Self-Host Your Own AI Image Generator With ComfyUI and FLUX

Midjourney charges $30 a month for its Standard plan. Every prompt you type gets logged on their servers. Every image you generate lives on infrastructure you don’t control. And if Midjourney’s content filter decides your prompt is problematic, you’re out of luck — no explanation, no appeal, just a rejected request and a subscription fee that keeps charging.

There’s another option now. FLUX.1 Dev, the open-weight image model from Black Forest Labs, produces output that competes directly with Midjourney v6 — and in some cases beats it, especially for text rendering and photorealism. You can run it on your own hardware, completely offline, with no subscription, no content filters, and no prompts sent anywhere.

This guide walks you through setting up ComfyUI with FLUX.1 Dev on a consumer GPU. The whole process takes about an hour, and the only cost is the hardware you probably already own.

What You Need

Here’s the minimum hardware for a functional setup:

GPU (most important):

8 GB VRAM — Works with quantized GGUF models (Q4/Q5). Expect slower generation but usable quality. RTX 3060 Ti, RTX 4060 Ti.
12 GB VRAM — The practical sweet spot. Runs FLUX.1 Dev in FP8 or GGUF Q8 with good speed. RTX 3060 12GB, RTX 4070.
24 GB VRAM — Full-quality FP16 FLUX with no compromises. RTX 3090, RTX 4090.

System:

16 GB RAM minimum, 32 GB recommended
50 GB free disk space (models are large)
Python 3.12 or higher
NVIDIA GPU with CUDA support (AMD and Apple Silicon work but with more setup friction)

If you have an RTX 3060 12GB — one of the most common gaming GPUs — you’re already set. They sell used for under $200.

Option A: ComfyUI Desktop (Easiest)

If you want the simplest path, ComfyUI now has an official desktop application that handles Python, dependencies, and updates automatically. Download it, install it like any other app, and skip to the model download section below.

The desktop version is a one-click installer that auto-configures your Python environment and keeps itself updated. It’s the right choice if you don’t want to touch a terminal.

Option B: Manual Install (More Control)

If you prefer to manage your own Python environment — or you’re on Linux — the manual route gives you full control.

Step 1: Clone ComfyUI

git clone https://github.com/comfyanonymous/ComfyUI.git
cd ComfyUI

Step 2: Set Up Python Environment

Use a virtual environment to keep things clean:

python3 -m venv venv
source venv/bin/activate  # Linux/Mac
# venv\Scripts\activate   # Windows

Step 3: Install PyTorch With CUDA

For NVIDIA GPUs with CUDA 12.1:

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

For AMD GPUs (ROCm):

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm7.2

For Apple Silicon Macs:

pip install --pre torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/nightly/cpu

Step 4: Install ComfyUI Dependencies

pip install -r requirements.txt

Step 5: Start ComfyUI

python main.py

Open your browser to http://127.0.0.1:8188 and you should see the ComfyUI interface.

Download the FLUX.1 Dev Model

This is where your VRAM determines your path.

For 12 GB+ VRAM: FP8 Checkpoint (Recommended)

The single-file FP8 checkpoint is the simplest option. Download flux1-dev-fp8.safetensors from Hugging Face and place it in:

ComfyUI/models/checkpoints/

This is a self-contained file — no extra components needed. It’s slightly lower quality than full FP16 but the difference is hard to spot in practice.

For 8 GB or Less VRAM: GGUF Quantized

GGUF quantization compresses the model significantly while preserving most of the quality. The city96/FLUX.1-dev-gguf repository offers multiple quantization levels:

Quantization	File Size	VRAM Needed	Quality
Q8_0	~12 GB	~13 GB	Near-original
Q5_K_S	~7 GB	~8 GB	~95% of original
Q4_K_S	~5 GB	~6 GB	Good for most uses

Download your chosen GGUF file and place it in:

ComfyUI/models/unet/

You’ll also need the ComfyUI-GGUF custom nodes. Install them through ComfyUI Manager (the easiest way) or clone them manually:

cd ComfyUI/custom_nodes
git clone https://github.com/city96/ComfyUI-GGUF.git
pip install -r ComfyUI-GGUF/requirements.txt

Additional Required Files

Regardless of which model variant you chose, you’ll also need the CLIP and VAE models. Download these and place them in the appropriate folders:

CLIP models → ComfyUI/models/clip/
- clip_l.safetensors
- t5xxl_fp16.safetensors (or t5xxl_fp8_e4m3fn.safetensors for lower VRAM)
VAE → ComfyUI/models/vae/
- ae.safetensors

These are available from the Comfy-Org/flux1-dev repository on Hugging Face.

Note: if you used the single-file FP8 checkpoint, the CLIP and VAE are already bundled — you can skip this step.

Generate Your First Image

Open ComfyUI in your browser (http://127.0.0.1:8188)
In the workflow browser, search for “flux text to image” and load the default workflow
If using GGUF, swap the standard model loader for the GGUF loader node
Type a prompt and hit “Queue Prompt”

Your first generation might take a minute as the model loads into VRAM. Subsequent images will be faster — typically 10-30 seconds on a 12 GB card depending on resolution.

Low VRAM Flags

If you get CUDA out-of-memory errors, restart ComfyUI with memory optimization flags:

# For 8-12 GB VRAM
python main.py --lowvram

# For 6-8 GB VRAM
python main.py --lowvram --cpu-text-encoder

These flags offload parts of the model to system RAM when not actively needed. Generation will be slower, but it works.

Why This Matters

Running image generation locally isn’t just about saving money — though that math works out fast. At $30/month for Midjourney Standard, a used RTX 3090 (~$700) pays for itself in under two years. Every month after that is free.

The bigger wins are privacy and control:

No prompt logging. Midjourney records every prompt. Your local setup records nothing unless you choose to.
No content filters. Cloud services block prompts without explanation. Locally, you decide what to generate.
No account required. No Discord server, no login, no terms of service that change without notice.
Offline capability. Once the model is downloaded, you don’t need an internet connection. Generate images on a plane, in a cabin, wherever.
Unlimited generations. No fast-hour limits, no relax-mode throttling. Your only constraint is the speed of your GPU.

FLUX.1 Dev’s output quality has reached the point where local generation isn’t a compromise anymore. For workflows that need consistent, private, unlimited image generation — product photography, concept art, design iteration — it’s the better tool.

What You Can Do

Start simple. Install ComfyUI Desktop, download the FP8 checkpoint, and generate a few images. You can always dig deeper into GGUF quantization and custom workflows later.
Check your GPU. Run nvidia-smi in a terminal to see your VRAM. If it says 8 GB or more, you can run FLUX today.
Join the community. The ComfyUI subreddit and the project’s GitHub issues are active and helpful if you hit snags.
Explore LoRAs. Once your base setup works, look into LoRA models on Civitai for style-specific fine-tuning — anime, photorealism, specific art styles — all running locally.

The subscription model for AI image generation made sense when the models were proprietary and the hardware requirements were extreme. Neither of those things is true anymore.