Product photography used to mean expensive studios, professional lighting, and hours of post-processing. Now you can generate a polished product shot in seconds. But which AI tool actually delivers usable results?
We ran the same prompt through four leading image generators to find out: Flux 2, Midjourney V7, GPT Image 1.5, and Google Imagen 4. Here’s what we learned.
The Test
We used a straightforward product photography prompt that e-commerce sellers might actually need:
A minimalist product photo of a matte black ceramic coffee mug on a white marble surface. Soft natural lighting from the left. Steam rising from the cup. Clean white background. Professional studio photography style.
This prompt tests several things: material accuracy (matte ceramic vs glossy), lighting interpretation, steam generation, background cleanliness, and overall photorealism.
The Results
Flux 2 Max
Cost: $0.055/image via API, or free if self-hosted Time: ~8 seconds
Flux delivered the most photorealistic result. The matte ceramic texture looked convincing, with subtle surface variations you’d expect from real pottery. The steam effect was natural, not overdone. The marble surface had appropriate veining without looking AI-generated.
Strengths: Photorealism, material accuracy, natural lighting Weaknesses: Occasionally struggles with text if you add branding later Best for: Product shots without text, lifestyle imagery, realistic renders
The open-source advantage here is real. If you have a 24GB VRAM GPU, you can run Flux 2 locally at zero marginal cost. For an e-commerce store generating hundreds of product images, that adds up fast.
Midjourney V7
Cost: $0.033/image on Standard plan ($30/month) or free in Relax mode Time: ~15 seconds in Fast mode
Midjourney produced the most beautiful image, but arguably not the most accurate one. The lighting was more dramatic than requested, with artistic shadows that might work for a coffee brand’s Instagram but not for an Amazon listing. The steam had an almost painterly quality.
Strengths: Aesthetic appeal, cinematic lighting, artistic interpretation Weaknesses: Often adds artistic flourishes you didn’t ask for, text accuracy at 71% Best for: Brand imagery, social media, hero shots where beauty matters more than accuracy
The V7 update brought improved coherence (fewer weird hands and artifacts), but Midjourney still interprets prompts more like a creative director than a technician. It makes choices. Sometimes those choices are brilliant. Sometimes they’re not what you needed.
GPT Image 1.5
Cost: $0.04/image via API Time: ~6 seconds
OpenAI’s GPT Image model achieved the best prompt adherence. Every element we requested showed up exactly as specified. The matte finish was accurate, the lighting came from the left, the background was truly clean white (not off-white like Midjourney’s).
Strengths: Precise prompt following, excellent text rendering (87% accuracy), consistent results Weaknesses: Can feel slightly sterile compared to Midjourney’s artistic touch Best for: Product listings requiring text, technical accuracy, consistent batch generation
The text rendering is the standout feature here. If your product shot needs a visible logo or label, GPT Image 1.5 gets it right far more often than alternatives. DALL-E 3 was deprecated in May 2026, so this is now OpenAI’s flagship image model.
Google Imagen 4
Cost: $0.02/image (Fast) or $0.04/image (Standard) Time: ~3 seconds (Fast mode)
Imagen 4 split the difference between Flux’s photorealism and GPT Image’s precision. The Fast mode is remarkably quick, making it practical for iterating on prompts. Google’s tight integration with Workspace apps (generate images directly in Slides or Docs) is a genuine workflow advantage.
Strengths: Speed, price-to-quality ratio, text rendering, Google app integration Weaknesses: Requires Google Cloud setup for API access, some creative limitations Best for: High-volume generation, text-heavy designs, users already in Google’s ecosystem
The Numbers
| Model | Cost/Image | Speed | Photorealism | Text Accuracy | Prompt Adherence |
|---|---|---|---|---|---|
| Flux 2 Max | $0.055 (or free local) | ~8s | Excellent | Fair | Good |
| Midjourney V7 | ~$0.033 | ~15s | Good | 71% | Fair (artistic interpretation) |
| GPT Image 1.5 | $0.04 | ~6s | Good | 87% | Excellent |
| Imagen 4 Fast | $0.02 | ~3s | Very Good | ~90% | Very Good |
Traditional product photography runs $20-$150+ per image. Even Midjourney’s subscription works out to a fraction of that cost. For small businesses, the economics are compelling.
What This Means
The “best” tool depends entirely on what you’re making.
For Amazon/Shopify listings: GPT Image 1.5 or Imagen 4. You need consistent results, accurate colors, and clean backgrounds that meet marketplace requirements. The slight sterility is actually an advantage here.
For brand photography: Midjourney V7. When you’re building a visual identity and want images that make people stop scrolling, Midjourney’s artistic interpretation works in your favor.
For high-volume generation: Flux 2 self-hosted. If you’re generating hundreds of product variants and have capable hardware, running locally eliminates per-image costs entirely. The 12B parameter model needs serious VRAM, but the results justify it.
For text-heavy designs: Ideogram 3.0 or GPT Image 1.5. If your product images need legible labels, logos, or packaging text, these two lead the field with 90%+ accuracy.
The Privacy Angle
Flux’s open-source model is the only option here that doesn’t send your prompts to a third-party server. For businesses working with unreleased products, confidential designs, or competitive intelligence, that matters. Midjourney, OpenAI, and Google all process your prompts on their infrastructure.
Running Flux locally means your product designs stay on your hardware. Given the pace of corporate AI prompt logging and potential training data usage, that’s not a trivial consideration.
What You Can Do
-
Start with Imagen 4 Fast for quick iteration at $0.02/image. Once you’ve nailed your prompt, switch to a higher-quality model.
-
Use GPT Image 1.5 if you need text in your images. Don’t fight Midjourney’s typography struggles.
-
Consider self-hosting Flux if you have an RTX 4090 or better. The upfront complexity pays off at scale.
-
Match the tool to the task. Hero images for marketing? Midjourney. Hundreds of product variants? Flux local. Quick mockups? Imagen 4 Fast.
AI image generation is no longer about finding “the best” tool. It’s about building a workflow that uses each model for what it does well. The generators that dominated 2025 have all improved dramatically, and new entrants like Nano Banana 2 are already challenging the incumbents.
Traditional product photography isn’t dead, but for most e-commerce applications, it’s no longer necessary.