Stable Diffusion is a latent diffusion model for generating images from text prompts. For developers, it is the most accessible entry point to AI image generation: open weights, free to use commercially (with some model-specific caveats), and runnable on consumer hardware. The model landscape has evolved significantly: SD 1.5 (2022) is outdated but fast and heavily fine-tuned, SDXL (2023) produces higher-quality images at higher cost, SD 3 (2024) improved text rendering and prompt adherence, and Flux (2024, Black Forest Labs) currently produces the best text-to-image quality for photorealistic and stylized images. If you are building image generation into a product today, start with Flux for quality or SDXL for the broader fine-tuning ecosystem.
The Model Landscape
Stable Diffusion 1.5 (SD 1.5) Released by Stability AI in 2022. The most fine-tuned base model in history. The CivitAI repository has thousands of fine-tuned SD 1.5 models for specific styles (anime, photorealistic, illustration). Resolution: 512x512. Fast inference (1-3 seconds on a consumer GPU). License: CreativeML Open RAIL-M (allows commercial use with attribution).
SDXL (Stable Diffusion XL) Released in 2023. Base + refiner two-stage architecture. Outputs at 1024x1024. Significantly better quality than SD 1.5 for photorealistic images. Slower (5-10 seconds on consumer GPU). Large fine-tuning ecosystem. License: CreativeML Open RAIL++-M.
Stable Diffusion 3 Medium Released in 2024. 2B parameter model. Better text rendering within images than previous versions. Better prompt adherence. License: Stability AI Community License (free for personal and non-commercial use; commercial use requires an enterprise plan).
Flux (Black Forest Labs) Released in 2024 by the original Stable Diffusion team after leaving Stability AI. Three variants: Flux.1 Schnell (fastest, Apache 2.0), Flux.1 Dev (high quality, non-commercial), Flux.1 Pro (best quality, API-only commercial). Flux.1 Schnell currently produces the best quality-to-speed ratio for open source image generation. Apache 2.0 license on Schnell means free commercial use.
Running Locally with ComfyUI
ComfyUI is a node-based UI for Stable Diffusion and related models. It gives precise control over every step of the generation pipeline: model selection, sampler, scheduler, ControlNet, LoRA, upscaling, etc.
Installation:
git clone https://github.com/comfyanonymous/ComfyUI
cd ComfyUI
pip install -r requirements.txt
python main.py
ComfyUI runs at localhost:8188. You build generation workflows by connecting nodes. Each node is a step in the pipeline (load model, CLIP encode, sample, decode image, save).
For Flux.1 Schnell, download the model from Hugging Face (black-forest-labs/FLUX.1-schnell) and place it in ComfyUI/models/unet/. The ComfyUI community has ready-made Flux workflows you can import.
ComfyUI is the preferred tool for power users who want maximum control over generation. For programmatic use from Python, it also has an API mode.
Running Locally with Automatic1111
Automatic1111 (AUTOMATIC1111/stable-diffusion-webui, 140k+ GitHub stars) is the most popular Stable Diffusion UI. Less flexible than ComfyUI but easier to get started with. It has a traditional web UI with settings panels rather than a node graph.
Installation:
git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui
cd stable-diffusion-webui
./webui.sh # Linux/Mac
webui-user.bat # Windows
Automatic1111 supports SD 1.5, SDXL, and many extensions. Less optimized for Flux than ComfyUI.
Using via API
For production applications where you do not want to manage GPU infrastructure:
Replicate (replicate.com): Run any Stable Diffusion or Flux model via API. Pay-per-generation (roughly $0.003-$0.015 per image depending on model). Simplest production option for low to moderate volume.
import replicate
output = replicate.run(
"black-forest-labs/flux-schnell",
input={"prompt": "a photorealistic portrait of a software engineer"}
)
print(output) # URL to generated image
Modal (modal.com): Run your own Flux/SDXL container on serverless GPU infrastructure. More complex to set up than Replicate but significantly cheaper at scale ($0.0002/second on A10G, ~$0.002 per image at 10 seconds per generation). Best for applications generating 10,000+ images/month.
Stability AI API: Direct Stability AI API for their official models (SD3, SDXL). $0.065 per image for SDXL, $0.09 for SD3. Not competitive on price for high volume but convenient for Stability's proprietary models.
Licensing Considerations
What you can use commercially:
Flux.1 Schnell: Apache 2.0. Fully commercial. Can be integrated into paid products, run as a service, used in advertising. No restrictions.
SD 1.5: CreativeML Open RAIL-M. Commercial use allowed. Cannot be used to generate CSAM or for explicit harmful purposes (defined in the license). Some fine-tuned models have additional restrictions.
SDXL: CreativeML Open RAIL++-M. Similar to SD 1.5. Commercial use allowed with the same restrictions.
Flux.1 Dev: Non-commercial only. Cannot be used in commercial products. Use Flux.1 Schnell or Flux.1 Pro (API) for commercial applications.
SD3 Medium: Requires Stability AI commercial license for commercial use. Not free for commercial applications.
Use Cases in Products
Practical integrations I have seen work well:
Avatar generation: Generate stylized user avatars from text descriptions or seed images. Works well with SDXL fine-tuned on portrait styles.
Image upscaling: Using Real-ESRGAN or similar upscaling models to enhance images uploaded by users. Dramatically improves low-resolution images.
Background generation: For e-commerce: generate product backgrounds for product photos. Works well with inpainting.
Concept visualization: For project management and ideation tools, generating concept images from text descriptions. Low-stakes creative application where perfect image quality is not required.
Keep Reading
- Hugging Face Complete Guide — Finding and downloading Stable Diffusion models
- Open Source LLM Production Guide — Running AI models in production at scale
- Cutting LLM API Costs — Cost comparison principles apply to image generation too
Pristren builds AI-powered software for teams. Zlyqor is our all-in-one workspace — chat, projects, time tracking, AI meeting summaries, and invoicing — in one tool. Try it free.