SDXL 1.0
A Replicate guide
Or, how I learned to make really weird cats
Stable Diffusion XL 1.0 is a new text-to-image model by Stability AI. It creates beautiful 1024x1024 images with simple prompts.
We’re going to look at how to get the best images by exploring:
- guidance scales
- number of steps
- the scheduler (or sampler) you should use
- what happens at different resolutions
- 🆕 refiners, and how to use them
Jump to the resolution section if you’re just here for weird cats.
Compare settings
Try changing the scheduler
, guidance_scale
and num_inference_steps
to see what happens.
{ "prompt": "A studio portrait photo of a cat", "num_inference_steps": 20, "guidance_scale": 7, "negative_prompt": "ugly, soft, blurry, out of focus, low quality, garish, distorted, disfigured", "seed": 1000, "width": 1024, "height": 1024, "scheduler": "K_EULER" }
Guidance scale
The guidance scale tells the model how similar the output should be to the prompt, start with a value of about 7.
Steps
A larger number of denoising steps increases the quality of the output but it takes longer to generate. Start with a value of about 20 steps. Don’t go too high, after a point each step helps less and less.
Scheduler
Schedulers (or samplers) define the denoising process. Most will get a decent image in as few as 10 steps with SDXL. Euler and Euler Ancestral give the sharpest and fastest results.
Compare resolutions
SDXL works best at 1024x1024, but what happens when you go bigger or smaller, or use a different aspect ratio?
Try changing width
and height
to see what happens.
Aspect ratio
1:1
{ "prompt": "A studio portrait photo of a cat", "num_inference_steps": 50, "guidance_scale": 7.5, "negative_prompt": "ugly, soft, blurry, out of focus, low quality, garish, distorted, disfigured", "seed": 1000, "width": 1024, "height": 1024, "scheduler": "K_EULER" }
Try these dimensions for common aspect ratios:
Aspect ratio | Resolution |
---|---|
1:1 | 1024x1024 |
4:3 | 1152x864 |
3:2 | 1248x832 |
16:9 | 1344x768 |
Refiner
With SDXL you can use a separate refiner model to add finer detail to your output.
You can use the refiner in two ways:
- one after the other
- as an ‘ensemble of experts’
One after the other
In this mode you take your final output from SDXL base model and pass it to the refiner. You can define how many steps the refiner takes.
Ensemble of experts
In this mode the SDXL base model handles the steps at the beginning (high noise), before handing over to the refining model for the final steps (low noise).
You get a more detailed image from fewer steps.
You can change the point at which that handover happens, we default to 0.8 (80%)
{ "prompt": "A studio portrait photo of a cat", "num_inference_steps": 100, "guidance_scale": 7.5, "negative_prompt": "ugly, soft, blurry, out of focus, low quality, garish, distorted, disfigured", "seed": 1000, "width": 1024, "height": 1024, "scheduler": "K_EULER", "refiner": "expert_ensemble_refiner", "high_noise_fraction": "0.80" }