Text-to-Image

Text-to-Image is the most common use case of any diffusion models on Automatic 1111. You can very easily generate images using text prompts using Auto 1111 SDK in the following way:

  1. Load a .safetensors weights file or .checkpoint (ckpt) file and initialize a StableDiffusionPipeline.

from auto1111sdk import StableDiffusionPipeline

pipe = StableDiffusionPipeline("model.safetensors")
  1. Pass a prompt to the pipeline to generate an image:

prompt = "A picture of a realistic dog"
output = pipe.generate_txt2img(prompt = prompt)

The output type will be a list of Image PIL objects. By default, the pipeline will generate 1 image, however, you can specify this in the list of parameters:

output = pipe.generate_txt2img(prompt = prompt, num_images = 2)
output[0].save("dog1.png")
output[1].save("dog2.png")

Parameters:

Currently, Auto 1111 SDK supports the following parameters (and the corresponding default values):

prompt: str
negative_prompt: str = ''
seed: int = -1
steps: int = 20
height: int = 512
width: int = 512
cfg_scale: float = 7.5
num_images: int = 1
sampler_name: str = 'Euler'

To view a list of all the different types of samplers, read this: https://github.com/AUTOMATIC1111/stable-diffusion-webui/discussions/4384.

Last updated