Pretraining is All You Need for Image-to-Image Translation
Paper
•
2205.12952
•
Published
Stable-Diffusion-Pokemon-en is a English-specific latent text-to-image diffusion model capable of generating Pokemon images given any text input.
This model was trained by using a powerful text-to-image model, diffusers For more information about our training method, see train_text_to_image.py.
Firstly, install our package as follows. This package is modified 🤗's Diffusers library to run English Stable Diffusion.
pip install diffusers==0.4.1
Run this command to log in with your HF Hub token if you haven't before:
huggingface-cli login
Running the pipeline with the LMSDiscreteScheduler scheduler:
import torch
import pandas as pd
from torch import autocast
from diffusers import LMSDiscreteScheduler, StableDiffusionPipeline
scheduler = LMSDiscreteScheduler(beta_start=0.00085, beta_end=0.012,
beta_schedule="scaled_linear", num_train_timesteps=1000)
#pretrained_model_name_or_path = "en_model_26000"
pretrained_model_name_or_path = "svjack/Stable-Diffusion-Pokemon-en"
pipe = StableDiffusionPipeline.from_pretrained(pretrained_model_name_or_path,
scheduler=scheduler, use_auth_token=True)
pipe = pipe.to("cuda")
disable safety_checker
pipe.safety_checker = lambda images, clip_input: (images, False)
imgs = pipe("A cartoon character with a potted plant on his head",
num_inference_steps = 100
)
image = imgs.images[0]
image.save("output.png")