Try img2img from Stable diffusion

2022-08-31

2022-08-28

Stable diffusion Try the image prompt The original data is a 256x256 image with just the text written on it, and “black cats” is specified as the text prompt

strength=0.75(default)

0.5

0.2 Oh, this is the strength of the text prompt side, I misunderstood.

0.99

0.9 0.8 0.85

0.88

0.87

$ time python scripts/img2img.py --prompt "black cats" --init-img c.png --ckpt sd-v1-4.ckpt --strength 0.87 --n_sample=1 real 0m32.746s user 0m24.091s sys 0m5.784s

$ for i in {1..10} ; do python scripts/img2img.py --prompt "black cats" --init-img c.png --ckpt sd-v1-4.ckpt --strength 0.87 --n_sample=1 --seed ${i}; done

If I had to choose one, this would be the one, but it’s not quite what I was expecting…

Draw rough instructions.

Is it getting worse? Hmmm, this probably doesn’t understand my rough instructions as “instructions for the placement of the cat”, but something like “it’s kind of a mess and this area is black”.

2022-08-31 from /villagepump/2022/08/28

Maybe “black cats” is not enough of a cat element.
Do you want to try to specify more poses or something?
- If you have two of them, you could specify that too.
And smudge? I don’t think it should be a pen, just the hardest one you can find.

Redrawn!

(I’m changing several conditions at the same time.) I set the image size to 512. - I saw that Effect of Image Size on Stable Diffusion is large.

—prompt “black cats” —strength 0.88

That’s good!

I’m going to revise the rough draft because it didn’t convey the intent a bit.

Er, why. You’re making a lot of assumptions, W.

Let’s change the seed and reroll in such a case. The third one, I see. I changed the color of the cat’s body to a ball or something, but the AI interpreted it as “the light blue thing by the cat must be a fish”. That’s good. I’ll add “fish” to the prompt.

Like 20 seconds for initialization and 30 seconds per piece for processing.

Try again with this image prompt of 512, which I thought was “worse” last time.

Much better results than last time. It appears that the problem is caused by the small image size, not the fill of the image or the fullness of the prompts.

Experiments to mask and regenerate a portion of an image once generated

0.8 0.9 0.5 This would leave a noise mask. 0.7

Experiment with changing the prompt for img2img We were talking about putting in the input on the left and getting the output on the right, and I said, “You specified the cat, didn’t you?” He asked.

To be precise, in addition to the image, we specify the prompt “black cat”, a random seed, and a value for how much of the text and image should be mixed together

Experiment with changing the prompt to something else.

black dog

black rabbit

Example of trying to give a reckless instruction bicolor cat (only the tip of one tail is white) tabby cat. Only the tip of the tail has stripes.

This page is auto-translated from [/nishio/Stable diffusionのimg2imgを試す](https://scrapbox.io/nishio/Stable diffusionのimg2imgを試す) using DeepL. If you looks something interesting but the auto-translated English is not good enough to understand it, feel free to let me know at @nishio_en. I’m very happy to spread my thought to non-Japanese readers.

🪴 Quartz 4.0

Try img2img from Stable diffusion

Graph View

Backlinks