2022-08-31 image

2022-08-28 image

Stable diffusion Try the image prompt The original data is a 256x256 image with just the text written on it, and “black cats” is specified as the text prompt imageimageimage

imageimageimage

strength=0.75(default) image

0.5 image

0.2 image Oh, this is the strength of the text prompt side, I misunderstood.

0.99 image

0.9 image 0.8 image 0.85 image

0.88 image

0.87 image

$ time python scripts/img2img.py --prompt "black cats" --init-img c.png --ckpt sd-v1-4.ckpt --strength 0.87 --n_sample=1 real 0m32.746s user 0m24.091s sys 0m5.784s

$ for i in {1..10} ; do python scripts/img2img.py --prompt "black cats" --init-img c.png --ckpt sd-v1-4.ckpt --strength 0.87 --n_sample=1 --seed ${i}; done imageimageimageimageimageimageimageimageimageimage

If I had to choose one, this would be the one, but it’s not quite what I was expecting
 image

imageimageimageimageimageimageimageimageimageimage

imageimageimageimageimageimageimageimageimageimage

Draw rough instructions. image

imageimageimageimageimage Is it getting worse? imageimageimage imageimageimage Hmmm, this probably doesn’t understand my rough instructions as “instructions for the placement of the cat”, but something like “it’s kind of a mess and this area is black”.

2022-08-31 /villagepump/yuyasurarin.icon from /villagepump/2022/08/28

  • Maybe “black cats” is not enough of a cat element.
  • Do you want to try to specify more poses or something?
    • If you have two of them, you could specify that too.
  • And smudge? I don’t think it should be a pen, just the hardest one you can find.

Redrawn! image

(I’m changing several conditions at the same time.) I set the image size to 512. - I saw that Effect of Image Size on Stable Diffusion is large.

  • —prompt “black cats” —strength 0.88

imageimageimage

imageimageimage That’s good!

I’m going to revise the rough draft because it didn’t convey the intent a bit.

imageimage Er, why. You’re making a lot of assumptions, W.

Let’s change the seed and reroll in such a case. imageimageimage imageimageimage The third one, I see. I changed the color of the cat’s body to a ball or something, but the AI interpreted it as “the light blue thing by the cat must be a fish”. That’s good. I’ll add “fish” to the prompt. imageimageimage imageimageimage

Like 20 seconds for initialization and 30 seconds per piece for processing.

imageimageimage

imageimageimage

Try again with this image prompt of 512, which I thought was “worse” last time. image

Much better results than last time. It appears that the problem is caused by the small image size, not the fill of the image or the fullness of the prompts. imageimageimage imageimageimage

Experiments to mask and regenerate a portion of an image once generated

image 0.8 imageimage 0.9 imageimage 0.5 This would leave a noise mask. imageimage 0.7 imageimage

Experiment with changing the prompt for img2img We were talking about putting in the input on the left and getting the output on the right, and I said, “You specified the cat, didn’t you?” He asked.

  • To be precise, in addition to the image, we specify the prompt “black cat”, a random seed, and a value for how much of the text and image should be mixed together imageimage

Experiment with changing the prompt to something else.

black dog imageimageimage

black rabbit imageimageimage

Example of trying to give a reckless instruction bicolor cat (only the tip of one tail is white) imageimageimage tabby cat. Only the tip of the tail has stripes. imageimageimage


This page is auto-translated from [/nishio/Stable diffusionたimg2imgă‚’è©Šă™](https://scrapbox.io/nishio/Stable diffusionたimg2imgă‚’è©Šă™) using DeepL. If you looks something interesting but the auto-translated English is not good enough to understand it, feel free to let me know at @nishio_en. I’m very happy to spread my thought to non-Japanese readers.