The same prompt will have this random range of output depending on the random number seed.

  • image
    • These 18 cards, all generated from the same prompt.
    • You’d think the composition would be mixed.
  • Why is this?
    • Initial value is just a random value
    • From there, repeated noise elimination is performed to get closer to “a distribution of images that humans can find meaning in.
    • We started completely differently, so the places we arrived at were completely different.
    • image
      • In this schematic, it’s drawn in two dimensions, but in reality there are 20,000 dimensions.
      • So the “surface” is very wide (Curse of the dimension).
      • Almost all distributed on the surface
        • We cannot observe the distribution of “the set of things that humans can recognize as pictures” in 20,000 dimensional space.
        • In high-dimensional space, the normal distribution is almost uniformly distributed on the hypersphere

Another Perspective

  • image
    • This is a vertical line starting from a “common random number seed”
    • Side by side generated by “common prompts”.
  • There are people in the world who are trying to prompt trial and error with AI drawing services.
    • Those who don’t fix the seed and use different prompts and trial and error make this observation
    • image
    • It’s almost always a random seed that determines the picture.
    • If you look at this and try to find a connection to Prompt a - eto, I don’t understand the translation.
      • People find meaning in random events, so some people find a false “prompt connection” to these “random seed effects”.
  • Some people have written blog posts saying, “You should experiment with fixing the seed.”
    • Those who have experimented with fixing the seed and changing the prompt make this observation
    • image
    • This looks like it would be easier to make sense of than the previous example.
      • I’m throwing away E because it doesn’t paint a decent picture.”
    • However, when I try the same prompt with a different seed, I get this result
      • image
      • I showed it to my wife, who knew nothing about it.
        • My impression is that both e are quite good.
        • The top is better with A and E. I like E better.
        • Below, A, D, and E are good, D>A>E in that order, E is also well drawn, but he does not like the face
      • In other words, even if you “fix the seed and observe with different prompts,” you will learn “how good or bad the prompt is in a particular seed.”
        • It’s like playing a specific random map in a random map generation game.
          • There’s no guarantee that that do-it-yourself know-how will be useful on other maps.
          • Because “good” in one seed doesn’t match up with another seed.
        • It’s Pessimistic Misconceptions to see a bad result by accident in a certain seed and decide, “This prompt is no good.”
          • If I tried it on more other seeds, I’d be like, “It was just bad the first time, but it was actually surprisingly good.”
          • But they underestimate it, so they don’t “try more.”
          • This makes it impossible to properly estimate the probability of success.
            • In the context of reinforcement learning, we are talking about how to avoid this pessimistic misunderstanding and not put more weight on exploration (Trade-offs between use and exploration).
  • Multiple seed multiple prompts multiply and observe. - image - A has a good chance of being so-so. - Sometimes E can be crazy weird, but sometimes it’s very good. - C is a weird one.
    • There is a difference in probability distribution like
    • This knowledge is much better than seed-fixed, trial-and-error knowledge.
      • May become meaningless in future upgrades.
      • How much of it depends on the language model and how much of it reflects the structure of the language itself
        • We will know as various models come out in the future.
    • The cat image generation in C3: Computer Created Cats uses Thompson sampling with Bernoulli distribution as the distribution (reinforcement learning).
      • What is Thompson sampling?
        • Sampling from a hypothetical distribution to try the largest choice.
        • Appropriate attempts are made with large variances.
        • C is not useful and is automatically discarded, while A and E are subject to a new trial
        • Automatically update the distribution shape with the data increased by the trial

Q: Thompson sampling, I think we need feedback on the results of the evaluation, but who is doing it?

  • A: Humans do it.
    • About 1,500 images are generated per day, and my wife and I look at them and label them “this one is good, this one is bad,” and we get about 100 “good ones” and 1,400 “bad ones.
    • I was trying to improve the prompts based on the results, which was done by humans in the beginning, but it became too much of a hassle, so I w
    • I said, “If you have this much data, you can automate it.” I automated it.
    • So currently, it’s like if you like something, just click the “Like” button and more good stuff will appear! Q: You mean try more prompts that are tied to what you think is good?
  • A: Yes.
    • I’ll add more later because it’s a bit of a mish-mash of explanations.
    • Written Thompson Sampling Hiring Process. Q: Is there any point in pursuing too many prompts individually?
  • A: I think so.
    • If you want to get a good picture, you have to pull out all the stops.
    • Because txt2img is, after all, a method that starts with completely random initial values.
    • If you want more control, I would have to use img2img. Q: Is it hard to find a good image if I’m doing it just right?
  • A: Well, mess.
    • Without a definition of what constitutes a “good image,” the “probability of getting a good one” is unknown.
    • If you pull a gacha with unknown probability, you may or may not get a good one, it’s “luck”!
  • One integer between 0 and 4294967296 since it is just a random number seed

This page is auto-translated from [/nishio/Stable Diffusionのシードとプロンプトの関係](https://scrapbox.io/nishio/Stable Diffusionのシードとプロンプトの関係) using DeepL. If you looks something interesting but the auto-translated English is not good enough to understand it, feel free to let me know at @nishio_en. I’m very happy to spread my thought to non-Japanese readers.