9/1

@shunji_umetani: what I thought when I heard about using spells to output the desired picture “Oh, the parameter tuning craftsman grew up” A spell optimization problem is a bombshell? @shunji_umetani: As many researchers know, parameter tuning is an addiction that melts time endlessly. It is a very dangerous profession, because it is addictive and melts your time endlessly, but you can hardly learn any specialized skills. It would be OK if, after repeated trial and error in parameter tuning, I could start studying to understand the internal principles of the software, but that doesn’t happen very often…

9/3

@nishio: I’m starting to feel like parameter tuning or prompt engineering is more of a “job” or “profession” and more of a companion to playing with Factorio or something. No? Is it more like a shadow game where you do light work and you get to draw a gacha and the SSR is a beautiful illustration? A type of gacha in which the tendency of summoned characters changes depending on the type of work

Humans have A bug that makes you wait if you have a short wait time. Stable Diffusion will return results in about 40 seconds after you put in the prompt, so you might inadvertently wait!

  • This is not a good idea, so early on I created a system where the prompt goes into a ring buffer and is generated by leaving it alone.
  • As a result, I accumulated 1,500 images a day, and the new bottleneck turned into a place where my wife and I would go through them and choose the best ones.

I made an app to assist in this “look at the image and sort it out” part.

  • The image is tiled like this
    • image
  • Click on it to see more information in a modal, where you can press “Excellent!” to move the file to a folder with good images.
    • image
  • My wife and I tried it today, and we were able to review the 3300 images we had accumulated in a realistic amount of time.
  • I think it’s Factorio-like to Identify bottlenecks and then destroy them.

Context of the Gacha

  • The most expensive way to buy a mess is to “put in a prompt, wait 40 seconds, and see the prompt results.”
    • A case where you spend 40 seconds looking at the image being generated in a daze or looking at the progress bar.
    • It’s similar to the type of game that has ads in between stages, or comic book apps where you get coins for watching ads to read the rest of the game.
  • After 30 gachas (= 30 generated by one prompt), the waiting time will be approximately 1 pomodoro, so that you can proceed with your work normally in that time
    • But every pomodoro has to check the image and specify the prompt, otherwise the machine will play.
  • So I read the prompts from a file and generated them in an infinite loop in turn.
    • Now prompt addition, generation, and review are loosely coupled and can be executed whenever you like.
    • The generating part now continues to produce images at a rate of 1,500 per day without human manipulation.
    • In gacha terms, it’s the “Daily Bonus 1500 Gacha.”
    • The bottleneck has moved to where we see the results of the mess.

We’re starting to look at the next bottleneck to be eliminated.


This page is auto-translated from /nishio/日記2022-09-09 using DeepL. If you looks something interesting but the auto-translated English is not good enough to understand it, feel free to let me know at @nishio_en. I’m very happy to spread my thought to non-Japanese readers.