Using 5 photos of our cat as training data, recalculate the embedded vector of the new token using Textual Inversion, and use the new token as a prompt to generate an image using Stable Diffusion.
learning data AI-generated photo, AI-generated Monet-style painting
By the way, the prompt is something like âa photo of our catâ or âa painting of our cat by Claude Monetâ, but if you change the âour catâ part to âcatâ, you will see the following. It is more complete as a cat, but the characteristics of âour catâ are not so good.
Maybe my cat is a type of structure in the three coat colors of black, orange, and white, where the black pigment is lost and the orange is much lighter.
The file of embedded vectors generated by Textual Inversion is about 5KB. The main content is a 768-dimensional float vector with some detailed information about the token.
---Impressions
@nishio: Iâm feeling ânot much resemblanceâ at the moment, but compared to random cat photos, it has clearly acquired features, so I have a feeling that within a few years there will be a lot of [People who keep messing around in search of a face. I have a feeling that there will be many [People who keep messing around in search of a face.
For example, if you study the photos of your daughter who died prematurely and generate hundreds of photos every day and select the ones you like, you will create new photos of your â[It lives in my heart. [Commemorative photo at a sightseeing spot youâve never been to, field day photos, wedding photosâŠ
-
This is a virtual souvenir photo of my cat, a completely indoor cat, when I took her to the virtual ocean!
-
âWedding Photography.â
-
Ah, so you could generate âyour idea of an ideal son-in-lawâ, match them up, marry them, and then start generating pictures of âgrandchildrenâ who never existedâŠ
-
This âvirtual realityâ sounds like a bad idea. If there is demand, there will be providers, and the tragedy of losing virtual grandchildren when the providers go out of businessâŠ
-
-
I didnât really understand the market for metaverse, which is about creating Avatars that look like real people from photos, but I guess it will develop into âMetaverse as a world where dead people continue to liveâ⊠I guess itâs evolving into âMetaverse as a world where dead people continue to liveââŠ
- Related: A poor avatar of the person in question.
-
His daughter, who died young and came of age in the Metaverse, is locked in by Meta (hell).
-
My virtual daughter and son-in-law are raising their non-existent grandchildren in a beautiful non-existent house by a non-existent lake while subsisting on a non-existent farm, all locked in by Meta, and the maintenance fees are deducted from my account on a subscription model. When I thought I hadnât logged in recently, the person was dead, but I hadnât cancelled the account, so it keeps getting debited (hell).
I saw the response of âlearning a guesserâs photoâ and thought that there could be hell even if the subject is still alive. It seems like there is a large amount of training data and it would be easy to improve the quality of the face. The person is growing up, but the growth is stopped when the person says, âNo, I like the one I had when I was 20 years oldâ and is kept forever in the metaverse. - Breeding [idol There are going to be hundreds of people who will remove the porn filter.
-
Is it possible to do that with a realistic number of images for ordinary people who are not idols or anything else? Is there a possibility of creating a culture of not taking off the mask in front of anyone but trusted people, or covering the face and not showing it to anyone but family members?
Bowman
-
- I got a very good one! I was so excited, but this was the best case, and even after generating more than 100 sheets after that, I could not produce anything better than this!
-
- Interpreted as âBowmen usually have a local dish.â w
- There are too many outputs where the food is the main body. You said âitâs a CHARACTERâ when you were learning.
- In fact, this was the first experiment, and after I got excited and left it for a while, I decided âletâs try it in live actionâ, which is the cat experiment above.
live-action Bowman
- Itâs technically interesting that they have mastered various things such as âtextureâ, âcolors they tend to useâ, âCO-like logoâ, âsize against peopleâ, etc. from the images I gave them with no prior information⊠but I guess consumers wonât be satisfied with this quality, right?
Results can be seed sensititve. If youâre unsatisfied with the model, try re-inverting with a new seed (by adding âseed <#> to the prompt).
- You may or may not get a good one if you run the gacha 100 times for an hour at a time.
- Since what we get as a result of learning is a single vector of 768 dimensions, we may be able to search efficiently by selecting only the good ones among multiple vectors and averaging or GA
- Optimization problem in 768 dimensional space where the evaluation function is human after all.
- My catâs learning will be at a satisfactory level if I work hard, and I have a feeling that Bowman wonât be able to do it.
- Bowman canât be represented by one token, he would need to be represented by about three tokens, for example, a face, a logo, and an outfit.
Learning with Unexplored Logos
- I tried changing the background to show the logo image part, but it still doesnât seem to work.
- I guess youâd have to take a picture of a logo shaped 3D object placed in various locations.
- It sounds like they understood it as âan abstract image with greenish, diagonal and horizontal linesâ rather than âan unexplored logo.â
This page is auto-translated from [/nishio/Textual Inversionăè©ŠăăŠăżă](https://scrapbox.io/nishio/Textual Inversionăè©ŠăăŠăżă) using DeepL. If you looks something interesting but the auto-translated English is not good enough to understand it, feel free to let me know at @nishio_en. Iâm very happy to spread my thought to non-Japanese readers.