word2vec and KJ method
I have always thought that word2vec could be used to support the KJ method, but now I realize that the KJ method itself is a method to map a vector position in space to a short sentence written on a sticky note. The phase of mapping short sentences to vectors is up to “place things that seem to be related so that they are close together”, and from there creating groups and nameplates is something that is missing from the current word2vec. How about if you pick one point at random of the vectors made by word2vec and put on the screen the best 50 closest to it or something like that. If you pick out a few things that are mates and a few things that are not, it would be nice if they were rotated in a direction that would separate them the most.
-
word2vec maps the meaning of a word to a vector
-
The “meaning” of a word was changed from symbol = source of discrete set to vector = source of continuous set.
-
In natural language processing, written words are identical and are the source of discrete sets
- However, this can be interpreted as a result of poor information and communication technology during a period of technological development, which forced compression.
- Just as the era when only 16 colors were available appears to be a “temporary technological limitation” when looking back from the current era when 160,000 colors per pixel are available for photos.
- 1 million vocabulary = 20 bits per word
- In contrast, the meaning of a word (Symbol Contents, signified) is represented by a value such as 32 bits x 100 dimensions.
-
Related Signifiant and Signifie.
This page is auto-translated from /nishio/word2vecとKJ法 using DeepL. If you looks something interesting but the auto-translated English is not good enough to understand it, feel free to let me know at @nishio_en. I’m very happy to spread my thought to non-Japanese readers.