Many people confuse the two issues about Japanese LLM, so I made a picture.
There are people who talk about the âJapanese LLMs donât mean anythingâ kind of thing from the 1 perspective. 2 Perspectives - GPT3 reverses the information density. - Thoughts on the Japanese Language Model - > It is important to have a large pipeline to the âAI that thinks across languagesâ that will grow more and more in the future. - > âAI that thinks across languagesâ is like a newly discovered oil field, and value is gushing forth. - > Users of languages with narrower pipes do not enjoy much of the value that comes out of this. - Either the performance improvement will be a headache, or the performance will improve endlessly. - If there is a head start, then the âdifference by sizeâ in 1 shrinks. - Language Dynamics - Currently, performance is better when communicating with GPT4 in English than in Japanese. - Is a Japanese language model necessary? - > Talking about how âanother smaller modelâ is futile, but we need a âtokenizer + alpha layer suitable for Japaneseâ.
One solution is like this.
- Whether this is beneficial remains to be seen.
- A âbetter to try than to do nothingâ mentality.
PS
- What if itâs not a headache?
-
nishio If we assume that âthe larger the scale of the training data, the higher the valueâ, then the total amount of sentences written in Japanese is not comparable to the total amount of sentences written in English, and the difference in terms of the number of speakers will not decrease. If we assume that âthe larger the scale of learning data, the higher the valueâ, then the total volume of sentences written in Japanese is not equal to the total volume of sentences written in English, and the difference in terms of the number of speakers will not decrease.
- Just as in the Meiji Era, âIf we donât make English the official language, weâre in trouble, arenât we?â as it was in the Meiji Era - theory of the official use of English as a foreign language
This page is auto-translated from /nishio/æ„æŹèȘLLMă«éąăă2ă€ăźćéĄ using DeepL. If you looks something interesting but the auto-translated English is not good enough to understand it, feel free to let me know at @nishio_en. Iâm very happy to spread my thought to non-Japanese readers.