Many people confuse the two issues about Japanese LLM, so I made a picture. image

There are people who talk about the “Japanese LLMs don’t mean anything” kind of thing from the 1 perspective. 2 Perspectives - GPT3 reverses the information density. - Thoughts on the Japanese Language Model - > It is important to have a large pipeline to the “AI that thinks across languages” that will grow more and more in the future. - > “AI that thinks across languages” is like a newly discovered oil field, and value is gushing forth. - > Users of languages with narrower pipes do not enjoy much of the value that comes out of this. - Either the performance improvement will be a headache, or the performance will improve endlessly. - If there is a head start, then the “difference by size” in 1 shrinks. - Language Dynamics - Currently, performance is better when communicating with GPT4 in English than in Japanese. - Is a Japanese language model necessary? - > Talking about how “another smaller model” is futile, but we need a “tokenizer + alpha layer suitable for Japanese”.

One solution is like this.

  • image
  • Whether this is beneficial remains to be seen.
    • A “better to try than to do nothing” mentality.

PS

  • What if it’s not a headache?
  • nishio If we assume that “the larger the scale of the training data, the higher the value”, then the total amount of sentences written in Japanese is not comparable to the total amount of sentences written in English, and the difference in terms of the number of speakers will not decrease. If we assume that “the larger the scale of learning data, the higher the value”, then the total volume of sentences written in Japanese is not equal to the total volume of sentences written in English, and the difference in terms of the number of speakers will not decrease.


This page is auto-translated from /nishio/æ—„æœŹèȘžLLMに閹する2ă€ăźć•éĄŒ using DeepL. If you looks something interesting but the auto-translated English is not good enough to understand it, feel free to let me know at @nishio_en. I’m very happy to spread my thought to non-Japanese readers.