nishio When data science in the corporate world gained attention, there were companies that boasted of piles of garbage data with no metrics or assumptions, saying, âWe have a lot of data. In the same way, there will be companies that say, âWe have a large amount of chat logs, so letâs use them in LLM! There will be companies that say, âWe have a lot of chat logs, so we can use them in LLM!
nishio Just as âa list of numbers with no idea what they measure under what circumstancesâ is useless for data analysis, âa list of words with no idea what they talk about under what circumstancesâ is also useless. In the same way, âa list of numbers with no idea of what was measured and under what circumstancesâ is useless for data analysis. Just as âa list of numbers without knowing what was measured and under what circumstancesâ is useless for data analysis, âa list of words without knowing what was said and under what circumstancesâ is useless for LLM.
nishio LLMs cannot understand context unless contextual information is stored in a form that LLMs can understand. If the conversation and context (materials and data) are tied together and stored in a machine-readable form, we can still make it for LLMs.
nishio Companies that had thought carefully about what form the information must be stored in to be used in the future, even before the advent of LLM, still have a good chance. I think
This page is auto-translated from /nishio/LLMăç解ă§ăă形ă§ćčć ĺ ąăäżĺ using DeepL. If you looks something interesting but the auto-translated English is not good enough to understand it, feel free to let me know at @nishio_en. Iâm very happy to spread my thought to non-Japanese readers.