Summary: LLM tuning is differentiated by rater attributes, and the presence of reinforcement learning or ethics filters by high IQ raters may limit the use of LLMs.
tokoroten LLM, ātuned to human preferenceā is not good for all humans, so what kind of people output I wonder if the results are ranked and differentiated by what kind of reinforcement learning was doneā¦
Maybe something like āThis LLM has been reinforcement trained by an evaluator with an IQ of 130+ā will come along.
tokoroten āIf you prove you have an IQ of 120 or higher, you can use an LLM with reinforcement learning on the assessment of someone with an IQ of 130, no ethics filter included. ā There may be a future of License required to use the machine with the limiter removed.
Itās a standard laugh that we should require a license to use a PC, but maybe that will happen with the LLMā¦
relevance - Raw ChatGPT and omni use cases are different.
This page is auto-translated from /nishio/LLMćØIQ using DeepL. If you looks something interesting but the auto-translated English is not good enough to understand it, feel free to let me know at @nishio_en. Iām very happy to spread my thought to non-Japanese readers.