Summary: LLM tuning is differentiated by rater attributes, and the presence of reinforcement learning or ethics filters by high IQ raters may limit the use of LLMs.gpt.icon

tokoroten LLM, ā€œtuned to human preferenceā€ is not good for all humans, so what kind of people output I wonder if the results are ranked and differentiated by what kind of reinforcement learning was doneā€¦

Maybe something like ā€œThis LLM has been reinforcement trained by an evaluator with an IQ of 130+ā€ will come along.

tokoroten ā€œIf you prove you have an IQ of 120 or higher, you can use an LLM with reinforcement learning on the assessment of someone with an IQ of 130, no ethics filter included. ā€ There may be a future of License required to use the machine with the limiter removed.

Itā€™s a standard laugh that we should require a license to use a PC, but maybe that will happen with the LLMā€¦

relevance - Raw ChatGPT and omni use cases are different.


This page is auto-translated from /nishio/LLMćØIQ using DeepL. If you looks something interesting but the auto-translated English is not good enough to understand it, feel free to let me know at @nishio_en. Iā€™m very happy to spread my thought to non-Japanese readers.