This problem is called ‘exploration-exploitation tradeoff’ in the field of reinforcement learning. You can not find better options if you choose only the option that looks the best from your experiences. It is a lack of exploration. (*1)

On the other hand, if you are looking for better options and only choosing inexperienced options, your experiences are not used. It is a lack of exploitation.

Since exploration and exploitation are in a trade-off relationship, it is necessary to execute both in a well-balanced manner, not on one side. So how can we make the well-balanced choices?


Footnote *1:

en.icon --- This page is auto-translated from [/nishio/(2.2.3.1) Exploration-exploitation tradeoff](https://scrapbox.io/nishio/(2.2.3.1) Exploration-exploitation tradeoff) using DeepL. If you looks something interesting but the auto-translated English is not good enough to understand it, feel free to let me know at [@nishio_en](https://twitter.com/nishio_en). I'm very happy to spread my thought to non-Japanese readers.