Monte Carlo tree search - Wikipedia
-
For balancing exploitation and exploration, UCT (Upper Confidence Bound applied to trees) algorithm was introduced by Levente Kocsis and Csaba Szepesvári.
- Kocsis, L. and Szepesvári, C., 2006, September. Bandit based monte-carlo planning. In European conference on machine learning (pp. 282-293). Springer, Berlin, Heidelberg.
-
I explain about [Upper Confidence Bound] algorithm in (2.2.3.2-2) UCB1 algorithm.
Related:
This page is auto-translated from [/nishio/Monte Carlo tree search](https://scrapbox.io/nishio/Monte Carlo tree search) using DeepL. If you looks something interesting but the auto-translated English is not good enough to understand it, feel free to let me know at @nishio_en. I’m very happy to spread my thought to non-Japanese readers.