It was titled “Memory and Abstraction” from 2018-10-22 to 2019-11-10, but I renamed it because I cannot recall this title when referring to it from various places, so I decided it was not an appropriate name. Note dimensional reduction as we are talking about putting dimensional reduction in attention mechanism. memoryabstraction
- Multiply both query and key by the same Dropout when drawing from memory in attention mechanism.
- This is effectively equivalent to doing a dimensional reduction of the two and then comparing them.
- [Dimension reduction and comparison” as explained in [Similarity of concepts is not distance.
- [Recall from memory with abstraction rather than simple similarity of query and key.
- In other words, this is the equivalent of analogy or suggestion.
This page is auto-translated from /nishio/次元削減注意 using DeepL. If you looks something interesting but the auto-translated English is not good enough to understand it, feel free to let me know at @nishio_en. I’m very happy to spread my thought to non-Japanese readers.