Transformer https://arxiv.org/pdf/1706.03762.pdf
In short, i rotating unit vectors, each with a rotation period of
-
Rotation period: [$ \sin(pos), \cos(pos)
-
Rotation period : is equi-proportionately chopped between the
-
The concept of position is often taken in the real number line sense, but it is expressed as a value that cycles through the position. Therefore, if there is an infinitely long series, “the same position” will always repeat itself, but if the cycle is long enough, there is no practical problem. Even the earth’s surface is actually in a cycle, but in everyday life we think of it as a Cartesian coordinate system.
-
Rotational encoding
It adds them together rather than concatenating them against the input, but that seems to be OK.
This page is auto-translated from [/nishio/Positional Encoding](https://scrapbox.io/nishio/Positional Encoding) using DeepL. If you looks something interesting but the auto-translated English is not good enough to understand it, feel free to let me know at @nishio_en. I’m very happy to spread my thought to non-Japanese readers.