RNNなしCNNなしで注意機構だけ構成されたTransformerが翻訳タスクで良い成績を出すという報告。

We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely.

解説 2017-12 http://deeplearning.hatenablog.com/entry/transformer