Chao Ma & Lexing Ying. (2023). Why Self-Attention Is Natural for Sequence-to-Sequence Problems? A Perspective from Symmetries. Journal of Machine Learning. 2 (3). 194-210. doi:10.4208/jml.221206