Paper Presentation: Attention is All You Need, presented by Mitchell Hoesing

In this presentation, Mitchell Hoesing presents Attention is All You Need, by Vaswani et al. This paper introduces the transformer architecture, which is current state of the art on a number of benchmark tasks. Prior to the transformer, sequence data would usually correspond to the use of an RNN, while the transformer uses a feed-forward architecture more similar to a CNN.