In this presentation, Mitchell Hoesing presents Attention is All You Need, by Vaswani et al. This paper introduces the transformer architecture, which is current state of the art on a number of benchmark tasks. Prior to the transformer, sequence data would usually correspond to the use of an RNN, while the transformer uses a feed-forward architecture more similar to a CNN.

Presentation Link: https://psu.mediaspace.kaltura.com/media/Group+Meeting/1_tsrf12ei