In this paper, Sourav Panda covers a paper that examines approaches for switching optimizers in the middle of learning. The paper is called Improving Generalization Performance by Switching from Adam to SGD, by Keskar and Socher.

Presentation Link: https://psu.mediaspace.kaltura.com/media/Group+Meeting/1_fn2pnmkn