Allen-Zhu, 2018. Natasha 2: Faster Non-Convex Optimization Than SGD.