Date/Time Date(s) - 25/04/202410:30 am - 11:30 am
Dr. Peter Bartlett – Professor, Department of Statistics at University of California, Berkeley
CANCELLED
Title: Optimization in Deep Networks: convergence of Sharpness Aware Minimization and the edge of stability
Abstract: We consider Sharpness-Aware Minimization (SAM), a gradientbased optimization method for deep networks that has exhibited performance improvements on image and language prediction problems. We show that SAM applied to a convex quadratic objective converges to a cycle about the minimum in the direction with the largest curvature. In the non-quadratic case, these oscillations encourage drift toward wider minima, by performing gradient descent on the spectral norm of the Hessian. We relate this behavior to an “edge of stability” phenomenon that has been empirically observed in neural networks trained by gradient descent, where curvature increases to the point of instability. Based on joint work with Phil Long and Olivier Bousquet
Coffee served before lecture at 10am