We provide sharp path-dependent generalization and excess error guarantees for the full-batch Gradient Decent (GD) algorithm on smooth losses (possibly non-Lipschitz, possibly nonconvex). At the heart of our analysis is a new technique for bounding the generalization error of deterministic symmetric algorithms, which implies that average output stability and a... Show more