Sign in

Law of Balance and Stationary Distribution of Stochastic Gradient Descent

By Liu Ziyin and others
The stochastic gradient descent (SGD) algorithm is the algorithm we use to train neural networks. However, it remains poorly understood how the SGD navigates the highly nonlinear and degenerate loss landscape of a neural network. In this work, we prove that the minibatch noise of SGD regularizes the solution towards... Show more
August 13, 2023
=
0
Loading PDF…
Loading full text...
Similar articles
Loading recommendations...
=
0
x1
Law of Balance and Stationary Distribution of Stochastic Gradient Descent
Click on play to start listening