Convergence of Stochastic Gradient Descent under a Local Łojasiewicz Condition for Deep Neural Networks

Jing An; Jianfeng Lu

doi:10.4208/jml.240724

Author(s)

&

Abstract

We study the convergence of stochastic gradient descent (SGD) for non-convex objective functions. We establish the local convergence with positive probability under the local Łojasiewicz condition introduced by Chatterjee [arXiv:2203.16462, 2022] and an additional local structural assumption of the loss function landscape. A key component of our proof is to ensure that the whole trajectories of SGD stay inside the local region with a positive probability. We also provide examples of neural networks with finite widths such that our assumptions hold.

Keywords:

Non-convex optimization Stochastic gradient descent Convergence analysis.

Convergence of Stochastic Gradient Descent under a Local Łojasiewicz Condition for Deep Neural Networks

Author(s)

Abstract

Keywords:

Abstract View

Pdf View

DOI

Convergence of Stochastic Gradient Descent under a Local Łojasiewicz Condition for Deep Neural Networks

Downloads

SHARE

Author(s)

Abstract

Keywords:

Abstract View

Pdf View

DOI