Convergence of Stochastic Gradient Descent under a Local Łojasiewicz Condition for Deep Neural Networks
Year: 2025
Author: Jing An, Jianfeng Lu
Journal of Machine Learning, Vol. 4 (2025), Iss. 2 : pp. 89–107
Abstract
We study the convergence of stochastic gradient descent (SGD) for non-convex objective functions. We establish the local convergence with positive probability under the local Łojasiewicz condition introduced by Chatterjee [arXiv:2203.16462, 2022] and an additional local structural assumption of the loss function landscape. A key component of our proof is to ensure that the whole trajectories of SGD stay inside the local region with a positive probability. We also provide examples of neural networks with finite widths such that our assumptions hold.
Journal Article Details
Publisher Name: Global Science Press
Language: English
DOI: https://doi.org/10.4208/jml.240724
Journal of Machine Learning, Vol. 4 (2025), Iss. 2 : pp. 89–107
Published online: 2025-01
AMS Subject Headings:
Copyright: COPYRIGHT: © Global Science Press
Pages: 19
Keywords: Non-convex optimization Stochastic gradient descent Convergence analysis.