Volume 2, Issue 3
Theory of the Frequency Principle for General Deep Neural Networks

Tao Luo, Zheng Ma, Zhi-Qin John Xu & Yaoyu Zhang

CSIAM Trans. Appl. Math., 2 (2021), pp. 484-507.

Published online: 2021-08

Export citation
  • Abstract

Along with fruitful applications of Deep Neural Networks (DNNs) to realistic problems, recently, empirical studies reported a universal phenomenon of Frequency Principle (F-Principle), that is, a DNN tends to learn a target function from low to high frequencies during the training. The F-Principle has been very useful in providing both qualitative and quantitative understandings of DNNs. In this paper, we rigorously investigate the F-Principle for the training dynamics of a general DNN at three stages: initial stage, intermediate stage, and final stage. For each stage, a theorem is provided in terms of proper quantities characterizing the F-Principle. Our results are general in the sense that they work for multilayer networks with general activation functions, population densities of data, and a large class of loss functions. Our work lays a theoretical foundation of the F-Principle for a better understanding of the training process of DNNs.

  • Keywords

Frequency principle, Deep Neural Networks, dynamical system, training process.

  • AMS Subject Headings

68Q32, 68T07, 37N40

  • Copyright

COPYRIGHT: © Global Science Press

  • Email address
  • BibTex
  • RIS
  • TXT
@Article{CSIAM-AM-2-484, author = {Tao and Luo and and 18070 and and Tao Luo and Zheng and Ma and and 18071 and and Zheng Ma and Zhi-Qin John and Xu and and 18072 and and Zhi-Qin John Xu and Yaoyu and Zhang and and 18073 and and Yaoyu Zhang}, title = {Theory of the Frequency Principle for General Deep Neural Networks}, journal = {CSIAM Transactions on Applied Mathematics}, year = {2021}, volume = {2}, number = {3}, pages = {484--507}, abstract = {

Along with fruitful applications of Deep Neural Networks (DNNs) to realistic problems, recently, empirical studies reported a universal phenomenon of Frequency Principle (F-Principle), that is, a DNN tends to learn a target function from low to high frequencies during the training. The F-Principle has been very useful in providing both qualitative and quantitative understandings of DNNs. In this paper, we rigorously investigate the F-Principle for the training dynamics of a general DNN at three stages: initial stage, intermediate stage, and final stage. For each stage, a theorem is provided in terms of proper quantities characterizing the F-Principle. Our results are general in the sense that they work for multilayer networks with general activation functions, population densities of data, and a large class of loss functions. Our work lays a theoretical foundation of the F-Principle for a better understanding of the training process of DNNs.

}, issn = {2708-0579}, doi = {https://doi.org/10.4208/csiam-am.SO-2020-0005}, url = {http://global-sci.org/intro/article_detail/csiam-am/19447.html} }
TY - JOUR T1 - Theory of the Frequency Principle for General Deep Neural Networks AU - Luo , Tao AU - Ma , Zheng AU - Xu , Zhi-Qin John AU - Zhang , Yaoyu JO - CSIAM Transactions on Applied Mathematics VL - 3 SP - 484 EP - 507 PY - 2021 DA - 2021/08 SN - 2 DO - http://doi.org/10.4208/csiam-am.SO-2020-0005 UR - https://global-sci.org/intro/article_detail/csiam-am/19447.html KW - Frequency principle, Deep Neural Networks, dynamical system, training process. AB -

Along with fruitful applications of Deep Neural Networks (DNNs) to realistic problems, recently, empirical studies reported a universal phenomenon of Frequency Principle (F-Principle), that is, a DNN tends to learn a target function from low to high frequencies during the training. The F-Principle has been very useful in providing both qualitative and quantitative understandings of DNNs. In this paper, we rigorously investigate the F-Principle for the training dynamics of a general DNN at three stages: initial stage, intermediate stage, and final stage. For each stage, a theorem is provided in terms of proper quantities characterizing the F-Principle. Our results are general in the sense that they work for multilayer networks with general activation functions, population densities of data, and a large class of loss functions. Our work lays a theoretical foundation of the F-Principle for a better understanding of the training process of DNNs.

Tao Luo, Zheng Ma, Zhi-Qin John Xu & YaoyuZhang. (2021). Theory of the Frequency Principle for General Deep Neural Networks. CSIAM Transactions on Applied Mathematics. 2 (3). 484-507. doi:10.4208/csiam-am.SO-2020-0005
Copy to clipboard
The citation has been copied to your clipboard