Sari la conținut

Conferința științifica FMI-IMAR, miercuri, 2 aprilie 2025, ora 15:00, sala 12, prof. Arnulf Jentzen

Miercuri, 2.04.2025, ora 15, sala 12

Prof. Arnulf Jentzen,
[1] School of Data Science and Shenzhen Research Institute of Big Data, The Chinese University of Hong Kong, Shenzhen (CUHK-Shenzhen), China
[2] Applied Mathematics: Institute for Analysis and Numerics, Faculty of Mathematics and Computer Science, University of Münster, Germany

Title:
Convergence rates for the Adam optimizer

Abstract:
Stochastic gradient descent (SGD) optimization methods are nowadays the method of choice for the training of deep neural networks (DNNs) in artificial intelligence systems. In practically relevant training problems, usually not the plain vanilla standard SGD method is the employed optimization scheme but instead suitably accelerated and adaptive SGD optimization methods such as the famous Adam optimizer are applied. In this work we establishing optimal convergence rates for the Adam optimizer for a large class of stochastic optimization problems covering strongly convex stochastic optimization problems. The key ingredient of our convergence analysis is a new vector field function which we propose to refer to as the Adam vector field. This Adam vector field accurately describes the macroscopic behaviour of the Adam optimization process but differs from the negative gradient of the objective function (the function we intend to minimize) of the considered stochastic optimization problem. In particular, our convergence analysis suggests that Adam does typically not converge to critical points of the objective function (zeros of the gradient of the objective function) of the considered optimization problem but converges with rates to zeros of this Adam vector field. Finally, we present acceleration techniques for Adam in the context of deep learning approximations for partial differential equation and optimal control problems. The talk is based on joint works with Steffen Dereich, Thang Do, Robin Graeber, and Adrian Riekert.

References:

[1] S. Dereich & A. Jentzen, Convergence rates for the Adam optimizer, arXiv:2407.21078 (2024), 43 pages.

[2] S. Dereich, R. Graeber, & A. Jentzen, Non-convergence of Adam and other adaptive stochastic gradient descent optimization methods for non-vanishing learning rates, arXiv:2407.08100 (2024), 54 pages.

[3] T. Do, A. Jentzen, & A. Riekert, Non-convergence to the optimal risk for Adam and stochastic gradient descent optimization in the training of deep neural networks, arXiv:2503.01660 (2025), 42 pages.

[4] A. Jentzen & A. Riekert, Non-convergence to global minimizers for Adam and stochastic gradient descent optimization and constructions of local minimizers in the training of artificial neural networks, arXiv:2402.05155 (2024), 36 pages, to appear in SIAM/ASA J. Uncertain. Quantif.

Brief bio:
Arnulf Jentzen (*November 1983) is appointed as a presidential chair professor at the Chinese University of Hong Kong, Shenzhen (since 2021) and as a full professor at the University of Münster (since 2019). In 2004 he started his undergraduate studies in mathematics at Goethe University Frankfurt in Germany, in 2007 he received his diploma degree at this university, and in 2009 he completed his PhD in mathematics at this university. The core research topics of his research group are machine learning approximation algorithms, computational stochastics, numerical analysis for high dimensional partial differential equations (PDEs), stochastic analysis, and computational finance. Currently, he serves in the editorial boards of several scientific journals such as the Annals of Applied Probability, the Journal of Machine Learning, the SIAM Journal on Scientific Computing, the SIAM Journal on Numerical Analysis, and the SIAM/ASA Journal on Uncertainty Quantification. His research activities has been recognized through several major awards such as the Felix Klein Prize of the European Mathematical Society (EMS) (2020), an ERC Consolidator Grant from the European Research Council (ERC) (2022), the Joseph F. Traub Prize for Achievement in Information-Based Complexity (2022), and a Frontier of Science Award in Mathematics (jointly with Jiequn Han and Weinan E) by the International Congress of Basic Science (ICBS) (2024). Further details on the activities of his research group can be found at the webpage http://www.ajentzen.de.

(prezentare în cadrul Seminarului de Probabilitati si statistica)

Etichete: