Statistical theory of deep learning
时间:2026-04-13
阅读量:1016次
Recently a lot of progress has been made regarding the theoretical understanding for deep artificial neural networks. One of the very promising directions is the statistical approach, which interprets deep learning as a statistical method and builds on existing techniques in mathematical statistics to derive theoretical error bounds and to understand novel phenomena such as benign overfitting and the regularising effect of dropout. The lecture surveys this field and describes future challenges.
Preliminary outline:
Lecture 1 (from approximation to generalisation bounds): Universal approximation theorem, approximation rates for shallow neural networks, Barron spaces, advantages of additional hidden layers, deep ReLU networks.
Lecture 2 (theory of gradient descent in machine learning): optimization in machine learning, weight balancing phenomenon, analysis of dropout, benign overfitting, grokking
Course Slides:
https://jschmidthieber.personalweb.utwente.nl/hangz.pdf
Resources:
For questions, please contact a.j.schmidt-hieber@utwente.nl
