APM_53673_EP-2024

Enrolment options

APM_53673_EP - Propriétés de généralisation des algorithmes en ML (2024-2025)

Most learning problems are formulated as optimization problems, based on the observation of a data sample (training set). Optimizing an objective defined from this sample allows proposing an estimator with good performance on the training set. However, the main interest lies in the generalization ability of this estimator, that is, its performance on new observations. With the emergence of large datasets since the 2000s, the connection between the algorithm used and the generalization capacity of the associated estimator has become a major topic of interest.

Today, the question of generalization remains a major research issue, both for its theoretical and practical aspects. In this course, we will explore both theoretical and heuristic results that help address this problem. Specifically, we will first study various approaches that provide theoretical guarantees regarding the generalization of algorithms, particularly those related to complexity, stability, and early stopping methods (early stopping, stochastic approximation). In the second part, we will study heuristic approaches and the differences (explained or observed) in the context of deep learning (non-convex and over-parameterized).

Prerequisites: Basic knowledge in convex optimization and statistics. Having completed the course on optimization for data sciences will help better understand the various algorithms involved.

Non-exhaustive reference list:

Rademacher and Gaussian Complexities: Risk Bounds and Structural Results, P. Bartlett, S. Mendelson
The Tradeoffs of Large Scale Learning, L. Bottou, O. Bousquet
Stability and Generalization, O. Bousquet, A. Elisseef
Train faster, generalize better: Stability of stochastic gradient descent, M. Hardt, B. Recht, Y. Singer
Non-strongly-convex smooth stochastic approximation with convergence rate O(1/n), F. Bach, E. Moulines
Understanding deep learning requires rethinking generalization, C. Zhang, S. Bengio, M. Hardt, B. Recht, O. Vinyals
On early stopping in gradient descent learning, Y. Yao, L. Rosasco, and A. Caponnetto
Generalization properties of multiple passes stochastic gradient method, S. Villa
Competing with the empirical risk minimizer in a single pass, R. Frostig, R. Ge, S. M. Kakade, A. Sidford
Deep Learning and Generalization, O. Bousquet

Teaching coordinator: Dieuleveut Aymeric

Guests cannot access this course. Please log in.