Search results

Search results: 375

MAP546 - ENS Data Challenge (2023-2024)

This a Data Camp based on the challenges proposed by the ENS: https://challengedata.ens.fr/

You can choose any of the proposed challenges. You will have to work in groups in groups of three to four students and have to provide a solution by the end of march. To help you, we will organize non mandatory coaching sessions on Mondays afternoon.

Teaching coordinator: Le Pennec Erwan

Category: Ingénieur 3A / Master 1

MAP547 - Data Science for Business Seminars (2023-2024)

Teaching coordinator: Le Pennec Erwan

Category: Ingénieur 3A / Master 1

MAP553 - Bases de l'Apprentissage Automatique (2023-2024)

Machine learning is a scientific discipline that is concerned with the design and development of algorithms that allow computers to learn from data. A major focus of machine learning is to automatically learn complex patterns and to make intelligent decisions based on them.

This course focuses on the methodology underlying supervised and unsupervised learning, with a particular emphasis on the mathematical formulation of algorithms, and the way they can be implemented and used in practice. We will therefore describe some necessary tools from optimization theory, and explain how to use them for machine learning. A glimpse about theoretical guarantees, such as upper bounds on the generalization error, are provided during the last lecture.

The methodology will be the main concern of the lectures while some proofs will be done during the PCs. Practice will be done through a challenge.

Teaching coordinator: Bianchi Pascal
Teaching coordinator: El Mhamdi El Mahdi
Teaching coordinator: Klein Thierry
Teaching coordinator: Le Pennec Erwan
Teaching coordinator: Philippenko Constantin

Category: Ingénieur 3A / Master 1

MAP554E - Python for DataSciences (2023-2024)

Category: Ingénieur 3A / Master 1

MAP555 - Traitement du Signal : de Fourier à l'Apprentissage Machine (2023-2024)

Initially known for its successes in telecommunications, signal processing is now part of all domains of data processing that require to analyse, extract and transform numerical information. This course is an introduction to the field of signal processing and as such requires basic knowledge of analysis (Fourier Transform), probabilities (random variables, random process) and linear algebra.

The course begins with a presentation of Fourier analysis and analog filtering with some applicative examples such as modulation and Fourier optics in astronomy. Next we will introduce signal sampling and digital signal filtering that has
become the de-facto standard in practical applications. We will study the very important Fast Fourier Transform (FFT) algorithm and discuss some examples of filtering in image processing. Next we will study the random/stochastic aspects of signals and the optimal linear filtering of signal and noise when modeled as as stochastic processes. The modeling of speech with will also be taken as an example for the study of auto-regressive models. Finally the last part of the course will briefly introduce several signal representations commonly used such as the Discrete Cosine Transform (DCT), and wavelet transforms used in JPEG encoding and image reconstruction. The short time Fourier transform will also be introduced to model non-stationary signals. Finally some recent approaches based on machine learning such as dictionary learning and deep learning signal reconstruction will be presented.

The course will be completed by practical sessions in Python/Numpy that will allow the students to implement the methods seen in the course on practical problems such as audio signal generation and filtering.

This course will be given in french or english depending on the public with lecture material in english.
A working knowledge of Python/Numpy is strongly recommended for the practical sessions.

**Evaluation**: practical session reports and final theoretical+practical exam.

Teaching coordinator: Flamary Rémi

Category: Ingénieur 3A / Master 1

MAP565 - Modélisation aléatoire et statistique des processus (2023-2024)

The aim of this course is to present a rigorous view of modern statistic techniques to answer fundamental questions of modeling and valuation arising in practice.

We will focus on extreme values, multi-dimensional dependencies in data and dynamical aspects, among other things. The application fields of the studied methods in this course are finance and economy, biology (population dynamics, seismology, epidemiology), climatology, network analysis or sport (match data, performance data...).

Teaching coordinator: Rosenbaum Mathieu

Category: Ingénieur 3A / Master 1

MAP566 - Statistics in Action (2023-2024)

The objective of this course is to show students how statistics is used in practice to answer a specific question, by introducing a series of important model-based approaches.

The students will learn to select and use appropriate statistical methodologies and acquire solid and practical skills by working-out examples on real-world data sets from various areas including medicine, genomics, ecology, and others.

All analyses will be conducted with the R software. No strong knwoledge neither of R programming is required (only basic scripting).

Website: https://jchiquet.github.io/MAP566/

Course Evaluation: 1 or 2 group projects + 1 PC report + a final exam
Course Language: French with all material in English

Teaching coordinator: Naulet Zacharie

Category: Ingénieur 3A / Master 1

MAP583 - Apprentissage profond de la théorie à la pratique (2023-2024)

Recent developments in neural network approaches (more known now as “deep learning”) have dramatically changed the landscape of several research fields such as image classification, object detection, speech recognition, machine translation, self-driving cars and many more. Due its promise of leveraging large (sometimes even small) amounts of data in an end-to-end manner, i.e. train a model to extract features by itself and to learn from them, deep learning is increasingly appealing to other fields as well: medicine, time series analysis, biology, simulation...

This course is a deep dive into practical details of deep learning architectures, in which we attempt to demystify deep learning and kick start you into using it in your own field of interest. During this course, you will gain a better understanding of the basis of deep learning and get familiar with its applications. We will show how to set up, train, debug and visualize your own neural network. Along the way, we will be providing practical engineering tricks for training or adapting neural networks to new tasks.

Teaching coordinator: Bianchi Pascal
Teaching coordinator: Scaman Kevin

Category: Ingénieur 3A / Master 1

MAP591 - Signal et image (2023-2024)

Director of the option:
Grégoire Allaire
Email: stages-map591@cmap.polytechnique.fr

Secretariat of the Applied Mathematics Department
Tel.: 01 69 33 46 07
Fax: 01 69 33 46 46.
Email: leyla.marzuk@polytechnique.edu

The analysis and automatic processing of the information in signals or images is a major sector of data processing wich applications are significative. There is a wide range of scopes: medical imaging, learning and recognition (images or speaking), computer vision, opposite problems, compression, telecommunications...

Internships are offered in France or abroad, in the industrial sector or in centers of research.
Our foreign partners are in Europe (UK, Germany, Austrich, Switzerland), but also in the United States, Australia...
In any case, the aim is to give students the opportunity to take part in research work or an innovative industrial development, while enriching their knowledge. All these internships include a modeling and implementing part with a ratio depending of the internship.

This option is in the of the MAP555 course of Signal processing, but it is not a prerequisite.

An internship catalog is available online but we are at your disposal to help you buiding your dream internship on this topic.

Course language: French

Teaching coordinator: Alouges François

Category: Ingénieur 3A / Master 1

MAP594 - Modélisation probabiliste et statistique (2023-2024)

Professors in charge of the option:
Vincent Bansaye - Probabilités
Email: bansaye@cmap.polytechnique.fr

Eric Moulines - Statistics
Email: eric.moulines@polytechnique.edu

Aymeric Dieuleveut - Machine Learning
Email: aymeric.dieuleveut@polytechnique.edu

Secretariat of the Applied Mathematics Department
Tel: 01 69 33 46 07
Fax: 01 69 33 46 46
Email: leyla.marzuk@polytechnique.edu

"Probability Modeling and Statistics" internships are generally about building and studying probabilistic models designed to describe and analyse physics, biological, computing or economy phenomena. Depending on the intended goals, models can be from machine learning (statistical learning), used as a tool to analysing data and proposing predictions (estimation, tests, prediction...), or to be analysed with probabilistical mehtods in order to grasp their behaviors and limits. Note that with the global rise of AI, big data or not, and the design of macine learning algorithms adapted are at the center of many intenships proposed.
The range of applications of these methods is very broad: biology (population dynamics, genetic heritage transmission, phylogenetic selection, biological regulation network...), communication network (traffic characterization, probabilistic analysis of protocols, congestion control), insurance (pricing, prediction of reserves), economy (analysis and prediction of macroeconomic aggregates...), etc.

These internships are particularly designed to students who have taken the Applied Mathematics PA (notably the course of "Process and estimate", "Communication network, algorithms and probabilities", "Statistical learning", "Random models in ecology and evolution").

Example of internships of the earlier years:

IN FRANCE

EDF
Uncertainty about prediction of electricity consumption.
Analysis of the use of electrical interconnextions in Europe.
VEOLIA
Biodiversity modelisation in basin of activated sludges.
SCHLUMBERGER
Uncertainty assessment for CO2 geological storage integrity.
THOMSON
Navigability with a bias.
TELECOM PARISTECH
Dynamical share of bandwidth in the Internet.
INRIA
Probabilistic methods for the Poisson-Boltzmann equation in molecular dynamics.
INRA
Cyclostationary analysis of the Caledonian climate.
Statistical models for the analysis of biological interaction network.
Study of regrowth dynamics outside crop plots in an agro-ecosystem.
ORANGE
Random walk in the city.

ABROAD

UNIVERSITY OF CALIFORNIA (Berkeley)
Development of flow model based algorithms for highway traffic estimation (Mobile Millenium).
Using mobile phones to estimate travel times in urban networks through the STARMA model.
Traffic forecasting using statistical machine learning.
COLUMBIA UNIVERSITY (New York)
Verification / testing of statistical decadal forecasts.
Subnational Carbon Emissions from Selected Countries.
IMPERIAL COLLEGE (London)
Influence in on-line social networks.
Dissemination of Information in Distributed Networks.
EPFL (Lausanne)
Stabilité des réseaux d'accès sans fil: impact de la topologie.
CMM-UNIVERSITY OF CHILE (Santiago)
Mathematical modeling and analysis of metabolic interaction networks.
UNIVERSITA ROMA 3 (Rome)
Mixing time for reversible Markov Chains and applications.
UNIVERSITY OF WATERLOO (Canada).
Bandwidth allocation policies in Wireless Networks.
NRS (Montréal)
Qualité de service et tarification des réseaux IP.

Course language: French

Teaching coordinator: Dieuleveut Aymeric
Teaching coordinator: Gerin Lucas

Category: Ingénieur 3A / Master 1

MAP595 - Mathématiques financières (2023-2024)

Professors in charge of the option:
Stefano De Marco
Email: stages-map595@cmap.polytechnique.fr

Pierre Henry-Labordère
Email: stages-map595@cmap.polytechnique.fr

Nizar Touzi
Email: stages-map595@cmap.polytechnique.fr

Secretariat of the Applied Mathematics Department
Assistant of the departement: Leyla Marzuk
T. +33 (0)169334607 – leyla.marzuk@polytechnique.edu

Internships offered in Financial mathematics generally take place in centers of research of banks or other organisms, such as investment funds. For some internships, it is mandatory to have taken the Year-3 course of "Stochastic models in finance". Internships abroad take place either in the bank or in academical centers of research.

Some internships require several interviews with different teams.

Example of internships of the earlier years:

En France

AXA
Optimization of pricing strategy
CMAP
Principal-Agent to several agents and to a jump dynamic. Applications to the structuing and pricing of electricity contracts

CREDIT SUISSE
Genetical methods for portfolio optimization
KEPLER CHEVREUX
Dynamics of order-book data and algorithm detection
SOCIETE GENERALE
Pricing de produits structurés très long terme
UNIVERSITE PARIS 7
Financial models with arbitraging, application to long-term asset and liability management

Abroad

BLOOMBERG LP
Calibration of a Path Dependent Volatility model to the VIX and S&P markets
BNP PARIBAS London Branch
Models for Overnight indexed swap rates and Libor dynamics, and related Market risk
BRITISH PETROLEUM
Application de techniques de Machine Learning à la méthode de Monte-Carlo des moindres carrés
DEUTSCHE BANK Londres
Pricing and risk management of interest rate derivatives
GOLDMAN SACHS
Predictive flow Analytics & inventory optimization
IMC Trading
Identification of the impact of market participants on the European Futures Market
JANE STREET
The volume synchronized probability of informed trading
JP MORGAN
Capital optimization, funding optimization, derivates clearing businesses, credit value adjustement
JUMP TRADING INTERNATIONAL
Latent order book in the context of market impact and liquidity drought
MONASH UNIVERSITY
Option pricing with linear market impact
SQUAREPOINT CAPITAL
Multi-period portfolio optimization to manage tail risk in equities
UNIVERSITY OF OXFORD
Numerics for the robust pricing and hedging problem in discrete time

Teaching coordinator: Abi Jaber Eduardo
Teaching coordinator: Djete Mao Fabrice

Category: Ingénieur 3A / Master 1

MAP596A - Internship for Data Science for Business (X) (2023-2024)

Teaching coordinator: Le Pennec Erwan

Category: Ingénieur 3A / Master 1

MAP596B - Internship for Data Science for Business (2023-2024)

Teaching coordinator: Le Pennec Erwan

Category: Ingénieur 3A / Master 1

MAP598 - Internship for Data & Finance (2023-2024)

Teaching coordinator: De Marco Stéfano
Teaching coordinator: Le Pennec Erwan

Category: Ingénieur 3A / Master 1

MAP654I - Practical introduction to machine learning (2023-2024)

The objective of this course is to provide a practical introduction to the field of machine learning. We will discuss the different machine learning problems from unsupervised (dimensionality reduction, clustering and density estimation) to supervised (classification, regression, ranking). In this course we will introduce for each method the problem, provide its modeling as an optimization problem and discuss the algorithms that are used to solve the problem. The practical aspect of each method will also be discussed along with python code and existing implementations.

The course will be completed by practical sessions that will allow the students to implement the methods seen in the course on practical problems such as image classification and time series prediction (biomedical and climate data). The objective of the practical session will be not only to learn to use the methods but also to interpret their models and results with respect to the data and the theoretical models.

Course overview:

Introduction

Machine learning problems
Knowing your data
Preprocessing

Unsupervised learning

Dimensionality reduction and
Dictionary learning and collaborative filtering
Clustering and generative modeling
Generative modeling

Supervised learning

Linear models and kernel methods for regression and classification
Nearest neighbors and bayesian decision
Trees and ensemble methods

ML in practice

Find your problem
Model selection

This course will be given in english with lecture material in english.

Evaluation : practical session reports and oral

Teaching coordinator: Clevenot Stéphanie
Teaching coordinator: Flamary Rémi

Category: Master 2

MAP667T - Biostatistics (2023-2024)

The course aims at introducing both concepts and methods used in clinical (or medical) research. It is integrated to the health science theme of the master program.The course will both emphasize the principles and concepts underlying the different goals of clinical research (prediction and causation) and develop on specific statistical methods that can be used to plan studies and analyze data in this context.

It will focus on notions and methods that are not covered by other courses of the master (e.g. design, survival analysis, causal inference), that will be tackled both from the theoretical and applied point-of-view.

Methods will be illustrated on several practical examples.

Category: Master 2

MAP670G - Data Stream Processing (2023-2024)

Objectifs du cours :

Introduction to the concept of data stream processing
Learning the basics on and how to use Data Stream Management Systems (DSMS)
Understanding the main sampling techniques used for stream processing : sampling, sketching, etc.
Understanding and using the main data stream processing algorithms

Syllabus :

This course deals with the algorithms and softwares commonly used to process large data streams. It aims at understanding the main difficulties and specificities of this type of data, knowing what different types of streams exist, what are the theoretical models and practical algorithms to analyze them, and what are the right tools to process these streams.

After an introduction of what data streams are from a conceptual point of view, this class covers the question of data stream processing from two different angles:

A Machine Learning and Data Mining approach to cover the theoretical and algorithmic difficulties of learning from data streams: online learning vs incremental and batch learning, and sampling techniques.
A more practical approach with an introduction to the various systems and software that are used to handle these data.

In terms of organization, the course will consist of an alternance of lectures and practical sessions. Finally, during the last class the students will have to present a recent research article of their choice on the subject of data stream processing.

Prérequis :

Basics in SQL language
Basics in Machine Learning (supervised and unsupervised)
A knowledge of Java programming is recommended but not mandatory

Évaluation :

The practical sessions will make ⅔ of the mark
The research paper presentation will make ⅓ of the mark

Teaching coordinator: Diao Yanlei

Category: Master 2

MAP670H - High-dimensional Matrix Estimation (2023-2024)

Syllabus : Nowadays many data learning problems require to analyze the structure of a high-dimensional matrix with remarkable properties; In recommender systems, this could be a column sparse matrix or a low-rank matrix but more sophisticated structures could be considered by combining several notions of sparsity; In graph analysis, popular spectrum techniques to detect cliques are based on the analysis of the Laplacian matrix with specific sparse/low-rank structure. In this course, we will review several mathematical tools useful to develop statistical analysis methods and study their performances. Such tools include concentration inequalities, convex optimization, perturbation theory and minimax theory.

Numerus Clausus : 30

Class Time: P2 Wednesday morning

Grading – 2.5 ECTS:

Written Exam

Article

Topics covered:

Principal Component Analysis
Spectral clustering
Matrix completion
Robust Statistics
Phase Retrieval
Optimal Transport

Textbook:

Vershynin. High-Dimensional Probability. Cambridge University.
Gross, Recovering low-rank matrices from few coefficients in any basis, 2011, arXiv:0910.1879
Guedon and R. Vershynin. Community detection in sparse networks viagrothendieck’s inequality.Probability Theory and Related Fields, 165(3-4):1025–1049,2016.
Ma, R. Dudeja, J. Xu, A. Maleki, X. Wang. Spectral Method for Phase Retrieval: an Expectation Propagation Perspective. arXiv: 1903.02505
M. Kouw, M. Loog. An introduction todomain adaptation and transfer learning, 2018. arXiv:1812.11806

Teaching coordinator: Lounici Karim

Category: Master 2

MAP670i - Structured Data: Learning and Prediction (2023-2024)

Teaching coordinator: Krzakala Paul

Category: Master 2

MAP670L - Generalisation properties of algorithms in ML (2023-2024)

La majorité des problèmes d'apprentissage sont formulés comme des problèmes d'optimisation,
à partir de l'observation d'un échantillon de données (ensemble d'entraînement). L'optimisation
d'un objectif défini à partir de cet échantillon permet de proposer un estimateur qui a une bonne
performance sur l'ensemble d'apprentissage. Cependant, on s'intéresse généralement à la
capacité de généralisation de cet estimateur, c'est à dire sa performance sur une nouvelle
observation. Avec l'émergence des grandes quantités de données depuis les années 2000, le
lien entre l'algorithme utilisé et la capacité de généralisation de l'estimateur associé est devenu
un sujet majeur.
Aujourd'hui, la question de la généralisation est encore une problématique de recherche
majeure, tant pour ses aspects théoriques que pratiques.
Dans ce cours, on s'intéresse à l'ensemble des résultats tant théoriques que heuristiques qui
permettent d'aborder ce problème. Plus précisément, on étudiera dans un premier temps les
différentes approches qui permettent d'obtenir des garanties théoriques quant à la
généralisation des algorithmes, en particulier les approches liées à la complexité, à la stabilité
et aux méthodes d'arrêt anticipé (Early stopping, approximation stochastique). Dans une
seconde partie, on étudiera les approches heuristiques et les différences (expliquées ou
constatées) dans le cadre du deep learning (non convexe et over-parametrized).
Prérequis : connaissances élémentaires en optimisation convexe et statistiques. Avoir suivi le
cours d'optimisation pour les data-sciences permettra de mieux cerner les différents algorithmes
en jeu.
Liste de références (non exhaustive) : - Rademacher and Gaussian Complexities: Risk
Bounds and Structural Results, P. Bartlett, S. Mendelson - The Tradeoffs of Large Scale
Learning, L. Bottou, O. Bousquet - Stability and Generalization, O. Bousquet, A. Elisseef - Train
faster, generalize better: Stability of stochastic gradient descent, M. Hardt, B. Recht, Y. Singer -
Non-strongly-convex smooth stochastic approximation with convergence rate O(1/n), F. Bach,
E. Moulines - Understanding deep learning requires rethinking generalization, C. Zhang, S.
Bengio, M. Hardt, B. Recht, O. Vinyals - On early stopping in gradient descent learning, Y Yao,
L. Rosasco, and A. Caponnetto - Generalization properties of multiple passes stochastic
gradient method, S. Villa - Competing with the empirical risk minimizer in a single pass, R.
Frostig, R. Ge, S. M. Kakade, A. Sidford - Deep Learning and Generalization, O. Bousquet
Modalités de contrôle
Présentation d’article / projet

Teaching coordinator: Dieuleveut Aymeric

Category: Master 2

Show all 375