École polytechnique learning platform
Search results: 376
This a Data Camp based on the challenges proposed by the ENS: https://challengedata.ens.fr/
You can choose any of the proposed challenges. You will have to work in groups in groups of three to four students and have to provide a solution by the end of march. To help you, we will organize non mandatory coaching sessions on Mondays afternoon.
- Teaching coordinator: Le Pennec Erwan
- Teaching coordinator: Le Pennec Erwan
Machine learning is a scientific discipline that is concerned with the design and development of algorithms that allow computers to learn from data. A major focus of machine learning is to automatically learn complex patterns and to make intelligent decisions based on them.
This course focuses on the methodology underlying supervised and unsupervised learning, with a particular emphasis on the mathematical formulation of algorithms, and the way they can be implemented and used in practice. We will therefore describe some necessary tools from optimization theory, and explain how to use them for machine learning. A glimpse about theoretical guarantees, such as upper bounds on the generalization error, are provided during the last lecture.
The methodology will be the main concern of the lectures while some proofs will be done during the PCs. Practice will be done through a challenge.
- Teaching coordinator: Bianchi Pascal
- Teaching coordinator: El Mhamdi El Mahdi
- Teaching coordinator: Klein Thierry
- Teaching coordinator: Le Pennec Erwan
- Teaching coordinator: Philippenko Constantin
Initially known for its successes in telecommunications, signal processing is now part of all domains of data processing that require to analyse, extract and transform numerical information. This course is an introduction to the field of signal processing and as such requires basic knowledge of analysis (Fourier Transform), probabilities (random variables, random process) and linear algebra.
The course begins with a presentation of Fourier analysis and analog filtering with some applicative examples such as modulation and Fourier optics in astronomy. Next we will introduce signal sampling and digital signal filtering that has
become the de-facto standard in practical applications. We will study the very important Fast Fourier Transform (FFT) algorithm and discuss some examples of filtering in image processing. Next we will study the random/stochastic aspects of signals and the optimal linear filtering of signal and noise when modeled as as stochastic processes. The modeling of speech with will also be taken as an example for the study of auto-regressive models. Finally the last part of the course will briefly introduce several signal representations commonly used such as the Discrete Cosine Transform (DCT), and wavelet transforms used in JPEG encoding and image reconstruction. The short time Fourier transform will also be introduced to model non-stationary signals. Finally some recent approaches based on machine learning such as dictionary learning and deep learning signal reconstruction will be presented.
The course will be completed by practical sessions in Python/Numpy that will allow the students to implement the methods seen in the course on practical problems such as audio signal generation and filtering.
This course will be given in french or english depending on the public with lecture material in english.
A working knowledge of Python/Numpy is strongly recommended for the practical sessions.
**Evaluation**: practical session reports and final theoretical+practical exam.
- Teaching coordinator: Flamary Rémi
The aim of this course is to present a rigorous view of modern statistic techniques to answer fundamental questions of modeling and valuation arising in practice.
We will focus on extreme values, multi-dimensional dependencies in data and dynamical aspects, among other things. The application fields of the studied methods in this course are finance and economy, biology (population dynamics, seismology, epidemiology), climatology, network analysis or sport (match data, performance data...).
- Teaching coordinator: Rosenbaum Mathieu
The objective of this course is to show students how statistics is used in practice to answer a specific question, by introducing a series of important model-based approaches.
The students will learn to select and use appropriate statistical methodologies and acquire solid and practical skills by working-out examples on real-world data sets from various areas including medicine, genomics, ecology, and others.
All analyses will be conducted with the R software. No strong knwoledge neither of R programming is required (only basic scripting).
Website: https://jchiquet.github.io/MAP566/
Course Evaluation: 1 or 2 group projects + 1 PC report + a final exam
Course Language: French with all material in English
- Teaching coordinator: Naulet Zacharie
Recent developments in neural network approaches (more known now as “deep learning”) have dramatically changed the landscape of several research fields such as image classification, object detection, speech recognition, machine translation, self-driving cars and many more. Due its promise of leveraging large (sometimes even small) amounts of data in an end-to-end manner, i.e. train a model to extract features by itself and to learn from them, deep learning is increasingly appealing to other fields as well: medicine, time series analysis, biology, simulation...
This course is a deep dive into practical details of deep learning architectures, in which we attempt to demystify deep learning and kick start you into using it in your own field of interest. During this course, you will gain a better understanding of the basis of deep learning and get familiar with its applications. We will show how to set up, train, debug and visualize your own neural network. Along the way, we will be providing practical engineering tricks for training or adapting neural networks to new tasks.
- Teaching coordinator: Bianchi Pascal
- Teaching coordinator: Scaman Kevin
Director of the option:
Grégoire Allaire
Email: stages-map591@cmap.polytechnique.fr
Secretariat of the Applied Mathematics Department
Tel.: 01 69 33 46 07
Fax: 01 69 33 46 46.
Email: leyla.marzuk@polytechnique.edu
The analysis and automatic processing of the information in signals or images is a major sector of data processing wich applications are significative. There is a wide range of scopes: medical imaging, learning and recognition (images or speaking), computer vision, opposite problems, compression, telecommunications...
Internships are offered in France or abroad, in the industrial sector or in centers of research.
Our foreign partners are in Europe (UK, Germany, Austrich, Switzerland), but also in the United States, Australia...
In any case, the aim is to give students the opportunity to take part in research work or an innovative industrial development, while enriching their knowledge. All these internships include a modeling and implementing part with a ratio depending of the internship.
This option is in the of the MAP555 course of Signal processing, but it is not a prerequisite.
An internship catalog is available online but we are at your disposal to help you buiding your dream internship on this topic.
Course language: French
- Teaching coordinator: Alouges François
Professors in charge of the option:
Vincent Bansaye - Probabilités
Email: bansaye@cmap.polytechnique.fr
Eric Moulines - Statistics
Email: eric.moulines@polytechnique.edu
Aymeric Dieuleveut - Machine Learning
Email: aymeric.dieuleveut@polytechnique.edu
Secretariat of the Applied Mathematics Department
Tel: 01 69 33 46 07
Fax: 01 69 33 46 46
Email: leyla.marzuk@polytechnique.edu
"Probability Modeling and Statistics" internships are generally about building and studying probabilistic models designed to describe and analyse physics, biological, computing or economy phenomena. Depending on the intended goals, models can be from machine learning (statistical learning), used as a tool to analysing data and proposing predictions (estimation, tests, prediction...), or to be analysed with probabilistical mehtods in order to grasp their behaviors and limits. Note that with the global rise of AI, big data or not, and the design of macine learning algorithms adapted are at the center of many intenships proposed.
The range of applications of these methods is very broad: biology (population dynamics, genetic heritage transmission, phylogenetic selection, biological regulation network...), communication network (traffic characterization, probabilistic analysis of protocols, congestion control), insurance (pricing, prediction of reserves), economy (analysis and prediction of macroeconomic aggregates...), etc.
These internships are particularly designed to students who have taken the Applied Mathematics PA (notably the course of "Process and estimate", "Communication network, algorithms and probabilities", "Statistical learning", "Random models in ecology and evolution").
Example of internships of the earlier years:
IN FRANCE
- EDF
Uncertainty about prediction of electricity consumption.
Analysis of the use of electrical interconnextions in Europe. - VEOLIA
Biodiversity modelisation in basin of activated sludges. - SCHLUMBERGER
Uncertainty assessment for CO2 geological storage integrity. - THOMSON
Navigability with a bias. - TELECOM PARISTECH
Dynamical share of bandwidth in the Internet. - INRIA
Probabilistic methods for the Poisson-Boltzmann equation in molecular dynamics. - INRA
Cyclostationary analysis of the Caledonian climate.
Statistical models for the analysis of biological interaction network.
Study of regrowth dynamics outside crop plots in an agro-ecosystem. - ORANGE
Random walk in the city.
ABROAD
- UNIVERSITY OF CALIFORNIA (Berkeley)
Development of flow model based algorithms for highway traffic estimation (Mobile Millenium).
Using mobile phones to estimate travel times in urban networks through the STARMA model.
Traffic forecasting using statistical machine learning. - COLUMBIA UNIVERSITY (New York)
Verification / testing of statistical decadal forecasts.
Subnational Carbon Emissions from Selected Countries. - IMPERIAL COLLEGE (London)
Influence in on-line social networks.
Dissemination of Information in Distributed Networks. - EPFL (Lausanne)
Stabilité des réseaux d'accès sans fil: impact de la topologie. - CMM-UNIVERSITY OF CHILE (Santiago)
Mathematical modeling and analysis of metabolic interaction networks. - UNIVERSITA ROMA 3 (Rome)
Mixing time for reversible Markov Chains and applications. - UNIVERSITY OF WATERLOO (Canada).
Bandwidth allocation policies in Wireless Networks. - NRS (Montréal)
Qualité de service et tarification des réseaux IP.
Course language: French
- Teaching coordinator: Dieuleveut Aymeric
- Teaching coordinator: Gerin Lucas
Professors in charge of the option:
Stefano De Marco
Email: stages-map595@cmap.polytechnique.fr
Pierre Henry-Labordère
Email: stages-map595@cmap.polytechnique.fr
Nizar Touzi
Email: stages-map595@cmap.polytechnique.fr
Secretariat of the Applied Mathematics Department
Assistant of the departement: Leyla Marzuk
T. +33 (0)169334607 – leyla.marzuk@polytechnique.edu
Internships offered in Financial mathematics generally take place in centers of research of banks or other organisms, such as investment funds. For some internships, it is mandatory to have taken the Year-3 course of "Stochastic models in finance". Internships abroad take place either in the bank or in academical centers of research.
Some internships require several interviews with different teams.
Example of internships of the earlier years:
En France
-
AXA
Optimization of pricing strategy -
CMAP
Principal-Agent to several agents and to a jump dynamic. Applications to the structuing and pricing of electricity contracts
CREDIT SUISSE
Genetical methods for portfolio optimization - KEPLER CHEVREUX
Dynamics of order-book data and algorithm detection -
SOCIETE GENERALE
Pricing de produits structurés très long terme -
UNIVERSITE PARIS 7
Financial models with arbitraging, application to long-term asset and liability management
Abroad
-
BLOOMBERG LP
Calibration of a Path Dependent Volatility model to the VIX and S&P markets -
BNP PARIBAS London Branch
Models for Overnight indexed swap rates and Libor dynamics, and related Market risk -
BRITISH PETROLEUM
Application de techniques de Machine Learning à la méthode de Monte-Carlo des moindres carrés -
DEUTSCHE BANK Londres
Pricing and risk management of interest rate derivatives -
GOLDMAN SACHS
Predictive flow Analytics & inventory optimization - IMC Trading
Identification of the impact of market participants on the European Futures Market -
JANE STREET
The volume synchronized probability of informed trading -
JP MORGAN
Capital optimization, funding optimization, derivates clearing businesses, credit value adjustement -
JUMP TRADING INTERNATIONAL
Latent order book in the context of market impact and liquidity drought -
MONASH UNIVERSITY
Option pricing with linear market impact -
SQUAREPOINT CAPITAL
Multi-period portfolio optimization to manage tail risk in equities -
UNIVERSITY OF OXFORD
Numerics for the robust pricing and hedging problem in discrete time
- Teaching coordinator: Abi Jaber Eduardo
- Teaching coordinator: Djete Mao Fabrice
- Teaching coordinator: Le Pennec Erwan
- Teaching coordinator: Le Pennec Erwan
- Teaching coordinator: De Marco Stéfano
- Teaching coordinator: Le Pennec Erwan
The objective of this course is to provide a practical introduction to the field of machine learning. We will discuss the different machine learning problems from unsupervised (dimensionality reduction, clustering and density estimation) to supervised (classification, regression, ranking). In this course we will introduce for each method the problem, provide its modeling as an optimization problem and discuss the algorithms that are used to solve the problem. The practical aspect of each method will also be discussed along with python code and existing implementations.
The course will be completed by practical sessions that will allow the students to implement the methods seen in the course on practical problems such as image classification and time series prediction (biomedical and climate data). The objective of the practical session will be not only to learn to use the methods but also to interpret their models and results with respect to the data and the theoretical models.
Course overview:
- Introduction
- Machine learning problems
- Knowing your data
- Preprocessing
- Unsupervised learning
- Dimensionality reduction and
- Dictionary learning and collaborative filtering
- Clustering and generative modeling
- Generative modeling
- Supervised learning
- Linear models and kernel methods for regression and classification
- Nearest neighbors and bayesian decision
- Trees and ensemble methods
- ML in practice
- Find your problem
- Model selection
This course will be given in english with lecture material in english.
Evaluation : practical session reports and oral
- Teaching coordinator: Clevenot Stéphanie
- Teaching coordinator: Flamary Rémi
- Teaching coordinator: Irurozki Ekhine
The course aims at introducing both concepts and methods used in clinical (or medical) research. It is integrated to the health science theme of the master program.The course will both emphasize the principles and concepts underlying the different goals of clinical research (prediction and causation) and develop on specific statistical methods that can be used to plan studies and analyze data in this context.
It will focus on notions and methods that are not covered by other courses of the master (e.g. design, survival analysis, causal inference), that will be tackled both from the theoretical and applied point-of-view.
Methods will be illustrated on several practical examples.
Objectifs du cours :
- Introduction to the concept of data stream processing
- Learning the basics on and how to use Data Stream Management Systems (DSMS)
- Understanding the main sampling techniques used for stream processing : sampling, sketching, etc.
- Understanding and using the main data stream processing algorithms
Syllabus :
This course deals with the algorithms and softwares commonly used to process large data streams. It aims at understanding the main difficulties and specificities of this type of data, knowing what different types of streams exist, what are the theoretical models and practical algorithms to analyze them, and what are the right tools to process these streams.
After an introduction of what data streams are from a conceptual point of view, this class covers the question of data stream processing from two different angles:
- A Machine Learning and Data Mining approach to cover the theoretical and algorithmic difficulties of learning from data streams: online learning vs incremental and batch learning, and sampling techniques.
- A more practical approach with an introduction to the various systems and software that are used to handle these data.
In terms of organization, the course will consist of an alternance of lectures and practical sessions. Finally, during the last class the students will have to present a recent research article of their choice on the subject of data stream processing.
Prérequis :
- Basics in SQL language
- Basics in Machine Learning (supervised and unsupervised)
- A knowledge of Java programming is recommended but not mandatory
Évaluation :
- The practical sessions will make ⅔ of the mark
- The research paper presentation will make ⅓ of the mark
- Teaching coordinator: Diao Yanlei
Syllabus : Nowadays many data learning problems require to analyze the structure of a high-dimensional matrix with remarkable properties; In recommender systems, this could be a column sparse matrix or a low-rank matrix but more sophisticated structures could be considered by combining several notions of sparsity; In graph analysis, popular spectrum techniques to detect cliques are based on the analysis of the Laplacian matrix with specific sparse/low-rank structure. In this course, we will review several mathematical tools useful to develop statistical analysis methods and study their performances. Such tools include concentration inequalities, convex optimization, perturbation theory and minimax theory.
Numerus Clausus : 30
Class Time: P2 Wednesday morning
Grading – 2.5 ECTS:
Written Exam
Article
Topics covered:
- Principal Component Analysis
- Spectral clustering
- Matrix completion
- Robust Statistics
- Phase Retrieval
- Optimal Transport
Textbook:
- Vershynin. High-Dimensional Probability. Cambridge University.
- Gross, Recovering low-rank matrices from few coefficients in any basis, 2011, arXiv:0910.1879
- Guedon and R. Vershynin. Community detection in sparse networks viagrothendieck’s inequality.Probability Theory and Related Fields, 165(3-4):1025–1049,2016.
- Ma, R. Dudeja, J. Xu, A. Maleki, X. Wang. Spectral Method for Phase Retrieval: an Expectation Propagation Perspective. arXiv: 1903.02505
- M. Kouw, M. Loog. An introduction todomain adaptation and transfer learning, 2018. arXiv:1812.11806
- Teaching coordinator: Lounici Karim
- Teaching coordinator: Giraldo Jhony
- Teaching coordinator: Krzakala Paul