Day 1: Data wrangling
- Advanced course on Pandas
- Tidy data
- Lab on MovieLens dataset
- Challenge and getting started with RAMP
Day 2: ML Pipelines and hyperparameter search
- Column transformer and pipelines
- Bayesian optimization and hyper parameter search
- Learning curves
Day 3: Metrics and dealing with unbalanced data
- Presentation of the different ML metrics
- Problem of the metric with unbalanced data
- ML approaches to deal with imbalanced data
Day 4: Ensemble methods and feature engineering
- Gradient Boosting
- Stacking
- feature engineering
Day 5: Model inspection
- partial dependence plots
- feature importance
Challenges
Besides the students will compete during the week on a data challenge.
- Profesor: Gramfort Alexandre