Day 1: Data wrangling

- Advanced course on Pandas
- Tidy data
- Lab on MovieLens dataset
- Challenge and getting started with RAMP

Day 2: ML Pipelines and hyperparameter search

- Column transformer and pipelines
- Bayesian optimization and hyper parameter search
- Learning curves

Day 3: Metrics and dealing with unbalanced data

- Presentation of the different ML metrics
- Problem of the metric with unbalanced data
- ML approaches to deal with imbalanced data

Day 4: Ensemble methods and feature engineering

- Gradient Boosting
- Stacking
- feature engineering

Day 5: Model inspection

- partial dependence plots
- feature importance

Challenges

Besides the students will compete during the week on a data challenge.