Date: January 30, 2024
Topic: Ensemble Learners
Recall
Ensemble learners take predictions from different machine learning models to perform a combined prediction.
Notes
Ensemble Learners

- Each learner will have its own prediction. We then weigh them in some way to get a single $y$-prediction
- For classification: voting
- For regression: take the mean
- Each learner has some sort of bias
- e.g., linear regression performs best when data is linear
- Having an ensemble allows the individual biases to cancel out, leading to less overfitting
Building an ensemble
- Train several parameterized polynomials of differing degrees
- Train several KNN models using different subsets
- Combine the models above into an ensemble
<aside>
📌 SUMMARY: Take several different models (can be the same or different kinds) and weigh the predictions to get a single prediction
</aside>
Date: January 30, 2024
Topic: Bootstrap aggregating - Bagging (Ensemble KNN)
Recall
Create several subsets of data (bags) to train a several KNN models. Combine them together to create an ensemble learner.
Notes
Bagging

- Form $m$ bags with $n'$ < $n$ data each. The data is taken from the training dataset with replacement
- From each bag, train a separate model to create an ensemble learner
With bagging, we can get an ensemble model that mitigates overfitting.
Example of Ensemble Models (KNN)

- The above shows an ensemble of 1NN models
- The resulting ensemble model (from taking the mean of each model) is quite smooth and mitigates overfitting
<aside>
📌 SUMMARY: Using bagging, we can create an ensemble of models from the different datasets. This leads to a model that can mitigate overfitting
</aside>
Date: January 30, 2024
Topic: Boosting (Ada Boost)
Recall
Ada Boost tries to single out badly performing data points so the model is more exposed to them for subsequent trainings
Notes
Ada Boost (Adaptive Boosting)

Repeat the above for n bags
- For each model, test it against the same train dataset.
- Results that deviate significantly are given higher weights
- During re-picking (with replacement) for the next bag we are more likely to pick badly performing data
- Create an ensemble of models from each bag’s training to give weightage for subsequent bags
<aside>
💡 However, as we increase the number of bags, certain data points will always be picked. This will lead to overfitting if many bags are used
</aside>
<aside>
📌 SUMMARY: Bagging and boosting are just wrappers for existing methods. They help to reduce error and overfitting
</aside>