Date: January 21, 2024
Topic: Parametric Regression
Recall
Find a best fit line or polynomial curve that best fit the data. This graph is then used to estimate future values.
The previous data points are not considered when making future predictions
Notes
Parametric Regression (Parametric)
Our goal is to find values for the constant terms that best model the data.
Fitting a line

- Given some data points, we can attempt to fit a line (red) to model the behavior
- This is based on $y=mx+b$
Fitting a polynomial

- We may find that its better to fit higher degree polynomials to model the actual behavior of data
- If the data is more complex, the model will be more complex
KNN and kernel regression relies on the existing data to make predictions.
KNN selects the closest $k$ points to make a prediction
Kernel regression weights the data points based on the query, and predicts from there.
K Nearest Neighbor (Non-parametric/instance based)
Find the closest points to estimate the input data

- For example, for $k=3$, we find the 3 closest points according to the query of -5mm.
- This leads to the 3 points chosen above in red
- We can then estimate the rain by taking the mean of the 3 chosen points
Kernel Regression
- Kernel regression weights all points and assigns a score to them based on the input
- From that score, a prediction is generated (for example, nearer points are weighted higher)
Use parametric models when we know what the model might look like.
Use non-parametric models when the data seems to be complex and we don’t know the underlying distribution.
Parametric vs Non-parametric
Cannonball Example (Parametric)

- Cannon ball distance is best estimated using a parametric model, since it follows a well-defined trajectory
- Biased problem - we already have some intuition of what the model might be (parametric equation)
- Pros:
- Cons:
Honeybee Example (Non-parametric)

- Hard to model honey bees mathematically, thus a non-parametric approach is more suitable
- Unbiased problem - hard to know what the model will be since not mathematical
- Pros:
- Cons:
<aside>
📌 SUMMARY: We have parametric methods like linear regression and non-parametric ones like KNN and kernel regression to model our data’s behavior
</aside>
Date: January 21, 2024
Topic: Training and Testing
<aside>
📌 SUMMARY: When testing algorithms, always use newer data. APIs implemented should include common methods across the models.
</aside>