Decision Trees Introduction

Date: May 16, 2024

Topic: Classification and Regression

Recall

The outputs of a task (discrete or continuous), tells us the kind of machine learning problem we are facing

Notes

Classification

From the data/input space, we map it to some set of discrete labels (e.g., True or False, Male or Female, etc)

Regression

From the data/input space, we get some continuous predicted values (output as a real number)

<aside> 💡 The difference between a classification task and a regression task is the output (discrete vs continuous)

</aside>

Classification or Regression?

Input	Output	Classification	Regression	Output Type
Credit History	Lend Money?	√	✕	Binary (2 classes)
Face Image	High School, College, Grad	√	✕	Discrete (3 classes)
Face Image	Age	√	√	Can be 150 classes (150 ages) or continuous (mapped to real number)

Terms for classification learning

Classification Learning

Instances: Input, can be a bunch of pixels (image) or other sorts of data

Concept: Function, from the input the concept maps it to an output (e.g., T/F)

Target Concept: The actual answer where we want our concept to be as close as possible to

Hypothesis: Class which the set of all concepts we are willing to entertain

Sample: Training set which has a paired input and output ({rose, T}, {leaf, F})

Candidate: A concept which could be the target concept

Testing Set: Takes the candidate concept and based on the inputs of the testing set, check how accurate it maps to the output of the testing set

<aside> 📌 SUMMARY: Based on some paired inputs and outputs, we want to derive a generalized concept that can predict an output from any similar input

</aside>

Date: May 17, 2024

Topic: Decision Trees

Recall

Decision Trees are built from:

Nodes
Edges
Leaves (outcomes)

Using a built decision tree, we can follow through and get our predicted class

Notes

How a Decision Tree might look like

Decision Trees consists of:
- Nodes: Attributes or features (hungry, raining, etc)
- Edges: Choice to make at each node (True or False, etc)
- Outcomes: The output from the tree which gives us our prediction

Example Decision Tree

A possible tree - candidate concept

From the above graph, we can predict our classes from the given data in the table below

| --- | --- | --- | --- | --- | --- | --- |

Note that we do not use all attributes during the tree traversal (rain, hungry and hot date doesn’t matter)

<aside> 📌 SUMMARY: Traversing a decision tree requires making decisions (edges) at every attribute (node), eventually arriving at our prediction (leaf)

</aside>