The outputs of a task (discrete or continuous), tells us the kind of machine learning problem we are facing
<aside> π‘ The difference between a classification task and a regression task is the output (discrete vs continuous)
</aside>
| Input | Output | Classification | Regression | Output Type |
|---|---|---|---|---|
| Credit History | Lend Money? | β | β | Binary (2 classes) |
| Face Image | High School, College, Grad | β | β | Discrete (3 classes) |
| Face Image | Age | β | β | Can be 150 classes (150 ages) or continuous (mapped to real number) |
Terms for classification learning
Instances: Input, can be a bunch of pixels (image) or other sorts of data
Concept: Function, from the input the concept maps it to an output (e.g., T/F)
Target Concept: The actual answer where we want our concept to be as close as possible to
Hypothesis: Class which the set of all concepts we are willing to entertain
Sample: Training set which has a paired input and output ({rose, T}, {leaf, F})
Candidate: A concept which could be the target concept
Testing Set: Takes the candidate concept and based on the inputs of the testing set, check how accurate it maps to the output of the testing set
<aside> π SUMMARY: Based on some paired inputs and outputs, we want to derive a generalized concept that can predict an output from any similar input
</aside>
Decision Trees are built from:
Using a built decision tree, we can follow through and get our predicted class

How a Decision Tree might look like

A possible tree - candidate concept
From the above graph, we can predict our classes from the given data in the table below
| --- | --- | --- | --- | --- | --- | --- |
Note that we do not use all attributes during the tree traversal (rain, hungry and hot date doesnβt matter)
<aside> π SUMMARY: Traversing a decision tree requires making decisions (edges) at every attribute (node), eventually arriving at our prediction (leaf)
</aside>