2. Supervised Learning

info

In this chapter you'll be introduced to:

Supervised Learning: Understanding how to formulate supervised learning tasks and the components involved.
Regression Problems: Focusing on regression where the target variable is continuous.
Machine Learning Systems: The steps involved in building a machine learning model, including defining hypotheses and selection criteria.
Hypothesis Class and Overfitting vs. Underfitting: Why specifying a hypothesis class is essential and the trade-offs in model complexity and their impact on performance.

What is Supervised Learning?

Supervised Learning is a type of machine learning where the model is trained using labeled data. The goal is to learn a function that maps inputs,

x

, to desired outputs,

y

, based on example input-output pairs. Our training dataset is a list of pairs

(x^{(i)}, y^{(i)})

, called training samples, where

i

is the index into the training set. More precisely, we can define our training data consisting of

M

examples:

Formulating a Supervised Learning Task

Each task requires a specific target variable and input features. The formulation depends on the problem we aim to solve.

Input Features

The input in supervised learning is denoted as

\mathbf{x}

, a vector of features known during both training and prediction. These features can be:

Handling Categorical Features

Categorical features cannot be directly used in mathematical models. We need to convert them into numerical representations.

One-Hot Encoding

One common method is one-hot encoding, where each category is represented by a binary vector.

Example: Animal Type (Cat, Dog, Elephant)
- Cat: [1, 0, 0]
- Dog: [0, 1, 0]
- Elephant: [0, 0, 1]

This avoids assigning arbitrary numerical values that could imply an unintended order or magnitude between categories.

We'll see other representations in latter chapters.

Output Targets

The target variable is denoted as

y

(also denoted as

t

in the literature). Depending on its nature, supervised learning tasks are classified as:

Regression Problems

In regression, the goal is to learn a function that maps input features to a continuous output. The regression model is denoted as

h

, where:

1. Defining the Hypothesis Class

Humans define a set of possible models (functions) that the machine learning algorithm can choose from. This set is called the hypothesis class, denoted as

\mathcal{H}

Why Specify a Hypothesis Class?

Limiting the hypothesis class prevents the model from fitting noise in the training data, i.e., overfitting.
Without restrictions, the model could choose any function, making learning from finite data impossible.

Overfitting

When a model is too complex, it fits the training data very well but performs poorly on new data.

Cause: Hypothesis class is too powerful (too many functions).

Underfitting

When a model is too simple, it cannot capture the underlying pattern of the data.

Cause: Hypothesis class is too limited.

In this step, we need to find a balance where the model is complex enough to capture the data patterns but simple enough to generalize well.

Example

In this example we show how a polynomial function would fit to data with different degrees of freedom. If the class is too strict (on the left) the function underfits the data. If the class is too powerful (on the right) the function overfits.

A Comparison underfitting, overfitting and balanced fitting — A Comparison underfitting, balanced fitting and overfitting from left to right¹.

2. Defining the Selection Criterion

Humans specify a selection criterion (objective function) that evaluates how well a model fits the training data. Denoted as

J

, it's a function of the hypothesis

h

and the training data.

3. Training the Model

The machine learning algorithm selects the best hypothesis from

\mathcal{H}

by minimizing the selection criterion:

4. Making Predictions

With the trained model

h^*

, predictions are made on new, unseen data

\mathbf{x}

5. Evaluating the Model

The model's performance is evaluated using a measure of success or error function

E(h)

on unseen data:

Recap

What's Next

In the next chapter, we'll go through Linear Regression, a fundamental learning algorithm in supervised learning.

Portions of this page are reproduced from work created and shared by Scikit-learn and used according to terms described in the BSD 3-Clause License. ↩

Supervised Learning

What is Supervised Learning?

Formulating a Supervised Learning Task

Input Features

One-Hot Encoding

Output Targets

Regression Problems

1. Defining the Hypothesis Class

Overfitting

Underfitting

2. Defining the Selection Criterion

3. Training the Model

4. Making Predictions

5. Evaluating the Model

Recap

1. What is the primary goal of supervised learning?

2. What does the training dataset in supervised learning consist of?

3. In regression problems, what does the model predict?

4. What is the hypothesis class in a supervised learning model?

5. Why is defining a hypothesis class important in supervised learning?

What's Next

What is Supervised Learning?​

Formulating a Supervised Learning Task​

Input Features​

One-Hot Encoding​

Output Targets​

Regression Problems​

1. Defining the Hypothesis Class​

Overfitting​

Underfitting​

2. Defining the Selection Criterion​

3. Training the Model​

4. Making Predictions​

5. Evaluating the Model​

Recap​

1. What is the primary goal of supervised learning?

2. What does the training dataset in supervised learning consist of?

3. In regression problems, what does the model predict?

4. What is the hypothesis class in a supervised learning model?

5. Why is defining a hypothesis class important in supervised learning?

What's Next​

Footnotes​

What is Supervised Learning?

Formulating a Supervised Learning Task

Input Features

One-Hot Encoding

Output Targets

Regression Problems

1. Defining the Hypothesis Class

Overfitting

Underfitting

2. Defining the Selection Criterion

3. Training the Model

4. Making Predictions

5. Evaluating the Model

Recap

What's Next

Footnotes