Logistic Regression
info
In this chapter, you'll learn about:
- Binary Classification with Probabilistic Models: Modeling binary outcomes using probabilities.
- Bernoulli Distribution: Understanding the distribution for binary random variables.
- Logistic Function (Sigmoid Function): Introducing the squashing function to map linear combinations to probabilities.
- Logistic Regression Model: Formulating the logistic regression for binary classification.
- Maximum Likelihood Estimation (MLE): Deriving the loss function for logistic regression.
- Cross-Entropy Loss: Connecting the logistic regression loss to cross-entropy and KL divergence.
- Gradient Computation: Calculating gradients for optimization.
- Convexity and Optimization: Discussing the convex nature of logistic regression and optimization methods.
In previous chapters, we introduced classification problems and explored linear classifiers. We discussed the limitations of using linear regression for classification and the need for models specifically designed for categorical outcomes.
In this chapter, we delve into logistic regression, a fundamental algorithm for binary classification tasks. Logistic regression models the probability that a given input belongs to a particular category, allowing for probabilistic interpretation of predictions. It is widely used due to its simplicity, interpretability, and effectiveness.
Binary Classification and the Bernoulli Distribution
Binary Classification Recap
- Objective: Assign an input to one of two classes, labeled as 0 or 1.
- Examples: Spam detection (spam or not spam), disease diagnosis (disease or healthy).
Bernoulli Distribution
- Definition: A discrete probability distribution for a random variable that has two possible outcomes, 1 (success) and 0 (failure).
- Parameter: , where .
- Probability Mass Function: