Bayesian Learning
info
In this chapter, you'll learn about:
- Bayesian Learning Principles: Understanding the Bayesian framework and how it differs from frequentist approaches.
- Maximum Likelihood and MAP Estimation: Reviewing MLE and MAP in the context of Bayesian inference.
- Bayesian Linear Regression: Applying Bayesian methods to linear regression models.
- Predictive Distributions: Deriving the predictive distribution for unseen data.
- Advantages of Bayesian Learning: Exploring the benefits, such as uncertainty quantification and online learning.
- Hierarchical Bayesian Models: Introducing hyperpriors and empirical Bayes methods.
In previous chapters, we explored parameter estimation techniques like Maximum Likelihood Estimation (MLE) and Maximum A Posteriori (MAP) Estimation. These methods provide point estimates of the parameters. However, they do not capture the uncertainty associated with these estimates.
In this chapter, we delve into Bayesian Learning, a probabilistic framework that models uncertainty by treating parameters as random variables with prior distributions. We will apply Bayesian principles to linear regression, leading to Bayesian Linear Regression, and discuss the computation of predictive distributions for new data points.
Review of MLE and MAP Estimation
Maximum Likelihood Estimation (MLE)
- Framework: Frequentist perspective.
- Assumption: Parameters are unknown but fixed constants.
- Objective:
- Interpretation: Find the parameter value that makes the observed data most probable.
Maximum A Posteriori (MAP) Estimation
- Framework: Bayesian perspective.
- Assumption: Parameters are random variables with a prior distribution .
- Objective:
- Using Bayes' Theorem:
- Interpretation: Find the parameter value that is most probable given the data and prior belief.
Bayesian Learning Principles
Bayesian Framework
- Parameters as Random Variables: All unknown quantities are treated as random variables.
- Prior Distribution: Represents our belief about the parameters before observing data.
- Posterior Distribution: Updated belief after observing data, computed using Bayes' theorem.
Bayesian Decision Theory
- Goal: Make predictions or decisions that minimize expected loss.
- Predictive Distribution: Instead of a point estimate, we compute the distribution over possible outcomes by integrating over all parameter values.
Predictive Distribution
The predictive distribution for a new data point is given by:
- : Likelihood of the target given parameters.
- : Posterior distribution over parameters.
Bayesian Linear Regression
Model Specification
-
Likelihood: