Bias and Variance in ML

B Bias

Bias measures how far off our model's predictions are from the true values on average. High bias means the model is too simple and misses important patterns in the data.

High Bias Examples:

Linear regression on non-linear data
Underfitting the training data
Model assumptions too restrictive
Consistent systematic errors

V Variance

Variance measures how much our model's predictions change when trained on different datasets. High variance means the model is too sensitive to small changes in training data.

High Variance Examples:

Deep neural networks with insufficient data
Overfitting to training data
Model too complex for dataset size
Predictions vary wildly with new data

The Bias-Variance Decomposition

Total Error = Bias² + Variance + Irreducible Error

This fundamental equation shows that prediction error comes from three sources: systematic bias, model variance, and inherent noise in the data.

The Bias-Variance Tradeoff

As model complexity increases, bias typically decreases while variance increases. The goal is finding the optimal balance.

Simple Models

High Bias
Low Variance
Underfitting

Optimal Models

Balanced Bias
Balanced Variance
Good Generalization

Complex Models

Low Bias
High Variance
Overfitting

🎯 Reducing Bias

Use more complex models
Add more features
Reduce regularization
Increase model flexibility

📊 Reducing Variance

Use more training data
Apply regularization
Use ensemble methods
Simplify the model

Bias and Variance in Machine Learning