Bias and Variance in Machine Learning

B Bias

Bias measures how far off our model's predictions are from the true values on average. High bias means the model is too simple and misses important patterns in the data.

High Bias Examples:

  • Linear regression on non-linear data
  • Underfitting the training data
  • Model assumptions too restrictive
  • Consistent systematic errors

V Variance

Variance measures how much our model's predictions change when trained on different datasets. High variance means the model is too sensitive to small changes in training data.

High Variance Examples:

  • Deep neural networks with insufficient data
  • Overfitting to training data
  • Model too complex for dataset size
  • Predictions vary wildly with new data

The Bias-Variance Decomposition

Total Error = Bias² + Variance + Irreducible Error

This fundamental equation shows that prediction error comes from three sources: systematic bias, model variance, and inherent noise in the data.

The Bias-Variance Tradeoff

As model complexity increases, bias typically decreases while variance increases. The goal is finding the optimal balance.

Simple Models

High Bias
Low Variance
Underfitting

Optimal Models

Balanced Bias
Balanced Variance
Good Generalization

Complex Models

Low Bias
High Variance
Overfitting

🎯 Reducing Bias

  • Use more complex models
  • Add more features
  • Reduce regularization
  • Increase model flexibility

📊 Reducing Variance

  • Use more training data
  • Apply regularization
  • Use ensemble methods
  • Simplify the model

© 2025 Machine Learning for Health Research Course | Prof. Gennady Roshchupkin

Interactive slides designed for enhanced learning experience