🎯 Machine Learning Evaluation Metrics

📊 Confusion Matrix Reference

Predicted
Positive
Predicted
Negative
Actual
Positive
TP
True Positive
FN
False Negative
Actual
Negative
FP
False Positive
TN
True Negative
📈 Metric 📝 Definition 🔢 Formula 🎯 Use Case
Accuracy The proportion of correct predictions among all predictions.
TP + TN TP + TN + FP + FN
General performance measurement when classes are balanced.
Precision The proportion of true positive predictions among all positive predictions.
TP TP + FP
Important when false positives are costly (e.g., avoiding unnecessary surgeries, reducing patient anxiety from false alarms).
Recall The proportion of actual positives that are correctly predicted (also known as Sensitivity).
TP TP + FN
Important when false negatives are costly (e.g., missing cancer diagnosis, failing to detect heart disease).
F1-Score The harmonic mean of Precision and Recall. Balances precision and recall.
2 × Precision × Recall Precision + Recall
Useful for imbalanced medical datasets (e.g., rare disease detection where both missed cases and false alarms matter).
Sensitivity The ability to correctly identify positive cases (same as Recall).
TP TP + FN
Critical in healthcare for diagnosing diseases (minimize missed cases).
Specificity The proportion of actual negatives that are correctly predicted.
TN TN + FP
Important when identifying true negatives is crucial (e.g., confirming healthy patients, drug safety testing).

💡 Key Insights for Metric Selection

⚖️ Precision vs Recall Trade-off High precision reduces false cancer alarms and unnecessary biopsies. High recall ensures fewer missed diagnoses but may increase patient anxiety from false positives.
🏥 Disease Screening Prioritize Sensitivity/Recall for cancer screening to minimize missed cases. Use Specificity to reduce unnecessary follow-up procedures and patient stress.
💊 Drug Efficacy Testing High Precision ensures patients receiving treatment actually benefit. High Recall ensures effective treatments aren't missed in clinical trials.
🫀 Emergency Medicine High Sensitivity for heart attack detection to avoid missing critical cases. Balance with Specificity to prevent overloading emergency resources.

© 2025 Machine Learning for Health Research Course | Prof. Gennady Roshchupkin

Interactive slides designed for enhanced learning experience