🎯 Machine Learning Evaluation Metrics

📊 Confusion Matrix Reference

Predicted
Positive

Predicted
Negative

Actual
Positive

TP
True Positive

FN
False Negative

Actual
Negative

FP
False Positive

TN
True Negative

📈 Metric	📝 Definition	🔢 Formula	🎯 Use Case
Accuracy	The proportion of correct predictions among all predictions.	TP + TN TP + TN + FP + FN	General performance measurement when classes are balanced.
Precision	The proportion of true positive predictions among all positive predictions.	TP TP + FP	Important when false positives are costly (e.g., avoiding unnecessary surgeries, reducing patient anxiety from false alarms).
Recall	The proportion of actual positives that are correctly predicted (also known as Sensitivity).	TP TP + FN	Important when false negatives are costly (e.g., missing cancer diagnosis, failing to detect heart disease).
F1-Score	The harmonic mean of Precision and Recall. Balances precision and recall.	2 × Precision × Recall Precision + Recall	Useful for imbalanced medical datasets (e.g., rare disease detection where both missed cases and false alarms matter).
Sensitivity	The ability to correctly identify positive cases (same as Recall).	TP TP + FN	Critical in healthcare for diagnosing diseases (minimize missed cases).
Specificity	The proportion of actual negatives that are correctly predicted.	TN TN + FP	Important when identifying true negatives is crucial (e.g., confirming healthy patients, drug safety testing).

💡 Key Insights for Metric Selection

⚖️ Precision vs Recall Trade-off High precision reduces false cancer alarms and unnecessary biopsies. High recall ensures fewer missed diagnoses but may increase patient anxiety from false positives.

🏥 Disease Screening Prioritize Sensitivity/Recall for cancer screening to minimize missed cases. Use Specificity to reduce unnecessary follow-up procedures and patient stress.

💊 Drug Efficacy Testing High Precision ensures patients receiving treatment actually benefit. High Recall ensures effective treatments aren't missed in clinical trials.

🫀 Emergency Medicine High Sensitivity for heart attack detection to avoid missing critical cases. Balance with Specificity to prevent overloading emergency resources.