Outer CV gives an unbiased estimate on truly unseen data. Inner CV tunes hyperparameters only on the outer-train split.
Step 1 Explanation: We start by dividing the entire dataset into K outer folds. One fold becomes the test set (yellow), while the remaining K-1 folds become the training set (green). This outer test fold will be held out completely and never used for model selection or hyperparameter tuning.
Step 2 Explanation: On the outer training data (green), we perform inner cross-validation to tune hyperparameters. Each inner fold uses a portion of the outer training data for validation (blue) while training on the rest (green). This ensures hyperparameter selection happens only on training data, never touching the outer test fold.
Step 3 Explanation: This shows how the current data split uses the entire dataset. The outer test fold (yellow) is completely held out. The inner validation (blue) is used only for hyperparameter tuning, and the training data (green) is used for model fitting.
Outer iter | Best hyperparam | Outer test score |
---|
© 2025 Machine Learning for Health Research Course | Prof. Gennady Roshchupkin
Interactive slides designed for enhanced learning experience