Understanding and Fixing Training Problems
🧠 Deep Learning Fundamentals
Gradients become exponentially smaller as they propagate backward through deep networks, especially with sigmoid/tanh activations.
Result: Early layers learn very slowly or stop learning entirely.
Gradients become exponentially larger during backpropagation, causing unstable training and NaN values.
Symptoms: Loss shoots to infinity, weights become NaN, training diverges.
Model memorizes training data but fails to generalize to new data.
Signs: Training accuracy >> Validation accuracy, large gap between train/val loss.
Model is too simple to capture underlying patterns in the data.
Signs: Both training and validation accuracy are low, high bias.
Most neural network issues can be prevented with proper architecture design, initialization, and hyperparameter tuning. Always start with proven defaults and adjust based on your specific problem!
© 2025 Machine Learning for Health Research Course | Prof. Gennady Roshchupkin
Interactive slides designed for enhanced learning experience