Overfitting & Underfitting — The Central Problem of ML

If you understand one ML concept deeply, make it this. Overfitting is why a model that aced training fails in production.

The spectrum

Underfitting — model too simple; bad on training AND test. It never learned the pattern. (Fix: bigger model, better features.)
Good fit — learned the true pattern; good on both.
Overfitting — model memorised the training data including its noise; great on train, bad on test. It learned the answers, not the concept.

The student analogy

Underfitting = didn't study, fails everything. Overfitting = memorised last year's exact answer key, aces those questions but fails when the exam changes. Good fit = actually understood the subject and handles new questions. ML wants understanding, not memorisation.

How to detect it

train_acc = model.score(X_train, y_train)   # 0.99
test_acc  = model.score(X_test,  y_test)    # 0.71
# Big gap (0.99 vs 0.71) = classic overfitting.

The fixes (in order to try)

More/better data — the strongest cure; harder to memorise a large, varied set.
Simplify the model — fewer parameters, shallower trees.
Regularisation — penalise complexity (L1/L2, dropout in neural nets).
Cross-validation — evaluate on multiple splits so one lucky split can't fool you.
Early stopping — stop training when test performance stops improving.

This is the bias–variance trade-off: too simple = high bias (underfit), too complex = high variance (overfit). The art is the middle.

← Previous

Accuracy Is Not Enough — Precision, Recall & F1

Feature Engineering — Where Models Are Really Won