Regression vs Classification — With Real Models

Supervised learning splits into two tasks. Picking the right one — and its right metric — is step one of every project.

Regression — predict a number

from sklearn.linear_model import LinearRegression

# predict house price from size
model = LinearRegression()
model.fit(X_train, y_train)     # y is a continuous number (price)
model.predict([[1200]])         # -> e.g. [4_850_000]

# it literally learns:  price = w * size + b

Use for: prices, temperatures, sales, age, CGPA — anything on a continuous scale. Judge it with MAE / RMSE (average error in the same units) and R² (fraction of variance explained).

Classification — predict a category

from sklearn.linear_model import LogisticRegression

# predict pass/fail from marks
model = LogisticRegression()
model.fit(X_train, y_train)     # y is a class (0 or 1)
model.predict([[42]])           # -> [0]  (fail)
model.predict_proba([[42]])     # -> [[0.78, 0.22]]  probabilities!

Use for: spam/not, disease/healthy, which-digit, sentiment. Judge with accuracy, precision, recall, F1 (next lesson) — accuracy alone lies on imbalanced data.

Quick test: Is the answer a number on a scale (regression) or a bucket/label (classification)? "How many?" → regression. "Which one?" → classification. Despite the name, Logistic Regression is a classification model.

← Previous

Your First ML Model with scikit-learn (Full Example)

Accuracy Is Not Enough — Precision, Recall & F1