Your First ML Model with scikit-learn (Full Example)

Enough theory — here is a complete, working model in ~15 lines. This template solves a huge fraction of real classification problems.

The whole thing

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score

# 1. Data: features X, labels y
X, y = load_iris(return_X_y=True)

# 2. Split
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42)

# 3. Choose & train a model
model = RandomForestClassifier(n_estimators=100, random_state=42)
model.fit(X_train, y_train)          # <- learning happens here

# 4. Predict & evaluate on UNSEEN data
preds = model.predict(X_test)
print("Accuracy:", accuracy_score(y_test, preds))   # ~0.97

# 5. Use it on something new
model.predict([[5.1, 3.5, 1.4, 0.2]])   # -> predicted flower class

The pattern that never changes

Notice the shape: model.fit(X, y) then model.predict(X). Every scikit-learn model — linear regression, SVM, gradient boosting — uses this exact interface. Swap RandomForestClassifier for LogisticRegression and the rest is identical. That consistency is why sklearn is the best place to learn.

Try it free right now

Open Google Colab, paste the code, press Shift+Enter. You just trained a 97%-accurate classifier with no install. Next: regression vs classification in depth.

← Previous

The Machine Learning Workflow — End to End

Regression vs Classification — With Real Models