PyTorch is the industry standard for deep learning. Here is a complete network — every line explained. This template scales from toy problems to research.
Tensors — NumPy arrays that run on GPUs and track gradients
import torch
x = torch.tensor([1.0, 2.0, 3.0])
x * 2 # like NumPy
x.to("cuda") # move to GPU — this is the superpowerDefine the network
import torch.nn as nn
model = nn.Sequential(
nn.Linear(4, 16), # 4 inputs -> 16 hidden units
nn.ReLU(), # non-linearity
nn.Linear(16, 3), # 16 -> 3 output classes
)The training loop — the heart of deep learning
loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)
for epoch in range(100):
optimizer.zero_grad() # 1. reset gradients
preds = model(X_train) # 2. forward pass (predict)
loss = loss_fn(preds, y_train)# 3. measure error
loss.backward() # 4. backprop (compute gradients)
optimizer.step() # 5. update weights (step downhill)Those five lines are every deep learning model, from this toy to GPT. zero_grad → forward → loss → backward → step. Memorise the rhythm.
Run it free with a GPU
Paste into Google Colab, set Runtime → GPU, and you are training on hardware that cost thousands, for free. Next: CNNs for images.