A neural network sounds mystical. It is really just weighted sums passed through simple functions, stacked in layers. Let's build the intuition one piece at a time.
One neuron
A neuron takes inputs, multiplies each by a weight (its learned importance), adds them up, adds a bias, and passes the result through an activation function.
inputs = [x1, x2, x3] weights = [w1, w2, w3] # learned z = x1*w1 + x2*w2 + x3*w3 + b # weighted sum (a dot product!) output = activation(z) # squash it
Why the activation function matters
Without it, stacking layers just gives another straight line — useless for complex patterns. The activation adds non-linearity, letting the network bend and curve to fit anything. The default is ReLU: max(0, z) — dead simple, keeps positives, zeroes negatives. That tiny kink is enough.
Layers — from pixels to concepts
Stack neurons into layers; stack layers into a network. Early layers learn simple features (edges), later layers combine them into complex ones (shapes → faces). "Deep" learning just means many layers.
Input layer → Hidden layer(s) → Output layer [pixels] [edges → shapes] ["cat" 0.92]
How it learns
Same loop as all ML: predict, measure loss, use backpropagation to compute how every weight contributed to the error, and nudge each one downhill (gradient descent). Backprop is just the chain rule from calculus, applied efficiently across layers — you never do it by hand.
Key insight: a neural network is a giant function with millions of tunable knobs (weights). Training = automatically finding knob settings that map inputs to correct outputs. Nothing more mystical than that.