Embeddings & Vector Search — How AI Understands Meaning

Embeddings are the quiet superpower behind semantic search, recommendations and RAG. The idea is beautiful: turn meaning into coordinates.

The core idea

An embedding model converts any text into a list of numbers (a vector, e.g. 1536 of them). The magic: similar meanings get similar vectors — they point in similar directions in space.

"king"   → [0.21, -0.44, 0.87, ...]
"queen"  → [0.19, -0.41, 0.85, ...]   # close to king
"banana" → [-0.66, 0.12, -0.30, ...]  # far away
# distance in this space = difference in meaning

Semantic search beats keyword search

Search "how to reset my password" and keyword search misses a doc titled "recovering account access" — no shared words. Embeddings match them because the meaning is close. You embed the query and find the nearest document vectors.

from sklearn.metrics.pairwise import cosine_similarity
# 1 = identical meaning, 0 = unrelated, -1 = opposite
cosine_similarity(query_vec, doc_vec)   # rank docs by this

Vector databases

To search millions of vectors fast, you use a vector database (Pinecone, Chroma, pgvector). It stores embeddings and returns the nearest neighbours to a query in milliseconds. This is the retrieval engine behind every RAG system.

Where you'll use them

Semantic search over docs
"Related articles" / product recommendations
Duplicate/plagiarism detection
Feeding relevant context to an LLM (RAG)

← Previous

Prompt Engineering — Get 10x Better Outputs

RAG Explained — Give LLMs Your Own Knowledge