Build a "Chat With Your PDF" RAG App

The "chat with your documents" app is THE portfolio project for 2026 AI roles. Here is the whole thing, conceptually complete.

Step 1 — Load & chunk

Split the document into overlapping chunks (~500 tokens). Chunks must be small enough to embed meaningfully but large enough to hold context.

text   = load_pdf("syllabus.pdf")
chunks = split(text, size=500, overlap=50)   # overlap keeps context across cuts

Step 2 — Embed & store (indexing, done once)

import chromadb
db = chromadb.Client().create_collection("docs")

for i, chunk in enumerate(chunks):
    vec = embed(chunk)                      # embedding model → vector
    db.add(ids=[str(i)], embeddings=[vec], documents=[chunk])

Step 3 — Retrieve + generate (per question)

def answer(question):
    q_vec   = embed(question)
    hits    = db.query(query_embeddings=[q_vec], n_results=4)
    context = "
".join(hits["documents"][0])

    prompt = f"""Answer using ONLY this context. If unknown, say "I don't know".

Context:
{context}

Question: {question}"""
    return llm(prompt)

That's the entire pattern

Index once, then every question = embed → retrieve top chunks → stuff into prompt → answer. Libraries like LangChain or LlamaIndex wrap this in fewer lines, but build it raw first so you understand each piece.

Level it up for the portfolio

Add a chat UI (streaming responses)
Show the source chunks as citations (builds trust, reduces "is it hallucinating?")
Support multiple documents; add file upload (drag & drop guide)

Concepts: embeddings · RAG.

← Previous

Build Your First AI App with an LLM API

Taking an AI Feature to Production — Cost, Latency & Guardrails