Karpathy's LLM Wiki: Build a Personal Knowledge Base That Gets Smarter Over Time

TL;DR: Traditional RAG searches your documents from scratch every time. Karpathy’s LLM Wiki flips this — an AI agent reads your sources once, builds an interlinked markdown wiki, and grows it incrementally. Each new source enriches the existing knowledge base instead of starting over.

The Problem with Traditional RAG

When you upload documents to ChatGPT or NotebookLM and ask a question, the AI searches through your files, pulls relevant chunks, and generates an answer. That works — but ask a similar question tomorrow and the AI does all that work again from scratch. Nothing is saved. Nothing compounds.

For questions that require connecting ideas across multiple documents, the AI has to find and stitch everything together every single time. There’s no memory, no accumulation, no building on previous understanding. As Karpathy puts it: with RAG, every question is like the AI has never read your documents before.

The LLM Wiki Concept

Karpathy’s idea flips the approach entirely. Instead of searching raw documents on every query, you have an AI agent read your documents once and build a structured wiki — a persistent knowledge base made of interlinked markdown files.

When you add a new source (PDF, article, meeting notes), the AI doesn’t just store it. It:

Extracts key ideas and creates wiki pages for new concepts
Updates existing pages with new information
Links related ideas together across sources
Flags contradictions when a new source disagrees with what’s already there
Updates the index to reflect the growing knowledge base

Over time, the wiki gets smarter. The connections are already there. The synthesis is already done. When you ask a question, the AI works from a pre-built organized knowledge base.

Karpathy’s Analogy

“Think of Obsidian as the IDE, the LLM as the programmer, and the Wiki as the code base. You rarely write the Wiki yourself — the AI does the writing and organizing. You focus on what goes in and what questions to ask.”

The Three Layers

Layer 1: Raw Sources (Read-Only)

Your original documents — PDFs, articles, meeting notes, whatever you’re working with. The AI reads these but never modifies them. This is your source of truth.

Layer 2: The Wiki (AI-Maintained)

A folder of markdown files that the AI creates and maintains. Contains:

Index page (table of contents for the entire wiki)
Concept pages (individual topics with summaries and references)
Entity pages (specific people, places, tools, projects)
Summary comparisons (side-by-side analyses)
All pages interlinked with [[wiki-style links]]

Layer 3: The Schema (Rules Document)

A configuration file (like CLAUDE.md for Claude Code) that tells the AI how to operate:

Purpose — what is this knowledge base about?
Folder structure — where are raw sources, where is wiki output?
Ingest workflow — when a new source is added: read, extract concepts, create/update wiki pages, update index, log changes
Page formatting rules — summary at top, source references on every claim, links to related concepts
Question-answering behavior — consult wiki first, cite sources, flag uncertainty

The schema evolves as the wiki grows. You can always refine it.

Setup: What You Need

Obsidian (free) — the viewer. Its graph view makes connections visual, but any text editor works since it’s just markdown files.
An AI coding agent — Claude Code (used in the demo), OpenAI Codex, Cursor, or any tool that can read/write local files.

Folder structure:

1
your-vault/
2
├── raw/          # Source documents (read-only)
3
├── wiki/         # AI-generated knowledge base
4
├── templates/    # Optional manual note templates
5
└── CLAUDE.md     # Schema / rules for the AI

Optional: Obsidian Web Clipper browser extension to save web articles as markdown directly into raw/.

The Workflow

Ingesting a source: Drop a document into raw/, then tell the AI: “I just added a new source to the raw folder. Please read it and update the wiki.” The AI reads the source, extracts key concepts, creates or updates wiki pages, and links everything together. Takes roughly 3 minutes for a typical article.

Asking questions: The AI consults the wiki pages — not the raw documents. It pulls from concept pages, entity pages, and summary pages, connecting dots across completely different sources. Each answer cites specific wiki pages.

Linting the wiki: Periodically ask the AI to “lint the wiki.” It checks for:

Contradictions between pages
Outdated claims
Orphan pages (no links pointing to them)
Broken links
Concepts mentioned but lacking their own page

This keeps the knowledge base healthy as it grows.

Use Cases

Students/researchers: Read papers and articles on a topic over weeks. End up with a structured knowledge base instead of a pile of highlighted PDFs.
Teachers: Feed in curriculum documents, professional development materials, and articles. Build a personal teaching wiki that grows over time.
Businesses: Ingest meeting notes, customer call transcripts, project documents. New team members browse the organized wiki instead of digging through Slack history.
Lifelong learners: Track what you learn from books, podcasts, and articles — like building your own personal encyclopedia.

Limitations

Personal scale only — works best with around 100 articles. Tens of thousands of pages need more infrastructure than plain markdown files.
Garbage in, garbage out — the wiki is only as good as the sources you feed it. Curation still matters.
Requires a coding agent — Obsidian alone doesn’t do any of this. The AI is the engine.
AI can make mistakes — miscategorization, missed connections. The lint feature helps, but review is still important, especially early on.

Why This Matters

The LLM Wiki is a fundamentally different approach to working with AI and documents. Instead of treating every query as a fresh search, it treats knowledge as something that compounds. Each source makes the whole system smarter, not just the current answer. The pattern works anywhere you’re accumulating knowledge over time and want it organized rather than scattered.

References

Karpathy’s LLM Wiki (original concept) — https://github.com/karpathy/llm.wiki
Obsidian — https://obsidian.md
CLAUDE.md Schema Template — https://go.teachers.tech/LLM_Wiki_CLAUDE
Claude Code Beginner’s Guide — https://youtu.be/s-Mc26Ytz10

Video: Karpathy’s LLM Wiki - Full Beginner Setup Guide by Teacher’s Tech