TL;DR: LLM Wiki is a pattern where an LLM incrementally builds and maintains a persistent, interlinked markdown wiki from your raw documents — knowledge compounds over time instead of being rediscovered on every query. Karpathy’s gist hit 5,000+ stars and 3,662 forks in days. But the community discussion revealed structural gaps, enforcement patterns, and a fourth layer the original design missed.
On April 4, 2026, Andrej Karpathy posted a single markdown file to GitHub Gist. It was titled llm-wiki.md. It contained no code, no implementation, no screenshots — just an idea described in abstract terms.
Within days it had 5,000+ stars and 3,662 forks.
The pattern is simple: instead of retrieving raw document chunks on every query (RAG), an LLM incrementally builds and maintains a persistent wiki — a structured, interlinked collection of markdown files that sits between you and the raw sources. Knowledge is compiled once, then kept current.
The community discussion that followed revealed far more than the original gist. Practitioners shared enforcement patterns, structural critiques, protocol extensions, and warnings about where the pattern will break. This article covers all of it.
The Core Insight: How Good Researchers Actually Work
Most people’s experience with LLMs and documents looks like RAG: upload files, retrieve chunks, generate answers. NotebookLM, ChatGPT file uploads, and most RAG systems work this way. Nothing is built up. Ask a subtle question that requires synthesizing five documents, and the LLM has to find and piece together the relevant fragments every time.
Good researchers don’t work this way. They don’t reread every paper from scratch before answering a question. Over time, they build a knowledge base — summaries, cross-references, connections, notes on where new sources challenge old beliefs. When a new source arrives, it gets integrated into what’s already there, not piled on top.
LLM Wiki automates exactly this behavior. The key difference: the wiki is a persistent, compounding artifact. The cross-references are already there. The contradictions have already been flagged. The synthesis already reflects everything you’ve read.
Architecture: Three Layers (That Practitioners Made Four)
Karpathy’s original design describes three layers:
1. Raw Sources (Immutable)
Your curated collection of source documents — articles, papers, images, data files. The LLM reads from them but never modifies them. This is your source of truth. Don’t organize them — that’s the AI’s job.
2. The Wiki (LLM-Owned)
A directory of LLM-generated markdown files. Summaries, entity pages, concept pages, comparisons, synthesis. The LLM owns this layer entirely — it creates pages, updates them when new sources arrive, maintains cross-references, and keeps everything consistent. You read it; the LLM writes it.
3. The Schema (Configuration)
A document — typically CLAUDE.md for Claude Code, AGENTS.md for Codex, or schema.md — that tells the LLM how the wiki is structured, what the conventions are, and what workflows to follow. This is what makes the LLM a disciplined wiki maintainer rather than a generic chatbot. You and the LLM co-evolve this over time.
4. The Infrastructure Layer (Discovered by Practitioners)
A team that’s been building this for six months found that three layers wasn’t enough. They added a fourth: an infrastructure layer with design records for the agents, rules, hooks, and conventions themselves. “Schema-in-CLAUDE.md works until the schema has non-trivial rationale worth preserving across changes. Then it wants its own records.”
Three Core Operations
Ingest
You drop a new source into the raw collection and tell the LLM to process it. Karpathy prefers to ingest sources one at a time and stay involved — reading summaries, checking updates, guiding emphasis. But you can batch-ingest with less supervision.
A single source might touch 10–15 wiki pages. The LLM reads the source, discusses key takeaways with you, writes a summary page, updates the index, updates relevant entity and concept pages, and appends an entry to the log.
Query
You ask questions against the wiki. The LLM searches for relevant pages, reads them, and synthesizes an answer with citations. Answers can take different forms — a markdown page, a comparison table, a slide deck, a chart.
The critical insight: good answers get filed back into the wiki as new pages. A comparison you asked for, an analysis, a connection you discovered — these are valuable and shouldn’t disappear into chat history. Your explorations compound in the knowledge base just like ingested sources do.
Lint
Periodically, ask the LLM to health-check the wiki. Look for:
- Contradictions between pages
- Stale claims that newer sources have superseded
- Orphan pages with no inbound links
- Important concepts mentioned but lacking their own page
- Missing cross-references
- Data gaps that could be filled with a web search
The LLM is good at suggesting new questions to investigate and new sources to look for. This keeps the wiki healthy as it grows.
LLM Wiki vs. Traditional RAG
RAG (Retrieval Augmented Generation) works by chunking your documents, embedding them in a vector store, and retrieving the most relevant fragments at query time. It’s useful — but nothing accumulates. With every question, the AI reassembles the answer from scratch.
| Aspect | Traditional RAG | LLM Wiki |
|---|---|---|
| Query Process | Retrieve chunks → Re-process → Answer | Read compiled wiki → Synthesize → Answer |
| Knowledge Persistence | Ephemeral (re-built each query) | Persistent (compounds over time) |
| Infrastructure | Vector database, embedding pipeline | Just markdown files |
| Contradiction Handling | Depends on retrieval ranking | Explicitly tracked and flagged |
| Cost per Query | High (re-processes everything) | Low (reads pre-compiled wiki) |
| Transparency | Hidden in vector space | Browse, verify, edit any page |
That said, LLM Wiki isn’t a replacement for RAG in every scenario. If you’re working with thousands of documents, need real-time data, or want to query a large corpus without curating it, RAG with proper infrastructure is still the better tool. LLM Wiki works best at personal scale — roughly up to 100 or maybe 200 sources.
Karpathy’s Four Advantages
- Explicit — You can see exactly what the AI knows and doesn’t know. No hidden vector embeddings, no black-box retrieval.
- Yours — The data lives on your computer, not in a provider’s cloud.
- File over app — It’s just markdown files, not locked into any platform. Any tool can read it, any AI model can work with it.
- Bring your own AI — You can use cloud GPT, open-source models, or switch anytime. The data and the AI models are completely decoupled.
What the Community Discovered: Enforcement Patterns
A team that’s been building LLM Wiki-adjacent systems for six months shared enforcement patterns that go beyond the original design.
Enforce at the Agent Boundary, Not the Conversation Boundary
Rather than trying to block the main conversation from editing the wiki, they let each specialized agent be its own enforcement unit:
- Writer agent — frontmatter excludes Bash and web access; a PreToolUse hook blocks writes to any path outside the content layers
- Maintainer agent — has Bash, but a PreToolUse hook validates every command (no
rm -rf, no force-push) - Auditor agent — read-only by design
The main conversation’s write discipline is instructional — it’s trusted to respect the rule in the schema because it’s the “planner,” not the “executor.” Hooks do the heavy lifting on the executors. This gives you structural guarantees on the agents that actually mutate things, without the friction of locking the conversation itself.
Four Claim Types, Not Binary Verified/Unverified
Binary verified/unverified isn’t enough — you need to split “inferred” from “unsourced.” They shipped four claim types as Obsidian callouts:
| Claim Type | Meaning | Example |
|---|---|---|
Source | Verbatim quote with citation | ”X causes Y” — Paper A, p. 12 |
Analysis | Team’s inference from data | ”This suggests X is primary driver” |
Inferred | Model’s reasoning, not yet verified | ”Likely connected to Z based on pattern” |
Gap | Explicitly missing — never fill with a guess | ”No source addresses X yet” |
The Analysis/Inferred split prevents paraphrasing-bias, where the model rewrites what a source says and nobody can tell afterwards whether it got it right.
Staleness Scoring
Each file carries a score derived from how far behind its outgoing wiki-link dependencies it is. Forward-only, no backlink tracking. Update a source → every downstream file’s staleness score ticks up → the auditor surfaces the worst offenders. This replaces a lot of the “who might have stale claims about this?” review burden that otherwise falls back on humans.
The Structural Gaps: Why LLM Wiki Could Collapse
Commenter V-interactions identified four structural problems the pattern doesn’t address:
1. Lifecycle
No mechanism for knowledge aging. Old claims persist even after sources are superseded. The lint operation catches some of this, but there’s no automated decay function. A claim from a 2019 paper that was overturned in 2023 stays in the wiki at equal weight unless someone runs lint and the LLM catches it.
2. Epistemic Filtering
No way to weight sources by credibility. A blog post and a peer-reviewed meta-analysis get equal treatment. The schema can instruct the LLM to prefer certain source types, but there’s no systematic credibility scoring.
3. Entropy
The wiki grows monotonically. There’s no pruning mechanism, no archival strategy, no compression of redundant pages. At scale, the wiki becomes a garden that only grows, never trimmed.
4. Scalability
Works at ~100-200 sources, but the pattern has no answer for what happens at 1,000+. The ingest operation that touches 10-15 pages per source becomes computationally expensive. The linear scan of the index slows down. The LLM’s context window fills up with wiki pages before it can answer a question.
The proposed direction: A knowledge graph with temporal decay, source credibility scoring, and automated pruning. But this is still theoretical — no implementation exists yet.
Extensions the Community Built
FUNGI Protocol: Critical Thinking Layer
Educational sciences researcher FBoschman extended the pattern with a zettelkasten-inspired workflow called FUNGI:
- Fleeting note — capture the raw thought
- Understand — the LLM explains the concept in context
- Nuance — identify tensions with existing knowledge
- Grow — develop connections to other wiki pages
- Integrate — file into the wiki with cross-references
The protocol includes a counter-argument step to resist confirmation bias. Five stages can feel heavy for small notes, so there’s a “quick pass” option. The LLM flags notes that haven’t fully grown, need work, or have interesting tensions with other pages.
The Obsidian Controversy
The most heated discussion was about Obsidian, the most popular frontend for LLM Wiki.
“Obsidian is proprietary software. You cannot run a true ‘personal knowledge base’ when the viewer itself is closed-source, vendor-controlled code that phones home.” — gnusupport
The argument: for a pattern that preaches persistent ownership, compounding knowledge, and bootstrapping, handing the viewing layer to a for-profit company is a contradiction. Suggestions included VS Codium, terminal Markdown renderers, static site generators, or custom minimal viewers.
The counter-argument: Obsidian’s graph view, Dataview queries, and backlinks are unmatched for navigating a wiki. The data is still plain Markdown — you’re not locked in. You can migrate anytime.
The Instance Strategy Question
“How many instances would you recommend? A personal one vs a work one? One per project?” — RonanCodes
No consensus emerged. Some practitioners run separate wikis per domain (research, work, personal). Others maintain a single wiki with schema-level topic separation. The cross-query question — can one wiki query another? — remains unanswered.
The Engelbart Lineage
A commenter pointed out that LLM Wiki is essentially an instantiation of Douglas Engelbart’s 1992 vision from “Toward High-Performance Organizations: A Strategic Role for Groupware”:
- Augmentation System — Human + Tool co-evolving
- CODIAK — Concurrent development, integration, and application of knowledge
- Open Hyperdocument System — Global addressing, back-links, structured documents
- ABC Model — A = core work, B = improve A, C = improve B (bootstrapping via C communities)
Karpathy’s division of labor — “You think. The LLM does the bookkeeping” — is the Engelbart arc realized: the human focuses on taste, judgment, and long-term sense-making while the AI handles the structural maintenance.
Getting Started
The basic workflow:
- Create folders:
raw/for sources,wiki/for the knowledge base, optionallyoutputs/for generated answers - Dump your sources into
raw/— articles, notes, PDFs. Don’t organize them. - Write a
schema.mdfile describing your topic, folder structure, wiki rules, and focus areas - Point an AI agentic tool at your folder and tell it to follow the schema rules
- Browse the wiki — verify, edit, and explore the connections the AI built
- Ask questions against the wiki for synthesized answers
- Run lint periodically to keep the wiki healthy
Templates for schema.md are available in the llm_wiki project by Nash Su.
Why This Matters
The LLM Wiki pattern realizes a vision where the associative connections between documents are as valuable as the documents themselves.
But more practically, it solves a real problem: information overload without synthesis is just hoarding. We all have hundreds of saved articles, bookmarked papers, and downloaded PDFs. The value isn’t in collecting them — it’s in connecting them, comparing them, and building a coherent understanding over time.
LLM Wiki automates the connection-building. You focus on reading and asking questions. The AI handles the bookkeeping.
The structural gaps are real — lifecycle, epistemic filtering, entropy, scalability — but they’re research problems, not dealbreakers. The pattern is young, evolving, and being stress-tested by thousands of practitioners right now.
Further Reading
- Karpathy’s original LLM Wiki gist — 5,000+ stars, the source document
- V-interactions: “Your LLM Wiki Will Collapse” — four structural gaps analysis
- llm_wiki on GitHub — open-source implementation with templates
- Engelbart (1992): “Toward High-Performance Organizations” — the intellectual ancestor
