LLM Wiki: The Pattern That Turns AI Into Your Knowledge Partner

TL;DR: LLM Wiki is a pattern where an LLM incrementally builds and maintains a persistent, interlinked markdown wiki from your raw documents — knowledge compounds over time instead of being rediscovered on every query. Karpathy’s gist hit 5,000+ stars and 3,662 forks in days. But the community discussion revealed structural gaps, enforcement patterns, and a fourth layer the original design missed.

On April 4, 2026, Andrej Karpathy posted a single markdown file to GitHub Gist. It was titled llm-wiki.md. It contained no code, no implementation, no screenshots — just an idea described in abstract terms.

Within days it had 5,000+ stars and 3,662 forks.

The pattern is simple: instead of retrieving raw document chunks on every query (RAG), an LLM incrementally builds and maintains a persistent wiki — a structured, interlinked collection of markdown files that sits between you and the raw sources. Knowledge is compiled once, then kept current.

The community discussion that followed revealed far more than the original gist. Practitioners shared enforcement patterns, structural critiques, protocol extensions, and warnings about where the pattern will break. This article covers all of it.

The Core Insight: How Good Researchers Actually Work

Most people’s experience with LLMs and documents looks like RAG: upload files, retrieve chunks, generate answers. NotebookLM, ChatGPT file uploads, and most RAG systems work this way. Nothing is built up. Ask a subtle question that requires synthesizing five documents, and the LLM has to find and piece together the relevant fragments every time.

Good researchers don’t work this way. They don’t reread every paper from scratch before answering a question. Over time, they build a knowledge base — summaries, cross-references, connections, notes on where new sources challenge old beliefs. When a new source arrives, it gets integrated into what’s already there, not piled on top.

LLM Wiki automates exactly this behavior. The key difference: the wiki is a persistent, compounding artifact. The cross-references are already there. The contradictions have already been flagged. The synthesis already reflects everything you’ve read.

Architecture: Three Layers (That Practitioners Made Four)

Karpathy’s original design describes three layers:

graph TD subgraph "raw/ — Immutable Sources" A1[article-1.pdf] A2[paper-2.md] A3[notes-3.txt] A4[bookmark-4.md] end subgraph "wiki/ — Knowledge Base" B1[index.md] B2[log.md] B3[topic-a.md] B4[topic-b.md] B5[comparisons/] end schema[schema.md] -. "rules" .-> LLM LLM[Agentic AI] -->|reads| A1 LLM -->|reads| A2 LLM -->|writes| B1 LLM -->|writes| B3 LLM -->|writes| B4 LLM -->|writes| B5 classDef raw stroke:#0ea5e9,stroke-width:2px classDef wiki stroke:#10b981,stroke-width:2px classDef config stroke:#f59e0b,stroke-width:2px,stroke-dasharray:5 class A1,A2,A3,A4 raw class B1,B2,B3,B4,B5 wiki class schema config

1. Raw Sources (Immutable)

Your curated collection of source documents — articles, papers, images, data files. The LLM reads from them but never modifies them. This is your source of truth. Don’t organize them — that’s the AI’s job.

2. The Wiki (LLM-Owned)

A directory of LLM-generated markdown files. Summaries, entity pages, concept pages, comparisons, synthesis. The LLM owns this layer entirely — it creates pages, updates them when new sources arrive, maintains cross-references, and keeps everything consistent. You read it; the LLM writes it.

3. The Schema (Configuration)

A document — typically CLAUDE.md for Claude Code, AGENTS.md for Codex, or schema.md — that tells the LLM how the wiki is structured, what the conventions are, and what workflows to follow. This is what makes the LLM a disciplined wiki maintainer rather than a generic chatbot. You and the LLM co-evolve this over time.

4. The Infrastructure Layer (Discovered by Practitioners)

A team that’s been building this for six months found that three layers wasn’t enough. They added a fourth: an infrastructure layer with design records for the agents, rules, hooks, and conventions themselves. “Schema-in-CLAUDE.md works until the schema has non-trivial rationale worth preserving across changes. Then it wants its own records.”

Three Core Operations

Ingest

You drop a new source into the raw collection and tell the LLM to process it. Karpathy prefers to ingest sources one at a time and stay involved — reading summaries, checking updates, guiding emphasis. But you can batch-ingest with less supervision.

sequenceDiagram participant User participant Raw as raw/ participant LLM participant Wiki as wiki/ User->>Raw: drop article-5.md User->>LLM: ingest LLM->>Raw: read article-5.md LLM->>LLM: extract insights LLM->>Wiki: update existing pages LLM->>Wiki: create new concept pages LLM->>Wiki: flag contradictions LLM->>Wiki: update index.md + log.md Note over Wiki: 10–15 pages modified

A single source might touch 10–15 wiki pages. The LLM reads the source, discusses key takeaways with you, writes a summary page, updates the index, updates relevant entity and concept pages, and appends an entry to the log.

Query

You ask questions against the wiki. The LLM searches for relevant pages, reads them, and synthesizes an answer with citations. Answers can take different forms — a markdown page, a comparison table, a slide deck, a chart.

sequenceDiagram participant User participant LLM participant Wiki as wiki/ User->>LLM: ask a question LLM->>Wiki: search index for relevant pages Wiki-->>LLM: return matching pages LLM->>LLM: synthesize answer LLM-->>User: return answer with citations

The critical insight: good answers get filed back into the wiki as new pages. A comparison you asked for, an analysis, a connection you discovered — these are valuable and shouldn’t disappear into chat history. Your explorations compound in the knowledge base just like ingested sources do.

Lint

Periodically, ask the LLM to health-check the wiki. Look for:

Contradictions between pages
Stale claims that newer sources have superseded
Orphan pages with no inbound links
Important concepts mentioned but lacking their own page
Missing cross-references
Data gaps that could be filled with a web search

The LLM is good at suggesting new questions to investigate and new sources to look for. This keeps the wiki healthy as it grows.

LLM Wiki vs. Traditional RAG

RAG (Retrieval Augmented Generation) works by chunking your documents, embedding them in a vector store, and retrieving the most relevant fragments at query time. It’s useful — but nothing accumulates. With every question, the AI reassembles the answer from scratch.

Aspect	Traditional RAG	LLM Wiki
Query Process	Retrieve chunks → Re-process → Answer	Read compiled wiki → Synthesize → Answer
Knowledge Persistence	Ephemeral (re-built each query)	Persistent (compounds over time)
Infrastructure	Vector database, embedding pipeline	Just markdown files
Contradiction Handling	Depends on retrieval ranking	Explicitly tracked and flagged
Cost per Query	High (re-processes everything)	Low (reads pre-compiled wiki)
Transparency	Hidden in vector space	Browse, verify, edit any page

That said, LLM Wiki isn’t a replacement for RAG in every scenario. If you’re working with thousands of documents, need real-time data, or want to query a large corpus without curating it, RAG with proper infrastructure is still the better tool. LLM Wiki works best at personal scale — roughly up to 100 or maybe 200 sources.

Karpathy’s Four Advantages

Explicit — You can see exactly what the AI knows and doesn’t know. No hidden vector embeddings, no black-box retrieval.
Yours — The data lives on your computer, not in a provider’s cloud.
File over app — It’s just markdown files, not locked into any platform. Any tool can read it, any AI model can work with it.
Bring your own AI — You can use cloud GPT, open-source models, or switch anytime. The data and the AI models are completely decoupled.

What the Community Discovered: Enforcement Patterns

A team that’s been building LLM Wiki-adjacent systems for six months shared enforcement patterns that go beyond the original design.

Enforce at the Agent Boundary, Not the Conversation Boundary

Rather than trying to block the main conversation from editing the wiki, they let each specialized agent be its own enforcement unit:

Writer agent — frontmatter excludes Bash and web access; a PreToolUse hook blocks writes to any path outside the content layers
Maintainer agent — has Bash, but a PreToolUse hook validates every command (no rm -rf, no force-push)
Auditor agent — read-only by design

The main conversation’s write discipline is instructional — it’s trusted to respect the rule in the schema because it’s the “planner,” not the “executor.” Hooks do the heavy lifting on the executors. This gives you structural guarantees on the agents that actually mutate things, without the friction of locking the conversation itself.

Four Claim Types, Not Binary Verified/Unverified

Binary verified/unverified isn’t enough — you need to split “inferred” from “unsourced.” They shipped four claim types as Obsidian callouts:

Claim Type	Meaning	Example
`Source`	Verbatim quote with citation	”X causes Y” — Paper A, p. 12
`Analysis`	Team’s inference from data	”This suggests X is primary driver”
`Inferred`	Model’s reasoning, not yet verified	”Likely connected to Z based on pattern”
`Gap`	Explicitly missing — never fill with a guess	”No source addresses X yet”

The Analysis/Inferred split prevents paraphrasing-bias, where the model rewrites what a source says and nobody can tell afterwards whether it got it right.

Staleness Scoring

Each file carries a score derived from how far behind its outgoing wiki-link dependencies it is. Forward-only, no backlink tracking. Update a source → every downstream file’s staleness score ticks up → the auditor surfaces the worst offenders. This replaces a lot of the “who might have stale claims about this?” review burden that otherwise falls back on humans.

The Structural Gaps: Why LLM Wiki Could Collapse

Commenter V-interactions identified four structural problems the pattern doesn’t address:

1. Lifecycle

No mechanism for knowledge aging. Old claims persist even after sources are superseded. The lint operation catches some of this, but there’s no automated decay function. A claim from a 2019 paper that was overturned in 2023 stays in the wiki at equal weight unless someone runs lint and the LLM catches it.

2. Epistemic Filtering

No way to weight sources by credibility. A blog post and a peer-reviewed meta-analysis get equal treatment. The schema can instruct the LLM to prefer certain source types, but there’s no systematic credibility scoring.

3. Entropy

The wiki grows monotonically. There’s no pruning mechanism, no archival strategy, no compression of redundant pages. At scale, the wiki becomes a garden that only grows, never trimmed.

4. Scalability

Works at ~100-200 sources, but the pattern has no answer for what happens at 1,000+. The ingest operation that touches 10-15 pages per source becomes computationally expensive. The linear scan of the index slows down. The LLM’s context window fills up with wiki pages before it can answer a question.

The proposed direction: A knowledge graph with temporal decay, source credibility scoring, and automated pruning. But this is still theoretical — no implementation exists yet.

Extensions the Community Built

FUNGI Protocol: Critical Thinking Layer

Educational sciences researcher FBoschman extended the pattern with a zettelkasten-inspired workflow called FUNGI:

Fleeting note — capture the raw thought
Understand — the LLM explains the concept in context
Nuance — identify tensions with existing knowledge
Grow — develop connections to other wiki pages
Integrate — file into the wiki with cross-references

The protocol includes a counter-argument step to resist confirmation bias. Five stages can feel heavy for small notes, so there’s a “quick pass” option. The LLM flags notes that haven’t fully grown, need work, or have interesting tensions with other pages.

The Obsidian Controversy

The most heated discussion was about Obsidian, the most popular frontend for LLM Wiki.

“Obsidian is proprietary software. You cannot run a true ‘personal knowledge base’ when the viewer itself is closed-source, vendor-controlled code that phones home.” — gnusupport

The argument: for a pattern that preaches persistent ownership, compounding knowledge, and bootstrapping, handing the viewing layer to a for-profit company is a contradiction. Suggestions included VS Codium, terminal Markdown renderers, static site generators, or custom minimal viewers.

The counter-argument: Obsidian’s graph view, Dataview queries, and backlinks are unmatched for navigating a wiki. The data is still plain Markdown — you’re not locked in. You can migrate anytime.

The Instance Strategy Question

“How many instances would you recommend? A personal one vs a work one? One per project?” — RonanCodes

No consensus emerged. Some practitioners run separate wikis per domain (research, work, personal). Others maintain a single wiki with schema-level topic separation. The cross-query question — can one wiki query another? — remains unanswered.

The Engelbart Lineage

A commenter pointed out that LLM Wiki is essentially an instantiation of Douglas Engelbart’s 1992 vision from “Toward High-Performance Organizations: A Strategic Role for Groupware”:

Augmentation System — Human + Tool co-evolving
CODIAK — Concurrent development, integration, and application of knowledge
Open Hyperdocument System — Global addressing, back-links, structured documents
ABC Model — A = core work, B = improve A, C = improve B (bootstrapping via C communities)

Karpathy’s division of labor — “You think. The LLM does the bookkeeping” — is the Engelbart arc realized: the human focuses on taste, judgment, and long-term sense-making while the AI handles the structural maintenance.

Getting Started

The basic workflow:

Create folders: raw/ for sources, wiki/ for the knowledge base, optionally outputs/ for generated answers
Dump your sources into raw/ — articles, notes, PDFs. Don’t organize them.
Write a schema.md file describing your topic, folder structure, wiki rules, and focus areas
Point an AI agentic tool at your folder and tell it to follow the schema rules
Browse the wiki — verify, edit, and explore the connections the AI built
Ask questions against the wiki for synthesized answers
Run lint periodically to keep the wiki healthy

Templates for schema.md are available in the llm_wiki project by Nash Su.

Why This Matters

The LLM Wiki pattern realizes a vision where the associative connections between documents are as valuable as the documents themselves.

But more practically, it solves a real problem: information overload without synthesis is just hoarding. We all have hundreds of saved articles, bookmarked papers, and downloaded PDFs. The value isn’t in collecting them — it’s in connecting them, comparing them, and building a coherent understanding over time.

LLM Wiki automates the connection-building. You focus on reading and asking questions. The AI handles the bookkeeping.

The structural gaps are real — lifecycle, epistemic filtering, entropy, scalability — but they’re research problems, not dealbreakers. The pattern is young, evolving, and being stress-tested by thousands of practitioners right now.