Karpathy's LLM Wiki Pattern Has 16 Million Views. Here's What It Looks Like Actually Running.

Ethan Carter
Apr 17
9 min read

Every time you start a new AI session, you start from zero. The research you did last week, the conclusions you reached last month, the connections you noticed between two papers six months ago: none of it carries over. The AI you're talking to has no memory of any of it. RAG, the dominant architecture for giving AI models access to external documents, retrieves fragments on demand but treats each query as independent. Nothing accumulates. Nothing compounds.

On April 3, 2026, Andrej Karpathy, a co-founder of OpenAI and former head of AI at Tesla, published a GitHub Gist describing a different approach: a three-folder markdown setup where an LLM compiles, maintains, and queries a structured knowledge base without a vector database. The post got 16 million views. That number represents something specific: researchers, developers, product managers, and analysts recognizing a problem they had been working around for years, finally named precisely.

The architecture Karpathy described is correct. The question is what it actually takes to run it. For most people who are not working directly with LLM APIs, the gap between "this is a good idea" and "this is working on my machine" is a multi-day engineering project that becomes an ongoing maintenance commitment. This article explains what the pattern requires, what a working implementation looks like, and what changes when the infrastructure is handled for you.

What Karpathy's LLM Wiki Pattern Actually Is

The llm wiki architecture Karpathy published centers on a structural insight: knowledge should be assembled before queries, not at query time.

Traditional RAG works as follows. You ask a question, the system searches a vector store for relevant chunks, and the model constructs an answer from whatever surfaces. The quality depends entirely on retrieval: whether the right chunks appear, whether they contain enough context, whether they contradict each other. Each query starts fresh. The system has no understanding of your domain, only access to your documents.

Karpathy's pattern inverts this. Three folders do the work. raw/ holds source material: research papers, articles, notes, YouTube transcripts, GitHub READMEs. wiki/ is where an LLM periodically reads all that raw material and compiles it into structured, encyclopedia-style articles, each synthesizing what the sources say about a concept, resolving conflicts between them, and linking to related entries. index.md is a table of contents compact enough to fit in a single context window.

When you query the system, the model reads the index first, identifies which articles are relevant, loads them, and answers from compiled knowledge rather than raw fragments. The LLM wiki doesn't retrieve information. It has already synthesized it.

The practical implications are concrete. Token usage drops roughly 95 percent compared to loading equivalent raw source material. Every fact is traceable to a readable, editable Markdown file. And because the wiki builds over time, a question asked in month six draws on a richer, more connected knowledge structure than the same question in month one.

What the Gist does not describe is the setup. The architecture assumes familiarity with LLM APIs, comfort writing or running scripts to trigger compile and lint operations, and the discipline to keep those processes running regularly over weeks and months. For the researchers, analysts, and product managers who most need compounding knowledge, that assumption is a significant barrier. A working implementation is not a folder structure. It is a maintenance system, and building that system is the part Karpathy left to the reader.

The Gist's 485 comments are largely a record of what breaks when people try to build this themselves: the raw/ folder stops growing after the first two weeks, the compile step never gets automated, and the lint pass gets skipped until the wiki becomes inconsistent enough to be unreliable.

How AK Wiki Implements the Three Layers

remio is a local-first AI knowledge base that automatically captures content from web browsing, files, and meetings. AK Wiki is an aApp inside remio built directly on Karpathy's architecture. Each of the three layers maps to a concrete implementation, and the gap between what the pattern describes and what most DIY attempts manage to sustain closes at each one.

Layer 1: The raw/ problem, solved by automated capture

Populating raw/ in a DIY setup requires deliberate curation: deciding what to include, converting content to a consistent format, and maintaining the organizational discipline to keep the folder current. Over time, this becomes its own job. Most implementations that fail do so here. The raw/ folder stops growing after the initial enthusiasm fades, and a knowledge base that stops receiving new material starts degrading from that point forward.

AK Wiki's source layer is remio's existing capture infrastructure. Every web page you clip, every local file you index, every meeting you record flows in automatically. There is no intake decision, no format conversion, no folder to maintain. The material accumulates as you work, not as a separate activity. For a knowledge worker who is not also a developer, this is the difference between having this system and not having it. The intake discipline that kills DIY implementations simply does not exist as a constraint.

Layer 2: Compiled knowledge, organized by topic and concept

The compile operation produces two types of output. Topic Collections are structured knowledge entries organized around themes you define: "AI Agents," "competitive research," "Q2 product decisions." You set the scope; the AI builds the entries from everything you have captured that touches those themes.

Concept Collections are different. These are connections and concepts the AI discovers during compilation that you did not explicitly define. If sources you captured across different weeks and different contexts point toward the same underlying idea, AK Wiki surfaces that relationship as its own entry. The system is not just organizing your knowledge. It is finding structure in it that you did not put there.

Both output types are readable, structured articles, not vector embeddings or retrieval scores. You can read them, edit them, and trace every claim back to the source material it came from. This is the transparency advantage VentureBeat noted when covering Karpathy's approach: an "evolving markdown library maintained by AI" that remains legible to its human owner, unlike a vector index that requires tooling to inspect.

During compilation, AK Wiki also supplements knowledge gaps. When your captured material lacks coverage on a concept that appears in your sources, the system searches the web to fill it in. Karpathy's original architecture has no equivalent for this; it works only with what you explicitly put into raw/.

Layer 3: A queryable topic map that maintains itself

The compiled output organizes into a navigable topic view: the functional equivalent of index.md, updated with each compile run. Unlike a manually maintained index, it does not go stale when you stop curating it. The structure reflects what the system has synthesized from your captured material, not what you remembered to document last Tuesday.

What Compile and Lint Actually Do

Two operations define how AK Wiki maintains knowledge quality over time. Both address failure modes that are well-documented in DIY implementations of Karpathy's pattern.

Compile is not indexing. When a compile run executes, the AI reads your captured content and rewrites knowledge entries based on current understanding. New sources integrate into existing articles rather than appending as separate entries. Contradictions between sources are identified and resolved within the article. The result is a synthesis: a well-compiled entry on a topic you have researched for three months reads differently from one built after three weeks, because the underlying material is richer and the connections are deeper.

This is computationally heavier than building a vector index, which is why it runs on demand or on a schedule rather than continuously. That is the intended design. The output quality is qualitatively different from what retrieval-based systems produce, and the processing cost reflects that difference. For knowledge workers outside technical roles, not having to configure this pipeline is itself a meaningful advantage: the compile step is a button, not a script you wrote.

Lint is the operation that prevents knowledge from degrading. Karpathy described a lint pass in his original Gist: a periodic scan of the entire wiki to identify inconsistencies, outdated information, and entries that conflict with more recent material. He noted it was necessary. What he did not specify was how to automate it, how frequently to run it, or what to do when the wiki has grown large enough that running it becomes expensive.

In practice, lint is the first thing that gets skipped in DIY implementations. Running it requires remembering to run it, having a working script available, and allocating time and API cost. Most implementations that start strong gradually stop running lint, and the wiki becomes less reliable over the following months without any obvious failure signal. The entries look authoritative. They are increasingly wrong.

In AK Wiki, Lint is a built-in operation accessible from the main interface. It can be triggered immediately or scheduled to run automatically. The AI reviews existing entries against the current state of your knowledge base, flags content that has become inconsistent, and updates it. The maintenance loop that Karpathy's architecture requires but does not specify is, in AK Wiki, a button.

What Your Knowledge Base Looks Like Over Time

The value of this architecture is not visible at setup. It accumulates.

At month one, Topic Collections start forming around your most active research areas. The entries are useful: they synthesize a few weeks of captured material into structured articles that answer questions faster than searching raw sources. Cross-topic connections are limited because there is not much to connect yet. The system is already more useful than searching a folder of raw files, but the compounding has not started. This is the stage where DIY implementations feel promising but have not yet produced results that feel essential. It is also, as documented extensively in the Gist comments, the stage where most of them plateau, because the maintenance loop breaks down before the compounding has had time to demonstrate its value.

At month three, Concept Collections begin to matter. The AI has now read material you captured across different contexts: an article clipped three months ago, a meeting transcript from last month, a document indexed last week. It surfaces connections between them that you did not deliberately create. A competitive analysis entry built in month one has been updated to incorporate a product announcement you captured in month three, without any manual action on your part. The knowledge base is no longer just a structured version of what you put in. It is starting to contain understanding you did not explicitly build.

At month six, the system begins answering questions you did not know to ask. When a topic has enough density, the AI draws on months of accumulated context rather than whatever you happened to capture recently. A researcher asking about a new paper's relationship to prior work gets an answer reflecting six months of reading in the field. A product manager asking about a recurring complaint pattern gets a response connecting current feedback to product decisions made in Q1. A developer asking why a certain architectural choice was made finds an entry that traces the reasoning across a thread of discussion that happened over four separate meetings.

Analytics Vidhya put it directly: the significance is less in the technical implementation and more in the paradigm shift it represents: from AI as a tool you query to AI as a system that accumulates understanding of your specific domain over time. The compounding effect Karpathy described is real. What most implementations never reach is the time horizon where that effect becomes visible, because they do not survive long enough to get there.

The people who benefit most from this kind of knowledge system are not always the ones best positioned to build and maintain the infrastructure behind it. AK Wiki separates those two things.

AK Wiki vs. DIY LLM Wiki vs. RAG

These three approaches suit different users, scales, and requirements. The choice between them is not a technical judgment about which architecture is superior. It is a practical question about who you are and what you can realistically sustain.

Building it yourself is the right choice when you need complete architectural control, work in a single focused domain, and have the technical background to build and maintain the infrastructure. Epsilla's analysis of the enterprise limits of this architecture correctly notes that it identifies RAG's fundamental problems but requires its own maintenance system to function in practice. For a developer who wants to own every layer of that system, building it yourself is a reasonable path.

AK Wiki in remio is the right choice when you want the compounding knowledge effect without the infrastructure commitment. Automated capture replaces manual raw/ curation. Built-in compile and lint replace custom scripts and cron jobs. Information supplementation via web search fills gaps that a local-only system cannot address. The result is a compiled knowledge wiki that does not require its owner to also be its engineer.

RAG remains the correct architecture for large-scale enterprise knowledge bases with strict access controls, multi-user requirements, and document corpora that exceed what a compiled wiki can handle. This pattern, whether implemented DIY or through a product, is designed for personal and small-team knowledge management. At enterprise scale, retrieval infrastructure is the right foundation.

Raw intake: DIY requires manual curation; AK Wiki captures automatically; RAG indexes automatically
Compile: DIY needs a manual script; AK Wiki runs on a button or schedule; RAG indexes continuously
Lint: DIY maintenance is manual and often abandoned; AK Wiki has a built-in schedulable lint operation; RAG has no equivalent
Transparency: DIY and AK Wiki both produce readable, editable entries; RAG stores vector embeddings that require tooling to inspect
Non-technical users: DIY blocks at setup; AK Wiki is out of the box; RAG requires pipeline configuration
Scale ceiling: DIY and AK Wiki are suited for personal to small-team scale; RAG is effectively unlimited

The problem Karpathy named is not new. Knowledge workers have been losing context at session boundaries since AI assistants became useful. What the Gist did was describe the architecture that solves it precisely enough that 16 million people recognized what they had been missing.

The architecture is correct. The problem it solves is real. Running it consistently, over months, is the hard part. If you want the knowledge system Karpathy described without building the infrastructure to support it, remio's AI-native second brain includes AK Wiki as an out-of-the-box implementation: automated capture, built-in compile and lint, cross-topic synthesis, information supplementation, and knowledge that compounds from day one.

Karpathy's LLM Wiki Pattern Has 16 Million Views. Here's What It Looks Like Actually Running.

What Karpathy's LLM Wiki Pattern Actually Is

How AK Wiki Implements the Three Layers

What Compile and Lint Actually Do

What Your Knowledge Base Looks Like Over Time

AK Wiki vs. DIY LLM Wiki vs. RAG

Recent Posts

Get started for free

Features

Alternatives

Solutions

Resources

Company