llm-wikid: agent-agnostic AI knowledge base with schema-driven compilation for Obsidian

llm-wikid tackles a common pain point in knowledge management with AI: how to efficiently maintain a large, structured knowledge base without relying on retrieval-augmented generation (RAG) at query time. Instead, it compiles knowledge once into a structured wiki format with pre-built cross-references, aiming to improve Q&A performance at scale.

What llm-wikid is and how it works

At its core, llm-wikid is an AI-maintained knowledge base system designed for Obsidian, the popular markdown note-taking app. Inspired by Andrej Karpathy’s LLM Wiki pattern, it eschews the classic RAG approach where documents are retrieved dynamically for each query. Instead, llm-wikid uses a multi-phase ingest pipeline that processes raw knowledge articles into interconnected, annotated wiki pages stored as markdown files.

The system is agent-agnostic, supporting multiple AI agents such as Claude Code, OpenClaw, Hermes, and Codex, all accessible through slash commands within the Obsidian environment. Control over the entire ingest and query pipeline is declaratively defined in a single CLAUDE.md schema file. This file functions as a prompt-as-code blueprint, specifying how to sort articles, resolve URLs, extract media, classify content, compile pages, perform bias checks, and re-index the knowledge base.

Under the hood, the architecture relies heavily on Obsidian’s native features like Dataview for querying markdown metadata, Bases for linking across notes, and graph view for visualization. The actual processing commands are shell scripts that orchestrate AI calls and content transformations, making the system lightweight and easily extensible.

Why the CLAUDE.md schema-driven pipeline matters

The defining strength of llm-wikid lies in its schema-driven approach. The CLAUDE.md file encapsulates the entire behavior of the knowledge system — from ingestion rules to quality controls — in markdown instructions rather than code. This means no coding is necessary to modify the pipeline logic; instead, you update the schema and the system adapts.

This design trades off some flexibility and real-time adaptability for predictability and scalability. By compiling knowledge once into a structured wiki, queries are faster and more reliable at scale (~100 articles / ~400K words). This contrasts with RAG systems where query latency and consistency can degrade as the document corpus grows.

The code quality is surprisingly clean given the complexity: the shell scripts are modular, the schema is human-readable, and the multi-agent support is well abstracted. Quality controls like bias checks and contradiction detection are baked into the pipeline, improving the trustworthiness of the knowledge base.

However, this approach requires upfront effort to maintain the schema and manage the ingest pipeline. It’s less suited for highly dynamic content that changes frequently since the knowledge must be recompiled to reflect updates. Also, the reliance on Obsidian means users need familiarity with its ecosystem and markdown conventions.

Quick start

git clone https://github.com/shannhk/llm-wikid.git my-wiki
cd my-wiki

This minimal setup gets you the repository locally. From there, you configure your CLAUDE.md schema file to define how the system should process your knowledge articles. The slash commands in Obsidian let you interact with the AI agents directly, triggering pipeline stages or querying the compiled wiki.

Verdict

llm-wikid is a solid choice if you want an AI-powered knowledge base that prioritizes structured, maintainable content over on-demand document retrieval. The schema-driven pipeline is a practical example of prompt-as-code that balances automation with control.

It fits well in environments where knowledge is relatively stable and can be curated upfront, such as research groups, technical documentation teams, or personal knowledge management enthusiasts who use Obsidian extensively.

The tradeoff is clear: you lose some agility compared to RAG-based systems but gain scalability, query speed, and quality control. Familiarity with shell scripting and Obsidian is helpful to get the most out of the system.

Overall, llm-wikid shows how combining markdown, CLI orchestration, and AI agents under a declarative schema can build an autonomous, maintainable knowledge system without heavy engineering overhead.

Navigating free-tier LLM APIs with the awesome-free-llm-apis catalog — A curated catalog of free-tier LLM APIs compatible with OpenAI SDK, detailing rate limits, model specs, and providers to
A hands-on course for mastering large language models: fine-tuning, quantization, and tooling — Explore a comprehensive LLM course with practical notebooks on fine-tuning (QLoRA, DPO), quantization (GPTQ), and tools
Awesome LLM Apps: a practical collection of runnable AI agent and RAG templates — Awesome LLM Apps offers 100+ runnable AI agent and RAG templates for quick LLM app development. It supports multiple pro
LLM-driven browser automation with Browser-Use: a hands-on look — Browser-Use is a Python library enabling LLM-powered AI agents to automate browsers efficiently. It features a custom Ch

→ GitHub Repo: shannhk/llm-wikid ⭐ 227 · Shell

Noureddine RAMDI / llm-wikid: agent-agnostic AI knowledge base with schema-driven compilation for Obsidian

What llm-wikid is and how it works

Why the CLAUDE.md schema-driven pipeline matters

Quick start

Verdict

Related Articles