llm-wiki: orchestrating multi-agent LLM research into persistent knowledge bases

Large language models (LLMs) are powerful tools, but managing them effectively for complex research tasks remains a challenge. The nvk/llm-wiki repository offers a shell-script-based orchestration layer that transforms any LLM agent into a persistent knowledge base compiler. It’s designed around multi-agent research workflows, enabling parallel investigations with durable provenance and deep cross-referencing.

What llm-wiki does and how it’s built

At its core, llm-wiki acts as an orchestration framework for LLM agents, built entirely in shell script for portability and ease of integration. It turns multiple LLM agents into a cooperative team that can research topics, ingest diverse sources, and compile their findings into a wiki format compatible with Obsidian.

The system supports plugins for multiple LLM runtimes, including Claude Code, OpenAI Codex, OpenCode, and Pi. These plugins adapt the orchestration layer to the specifics of each LLM environment, including their prompt sizes and context limits. For example, Claude Code uses a ~22K token system prompt, Codex ~3K, and Pi around 1K tokens.

The architecture enables up to 10 agents running in parallel, each contributing to research tasks with configurable modes like “deep” or “retardmax,” trading off depth and speed. Research sessions generate durable provenance files (.session-events.jsonl and .session-checkpoint.json) that record the entire investigative process, allowing replay and auditability.

Source ingestion is flexible: llm-wiki can pull data from URLs, PDFs, MediaWiki dumps, Wayback Machine archives, CSV message archives, and GitHub repos. This breadth supports comprehensive topic exploration.

Technical strengths and tradeoffs

The standout feature is the multi-agent orchestration with a fuzzy router that interprets natural language commands to route tasks appropriately: ingest URLs, answer queries, or initiate research topics. This reduces the cognitive load on the user and automates intent detection.

The shell-script orchestration is surprisingly effective, given its minimal dependencies and portability. It leverages a plugin system to handle differences between LLM runtimes, maintaining a consistent interface for research and ingestion tasks.

Parallelism is managed carefully to avoid overwhelming APIs or hitting rate limits, with configurable time budgets and depth modes. For instance, “deep” mode uses 8 agents and can run research for hours, while “retardmax” mode pushes up to 10 agents at maximum speed.

Durable provenance tracking is another key strength. By storing session events and checkpoints as JSONL files, the system enables replaying research sessions, auditing decisions, and resuming interrupted work. This is rare in LLM tooling, where ephemeral sessions are the norm.

The wiki output is compatible with Obsidian, a popular markdown-based knowledge management tool, which means users can integrate the LLM-generated knowledge base into their existing workflows.

Tradeoffs are clear: orchestrating complex multi-agent workflows in shell scripts can be brittle and hard to scale beyond a certain complexity. The user must also have access to compatible LLM runtimes that support these plugins. Moreover, the system’s efficiency depends heavily on the chosen LLM’s capabilities and context window sizes.

Quick start

The repository provides straightforward installation commands for supported LLM runtimes:

Claude Code (native plugin):

claude plugin install wiki@llm-wiki

OpenAI Codex (marketplace plugin):

codex plugin marketplace add nvk/llm-wiki

Once installed, you can run research commands directly from the CLI. Here are some examples:

/wiki:research "nutrition" --new-topic            # Create wiki + research in one shot
/wiki:research "gut-brain axis" --wiki nutrition   # Add more research to existing wiki
/wiki:research "fasting" --deep --min-time 2h     # 8 agents, keep going for 2 hours
/wiki:research "keto" --retardmax                 # 10 agents, max speed, ingest everything
/wiki:research "What makes long form articles go viral?" --new-topic  # Question → decompose → playbook
/wiki:thesis "fiber reduces neuroinflammation via SCFAs"  # Thesis-driven: evidence for + against → verdict
/wiki:thesis "cold exposure upregulates BDNF" --min-time 1h  # Deep thesis investigation
/wiki:query "How does fiber affect mood?"         # Ask the wiki
/wiki:query "compare keto and mediterranean" --deep  # Deep cross-referenced answer
/wiki:query --resume                              # Where did I leave off?
/wiki add https://example.com/article             # Fuzzy router detects URL → ingest
/wiki what do we know about CRISPR?               # Fuzzy router detects question → query
/wiki:ingest https://example.com/article          # Manually ingest a source
/wiki:ingest --inbox                              # Process files dropped in inbox/
/wiki:ingest-collection https://github.com/bitcoin/bips --wiki bitcoin  # Bulk import spec repos
/wiki:ingest-collection https://dump.bitcoin.it/dump_20260429_en.xml.bz2 --wiki bitcoin  # Import MediaWiki dumps
/wiki:ingest-collection messages.csv --adapter csv-messages --wiki bitcoin  # Split message archives
/wiki:ingest-collection "https://example.com/*" --adapter wayback-cdx --from 20100101 --to 20200101  # Import archived snapshots
/wiki:compile                                     # Compile any unprocessed sources
/wiki:audit --project gut-brain-pla

These commands demonstrate the system’s versatility: starting new wikis, deep multi-agent research, thesis-driven investigations, querying compiled knowledge, and ingesting a wide range of source types.

Verdict

llm-wiki targets advanced users who want to orchestrate multiple LLM agents for structured, persistent research and knowledge compilation. Its plugin architecture and shell-script foundation prioritize portability and integration with existing LLM runtimes.

It’s not a plug-and-play solution for casual users due to its complexity and dependency on compatible LLM environments. However, for researchers, developers, or teams running multi-agent LLM workflows, it offers powerful orchestration, durable provenance, and flexible ingestion.

The tradeoff is the shell-script orchestration layer, which may limit scalability and robustness compared to dedicated multi-threaded or distributed frameworks. Still, its approach to multi-agent research orchestration and natural language intent routing is worth understanding if you’re building complex LLM-driven knowledge systems.

Navigating free-tier LLM APIs with the awesome-free-llm-apis catalog — A curated catalog of free-tier LLM APIs compatible with OpenAI SDK, detailing rate limits, model specs, and providers to
A hands-on course for mastering large language models: fine-tuning, quantization, and tooling — Explore a comprehensive LLM course with practical notebooks on fine-tuning (QLoRA, DPO), quantization (GPTQ), and tools
LLM-driven browser automation with Browser-Use: a hands-on look — Browser-Use is a Python library enabling LLM-powered AI agents to automate browsers efficiently. It features a custom Ch
Awesome LLM Apps: a practical collection of runnable AI agent and RAG templates — Awesome LLM Apps offers 100+ runnable AI agent and RAG templates for quick LLM app development. It supports multiple pro
Context7: injecting real-time, version-specific docs into LLM workflows — Context7 tackles LLM hallucinations by injecting up-to-date, version-specific library docs directly into AI coding agent

→ GitHub Repo: nvk/llm-wiki ⭐ 342 · Shell

Noureddine RAMDI / llm-wiki: orchestrating multi-agent LLM research into persistent knowledge bases

What llm-wiki does and how it’s built

Technical strengths and tradeoffs

Quick start

Verdict

Related Articles