Noureddine RAMDI / ArkhamMirror: A modular, local-first investigative journalism platform with event-driven shards

Created Mon, 04 May 2026 10:23:02 +0000 Modified Sat, 23 May 2026 20:41:27 +0000

mantisfury/ArkhamMirror

ArkhamMirror tackles a familiar pain point: investigative journalism workflows often involve juggling large, complex document sets with the need for deep analysis, pattern recognition, and credibility assessment. What sets this platform apart is its architecture — a modular, local-first system that breaks down monoliths into independent shards communicating only through events. This design supports extensibility without sacrificing system integrity or requiring deep coding skills for domain-specific extensions.

what ArkhamMirror does and how it works

At its core, ArkhamMirror (also known as SHATTERED) is a document intelligence platform designed specifically for investigative journalism. Its main goal is to support structured analytic techniques, such as Analysis of Competing Hypotheses (ACH), contradiction detection, and deception detection checklists (MOM, POP, MOSES, EVE), all grounded in real-world investigative methods.

The architecture is a standout feature: it uses a ‘Voltron’ pattern consisting of an immutable core frame called ArkhamFrame and 26 pluggable shards. The ArkhamFrame includes 17 services that provide foundational capabilities, while each shard adds domain-specific features. These shards do not directly import code from one another; instead, they communicate exclusively through event passing. Each shard has its own isolated PostgreSQL schema, which helps avoid schema conflicts and data coupling.

The tech stack is Python-based, with PostgreSQL 14+ as the primary database, enhanced with the pgvector extension to enable hybrid semantic and keyword search capabilities. This is crucial for handling the diverse, unstructured data typical in investigative journalism.

The system supports multiple large language model (LLM) providers. Locally hosted options like Ollama and LM Studio are supported alongside cloud-based services such as OpenAI and Groq. This flexibility allows users to run the platform in environments with or without reliable internet access or GPU acceleration, gracefully degrading features when AI resources are limited.

A key part of the platform is advanced graph visualization, featuring over 10 layout modes and network analytics to help investigators visualize relationships and patterns effectively.

The data pipeline follows a clear INGEST→EXTRACT→ORGANIZE→ANALYZE→ACT flow, which aligns with investigative workflows and ensures data moves logically through stages from raw input to actionable insight.

the strength of the event-driven modular architecture

The defining technical strength of ArkhamMirror is its extreme modularity through the Voltron architecture. The core frame is immutable, providing stability and a consistent foundation. Shards plug into this frame, but they remain isolated and communicate solely by emitting and listening to events.

This event-driven communication enforces loose coupling, making it easier to develop, test, and deploy shards independently. It also allows non-developers or domain experts to contribute by building shards that focus on specific investigative techniques without risking core system stability.

Each shard maintaining its own PostgreSQL schema is a practical design choice. It prevents database schema collisions, enables shard-specific optimizations, and simplifies shard-level data management.

The tradeoff here is complexity in orchestration and the need for robust event management to avoid event storms or missed messages. The codebase includes a discovery mechanism that auto-detects installed shards, which aids in managing this complexity.

ArkhamMirror’s use of multiple LLM providers, including local hosting options, is another strength. It addresses privacy concerns and dependency on cloud services, which is often a dealbreaker for investigative work in sensitive or disconnected environments. The fallback mechanisms ensure the platform remains usable even when AI resources are constrained.

The code quality appears well-structured, with a clear separation of concerns between the core frame and shards. The use of Python’s packaging and pip editable installs for shards supports iterative development and experimentation.

However, this architecture may introduce latency due to event passing and complexity in debugging cross-shard interactions. Developers need to understand asynchronous event-driven patterns and manage PostgreSQL schemas carefully. The system is also relatively heavyweight due to the multiple services and dependencies.

quick start

prerequisites

  • Python 3.10+
  • Node.js 18+ (for local UI development only)
  • PostgreSQL 14+ with pgvector extension

installation

# Install the Frame
cd packages/arkham-frame
pip install -e .

# Install all shards (or select specific ones)
for dir in ../arkham-shard-*/; do
  pip install -e "$dir"
done

# Install spaCy model
python -m spacy download en_core_web_sm

# Install UI dependencies
cd ../arkham-shard-shell
npm install

configuration

Create a .env file or set environment variables.

Start the Frame API (which auto-discovers installed shards):

python -m uvicorn arkham_frame.main:app --host 127.0.0.1 --port 8100

docker deployment

The README recommends Docker deployment for easier setup, including a setup wizard and tenant creation. It requires a domain name, open ports 80 and 443 for Let’s Encrypt, and Docker with Docker Compose installed.

Additional requirements include a local LLM server (LM Studio, Ollama, or vLLM) and pre-cached embedding models.

verdict

ArkhamMirror is a thoughtfully designed platform for investigative journalism that takes a unique architectural approach with its event-driven, modular Voltron design. This makes it flexible and extensible while maintaining system integrity and supporting domain-specific customizations.

It’s a good fit for teams needing a local-first, privacy-conscious setup, who want to leverage AI-powered analysis without being locked into cloud providers. The use of PostgreSQL with pgvector for hybrid search adds practical power to data querying.

The tradeoffs include the complexity inherent in event-driven modular systems and the overhead of managing multiple shards and schemas. The system demands familiarity with asynchronous event patterns, Python packaging, and PostgreSQL extensions.

While it’s not a lightweight drop-in solution, for investigative journalists or developers building specialized document intelligence tools, ArkhamMirror offers a robust, extensible platform worth exploring. The codebase’s modularity also invites contributions and custom shard development, which can empower domain experts beyond traditional software engineers.

Overall, ArkhamMirror’s architecture and feature set make it a rare example of high-modularity document intelligence platforms that integrate AI thoughtfully and flexibly.


→ GitHub Repo: mantisfury/ArkhamMirror ⭐ 406 · Python