DeepDiagram: Streaming XML-driven AI visualization with multi-agent orchestration

DeepDiagram tackles a common pain point in AI-powered diagram generation: the latency and complexity introduced by traditional tool-calling architectures. Instead of invoking external tools as black boxes, DeepDiagram’s agents emit structured XML tags in a streaming fashion, parsed in real-time by a frontend state machine. This approach gives users immediate visibility into the AI’s evolving design concepts while the diagram code is still being generated, enhancing transparency and interactivity.

What DeepDiagram does and how it’s architected

DeepDiagram is an open-source AI visualization platform designed to translate natural language commands into a variety of diagrams. It builds on a modern full-stack architecture with a React 19 frontend and a FastAPI backend, orchestrated by LangGraph—a multi-agent orchestration framework.

The platform includes six specialized agents, each targeting a distinct diagram format or style: MindMap, Flowchart, Data Chart, Draw.io, Mermaid, and Infographic. These agents work concurrently and independently, generating outputs wrapped in <design_concept> and <code> XML tags. Unlike typical AI pipelines that rely on tool calls or subprocesses, this repo uses a streaming XML tag output pattern where agents emit fragments continuously. The frontend listens to these streams and updates the UI state machine accordingly.

A context-aware intelligent router manages request dispatching, using explicit @mentions to direct commands to specific agents, natural language intent recognition by large language models (LLMs), and conversation continuity heuristics. This router ensures that requests reach the most appropriate agent without manual intervention.

The system supports multimodal inputs including images, PDFs, and Office documents, enabling richer context beyond plain text. Persistent session history is stored in a PostgreSQL database with Git-like message branching, allowing users to explore different conversation paths.

Two separate Server-Sent Events (SSE) streams are maintained: one for the AI’s reasoning and design concepts, and another for the actual diagram code, enabling independent rendering and smoother user experience.

The streaming XML output pattern and multi-agent orchestration

The standout technical feature is the replacement of classical tool-calling with a streaming XML tag output pattern. Traditionally, AI agents might invoke external tools or subprocesses to generate diagrams, which adds latency and complexity in managing subprocess lifecycles and parsing outputs.

Here, each LangGraph agent produces XML fragments that are immediately parsed by the frontend state machine. The tags <design_concept> and <code> encapsulate the AI’s intermediate reasoning and final diagram code respectively. This immediate parsing allows the UI to update progressively, showing design rationale before the diagram fully renders.

This pattern is powerful for transparency: users see what the AI “thinks” in real-time and can track how the final output assembles. It also reduces architectural complexity on the backend by avoiding tool invocation layers.

The tradeoff is increased complexity in the frontend parser and state machine, which must robustly handle partially complete XML fragments and synchronize dual SSE streams. This requires careful engineering to maintain consistency and performance.

The intelligent routing mechanism further distinguishes DeepDiagram. It uses a combination of explicit user @mentions, LLM intent classification, and conversation context to dynamically decide which agent(s) to involve. This reduces user friction and enhances the platform’s flexibility in handling diverse diagram requests.

Quick start

To get DeepDiagram running for development, follow these steps exactly as provided:

cd backend
uv sync                # Install dependencies via uv
bash start_backend.sh  # Runs DB migrations + starts FastAPI server

Backend will be available at http://localhost:8000

Then, in another terminal:

cd frontend
npm install
npm run dev

Frontend will run at http://localhost:5173

For production, Docker and Docker Compose are recommended. You need to create a .env file with configuration including your OpenAI-compatible API keys. The backend uses PostgreSQL for message branching and state persistence.

This setup reflects a typical modern full-stack AI platform with separate backend and frontend services, containerization support, and environment-based configuration.

Verdict

DeepDiagram is relevant for developers and teams exploring AI-driven diagram generation where seeing the AI’s reasoning as it builds the diagram is valuable. Its architecture eschews more common tool-calling patterns in favor of a streaming XML output parsed live on the frontend, which is worth understanding if you work on multi-agent AI systems or interactive visualization tools.

The platform’s multimodal input support and persistent Git-like session branching add practical value for complex workflows. The tradeoff lies in the complexity of managing dual SSE streams and the frontend’s XML parsing state machine, which might pose maintenance challenges.

If you’re building AI tools that require immediate feedback loops and want to avoid the overhead of subprocess calls, DeepDiagram’s approach is worth a look. Its React 19 + FastAPI stack combined with LangGraph orchestration is a solid, modern foundation, though expect to invest in understanding the routing and streaming mechanics well.

Overall, DeepDiagram provides a clear example of how to architect multi-agent AI platforms with transparent, streaming outputs that improve user interaction and system responsiveness.

Open Deep Research: A Next.js 16 agentic AI assistant for iterative web research — Open Deep Research is a TypeScript Next.js 16 app that uses an LLM to plan, execute, and iterate web research via Exa an
DeepSeek-Reasonix: An AI coding agent engineered for token-efficient long sessions — DeepSeek-Reasonix is a terminal-native AI coding agent built around a cache-first loop for token-efficient, long session
DeepChat: a unified Electron desktop platform for multi-LLM AI agents with ACP integration — DeepChat is an Electron-based TypeScript desktop app unifying multi-LLM chat, MCP protocols, and ACP agent integration w
open-researcher: AI-powered web research assistant with integrated scraping and summarization — open-researcher is a TypeScript app combining AI APIs and web scraping to assist research workflows. It offers an extens
RESTai: a multi-project AIaaS platform with agentic browser automation and visual AI pipelines — RESTai exposes multi-project AI capabilities via a unified REST API, featuring an agentic browser with Dockerized Playwr

→ GitHub Repo: LingyiChen-AI/DeepDiagram ⭐ 904 · TypeScript

Noureddine RAMDI / DeepDiagram: Streaming XML-driven AI visualization with multi-agent orchestration

What DeepDiagram does and how it’s architected

The streaming XML output pattern and multi-agent orchestration

Quick start

Verdict

Related Articles