Noureddine RAMDI / AI Knowledge Graph Generator: Building structured graphs from unstructured text with LLMs

Created Mon, 04 May 2026 10:23:01 +0000 Modified Sat, 23 May 2026 20:41:27 +0000

robert-mcdermott/ai-knowledge-graph

AI Knowledge Graph Generator tackles a common challenge in natural language processing: turning unstructured text into structured, navigable knowledge graphs. It uses a three-phase pipeline that combines text chunking, large language model (LLM) powered triple extraction, and entity standardization plus relationship inference to produce a comprehensive graph representation. The approach is backend-agnostic and configurable, making it a practical tool for knowledge extraction workflows.

How AI Knowledge Graph Generator structures text into knowledge graphs

This project is a Python-based tool designed to transform raw text documents into structured knowledge graphs. The architecture is straightforward yet thoughtfully modular. It operates in three primary phases:

  1. Text chunking: The input text is split into manageable chunks, with configurable overlap to preserve context between segments.

  2. LLM-driven SPO extraction: Each chunk is processed through an LLM to extract SPO (Subject-Predicate-Object) triplets that represent entities and their relationships.

  3. Entity standardization and relationship inference: Extracted entities are aligned and standardized using the LLM itself, addressing naming inconsistencies across chunks without requiring a separate named entity recognition model. Additionally, the tool infers new relationships to connect otherwise disconnected graph communities.

The system supports any OpenAI-compatible API endpoint, including Ollama, LM Studio, OpenAI’s API, vLLM, or LiteLLM, making it flexible and backend-agnostic. The output is an interactive HTML visualization generated using NetworkX for graph operations and the Louvain method for community detection.

Configuration is provided via a TOML file, allowing users to adjust chunk sizes, overlap, LLM parameters, and other pipeline options. The tool can be run as a standalone script or installed as a CLI module, catering to different usage preferences.

What sets this knowledge graph generator apart technically

The standout technical feature is the entity standardization phase, which uses the LLM itself to align and unify entities extracted from different chunks. This addresses a classic problem in knowledge graph extraction: inconsistent naming or entity variants that would otherwise fragment the graph. By doing this without a separate named entity recognition (NER) model, the repo simplifies the pipeline and leverages the LLM’s contextual understanding.

Another strength is the pipeline’s modularity and configurability. Users can tweak chunk overlap, extraction prompts, and inference settings, which provides control over the tradeoff between performance and accuracy.

The backend-agnostic design is practical. By abstracting the LLM API interface, the tool can run on different LLM providers or local LLM servers without code changes, a real advantage for privacy-conscious or resource-limited setups.

The code quality is solid given the complexity. The repo uses Python’s NetworkX for graph management and community detection, which is a reliable and well-understood library. The visualization outputs an interactive HTML file, making exploration of complex graphs easier.

Tradeoffs are clear: relying on LLMs for entity standardization and relationship inference means the quality depends heavily on the model chosen and the prompt design. Also, the chunking strategy, while configurable, may still lose some cross-chunk context, which is a common limitation in chunk-based NLP pipelines.

Quick start

The project requires Python 3.11+ and installs dependencies via pip as follows:

1. Clone this repository
2. Install dependencies: `pip install -r requirements.txt`
3. Configure your settings in `config.toml`
4. Run the system:

python generate-graph.py --input your_text_file.txt --output knowledge_graph.html

Alternatively, using UV:

uv run generate-graph.py --input your_text_file.txt --output knowledge_graph.html

Or install as a module and use the CLI:

pip install --upgrade -e .
generate-graph --input your_text_file.txt --output knowledge_graph.html

Verdict

AI Knowledge Graph Generator is a practical, hands-on tool for developers and researchers looking to convert unstructured text into structured knowledge graphs using LLMs. Its modular three-phase pipeline and backend-agnostic design make it flexible for experimentation and integration into larger workflows.

The main limitation is the dependency on LLM quality and prompt engineering, which affects entity extraction and standardization. It’s not a turnkey solution for perfect knowledge graphs but rather a solid foundation that embraces current LLM capabilities and their tradeoffs.

This repo is relevant for AI practitioners interested in knowledge extraction, NLP engineers experimenting with graph-based representations, and anyone needing to visualize entity relationships interactively. Its configurable design and open architecture encourage adaptation and extension for specific use cases.

Overall, it’s a useful reference implementation of LLM-powered knowledge graph extraction that balances simplicity and functionality without overselling capabilities.


→ GitHub Repo: robert-mcdermott/ai-knowledge-graph ⭐ 2,253 · Python