LuaN1aoAgent: Autonomous penetration testing with P-E-R multi-agent causal graph reasoning

LuaN1aoAgent tackles autonomous penetration testing by splitting the traditionally monolithic AI agent role into three specialized collaborators: a Planner, an Executor, and a Reflector. This P-E-R multi-agent system coordinates through a shared event bus and causal graph reasoning, enabling attack plans to evolve dynamically as directed acyclic graphs (DAGs). The result is an AI pentesting agent that adapts in real time, parallelizes independent steps automatically, and keeps decisions traceable with evidence-backed causal chains.

What LuaN1aoAgent does and how it is architected

At its core, LuaN1aoAgent is an autonomous penetration testing agent built in Python. It uses a novel P-E-R (Planner-Executor-Reflector) multi-agent framework combined with causal graph reasoning to automate complex attack workflows.

The Planner agent outputs structured graph editing operations — ADD_NODE, UPDATE_NODE, DEPRECATE_NODE — to evolve an attack plan represented as a DAG rather than free-form language. This allows the system to model dependencies explicitly and parallelize tasks that don’t depend on each other. The DAG evolves dynamically as new evidence and hypotheses are added during an engagement.

The Executor agent orchestrates security tools using the MCP protocol, which standardizes tool invocation and communication. It employs intelligent context compression and hypothesis persistence to keep the attack state manageable and consistent across steps.

The Reflector agent performs four levels (L1-L4) of failure attribution, analyzing errors and preventing repeated mistakes. This autonomous recovery improves robustness and efficiency.

Crucially, all decisions and actions are tied to a causal graph structured as Evidence → Hypothesis → Vulnerability → Exploit, each with confidence scoring. This evidence-first approach prevents the hallucination problem common in large language models, ensuring every attack step is auditable and traceable.

The entire system is implemented in Python 3.10+, leveraging asynchronous programming for concurrency. It supports integration with LLM APIs compatible with OpenAI formats, including GPT-4o, DeepSeek, Claude-3.5, and others.

Technical strengths and design tradeoffs

The standout feature of LuaN1aoAgent is its multi-agent P-E-R framework with causal graph reasoning. Unlike monolithic LLM agents that mix planning, execution, and reflection in one, this design splits concerns into distinct cognitive roles:

The Planner focuses purely on evolving the attack graph plan.
The Executor handles tool orchestration and state management.
The Reflector performs error analysis and recovery.

This separation avoids the “split personality” problem where a single LLM agent struggles to maintain consistent context and roles, leading to errors or inefficiencies.

The Planner’s output as structured graph-editing operations rather than natural language prompts is a clever design choice. It transforms the attack plan into a living DAG, enabling the system to automatically parallelize independent tasks via topological sorting. This makes the attack process more efficient and adaptive as new vulnerabilities are discovered in real time.

Using causal graph reasoning with explicit Evidence → Hypothesis → Vulnerability → Exploit chains is another technical highlight. This enforces rigor and traceability, which is critical in penetration testing where audit trails and reproducibility matter. It also reduces LLM hallucinations by requiring evidence-backed reasoning chains.

The MCP protocol for tool orchestration standardizes communication with external security tools, improving modularity and extensibility. Intelligent context compression helps keep the interaction with LLMs cost-effective and performant.

The tradeoffs are clear though. Running high-privilege tools like shell_exec and python_exec poses security risks, so the authors strongly recommend isolated environments such as Docker or VMs. The system requires at least 4GB RAM (8GB+ recommended) due to the demands of RAG services and LLM inference.

The reliance on external LLM APIs means costs and latency are factors to consider, although the reported median exploit cost is low ($0.09), indicating efficient usage.

Overall, the codebase balances sophistication with practical constraints, and the architecture is modular enough to extend or adapt to new tools and models.

Quick start

System requirements

Component	Requirements	Notes
Operating System	Linux (recommended) / macOS / Windows (WSL2)	Recommended to run in isolated environments
Python	3.10+	Requires support for asyncio and type hints
LLM API	OpenAI compatible format	Supports GPT-4o, DeepSeek, Claude-3.5, etc.
Memory	Minimum 4GB, recommended 8GB+	RAG services and LLM inference require memory
Network	Internet connection	Access to LLM APIs and knowledge base updates

⚠️ Security Notice: LuaN1ao includes high-privilege tools like shell_exec and python_exec. Strongly recommend running in Docker containers or virtual machines to avoid potential risks to the host system.

Installation

# Install dependencies
pip install -r requirements.txt

Configuration

Set environment variables as required by your LLM API provider and other settings (refer to the README for details).

This minimal quick start gets you the dependencies ready and the config set to run LuaN1aoAgent.

How to navigate and explore LuaN1aoAgent

If you want to dive deeper, start with the README to understand the full architecture and usage details.

Look into the planner/, executor/, and reflector/ directories to see how the three agents are implemented and interact.

The causal graph management code and MCP protocol handlers are key components to study for understanding how attack plans are modeled and how tools are invoked.

Check out example configurations and benchmark scripts to see how the system evaluates success rate and exploit cost.

The modular design means you can extend the framework by adding new tools or improving the causal reasoning modules.

Verdict

LuaN1aoAgent is aimed at security researchers and penetration testers interested in autonomous AI-driven testing workflows that are auditable and adaptable.

Its P-E-R multi-agent framework with causal graph reasoning is a thoughtful approach to overcoming limitations of single-agent LLM pentesters. The clear separation of planning, execution, and reflection roles, combined with structured DAG planning, sets a solid foundation for reliable automated attacks.

The repo demands careful attention to security best practices due to high-privilege tool usage, and requires a moderate hardware footprint.

While it’s not a plug-and-play tool for casual users, for teams working on advanced autonomous pentesting, LuaN1aoAgent offers a sophisticated platform that balances innovation with practical constraints. The evidence-first causal graph approach also means every attack step is traceable, a critical feature for real-world usage.

For anyone exploring AI-assisted penetration testing, LuaN1aoAgent is worth studying and experimenting with, especially if you want to understand how to architect multi-agent cooperation and rigorous causal reasoning in this domain.

Pentest Swarm AI: A stigmergic swarm intelligence approach to autonomous penetration testing — Pentest Swarm AI uses stigmergic swarm intelligence via a pheromone-decaying blackboard for decentralized, emergent pent
AGNT: a local-first AI agent OS with self-evolving agent skills — AGNT is a JavaScript-based local-first AI agent OS featuring a unique SkillForge system that evolves agent instructions
CORAL: orchestrating autonomous AI coding agents with git worktree isolation and shared state — CORAL uses git worktree branches combined with symlinked shared state to orchestrate multiple AI coding agents collabora
AgentFlow: orchestrating AI coding agents with graph-based parallelism and remote execution — AgentFlow is a Python library for orchestrating AI coding agents using dependency graphs, supporting parallel fanout, it

→ GitHub Repo: SanMuzZzZz/LuaN1aoAgent ⭐ 923 · Python

Noureddine RAMDI / LuaN1aoAgent: Autonomous penetration testing with P-E-R multi-agent causal graph reasoning