pentest-agents: a cross-IDE autonomous bug bounty framework with multi-agent AI

pentest-agents is a large-scale autonomous bug bounty framework that stands out by deploying 50 specialized AI agents across seven different AI coding platforms. It tackles the complexity of multi-IDE agent portability with a unique translator layer that renders a single source of Claude Code-native agents into native formats for each target platform, including Codex, Gemini, Cursor, Windsurf, Copilot, and OpenClaw.

What pentest-agents does and how it is architected

At its core, this project is a Python-based autonomous penetration testing suite designed to automate bug bounty workflows at scale. The framework ships with about 118,000 lines of code spread across roughly 760 files — a massive codebase that contains 50 distinct AI agents, 26 slash commands, and 19 CLI tools.

The architecture revolves around several key components:

Multi-agent system: 50 specialized AI agents operate autonomously, each designed for specific tasks in the bug bounty lifecycle.
Multi-IDE portability layer: The framework supports seven AI coding tools including Claude Code, OpenAI Codex, Gemini, Cursor, Windsurf, Copilot, and OpenClaw. This is achieved by maintaining a single “source of truth” of agents in Claude Code format and translating them to each target’s native agent format.
Autonomous hunt loops: The agents run autonomous hunt loops capable of building A→B exploit chains, enabling complex multi-step attack workflows.
7-Question Gate: A triage validation system that uses 7 questions to gate and validate findings before progressing.
Persistent endpoint brain: A brain module tracks endpoints continuously to maintain state and understanding across sessions.
MCP servers: Two MCP servers connect to 16 bug bounty platforms, supporting BYO semantic writeup search.

The project includes about 2,500 lines of payloads embedded within the agents, and the entire system integrates cost tracking hooks for Claude Code to monitor LLM usage.

This architecture supports a high degree of automation and cross-platform flexibility, making it possible to run the same logical agents across very different AI coding environments with minimal manual adaptation.

The multi-IDE portability layer and its technical tradeoffs

What truly distinguishes pentest-agents is the multi-provider agent translator. Instead of implementing agents separately for each AI coding tool, the framework uses Claude Code as the canonical agent definition format. From this single source, the system auto-generates native agents for six other platforms.

This translator handles several key adaptations:

Model field stripping: Non-essential or Claude-specific model fields are removed for compatibility.
Prose normalization: Claude Code-specific prose patterns are rewritten to fit the conventions of each target platform.
Path rewriting: File and resource path references are remapped to work within the target environment.
Body-length caps: For example, Copilot agents have their bodies truncated at 30k characters to fit platform constraints.

This cross-IDE portability layer is a significant engineering effort. It enables a consistent development and maintenance workflow, as updates only need to be made once in Claude Code, then propagated.

The tradeoff is complexity: the translator must keep up with each platform’s subtle differences and limitations. It also means that some platform-specific optimizations or features might be harder to leverage fully, as the agents must remain compatible across all targets.

Code quality appears robust given the scale, with extensive tooling and CLI utilities to manage the agents, run autonomous loops, and interface with multiple bug bounty platforms. The modular design of skills and commands reflects a mature engineering approach.

Quick start

The README provides clear commands to get started, including environment setup, workspace scaffolding, and running agents:

# MCP servers are launched via `uv run --with mcp` — no global pip install required.
export HACKERONE_USERNAME=you HACKERONE_TOKEN=your_token
uv run python3 tools/scaffold.py hackerone tesla
cd ~/bounties/hackerone-tesla && claude
/model opus             # Opus 4.7 [1M] — subagents inherit via model: "inherit"
/sync hackerone tesla
/brain init && /status
/hunt tesla.com

The scaffold.py tool provisions workspaces for all supported clients, not just Claude Code, generating the necessary directories and config files to resolve paths inside the bounty workspace.

For installation, the framework offers two modes:

Use pre-rendered bundles for each supported AI coding tool directly without installation.

git clone https://github.com/H-mmer/pentest-agents-suite
cd pentest-agents-suite/pentest-agents/providers/codex
codex  # or: cd ../gemini && gemini, etc.

Run the installer to install agents into your own project or globally.

python3 -m tools.installer install --targets all --scope project
python3 -m tools.installer install --targets codex --scope global

This installer rewrites paths to work irrespective of your project location, maintaining stable references back to the cloned pentest-agents repo.

verdict

pentest-agents is a heavyweight autonomous bug bounty framework built for practitioners who want to run AI-driven pentesting workflows across multiple AI coding platforms with minimal manual adaptation.

Its multi-IDE portability layer is a notable technical achievement, enabling one source of agent definitions to serve seven distinct AI coding environments. This reduces maintenance overhead but introduces translator complexity and some platform constraint tradeoffs.

The sheer scale of the codebase and number of agents means it requires commitment to understand and operate effectively. However, for teams aiming to build or run autonomous bug bounty operations at scale with AI agents, this project offers a rich toolkit.

That said, the framework is not a lightweight or plug-and-play solution. It demands familiarity with AI coding tools, autonomous agent design, and bug bounty workflows. It also assumes a degree of comfort with managing complex multi-agent systems and coordinating across multiple platforms.

For anyone interested in autonomous pentesting, AI agent orchestration, or cross-platform AI tool integration, pentest-agents is worth exploring. The project pushes the boundary of what can be automated in bug bounty hunting using AI, especially with its persistent endpoint brain and exploit chain building capabilities.

Pentest Swarm AI: A stigmergic swarm intelligence approach to autonomous penetration testing — Pentest Swarm AI uses stigmergic swarm intelligence via a pheromone-decaying blackboard for decentralized, emergent pent
DSPy agent skills pack with GEPA optimization for Claude Code and Codex CLI — Explore a production-grade pack of DSPy 3.2.x agent skills with GEPA optimization, delivering up to +19.53 accuracy on R
AgentShield: auditing AI agent security configurations with runtime confidence scoring — AgentShield is a TypeScript CLI tool that audits Claude Code AI agent configs for secrets, permissions, hooks, and more
Softaworks Agent Toolkit: A modular plugin marketplace for AI coding agents — Softaworks Agent Toolkit offers 40+ modular AI skills and plugins for coding agents like Claude Code, enabling composabl
Bug Hunter: adversarial multi-agent AI for runtime code bug detection and auto-fixing — Bug Hunter uses an adversarial multi-agent AI pipeline to detect and auto-fix runtime code bugs with low false positives

→ GitHub Repo: H-mmer/pentest-agents ⭐ 508 · Python

Noureddine RAMDI / pentest-agents: a cross-IDE autonomous bug bounty framework with multi-agent AI

What pentest-agents does and how it is architected

The multi-IDE portability layer and its technical tradeoffs

Quick start

verdict

Related Articles