Noureddine RAMDI / npcpy: enforcing AI behavioral compliance through architecture for multimodal LLM apps

Created Sat, 23 May 2026 20:41:14 +0000 Modified Sat, 23 May 2026 20:41:27 +0000

NPC-Worldwide/npcpy

npcpy takes a fundamentally different approach to AI system safety and composability. Instead of relying on prompt engineering to enforce behavioral constraints on large language model (LLM) agents, it builds those constraints directly into the software architecture through its NPC Context-Agent-Tool data layer. This structural enforcement aims to make agent behavior more deterministic, auditable, and robust across different environments.

what npcpy does and how it’s built

At its core, npcpy is a Python library designed to provide composable primitives for creating multimodal LLM applications, agentic AI systems, and knowledge graph pipelines. It abstracts interactions with multiple LLM providers—both local and cloud-based—through a unified interface. Supported local providers include Ollama, llama.cpp, LM Studio, and omlx, while cloud providers like OpenAI, Anthropic, and Gemini are integrated via API keys.

The architectural centerpiece is the NPC Context-Agent-Tool data layer, a structured communication and control model that enforces behavioral compliance by design rather than by relying on prompt guardrails. This design aims to reduce unpredictability in agent behavior and improve auditability by embedding compliance rules in the software layers.

npcpy ships with several agent archetypes to cover different use cases:

  • Base Agent: Comes with built-in tools for shell command execution, Python code execution, file editing, and web search.
  • ToolAgent: Allows users to attach arbitrary custom Python functions as tools, facilitating extensibility.
  • CodingAgent: Automatically executes code blocks generated by the LLM, useful for coding-related tasks.
  • NPCArray: Orchestrates multi-agent debates where multiple agents interact and respond to each other’s outputs, enabling more complex multi-agent workflows.

The stack is pure Python, making it accessible and easy to integrate into existing Python projects. It supports optional extras for local inference engines, text-to-speech (TTS), speech-to-text (STT), and API providers, making it flexible for various deployment scenarios.

architectural enforcement of behavioral compliance and agent orchestration

What distinguishes npcpy is its insistence on enforcing compliance and control through architectural design rather than prompt engineering alone. Most LLM agent frameworks treat prompts as the primary method to influence agent behavior, which can be brittle and opaque. npcpy’s NPC data layer models interactions between Context, Agents, and Tools explicitly, allowing the software to govern what an agent can or cannot do.

This approach makes it possible to audit agent decisions and outputs systematically. Instead of hoping that a prompt will prevent unwanted behaviors, npcpy’s architecture provides deterministic boundaries enforced by the code itself.

The agent archetypes provided showcase different tradeoffs:

  • The Base Agent covers common operational needs with built-in tools, offering a ready-to-go solution.
  • The ToolAgent opens the door to customization but requires users to implement and register their own tools, adding complexity.
  • The CodingAgent is a neat automation to execute generated code, but it comes with the risk of running unintended or unsafe code, so it needs careful sandboxing.
  • The NPCArray multi-agent debate is powerful for scenarios requiring multiple perspectives or consensus-building but introduces complexity in managing agent interactions and potential combinatorial explosion in state.

Under the hood, the codebase is surprisingly clean for a project juggling multiple providers and complex agent orchestration. The modular design and clear separation of concerns make it approachable, though fully mastering the NPC data layer will take some time.

quick start with npcpy

The installation is straightforward with pip and supports several optional extras depending on your target setup:

pip install npcpy              # base
pip install npcpy[lite]        # + API provider libraries
pip install npcpy[local]       # + ollama, diffusers, transformers, airllm
pip install npcpy[yap]         # + TTS/STT
pip install npcpy[all]         # everything

System dependencies vary by platform. For example, on Linux, you install system packages like espeak, portaudio19-dev, ffmpeg, and others, then install and pull models with Ollama:

sudo apt-get install espeak portaudio19-dev python3-pyaudio ffmpeg libcairo2-dev libgirepository1.0-dev
curl -fsSL https://ollama.com/install.sh | sh
ollama pull qwen3.5:2b

On macOS, the equivalent uses Homebrew:

brew install portaudio ffmpeg pygobject3 ollama
brew services start ollama
ollama pull qwen3.5:2b

Windows users install Ollama and ffmpeg manually and pull the model similarly.

API keys for cloud providers go into a .env file or environment variables:

export OPENAI_API_KEY="your_key"
export ANTHROPIC_API_KEY="your_key"
export GEMINI_API_KEY="your_key"

From here, you can start building agents using the provided archetypes, attach tools, and orchestrate multi-agent workflows.

who should consider npcpy and its limitations

npcpy is a solid fit for developers building complex AI agent systems where control, compliance, and auditability are paramount. If you’ve struggled with brittle prompt engineering to enforce agent constraints or want a more deterministic multi-agent system, npcpy’s architectural approach is worth understanding.

That said, the approach comes with complexity overhead. The NPC data layer and multi-agent orchestrations require a solid grasp of the abstractions and may be overkill for simple single-agent tasks or straightforward prompt-based applications.

Also, while npcpy supports multiple local and cloud providers, the quality and performance of your agents heavily depend on those underlying LLMs and inference setups. Running local models requires significant system resources and setup, and cloud APIs add cost and latency considerations.

The automatic code execution of the CodingAgent is useful but can introduce security risks if not sandboxed properly. The multi-agent debates via NPCArray can become complex to debug or tune.

Overall, npcpy is a thoughtfully designed toolkit for practitioners who want to embed compliance and behavioral rules in AI agent architectures rather than relying solely on prompt engineering. It’s worth exploring if you need reliable, auditable agent systems or are experimenting with multi-agent AI workflows that include tool use and debate.

# Example: simple instantiation of a base agent
from npcpy.agent import BaseAgent

agent = BaseAgent()
response = agent.run("What is the weather today?")
print(response)

This snippet barely scratches the surface but shows how you might start working with the base agent. From here, you can attach tools, switch LLM providers, or orchestrate multiple agents.

The codebase’s modular design and clean abstractions make it approachable for Python developers comfortable with AI and agent design patterns.

npcpy is not for everyone, but for those building serious agentic AI systems with compliance needs, it offers a unique architectural stance that’s worth understanding and testing in your projects.


→ GitHub Repo: NPC-Worldwide/npcpy ⭐ 1,363 · Python