OpenShell: Securing AI agents with runtime-policy sandboxing from NVIDIA

OpenShell addresses a pressing challenge in AI agent deployment: safely running autonomous agents like Claude, Codex, or Copilot inside isolated containers with strong security policies. It’s an alpha-stage runtime from NVIDIA that goes beyond simple sandboxing by layering multiple defenses and enabling dynamic policy updates without downtime.

What OpenShell does and how it’s built

At its core, OpenShell is a runtime environment designed to run autonomous AI agents inside sandboxed containers while enforcing strict security policies. The project is implemented in Rust, a natural choice given its focus on safety and performance. It supports multiple compute backends including Docker, Podman, MicroVM, and Kubernetes, making it flexible enough for various development and deployment scenarios.

OpenShell applies a defense-in-depth strategy by enforcing four protection layers:

Filesystem policies: Restricting access to the host and container file systems based on declarative rules.
Network policies: Fine-grained outbound and inbound network controls, including Layer 7 filtering.
Process policies: Managing which processes can run and their privileges.
Inference policies: Governing AI inference calls, e.g., to external LLMs.

These policies are defined in YAML and can be hot-reloaded at runtime for network and inference layers, allowing changes without restarting the sandbox containers. This capability is a practical security primitive, addressing the common operational headache of updating policies without downtime.

The runtime includes a privacy-aware router that strips caller credentials from requests and injects backend credentials for managed model access. Unlike traditional filesystem-based credential handling, OpenShell uses a provider system to inject agent credentials as environment variables, reducing the risk of credential leakage.

OpenShell ships with built-in agent skills for common tasks like gateway troubleshooting and policy generation, easing the burden on developers.

What makes OpenShell’s approach technically interesting

The standout feature is the policy-enforced egress routing with hot-reloadable YAML configurations. Most container runtimes or sandboxing tools handle network policies as static allow/deny lists that require container restarts upon changes. OpenShell’s approach lets operators adjust Layer 7 network policies dynamically, for example allowing GET requests but blocking POST requests, all without interrupting running agents.

This solves a real problem in AI agent security, where network access needs to be tightly controlled to prevent data exfiltration or unauthorized communications, but operational flexibility is also critical.

The defense-in-depth model applied across filesystem, network, process, and inference layers is another strength. Instead of relying on a single security boundary, multiple layers reduce the attack surface and contain potential breaches more effectively.

Under the hood, the choice of Rust helps maintain memory safety and concurrency without sacrificing performance. The codebase is surprisingly clean given the complexity of the problem domain, with clear separations between policy evaluation, sandbox lifecycle management, and agent credential handling.

The provider-based credential injection pattern is worth noting. Injecting secrets as environment variables rather than filesystem artifacts reduces the risk that an agent or malicious code could discover sensitive credentials by scanning container file trees.

The project is still in alpha and single-player mode, targeting individual developers. Multi-tenant and enterprise-grade deployments are planned but not yet available. This means that while the architecture is promising, expect some rough edges and evolving features.

Quick start

OpenShell supports macOS, Windows with WSL 2, and Linux hosts. You’ll need a local runtime such as Docker, Podman, or virtualization enabled for MicroVM sandboxes.

You can install OpenShell via these methods:

Binary (recommended):

curl -LsSf https://raw.githubusercontent.com/NVIDIA/OpenShell/main/install.sh | sh

From PyPI (requires uv):

uv tool install -U openshell

For Kubernetes users, there’s an experimental Helm chart:

helm install openshell oci://ghcr.io/nvidia/openshell/helm-chart

After installation, create a sandbox container for one of the supported agents:

openshell sandbox create -- claude  # or opencode, codex, copilot

The sandbox includes useful developer and networking tools like git, vim, ping, dig, and traceroute, along with language runtimes (Python 3.13 and Node 22).

Every sandbox starts with minimal outbound network access. You can open additional access by adjusting the YAML-based network policies, which can be hot-reloaded.

how to explore the policy enforcement

The real eye-opener is seeing how network policies work in practice. You define Layer 7 rules in YAML that specify which HTTP methods and endpoints are allowed or denied. The runtime enforces these rules inside the network stack of the sandbox, and updates can be applied live without restarting the container.

This dynamic control is rare in container sandboxing and much needed for AI agents that may call out to multiple external services or APIs.

verdict

OpenShell is a promising step towards secure, flexible AI agent runtimes. Its multi-layer defense model coupled with runtime-policy hot-reloading tackles a gap in current sandboxing tools focused on AI workloads.

The Rust implementation and multi-backend support show careful design, and the privacy-aware credential injection addresses a subtle but important security concern.

As an alpha-stage project, it’s best suited for developers and researchers exploring secure AI agent execution and sandboxing. Production use will require patience as features mature, especially multi-tenant support.

If you’re building or deploying autonomous AI agents and need a sandbox runtime with fine-grained, dynamic policy controls, OpenShell is worth a close look.

Its approach to runtime network policy enforcement and credential management offers insights valuable beyond just this project, potentially influencing how secure AI workflows are architected in the future.

NVIDIA NeMo Agent Toolkit: Enhancing multi-agent workflows with performance primitives and observability — NVIDIA NeMo Agent Toolkit adds performance primitives, profiling, and runtime intelligence to multi-agent workflows alon
awesome-sandbox: comparing modern sandboxing tech for AI agent execution — A curated repo comparing sandboxing technologies for secure, fast AI agent execution. Covers microVMs, containers, WebAs
Open Cowork: Desktop AI Agent with VM-level Sandbox Isolation for Safer AI Workflows — Open Cowork wraps multiple LLMs in a cross-platform desktop app with unique VM-level sandboxing using WSL2 and Lima for
Building private AI workflows with the n8n self-hosted AI starter kit — Spin up a private AI agent stack in under 5 minutes with n8n’s self-hosted AI starter kit. Combines local LLMs, automati
Forge: a Rust-based multi-agent AI coding assistant integrated into your terminal workflow — Forge is a Rust-based AI coding agent with multi-agent architecture and a unique ZSH plugin that intercepts shell comman

→ GitHub Repo: NVIDIA/OpenShell ⭐ 6,049 · Rust

Noureddine RAMDI / OpenShell: Securing AI agents with runtime-policy sandboxing from NVIDIA

What OpenShell does and how it’s built

What makes OpenShell’s approach technically interesting

Quick start

how to explore the policy enforcement

verdict

Related Articles