Claw-Eval offers a Python-based evaluation harness for LLM autonomous agents, featuring 300 tasks and a strict Pass^3 metric to ensure reliable, multi-dimensional benchmarking.
Neon Vision Editor is a native Swift code editor for macOS/iOS/iPadOS that balances minimalism with sandbox compliance via a unique CLI helper using macOS Launch Services.
OpenShell by NVIDIA offers a Rust-based AI agent sandbox runtime with hot-reloadable YAML policies for filesystem, network, process, and inference controls inside containers.
Open Cowork wraps multiple LLMs in a cross-platform desktop app with unique VM-level sandboxing using WSL2 and Lima for safe AI agent command execution.
A curated repo comparing sandboxing technologies for secure, fast AI agent execution. Covers microVMs, containers, WebAssembly, and more with tradeoffs on security vs speed.
Naersk is a Nix library enabling reproducible Rust builds free of Impure Function Dependencies. It parses Cargo.lock and manages dependencies purely in Nix, ideal for CI/CD and Hydra.
DeerFlow 2.0 is a Python framework for orchestrating AI sub-agents and memory with support for multiple LLMs and execution sandboxes. It uses a modular config and setup wizard for flexible deployment.