PasteGuard addresses a real pain point in AI development: how to prevent sensitive data like PII, API keys, and tokens from leaking into cloud-based large language models (LLMs). Instead of forcing developers to change their code or SDKs, PasteGuard offers a transparent proxy that automatically detects and masks sensitive information in API requests. This means you can keep using your existing OpenAI or Anthropic clients but route calls through a local privacy proxy, ensuring data is scrubbed before it leaves your environment.
What PasteGuard is and how it works
PasteGuard is an open-source privacy proxy designed specifically for LLMs. It runs locally (commonly in a Docker container) and intercepts API calls directed at OpenAI or Anthropic endpoints. The magic lies in its automatic detection of over 30 types of sensitive data — including personal identifiable information (PII), API keys, and tokens — across 24 supported languages.
The detection engine relies on Microsoft Presidio, a well-regarded PII detection framework, integrated seamlessly here for real-time scanning and masking. The proxy is implemented in TypeScript using Bun and Hono, which provide a lightweight and high-performance HTTP server environment.
Under the hood, PasteGuard acts as a reverse proxy: you change the base URL in your AI client’s configuration to point at PasteGuard’s local server (e.g., http://localhost:3000/openai/v1 instead of the real OpenAI URL). PasteGuard then inspects the outgoing requests, applies masking rules, and forwards sanitized requests to the real AI provider.
Beyond privacy, PasteGuard also supports a “route mode” that selectively redirects sensitive requests to local LLMs like Ollama or vLLM, while non-sensitive requests go to cloud providers. This hybrid routing enables balancing privacy concerns with access to powerful cloud models.
A built-in dashboard provides auditing capabilities, allowing you to review masked requests and understand what data was redacted.
What sets PasteGuard apart: transparent integration and local-first privacy
The standout feature is the simplicity of integration: no SDK changes or complex instrumentation needed. Just change the base URL in your existing OpenAI or Anthropic client, and PasteGuard transparently intercepts and sanitizes requests. This design minimizes developer friction and supports a wide range of existing applications.
The choice of Microsoft Presidio for PII detection is pragmatic. It supports many languages and data types out of the box, but it does impose limitations: detection accuracy depends on Presidio’s models and may not catch every edge case. False positives or negatives can occur, requiring tuning or custom rules for specific domains.
Running all processing locally is a deliberate tradeoff. It enhances privacy by avoiding data leakage and gives you full control, but it may add latency compared to direct calls. The performance impact depends on your workload and hardware.
The architecture is modular: Bun and Hono provide a modern, efficient runtime with minimal overhead. SQLite is used for local storage, likely for audit logs and configuration.
The route mode feature adds flexibility, allowing sensitive data to be handled by local LLMs, which can be essential in regulated environments or where cloud usage is restricted.
The dashboard is a valuable addition for operational visibility, helping teams audit privacy compliance in real time.
Overall, the codebase is surprisingly clean for a privacy-focused proxy, with a clear separation of concerns between detection, masking, proxying, and routing.
Quick start
Run PasteGuard locally using Docker:
docker run --rm -p 3000:3000 ghcr.io/sgasser/pasteguard:en
Then point your tools or applications to PasteGuard’s local URL instead of the original AI provider URLs:
| API | PasteGuard URL | Original URL |
|---|---|---|
| OpenAI | http://localhost:3000/openai/v1 | https://api.openai.com/v1 |
| Anthropic | http://localhost:3000/anthropic | https://api.anthropic.com |
This straightforward setup means you can start masking sensitive data with minimal configuration.
Verdict
PasteGuard is a solid tool for developers and organizations that use large language models but need to control sensitive data leakage. Its transparent proxy approach minimizes integration overhead while providing robust PII detection powered by Microsoft Presidio.
The local processing model is a double-edged sword: it enhances privacy and auditability but may introduce latency and requires local resources. Detection accuracy depends on Presidio’s capabilities, so domain-specific tuning might be necessary.
For teams embedding LLMs into products or workflows where privacy is paramount, and where changing SDKs or rewriting code is impractical, PasteGuard offers a pragmatic solution. It also fits well in hybrid environments where local and cloud LLMs coexist.
That said, it’s not a silver bullet. It requires running a local proxy, which may not suit all deployment environments. The detection is as good as the underlying models, so sensitive data leakage cannot be fully ruled out without additional safeguards.
Still, the one-line base URL swap integration pattern is elegant in its simplicity and developer-friendliness. PasteGuard is worth exploring if you want to add a privacy layer to your AI interactions without refactoring your existing codebase.
Related Articles
- OpenAI Codex CLI: local-first AI coding assistant with ChatGPT integration — OpenAI Codex CLI brings AI coding assistance local to your terminal, integrating with ChatGPT plans for powerful hybrid
- Browser Harness: a self-healing LLM agent for browser automation via Chrome DevTools — Browser Harness enables LLMs to automate browsers by dynamically generating helper functions using the Chrome DevTools P
- Jan: a local-first desktop app for large language models with Tauri and Rust — Jan is an open-source desktop app that runs large language models locally using Tauri, Node.js, and Rust. It offers priv
→ GitHub Repo: sgasser/pasteguard ⭐ 613 · TypeScript