Noureddine RAMDI / PasteGuard: a local privacy proxy for masking sensitive data in LLM requests

Created Mon, 04 May 2026 10:23:02 +0000 Modified Sat, 23 May 2026 20:41:27 +0000

sgasser/pasteguard

PasteGuard addresses a real pain point in AI development: how to prevent sensitive data like PII, API keys, and tokens from leaking into cloud-based large language models (LLMs). Instead of forcing developers to change their code or SDKs, PasteGuard offers a transparent proxy that automatically detects and masks sensitive information in API requests. This means you can keep using your existing OpenAI or Anthropic clients but route calls through a local privacy proxy, ensuring data is scrubbed before it leaves your environment.

What PasteGuard is and how it works

PasteGuard is an open-source privacy proxy designed specifically for LLMs. It runs locally (commonly in a Docker container) and intercepts API calls directed at OpenAI or Anthropic endpoints. The magic lies in its automatic detection of over 30 types of sensitive data — including personal identifiable information (PII), API keys, and tokens — across 24 supported languages.

The detection engine relies on Microsoft Presidio, a well-regarded PII detection framework, integrated seamlessly here for real-time scanning and masking. The proxy is implemented in TypeScript using Bun and Hono, which provide a lightweight and high-performance HTTP server environment.

Under the hood, PasteGuard acts as a reverse proxy: you change the base URL in your AI client’s configuration to point at PasteGuard’s local server (e.g., http://localhost:3000/openai/v1 instead of the real OpenAI URL). PasteGuard then inspects the outgoing requests, applies masking rules, and forwards sanitized requests to the real AI provider.

Beyond privacy, PasteGuard also supports a “route mode” that selectively redirects sensitive requests to local LLMs like Ollama or vLLM, while non-sensitive requests go to cloud providers. This hybrid routing enables balancing privacy concerns with access to powerful cloud models.

A built-in dashboard provides auditing capabilities, allowing you to review masked requests and understand what data was redacted.

What sets PasteGuard apart: transparent integration and local-first privacy

The standout feature is the simplicity of integration: no SDK changes or complex instrumentation needed. Just change the base URL in your existing OpenAI or Anthropic client, and PasteGuard transparently intercepts and sanitizes requests. This design minimizes developer friction and supports a wide range of existing applications.

The choice of Microsoft Presidio for PII detection is pragmatic. It supports many languages and data types out of the box, but it does impose limitations: detection accuracy depends on Presidio’s models and may not catch every edge case. False positives or negatives can occur, requiring tuning or custom rules for specific domains.

Running all processing locally is a deliberate tradeoff. It enhances privacy by avoiding data leakage and gives you full control, but it may add latency compared to direct calls. The performance impact depends on your workload and hardware.

The architecture is modular: Bun and Hono provide a modern, efficient runtime with minimal overhead. SQLite is used for local storage, likely for audit logs and configuration.

The route mode feature adds flexibility, allowing sensitive data to be handled by local LLMs, which can be essential in regulated environments or where cloud usage is restricted.

The dashboard is a valuable addition for operational visibility, helping teams audit privacy compliance in real time.

Overall, the codebase is surprisingly clean for a privacy-focused proxy, with a clear separation of concerns between detection, masking, proxying, and routing.

Quick start

Run PasteGuard locally using Docker:

docker run --rm -p 3000:3000 ghcr.io/sgasser/pasteguard:en

Then point your tools or applications to PasteGuard’s local URL instead of the original AI provider URLs:

APIPasteGuard URLOriginal URL
OpenAIhttp://localhost:3000/openai/v1https://api.openai.com/v1
Anthropichttp://localhost:3000/anthropichttps://api.anthropic.com

This straightforward setup means you can start masking sensitive data with minimal configuration.

Verdict

PasteGuard is a solid tool for developers and organizations that use large language models but need to control sensitive data leakage. Its transparent proxy approach minimizes integration overhead while providing robust PII detection powered by Microsoft Presidio.

The local processing model is a double-edged sword: it enhances privacy and auditability but may introduce latency and requires local resources. Detection accuracy depends on Presidio’s capabilities, so domain-specific tuning might be necessary.

For teams embedding LLMs into products or workflows where privacy is paramount, and where changing SDKs or rewriting code is impractical, PasteGuard offers a pragmatic solution. It also fits well in hybrid environments where local and cloud LLMs coexist.

That said, it’s not a silver bullet. It requires running a local proxy, which may not suit all deployment environments. The detection is as good as the underlying models, so sensitive data leakage cannot be fully ruled out without additional safeguards.

Still, the one-line base URL swap integration pattern is elegant in its simplicity and developer-friendliness. PasteGuard is worth exploring if you want to add a privacy layer to your AI interactions without refactoring your existing codebase.


→ GitHub Repo: sgasser/pasteguard ⭐ 613 · TypeScript