๐Ÿ”’ Open Source ยท Production Ready

Your LLM calls.
Your data. Zero exposure.

A drop-in privacy proxy that strips PII, secrets, and sensitive data before they reach any AI provider โ€” cloud or local.

Get hosted โ†’ View on GitHub
$ pip install llm-privacy-gateway
โœ“ Zero external dependencies
โœ“ 10/10 leak prevention benchmark
โœ“ Works with OpenAI ยท DeepSeek ยท Claude ยท Ollama
โœ“ AES-256-GCM encrypted audit logs
โœ“ MIT licensed
The Problem
Every prompt you send goes through
their servers unfiltered.

OpenAI, DeepSeek, Anthropic โ€” they all log requests. One leaked key or prompt could expose patient records, client data, or API secrets.

โœ— Without the gateway

Raw prompts โ€” including emails, SSNs, credit cards, API keys โ€” travel to the cloud provider in plaintext.

"email alice@company.com,
SSN 123-45-6789,
key sk-abc123..."

โœ“ With the gateway

PII is stripped or tokenized before the request is forwarded. The LLM never sees real sensitive data.

"email [REDACTED_EMAIL],
SSN [REDACTED_SSN],
key [REDACTED_KEY]"

Who needs this?

Healthcare apps processing patient notes. Legal tools reading contracts. Finance platforms with transaction data. Any team using LLMs on sensitive data.


How it works
Three lines to protect every LLM call.

The gateway runs as a local proxy. Your app talks to it instead of OpenAI directly. No SDK changes required.

Step 01

Point your app at the gateway

Change base_url from api.openai.com to localhost:8787. That's it. Zero code changes.

Step 02

Gateway scrubs every request

Emails, phones, SSNs, credit cards, API keys โ€” stripped or tokenized before the request leaves your machine.

Step 03

Response is sanitized too

Inbound responses are scanned for PII echoed back. Streamed responses handled chunk-by-chunk, split-boundary safe.

Step 04

Everything logged, encrypted

Every request/response pair is written to an AES-256-GCM encrypted local audit log. Only you hold the key.

๐Ÿ 

True zero-cloud mode via Ollama

Route to a local model and your data never leaves your machine โ€” not even to us. Just point the gateway at Ollama: --upstream-base-url http://localhost:11434. Works with llama3, qwen, deepseek-coder and any Ollama-compatible model. Verified: ZERO_CLOUD_OK โœ“


Benchmark Results
10 out of 10. Zero leaks.

Tested against a curated corpus of 10 leak scenarios covering outbound PII, tool args, tainted inbound, and streaming split-boundary edge cases.

LLM Privacy Gateway

10/10
Cases passed ยท 0 leaks
Boundary redaction + tokenization + tool-arg scan + taint-aware stream buffer

Regex-only (both ways)

5/10
Cases passed ยท 5 leaks
Simulated baseline: simple regex on full payload, no taint memory

Regex outbound only

3/10
Cases passed ยท 7 leaks
Simulated baseline: no inbound scanning, no stream handling

Features
Production-grade. Zero dependencies.

Built in pure Python โ€” stdlib only. No third-party risk in the privacy layer itself.

๐Ÿ”

Boundary PII Redaction

Emails, phones, SSNs, credit cards, API keys, and custom patterns stripped before any outbound call.

๐ŸŽญ

Session Tokenization

Soft PII (names, identifiers) replaced with session-random tokens so the LLM still gets context โ€” but no real data.

๐Ÿ”ง

Tool Argument Scanning

Function-call arguments inspected for sensitive keys (api_key, password, token) before they reach the model.

๐Ÿ“ก

Stream-safe Sanitization

SSE and NDJSON streams sanitized chunk-by-chunk with a holdback buffer โ€” no split-boundary leaks.

๐Ÿง 

Taint-aware Policy

Tracks which sessions have seen sensitive values and prevents replays in subsequent requests.

๐Ÿ“

Encrypted Audit Logs

Every request/response pair logged with AES-256-GCM encryption. Key rotation supported. Only you can read it.

๐Ÿšฆ

Rate Limiting + Auth

Per-IP rate limiting with SQLite backend, bearer token auth, and optional mTLS for enterprise deployments.

๐Ÿ“Š

Prometheus Metrics

/healthz and /metrics endpoints for monitoring. Plug straight into Grafana.

๐Ÿ 

Local Model Support

Works with Ollama. Route to llama3, qwen, mistral โ€” data never leaves your device. True zero-cloud.


Pricing
Start free. Scale when you need.

Self-hosted is always free and open source. Pay only if you want us to host and manage it for you.

Open Source
$0 / forever
Self-hosted. Full source code on GitHub. MIT license.
  • Full gateway + all privacy features
  • Works with any OpenAI-compatible API
  • Local Ollama support
  • Encrypted audit logs
  • Docker + one-command deploy
  • Managed hosting
  • Email support
  • Custom policy packs
Get the code โ†’
Enterprise
$299 / month
For teams with compliance requirements. HIPAA-ready config, custom patterns, priority support.
  • Everything in Pro Hosted
  • Custom PII patterns for your domain
  • HIPAA / SOC2-ready config template
  • mTLS for upstream + client
  • Priority email support (24h)
  • 2-week pilot with leakage report
  • Slack / Teams integration
  • Annual billing available
Contact us โ†’

FAQ
Common questions

Is this real end-to-end encryption?

True E2EE means only sender and receiver can read the message โ€” but an LLM can't process encrypted text. What this gateway gives you is the practical equivalent: your PII never reaches the cloud provider's servers. For true zero-cloud, use Ollama local mode โ€” data never leaves your machine at all.

Does the hosted version mean you see my LLM requests?

No. In the hosted plan, your requests are PII-scrubbed at the gateway (before logging) and forwarded to your chosen LLM provider using your own API token. We never store the raw content or your upstream credentials. Audit logs are encrypted with a key only you hold.

Does this work with OpenAI, Anthropic, DeepSeek, Azure OpenAI?

Yes โ€” any OpenAI-compatible API endpoint works. Also supports Anthropic's API format, Gemini streaming, and local Ollama models. Just point --upstream-base-url at your provider.

What's the latency overhead?

The gateway adds <5ms of local processing. It's a lightweight Python proxy โ€” no ML models running on your traffic, just regex + tokenization. For most LLM workflows, you won't notice the difference.

Do I need to change my code to use this?

No. Change one line: set base_url="http://localhost:8787" (or our hosted URL) in your OpenAI client. Everything else stays the same.