DLP: Output Protection
DLP (Data Loss Prevention) in SafeLLM is a mechanism protecting against uncontrolled leakage of sensitive data from the LLM model to the end user.
Why is DLP important?
Section titled “Why is DLP important?”LLM models can be manipulated to reveal data from their training set or data from other users (if they are part of the context). DLP represents the last line of defense.
Operation Modes (Streaming Modes)
Section titled “Operation Modes (Streaming Modes)”1. Synchronous (block)
Section titled “1. Synchronous (block)”- Process: Response Buffering -> Scanning -> Delivery.
- Security: Highest. Data never leaves the sidecar before the scan.
- Note: Increases TTFT (Time To First Token) because the entire response must be generated.
- Operational caution: The full response is buffered in memory. Set
DLP_MAX_OUTPUT_LENGTHto cap memory usage for large outputs.
2. Asynchronous (audit) (OSS + Enterprise)
Section titled “2. Asynchronous (audit) (OSS + Enterprise)”- Process: Streaming to client -> Background scan of a copy.
- Security: Monitoring and Audit. Does not block leaks “on the fly” but allows for immediate reaction after the fact.
- Advantages: Zero impact on UX.
- Stats scope: Audit stats are in-memory per worker. In multi-worker deployments, totals are per-process unless you add a shared store (e.g., Redis).
When to use
Section titled “When to use”- Compliance: Essential for meeting GDPR, HIPAA, or SOC2 requirements by preventing PII from leaving your infrastructure via LLM responses.
- Data Privacy: To prevent the model from leaking training data or context that might contain sensitive information.
- Audit: Use
auditmode to monitor leaks without affecting the user experience (UX) during a pilot phase.
Common pitfalls
Section titled “Common pitfalls”- Memory Overhead: In
blockmode, the sidecar must buffer the entire response. If the LLM returns a very long text (e.g., 10k tokens), memory usage will spike. - Output Buffer Limit: Responses longer than
DLP_MAX_OUTPUT_LENGTH(default 500,000 chars) are truncated before scanning. This is a configurable security measure to prevent OOM (Out of Memory) errors, but it means very long responses might not be fully checked for leaks if the limit is exceeded. - TTFT (Time To First Token): Using
blockmode breaks streaming. The user won’t see the first word until the entire response is finished and scanned. Useauditmode if streaming UX is critical. - False Positives in Anonymization: In
anonymizemode, some technical terms or random numbers might be mistakenly identified as PII and redacted, potentially making the LLM response confusing. - APISIX / OpenResty phase limits (OSS): In APISIX,
body_filterdoes not allow network calls (cosockets). If a route attempts to call the sidecar frombody_filterto enforce DLP block, the request can fail (500 / empty reply). In OSS, keep DLP inaudit(log-only) mode for APISIX, or move blocking logic to an Enterprise-capable path.
Configuration Example
Section titled “Configuration Example”# High Security (Block Mode)ENABLE_DLP=trueDLP_STREAMING_MODE=blockDLP_MODE=anonymizeDLP_PII_ENTITIES=["EMAIL_ADDRESS", "PHONE_NUMBER", "CREDIT_CARD", "US_SSN"]DLP_MAX_OUTPUT_LENGTH=500000
# High Performance (Audit Mode)# ENABLE_DLP=true# DLP_STREAMING_MODE=audit# DLP_MODE=logReactions to PII Detection
Section titled “Reactions to PII Detection”For synchronous mode, you can choose one of three actions (DLP_MODE):
block(Enterprise Paid): The response is blocked, and the user sees the message[BLOCKED DUE TO PII LEAK].anonymize(Enterprise Paid): Sensitive data is replaced, e.g.,Account number: [REDACTED:IBAN].log(OSS): The response passes, and a log-only record is produced.
Configuration
Section titled “Configuration”ENABLE_DLP=trueDLP_STREAMING_MODE=blockDLP_MODE=log # OSS default; block/anonymize require Enterprise PaidDLP_PII_ENTITIES=["EMAIL_ADDRESS", "PHONE_NUMBER", "POLISH_PESEL"]Detected Entities
Section titled “Detected Entities”DLP shares detection engines with the L1.5 layer. In OSS it uses regex-only detection. Enterprise adds GLiNER for contextual detection and country-specific identifiers.