Skip to content

DLP: Output Protection

DLP (Data Loss Prevention) in SafeLLM is a mechanism protecting against uncontrolled leakage of sensitive data from the LLM model to the end user.

LLM models can be manipulated to reveal data from their training set or data from other users (if they are part of the context). DLP represents the last line of defense.

  • Process: Response Buffering -> Scanning -> Delivery.
  • Security: Highest. Data never leaves the sidecar before the scan.
  • Note: Increases TTFT (Time To First Token) because the entire response must be generated.
  • Operational caution: The full response is buffered in memory. Set DLP_MAX_OUTPUT_LENGTH to cap memory usage for large outputs.

2. Asynchronous (audit) (OSS + Enterprise)

Section titled “2. Asynchronous (audit) (OSS + Enterprise)”
  • Process: Streaming to client -> Background scan of a copy.
  • Security: Monitoring and Audit. Does not block leaks “on the fly” but allows for immediate reaction after the fact.
  • Advantages: Zero impact on UX.
  • Stats scope: Audit stats are in-memory per worker. In multi-worker deployments, totals are per-process unless you add a shared store (e.g., Redis).
  • Compliance: Essential for meeting GDPR, HIPAA, or SOC2 requirements by preventing PII from leaving your infrastructure via LLM responses.
  • Data Privacy: To prevent the model from leaking training data or context that might contain sensitive information.
  • Audit: Use audit mode to monitor leaks without affecting the user experience (UX) during a pilot phase.
  • Memory Overhead: In block mode, the sidecar must buffer the entire response. If the LLM returns a very long text (e.g., 10k tokens), memory usage will spike.
  • Output Buffer Limit: Responses longer than DLP_MAX_OUTPUT_LENGTH (default 500,000 chars) are truncated before scanning. This is a configurable security measure to prevent OOM (Out of Memory) errors, but it means very long responses might not be fully checked for leaks if the limit is exceeded.
  • TTFT (Time To First Token): Using block mode breaks streaming. The user won’t see the first word until the entire response is finished and scanned. Use audit mode if streaming UX is critical.
  • False Positives in Anonymization: In anonymize mode, some technical terms or random numbers might be mistakenly identified as PII and redacted, potentially making the LLM response confusing.
  • APISIX / OpenResty phase limits (OSS): In APISIX, body_filter does not allow network calls (cosockets). If a route attempts to call the sidecar from body_filter to enforce DLP block, the request can fail (500 / empty reply). In OSS, keep DLP in audit (log-only) mode for APISIX, or move blocking logic to an Enterprise-capable path.
# High Security (Block Mode)
ENABLE_DLP=true
DLP_STREAMING_MODE=block
DLP_MODE=anonymize
DLP_PII_ENTITIES=["EMAIL_ADDRESS", "PHONE_NUMBER", "CREDIT_CARD", "US_SSN"]
DLP_MAX_OUTPUT_LENGTH=500000
# High Performance (Audit Mode)
# ENABLE_DLP=true
# DLP_STREAMING_MODE=audit
# DLP_MODE=log

For synchronous mode, you can choose one of three actions (DLP_MODE):

  1. block (Enterprise Paid): The response is blocked, and the user sees the message [BLOCKED DUE TO PII LEAK].
  2. anonymize (Enterprise Paid): Sensitive data is replaced, e.g., Account number: [REDACTED:IBAN].
  3. log (OSS): The response passes, and a log-only record is produced.
Terminal window
ENABLE_DLP=true
DLP_STREAMING_MODE=block
DLP_MODE=log # OSS default; block/anonymize require Enterprise Paid
DLP_PII_ENTITIES=["EMAIL_ADDRESS", "PHONE_NUMBER", "POLISH_PESEL"]

DLP shares detection engines with the L1.5 layer. In OSS it uses regex-only detection. Enterprise adds GLiNER for contextual detection and country-specific identifiers.