Skip to content

Environment Variables

All SafeLLM settings can be configured using environment variables. These variables are parsed by Pydantic and can be set in a .env file or directly in the shell.

VariableDefaultOSS / ENTDescription
ENABLE_CACHEtrueOSSEnables L0 layer (Redis Cache).
ENABLE_L1_KEYWORDStrueOSSEnables keyword filtering.
ENABLE_L2_AIfalseENTEnables ONNX models (Prompt Injection).
ENABLE_L3_PIItrueOSSEnables sensitive data detection (L1.5 Layer).
ENABLE_DLPfalseOSSEnables output scanning (Data Loss Prevention).
ENABLE_METRICStrueOSSEnables /metrics endpoint for Prometheus.
SHADOW_MODEtrueOSSLog violations without blocking traffic.
FAIL_OPENfalseOSSIf true, allow request on layer errors.
LOG_LEVELINFOOSSDEBUG, INFO, WARNING, ERROR.
LOG_FORMATjsonOSSjson or text.

In the Open Source (OSS) edition, certain features are hardcoded or limited to ensure maximum performance with minimal dependencies:

  • Shadow Mode: Enabled by default (SHADOW_MODE=true) to avoid accidental blocking on first install.
  • PII Detection: Respects USE_FAST_PII. However, since GLiNER is not available in OSS, setting USE_FAST_PII=false will effectively disable PII detection.
  • DLP Mode: DLP_MODE is restricted to log in OSS. block and anonymize require the Enterprise build.
  • Redis: Supports standalone Redis only.
  • AI Guard: Disabled (L2 layer returns safe by default in OSS builds).

These variables only take effect in the Enterprise Edition:

VariableDefaultDescription
USE_FAST_PIIfalseSet to false to use high-precision GLiNER AI models.
REDIS_SENTINEL_ENABLEDfalseEnables Redis Sentinel for high availability.
REDIS_SENTINEL_HOSTSredis-sentinel:26379Comma-separated list of Sentinel nodes (e.g., host1:26379,host2:26379).
REDIS_SENTINEL_MASTERmymasterSentinel master name.
REDIS_SENTINEL_PASSWORD-Optional password for Sentinel nodes.
USE_DISTRIBUTED_COALESCERfalsePrevents “thundering herd” by syncing in-flight requests via Redis.
COALESCER_LOCK_TTL30Distributed lock TTL in seconds (prevents deadlocks).
COALESCER_RESULT_TTL5How long to keep results for late followers.
ENABLE_AUDIT_LOGSfalseEnables persistent audit logging (Redis Queue -> Worker).
AUDIT_QUEUE_NAMEsafellm:audit_logsRedis key for the audit log queue.
AUDIT_REDIS_FALLBACK_TO_FILEfalseIf Redis fails, write audit logs to a local JSONL file.
AUDIT_LOG_PATH/app/audit_logs/audit.jsonlPath for fallback audit logs.
DASHBOARD_ADMIN_KEY-API key required to access the administrative panel.
VariableDefaultDescription
L1_BLOCKED_PHRASES(Internal)JSON list ["a", "b"] or CSV a,b. Overrides the default security list.
VariableDefaultDescription
L3_PII_ENTITIES(Common)List of PII types (e.g., EMAIL_ADDRESS,PHONE_NUMBER,CREDIT_CARD).
L3_PII_THRESHOLD0.7Confidence threshold for detection (GLiNER or Regex).
L3_PII_LANGUAGEenLanguage for GLiNER analysis (ENT only).
CUSTOM_FAST_PII_PATTERNS{}JSON dict of custom regex patterns (e.g., {"MY_ID":"ID-[0-9]{4}"}).
CUSTOM_FAST_PII_MAX_PATTERNS50Maximum number of allowed custom regex patterns.
CUSTOM_FAST_PII_MAX_PATTERN_LENGTH256Maximum characters in a single custom regex string.
CUSTOM_FAST_PII_MAX_TEXT_LENGTH20000Skip custom regex for texts longer than this to avoid ReDoS risk.
VariableDefaultDescription
L2_MODEL_PATHmodels/...Path to the ONNX model file.
L2_THRESHOLD0.9Blocking threshold (higher = stricter).
L2_MAX_LENGTH512Max prompt length in tokens for the AI model. Texts longer than this are truncated.
VariableDefaultDescription
REDIS_HOSTlocalhostRedis server address.
REDIS_PORT6379Redis port.
REDIS_DB0Redis database index.
REDIS_PASSWORD-Optional Redis password.
REDIS_TTL3600Cache TTL in seconds.
REDIS_TIMEOUT0.5Connection timeout in seconds.
VariableDefaultDescription
DLP_STREAMING_MODEblockblock (buffer + scan) or audit (stream + log).
DLP_STREAMING_WINDOW_SIZE4096Sliding window size for streaming DLP scans.
DLP_STREAMING_MAX_CHUNK_LENGTH4096Max chunk length accepted per streaming call.
DLP_STREAMING_TTL_SECONDS300TTL for stream state in seconds.
DLP_STREAMING_USE_REDIStrueStore stream state in Redis for multi-worker safety.
DLP_MODEblockblock, anonymize (ENT only), or log (OSS).
DLP_PII_ENTITIES(Common)Entities to scan in responses.
DLP_PII_THRESHOLD0.5Confidence threshold for output scanning.
DLP_MAX_OUTPUT_LENGTH500000Max characters to scan in one response.
DLP_BLOCK_MESSAGE[BLOCKED...]Message shown to user when a leak is blocked.
DLP_FAIL_OPENfalseAllow response if DLP scanner fails.
VariableDefaultDescription
PRELOAD_MODELSfalseLoad AI models once at startup (recommended for multi-worker).
MAX_BODY_SIZE1000000Max request body size in bytes (1MB).
REQUEST_TIMEOUT30Internal request timeout in seconds.
ALLOW_HEADERX-Auth-ResultHeader name set by sidecar for APISIX (200 OK case).
  • Model preload increases RAM usage. With PRELOAD_MODELS=true, each worker keeps models in memory for lower latency.
  • DLP block mode buffers full responses. Large responses consume RAM until scan completes; keep DLP_MAX_OUTPUT_LENGTH sane.
  • Worker count matters. If you run multiple workers, multiply memory requirements accordingly to avoid OOM kills.