All SafeLLM settings can be configured using environment variables. These variables are parsed by Pydantic and can be set in a .env file or directly in the shell.
| Variable | Default | OSS / ENT | Description |
|---|
ENABLE_CACHE | true | OSS | Enables L0 layer (Redis Cache). |
ENABLE_L1_KEYWORDS | true | OSS | Enables keyword filtering. |
ENABLE_L2_AI | false | ENT | Enables ONNX models (Prompt Injection). |
ENABLE_L3_PII | true | OSS | Enables sensitive data detection (L1.5 Layer). |
ENABLE_DLP | false | OSS | Enables output scanning (Data Loss Prevention). |
ENABLE_METRICS | true | OSS | Enables /metrics endpoint for Prometheus. |
SHADOW_MODE | true | OSS | Log violations without blocking traffic. |
FAIL_OPEN | false | OSS | If true, allow request on layer errors. |
LOG_LEVEL | INFO | OSS | DEBUG, INFO, WARNING, ERROR. |
LOG_FORMAT | json | OSS | json or text. |
In the Open Source (OSS) edition, certain features are hardcoded or limited to ensure maximum performance with minimal dependencies:
- Shadow Mode: Enabled by default (
SHADOW_MODE=true) to avoid accidental blocking on first install.
- PII Detection: Respects
USE_FAST_PII. However, since GLiNER is not available in OSS, setting USE_FAST_PII=false will effectively disable PII detection.
- DLP Mode:
DLP_MODE is restricted to log in OSS. block and anonymize require the Enterprise build.
- Redis: Supports standalone Redis only.
- AI Guard: Disabled (L2 layer returns
safe by default in OSS builds).
These variables only take effect in the Enterprise Edition:
| Variable | Default | Description |
|---|
USE_FAST_PII | false | Set to false to use high-precision GLiNER AI models. |
REDIS_SENTINEL_ENABLED | false | Enables Redis Sentinel for high availability. |
REDIS_SENTINEL_HOSTS | redis-sentinel:26379 | Comma-separated list of Sentinel nodes (e.g., host1:26379,host2:26379). |
REDIS_SENTINEL_MASTER | mymaster | Sentinel master name. |
REDIS_SENTINEL_PASSWORD | - | Optional password for Sentinel nodes. |
USE_DISTRIBUTED_COALESCER | false | Prevents “thundering herd” by syncing in-flight requests via Redis. |
COALESCER_LOCK_TTL | 30 | Distributed lock TTL in seconds (prevents deadlocks). |
COALESCER_RESULT_TTL | 5 | How long to keep results for late followers. |
ENABLE_AUDIT_LOGS | false | Enables persistent audit logging (Redis Queue -> Worker). |
AUDIT_QUEUE_NAME | safellm:audit_logs | Redis key for the audit log queue. |
AUDIT_REDIS_FALLBACK_TO_FILE | false | If Redis fails, write audit logs to a local JSONL file. |
AUDIT_LOG_PATH | /app/audit_logs/audit.jsonl | Path for fallback audit logs. |
DASHBOARD_ADMIN_KEY | - | API key required to access the administrative panel. |
| Variable | Default | Description |
|---|
L1_BLOCKED_PHRASES | (Internal) | JSON list ["a", "b"] or CSV a,b. Overrides the default security list. |
| Variable | Default | Description |
|---|
L3_PII_ENTITIES | (Common) | List of PII types (e.g., EMAIL_ADDRESS,PHONE_NUMBER,CREDIT_CARD). |
L3_PII_THRESHOLD | 0.7 | Confidence threshold for detection (GLiNER or Regex). |
L3_PII_LANGUAGE | en | Language for GLiNER analysis (ENT only). |
CUSTOM_FAST_PII_PATTERNS | {} | JSON dict of custom regex patterns (e.g., {"MY_ID":"ID-[0-9]{4}"}). |
CUSTOM_FAST_PII_MAX_PATTERNS | 50 | Maximum number of allowed custom regex patterns. |
CUSTOM_FAST_PII_MAX_PATTERN_LENGTH | 256 | Maximum characters in a single custom regex string. |
CUSTOM_FAST_PII_MAX_TEXT_LENGTH | 20000 | Skip custom regex for texts longer than this to avoid ReDoS risk. |
| Variable | Default | Description |
|---|
L2_MODEL_PATH | models/... | Path to the ONNX model file. |
L2_THRESHOLD | 0.9 | Blocking threshold (higher = stricter). |
L2_MAX_LENGTH | 512 | Max prompt length in tokens for the AI model. Texts longer than this are truncated. |
| Variable | Default | Description |
|---|
REDIS_HOST | localhost | Redis server address. |
REDIS_PORT | 6379 | Redis port. |
REDIS_DB | 0 | Redis database index. |
REDIS_PASSWORD | - | Optional Redis password. |
REDIS_TTL | 3600 | Cache TTL in seconds. |
REDIS_TIMEOUT | 0.5 | Connection timeout in seconds. |
| Variable | Default | Description |
|---|
DLP_STREAMING_MODE | block | block (buffer + scan) or audit (stream + log). |
DLP_STREAMING_WINDOW_SIZE | 4096 | Sliding window size for streaming DLP scans. |
DLP_STREAMING_MAX_CHUNK_LENGTH | 4096 | Max chunk length accepted per streaming call. |
DLP_STREAMING_TTL_SECONDS | 300 | TTL for stream state in seconds. |
DLP_STREAMING_USE_REDIS | true | Store stream state in Redis for multi-worker safety. |
DLP_MODE | block | block, anonymize (ENT only), or log (OSS). |
DLP_PII_ENTITIES | (Common) | Entities to scan in responses. |
DLP_PII_THRESHOLD | 0.5 | Confidence threshold for output scanning. |
DLP_MAX_OUTPUT_LENGTH | 500000 | Max characters to scan in one response. |
DLP_BLOCK_MESSAGE | [BLOCKED...] | Message shown to user when a leak is blocked. |
DLP_FAIL_OPEN | false | Allow response if DLP scanner fails. |
| Variable | Default | Description |
|---|
PRELOAD_MODELS | false | Load AI models once at startup (recommended for multi-worker). |
MAX_BODY_SIZE | 1000000 | Max request body size in bytes (1MB). |
REQUEST_TIMEOUT | 30 | Internal request timeout in seconds. |
ALLOW_HEADER | X-Auth-Result | Header name set by sidecar for APISIX (200 OK case). |
- Model preload increases RAM usage. With
PRELOAD_MODELS=true, each worker keeps models in memory for lower latency.
- DLP block mode buffers full responses. Large responses consume RAM until scan completes; keep
DLP_MAX_OUTPUT_LENGTH sane.
- Worker count matters. If you run multiple workers, multiply memory requirements accordingly to avoid OOM kills.