Environment Variables

All SafeLLM settings can be configured using environment variables. These variables are parsed by Pydantic and can be set in a .env file or directly in the shell.

Core Settings

Variable	Default	OSS / ENT	Description
`ENABLE_CACHE`	`true`	OSS	Enables L0 layer (Redis Cache).
`ENABLE_L1_KEYWORDS`	`true`	OSS	Enables keyword filtering.
`ENABLE_L2_AI`	`false`	ENT	Enables ONNX models (Prompt Injection).
`ENABLE_L3_PII`	`true`	OSS	Enables sensitive data detection (L1.5 Layer).
`ENABLE_DLP`	`false`	OSS	Enables output scanning (Data Loss Prevention).
`ENABLE_METRICS`	`true`	OSS	Enables `/metrics` endpoint for Prometheus.
`SHADOW_MODE`	`true`	OSS	Log violations without blocking traffic.
`FAIL_OPEN`	`false`	OSS	If `true`, allow request on layer errors.
`LOG_LEVEL`	`INFO`	OSS	`DEBUG`, `INFO`, `WARNING`, `ERROR`.
`LOG_FORMAT`	`json`	OSS	`json` or `text`.

OSS Defaults & Limitations

In the Open Source (OSS) edition, certain features are hardcoded or limited to ensure maximum performance with minimal dependencies:

Shadow Mode: Enabled by default (SHADOW_MODE=true) to avoid accidental blocking on first install.
PII Detection: Respects USE_FAST_PII. However, since GLiNER is not available in OSS, setting USE_FAST_PII=false will effectively disable PII detection.
DLP Mode: DLP_MODE is restricted to log in OSS. block and anonymize require the Enterprise build.
Redis: Supports standalone Redis only.
AI Guard: Disabled (L2 layer returns safe by default in OSS builds).

Enterprise-Only Settings (Paid)

These variables only take effect in the Enterprise Edition:

Variable	Default	Description
`USE_FAST_PII`	`false`	Set to `false` to use high-precision GLiNER AI models.
`REDIS_SENTINEL_ENABLED`	`false`	Enables Redis Sentinel for high availability.
`REDIS_SENTINEL_HOSTS`	`redis-sentinel:26379`	Comma-separated list of Sentinel nodes (e.g., `host1:26379,host2:26379`).
`REDIS_SENTINEL_MASTER`	`mymaster`	Sentinel master name.
`REDIS_SENTINEL_PASSWORD`	-	Optional password for Sentinel nodes.
`USE_DISTRIBUTED_COALESCER`	`false`	Prevents “thundering herd” by syncing in-flight requests via Redis.
`COALESCER_LOCK_TTL`	`30`	Distributed lock TTL in seconds (prevents deadlocks).
`COALESCER_RESULT_TTL`	`5`	How long to keep results for late followers.
`ENABLE_AUDIT_LOGS`	`false`	Enables persistent audit logging (Redis Queue -> Worker).
`AUDIT_QUEUE_NAME`	`safellm:audit_logs`	Redis key for the audit log queue.
`AUDIT_REDIS_FALLBACK_TO_FILE`	`false`	If Redis fails, write audit logs to a local JSONL file.
`AUDIT_LOG_PATH`	`/app/audit_logs/audit.jsonl`	Path for fallback audit logs.
`DASHBOARD_ADMIN_KEY`	-	API key required to access the administrative panel.

Layer L1: Keywords

Variable	Default	Description
`L1_BLOCKED_PHRASES`	(Internal)	JSON list `["a", "b"]` or CSV `a,b`. Overrides the default security list.

Layer L1.5: PII Shield (Input)

Variable	Default	Description
`L3_PII_ENTITIES`	(Common)	List of PII types (e.g., `EMAIL_ADDRESS,PHONE_NUMBER,CREDIT_CARD`).
`L3_PII_THRESHOLD`	`0.7`	Confidence threshold for detection (GLiNER or Regex).
`L3_PII_LANGUAGE`	`en`	Language for GLiNER analysis (ENT only).
`CUSTOM_FAST_PII_PATTERNS`	`{}`	JSON dict of custom regex patterns (e.g., `{"MY_ID":"ID-[0-9]{4}"}`).
`CUSTOM_FAST_PII_MAX_PATTERNS`	`50`	Maximum number of allowed custom regex patterns.
`CUSTOM_FAST_PII_MAX_PATTERN_LENGTH`	`256`	Maximum characters in a single custom regex string.
`CUSTOM_FAST_PII_MAX_TEXT_LENGTH`	`20000`	Skip custom regex for texts longer than this to avoid ReDoS risk.

Layer L2: Neural Guard (Enterprise)

Variable	Default	Description
`L2_MODEL_PATH`	`models/...`	Path to the ONNX model file.
`L2_THRESHOLD`	`0.9`	Blocking threshold (higher = stricter).
`L2_MAX_LENGTH`	`512`	Max prompt length in tokens for the AI model. Texts longer than this are truncated.

Redis Configuration

Variable	Default	Description
`REDIS_HOST`	`localhost`	Redis server address.
`REDIS_PORT`	`6379`	Redis port.
`REDIS_DB`	`0`	Redis database index.
`REDIS_PASSWORD`	-	Optional Redis password.
`REDIS_TTL`	`3600`	Cache TTL in seconds.
`REDIS_TIMEOUT`	`0.5`	Connection timeout in seconds.

Data Loss Prevention (DLP)

Variable	Default	Description
`DLP_STREAMING_MODE`	`block`	`block` (buffer + scan) or `audit` (stream + log).
`DLP_STREAMING_WINDOW_SIZE`	`4096`	Sliding window size for streaming DLP scans.
`DLP_STREAMING_MAX_CHUNK_LENGTH`	`4096`	Max chunk length accepted per streaming call.
`DLP_STREAMING_TTL_SECONDS`	`300`	TTL for stream state in seconds.
`DLP_STREAMING_USE_REDIS`	`true`	Store stream state in Redis for multi-worker safety.
`DLP_MODE`	`block`	`block`, `anonymize` (ENT only), or `log` (OSS).
`DLP_PII_ENTITIES`	(Common)	Entities to scan in responses.
`DLP_PII_THRESHOLD`	`0.5`	Confidence threshold for output scanning.
`DLP_MAX_OUTPUT_LENGTH`	`500000`	Max characters to scan in one response.
`DLP_BLOCK_MESSAGE`	`[BLOCKED...]`	Message shown to user when a leak is blocked.
`DLP_FAIL_OPEN`	`false`	Allow response if DLP scanner fails.

Infrastructure & Security

Variable	Default	Description
`PRELOAD_MODELS`	`false`	Load AI models once at startup (recommended for multi-worker).
`MAX_BODY_SIZE`	`1000000`	Max request body size in bytes (1MB).
`REQUEST_TIMEOUT`	`30`	Internal request timeout in seconds.
`ALLOW_HEADER`	`X-Auth-Result`	Header name set by sidecar for APISIX (200 OK case).

Production Memory Guidance

Model preload increases RAM usage. With PRELOAD_MODELS=true, each worker keeps models in memory for lower latency.
DLP block mode buffers full responses. Large responses consume RAM until scan completes; keep DLP_MAX_OUTPUT_LENGTH sane.
Worker count matters. If you run multiple workers, multiply memory requirements accordingly to avoid OOM kills.