Skip to content

L0: Smart Cache

The L0 layer is the first line of the SafeLLM pipeline, whose task is to optimize costs and response times by avoiding repeated scans and queries to LLM models.

  • Near-zero latency: Decision in < 0.1ms in case of a cache hit.
  • Cost reduction: Prevents repetitive, dangerous prompts from being sent to expensive AI models.
  • Performance: Relieves AI layers (L1.5, L2) from analyzing the same texts.
  • Cost Optimization: When you have many repetitive queries (e.g., “Help”, “Translate this”, or common greetings) and want to save tokens.
  • DDoS Protection: To prevent attackers from draining your LLM credits by sending the same malicious prompt repeatedly.
  • Latency sensitive apps: To get instant responses for known queries.
  • Cache Poisoning: If an attacker manages to get a malicious prompt cached as “safe” (unlikely with SafeLLM’s waterfall design, but possible if rules change), subsequent users will get the cached decision.
  • Redis Connection Issues: Ensure Redis is highly available. If the sidecar cannot connect to Redis, it will bypass the cache (fail-open for cache), which increases load on other layers and the LLM.
  • Memory Management: Without a proper maxmemory-policy, Redis can grow indefinitely. Always use volatile-lru or allkeys-lru.
ENABLE_CACHE=true
REDIS_HOST="redis-server.internal"
REDIS_PORT=6379
REDIS_TTL=86400 # 24 hours
REDIS_TIMEOUT=0.2 # 200ms timeout for Redis ops

The system uses Redis as a Key-Value database. The key is a SHA256 hash of the normalized prompt text.

  • [OSS] Standalone: Single Redis instance. Ideal for smaller deployments.
  • [Enterprise (Paid)] Redis Sentinel: High availability (HA) mode. Automatic failover in case of Master node failure.
  • [Enterprise (Paid)] Distributed Coalescer: A mechanism to prevent “thundering herd”. If multiple pods receive the same prompt at the same time, only one will perform the scan, and others will wait for the result in Redis.
Terminal window
ENABLE_CACHE=true
REDIS_HOST=localhost
REDIS_PORT=6379
REDIS_TTL=3600 # Entry time-to-live in seconds

SafeLLM performs Unicode normalization (NFKC) before hashing. This prevents “homograph bypass” attacks, where an attacker uses visually identical characters from different Unicode sets to obtain a different hash for a semantically identical prompt.