L0: Smart Cache
The L0 layer is the first line of the SafeLLM pipeline, whose task is to optimize costs and response times by avoiding repeated scans and queries to LLM models.
Main Tasks
Section titled “Main Tasks”- Near-zero latency: Decision in < 0.1ms in case of a cache hit.
- Cost reduction: Prevents repetitive, dangerous prompts from being sent to expensive AI models.
- Performance: Relieves AI layers (L1.5, L2) from analyzing the same texts.
When to use
Section titled “When to use”- Cost Optimization: When you have many repetitive queries (e.g., “Help”, “Translate this”, or common greetings) and want to save tokens.
- DDoS Protection: To prevent attackers from draining your LLM credits by sending the same malicious prompt repeatedly.
- Latency sensitive apps: To get instant responses for known queries.
Common pitfalls
Section titled “Common pitfalls”- Cache Poisoning: If an attacker manages to get a malicious prompt cached as “safe” (unlikely with SafeLLM’s waterfall design, but possible if rules change), subsequent users will get the cached decision.
- Redis Connection Issues: Ensure Redis is highly available. If the sidecar cannot connect to Redis, it will bypass the cache (fail-open for cache), which increases load on other layers and the LLM.
- Memory Management: Without a proper
maxmemory-policy, Redis can grow indefinitely. Always usevolatile-lruorallkeys-lru.
Configuration Example
Section titled “Configuration Example”ENABLE_CACHE=trueREDIS_HOST="redis-server.internal"REDIS_PORT=6379REDIS_TTL=86400 # 24 hoursREDIS_TIMEOUT=0.2 # 200ms timeout for Redis opsTechnology
Section titled “Technology”The system uses Redis as a Key-Value database. The key is a SHA256 hash of the normalized prompt text.
Operation Modes
Section titled “Operation Modes”- [OSS] Standalone: Single Redis instance. Ideal for smaller deployments.
- [Enterprise (Paid)] Redis Sentinel: High availability (HA) mode. Automatic failover in case of Master node failure.
- [Enterprise (Paid)] Distributed Coalescer: A mechanism to prevent “thundering herd”. If multiple pods receive the same prompt at the same time, only one will perform the scan, and others will wait for the result in Redis.
Configuration
Section titled “Configuration”ENABLE_CACHE=trueREDIS_HOST=localhostREDIS_PORT=6379REDIS_TTL=3600 # Entry time-to-live in secondsCache Security
Section titled “Cache Security”SafeLLM performs Unicode normalization (NFKC) before hashing. This prevents “homograph bypass” attacks, where an attacker uses visually identical characters from different Unicode sets to obtain a different hash for a semantically identical prompt.