L2: AI Guard
Enterprise (Paid) only: L2 AI Guard is available in the Enterprise edition. Contact sales@safellm.io for access.
The L2 layer is the most advanced level of protection in SafeLLM, dedicated to combating attacks that static rules cannot detect.
What is Neural Guard?
Section titled “What is Neural Guard?”Traditional keyword filters can be fooled by appropriately constructing sentences (social engineering). The L2 layer uses a neural network model (Prompt Guard) compiled to the ONNX format, which analyzes the semantics and intent of the query.
Detected Threats
Section titled “Detected Threats”- Jailbreak: Attempts to force the model to break its system instructions (e.g., “Imagine you are a hacker…”).
- Indirect Injection: Hidden instructions within data that may be processed by the LLM.
- System Prompt Leakage: Attempts to extract secret instructions that define the model’s behavior.
When to use
Section titled “When to use”- Use L2 when you need protection against semantic attacks, social engineering, and advanced jailbreaks that don’t use specific “bad words”.
- Best for public-facing LLM applications where users might actively try to bypass security filters.
- Essential for high-risk environments where system prompt leakage must be prevented at all costs.
Common pitfalls
Section titled “Common pitfalls”- Over-blocking: Setting
L2_THRESHOLDtoo low can lead to high false-positive rates, blocking legitimate user queries. - Latency sensitive apps: L2 adds 30-70ms per request. If your application requires sub-10ms response times, consider relying more on L1 or using a more powerful CPU.
- Semantic Truncation: The model has a
L2_MAX_LENGTHlimit (default 512 tokens). Prompts longer than this are truncated before reaching the AI model. An attacker could potentially bypass the layer by sending a very long “filler” text followed by the actual jailbreak. You can increase this limit viaL2_MAX_LENGTHif your infrastructure permits, but ensure your application also limits maximum prompt length.
Configuration Example [Enterprise]
Section titled “Configuration Example [Enterprise]”ENABLE_L2_AI=trueL2_THRESHOLD=0.85L2_MODEL_PATH="models/prompt_guard.onnx"L2_MAX_LENGTH=512SHADOW_MODE=truePerformance and Optimization
Section titled “Performance and Optimization”Despite being an AI model, it has been optimized for CPU operation:
- Latency: ~30-70ms (depending on text length).
- Preloading: With
PRELOAD_MODELS=true, the model is loaded once at startup and shared between workers, drastically reducing RAM consumption.
Configuration [Enterprise - Paid]
Section titled “Configuration [Enterprise - Paid]”ENABLE_L2_AI=trueL2_THRESHOLD=0.9 # Blocking threshold (0.0 - 1.0)L2_MODEL_PATH=models/prompt_guard.onnxWe recommend starting with L2_THRESHOLD=0.9 and monitoring logs in SHADOW_MODE to avoid False Positives. When you are confident in the rules, set SHADOW_MODE=false to enforce blocking.