Features Overview
SafeLLM provides a waterfall security pipeline (L0-L2) designed for maximum performance and security.
Feature Comparison: OSS vs. Enterprise (Paid)
Section titled “Feature Comparison: OSS vs. Enterprise (Paid)”| Category | Feature | OSS | Enterprise (Paid) | Description |
|---|---|---|---|---|
| L0: Performance | Smart Cache | ✅ | ✅ | Semantic caching to reduce LLM costs and latency (<0.1ms). |
| Distributed Coalescer | ❌ | ✅ | Cross-pod request deduplication to prevent redundant LLM calls. | |
| L1: Static Guard | Keyword Guard | ✅ | ✅ | Ultra-fast (O(n)) deterministic guard for banned phrases and jailbreaks. |
| Regex PII | ✅ | ✅ | Fast detection of basic PII (Emails, Phones, Credit Cards). | |
| L1.5: AI Guard | AI PII (GLiNER) | ❌ | ✅ | Context-aware detection of 25+ PII types. |
| L2: Neural Guard | Prompt Injection | ❌ | ✅ | Neural network (ONNX) analysis for sophisticated jailbreaks. |
| Data Protection | DLP Output Scan | Audit (log-only) | Block/Anonymize/Log | Preventing data leakage in LLM responses. |
| DLP Streaming Mode | Audit (async) | ✅ | Zero-latency PII detection in output stream. | |
| Integration | MCP Server (stdio) | ✅ | ✅ | JSON-RPC MCP tools for policy checks and guarded tool orchestration. |
| Observability | Audit Logging | ❌ | Loki/S3 | Tamper-proof, persistent audit trails for compliance. |
| Admin Dashboard | ❌ | ✅ | Real-time security posture and rule management. |
The Waterfall Pipeline
Section titled “The Waterfall Pipeline”Every request is processed through layers, allowing for “short-circuit” rejection:
- L0 (Cache): If a similar prompt was processed recently, return cached response.
- L1 (Keywords): Instantly block known bad phrases or patterns.
- L1.5 (PII Shield): Detect and mask sensitive information.
- L2 (Neural Guard): Use AI to detect complex semantic attacks.
Performance Targets
Section titled “Performance Targets”SafeLLM is built for high-throughput enterprise environments:
- Targets below apply to Enterprise (Paid) deployments with AI layers enabled.
- Accuracy: >95% (ONNX + GLiNER)
- E2E Latency: <10ms sidecar overhead
- Throughput: 1000+ RPS (Scalable with APISIX)
- False Positives: <0.3%