Skip to content

Features Overview

SafeLLM provides a waterfall security pipeline (L0-L2) designed for maximum performance and security.

Feature Comparison: OSS vs. Enterprise (Paid)

Category	Feature	OSS	Enterprise (Paid)	Description
L0: Performance	Smart Cache	✅	✅	Semantic caching to reduce LLM costs and latency (<0.1ms).
	Distributed Coalescer	❌	✅	Cross-pod request deduplication to prevent redundant LLM calls.
L1: Static Guard	Keyword Guard	✅	✅	Ultra-fast (O(n)) deterministic guard for banned phrases and jailbreaks.
	Regex PII	✅	✅	Fast detection of basic PII (Emails, Phones, Credit Cards).
L1.5: AI Guard	AI PII (GLiNER)	❌	✅	Context-aware detection of 25+ PII types.
L2: Neural Guard	Prompt Injection	❌	✅	Neural network (ONNX) analysis for sophisticated jailbreaks.
Data Protection	DLP Output Scan	Audit (log-only)	Block/Anonymize/Log	Preventing data leakage in LLM responses.
	DLP Streaming Mode	Audit (async)	✅	Zero-latency PII detection in output stream.
Integration	MCP Server (stdio)	✅	✅	JSON-RPC MCP tools for policy checks and guarded tool orchestration.
Observability	Audit Logging	❌	Loki/S3	Tamper-proof, persistent audit trails for compliance.
	Admin Dashboard	❌	✅	Real-time security posture and rule management.

The Waterfall Pipeline

Every request is processed through layers, allowing for “short-circuit” rejection:

L0 (Cache): If a similar prompt was processed recently, return cached response.
L1 (Keywords): Instantly block known bad phrases or patterns.
L1.5 (PII Shield): Detect and mask sensitive information.
L2 (Neural Guard): Use AI to detect complex semantic attacks.

Performance Targets

SafeLLM is built for high-throughput enterprise environments:

Targets below apply to Enterprise (Paid) deployments with AI layers enabled.
Accuracy: >95% (ONNX + GLiNER)
E2E Latency: <10ms sidecar overhead
Throughput: 1000+ RPS (Scalable with APISIX)
False Positives: <0.3%