Observability & Monitoring

SafeLLM is built with full operational visibility and regulatory compliance (GDPR) in mind.

Prometheus Metrics

The application exposes a /metrics endpoint that can be scraped by Prometheus every 5-15 seconds.

Key Metrics

safellm_blocked_requests_total: Counter of blocked requests. The layer label indicates which layer blocked the traffic (e.g., L1_KEYWORDS, L2_AI_GUARD).
safellm_scan_duration_seconds: Histogram of scanning duration for each layer. Allows monitoring the impact on TTFT.
safellm_cache_hits_total: Effectiveness of the Cache layer. A high rate means significant savings on LLM tokens.
safellm_dlp_pii_detected_total: Number of data leak incidents in model outputs.

Audit Logs (Enterprise - Paid)

The Enterprise (Paid) version offers a dedicated audit logging system that is tamper-resistant and optimized for compliance. In OSS, audit logging is not available (no-op). Only DLP audit stats are kept in memory.

What is logged?

SafeLLM NEVER logs the full content of the prompt in audit logs (for privacy reasons). Instead, it logs:

request_id: Unique query identifier.
prompt_hash: SHA256 hash of the text (allows finding the query in other systems without logging PII).
verdict: Decision (Allowed/Blocked).
layer: The layer that made the decision.
reason: Reason for the block (e.g., type of PII detected).

Logging Architecture

Logs are sent asynchronously via a Redis queue so as not to increase the latency of the main query. The audit_worker process receives the logs and can forward them to:

JSONL files (locally).
Grafana Loki: Central logging system.
S3: Long-term archiving.

Configuration

ENABLE_METRICS=true
ENABLE_AUDIT_LOGS=true  # Enterprise (Paid) only
LOG_LEVEL=INFO
LOG_FORMAT=json  # Recommended for log collection systems