· Operations · 6 min read
AI Security Operations: From Shadow Mode to Board Report in 90 Days
Deploying an AI security control takes one afternoon. The hard part is the 90 days after. Here is the operational playbook.

The Problem Is Not Tooling
You can deploy SafeLLM in an afternoon. A Docker container, an APISIX route, Shadow Mode on — done. The AI traffic flowing through your gateway is now being analysed, logged, and categorised.
That is the easy part. The hard part is what happens over the next 90 days: tuning detection thresholds, deciding what to block versus what to flag, building evidence that satisfies regulators, and producing reports that a board can actually read. Most AI security initiatives fail not because the technology does not work, but because no one planned the operational maturity curve.
This post is the playbook. It covers the journey from first deployment to board-ready reporting, with specific metrics, exit criteria, and decision points at each stage.
Week 1–2: Shadow Mode — Observe Before You Act
What to Deploy
Enable SafeLLM in Shadow Mode on your primary AI-facing routes. Shadow Mode analyses every request and response through the full L0–L2 pipeline but never blocks. Users see no difference. Your security team sees everything.
Start with routes that carry the highest data sensitivity: customer-facing AI assistants, internal copilot endpoints, any route where users can type free-form prompts.
What to Measure
Track these metrics from day one:
- Requests scanned per day — establishes your baseline AI traffic volume
- Sensitive data detections by type — credit cards, IBANs, national IDs, emails, custom patterns
- Policy violations per route — which endpoints generate the most risk
- Detection reason codes — the specific patterns triggering each detection
- False positive candidates — detections that your team reviews and marks as incorrect
Exit Criteria
Move to selective enforcement when:
- False positive rate < 2% for your top-5 detection categories, sustained over 7 consecutive days
- Top-5 reason codes are stable — no new dominant pattern appearing that you have not reviewed
- You have assigned an owner for each detection category (security, legal, or platform team)
If you hit day 14 and your false positive rate is still above 5%, extend Shadow Mode. Do not rush enforcement. A blocked legitimate request damages user trust faster than a leaked detection improves security posture.
Week 3–4: Selective Enforcement — Start With High Confidence
What to Block First
Begin enforcement with detection categories where false positives are near zero:
- Credit card numbers — Luhn-validated, extremely low false positive rate
- IBAN numbers — format-validated, country-code checked
- Known jailbreak patterns — exact-match and obfuscation-resistant keyword lists
These categories have clear, unambiguous patterns. When SafeLLM flags a valid credit card number in a prompt, there is no legitimate reason for it to be there.
What to Keep in Shadow
Leave ambiguous categories in observation mode with explicit decision owners:
- Email addresses — sometimes legitimate in context (e.g., “send this to john@company.com”)
- Names — high false positive potential depending on context
- Custom patterns — need more data before enforcement decisions
User Feedback Loop
When a request is blocked, the user sees a reason code. Establish a lightweight feedback mechanism: a Slack channel, a form, or an email alias where users can report false positives. Review these weekly. Adjust patterns. Document every change with a policy version increment.
This feedback loop is not just operational hygiene — it is evidence of proportionate response under GDPR Article 25 and the EU AI Act’s human oversight requirements.
Month 2–3: Evidence and Reporting
Operational Metrics That Matter
By month two, you have enough data to build meaningful trend reports. Focus on metrics that answer the questions regulators and board members actually ask:
| Metric | What It Answers |
|---|---|
| Traffic coverage % | “What percentage of our AI traffic is monitored?” |
| Detection rate by category | ”What types of sensitive data are employees sending to AI?” |
| Block rate vs shadow rate | ”What percentage of violations are we actively preventing?” |
| False positive rate (trend) | “Are our controls getting more accurate over time?” |
| Incident readiness score | ”If a leak happened today, how fast could we reconstruct events?” |
| Mean time to evidence | ”How quickly can we produce an audit package for a specific request?” |
Mapping to Regulatory Frameworks
Your metrics map directly to compliance obligations:
- GDPR Article 25 (Data Protection by Design) — traffic coverage %, detection rate, evidence of proportionate technical measures
- EU AI Act (Risk Management & Transparency) — detection categories, human oversight evidence (feedback loop), policy version history
- NIS2 (Incident Reporting) — incident readiness score, mean time to evidence, containment capability demonstration
Executive Summary Format
A board does not want a 30-page technical report. They want one page with three sections:
- Risk posture — traffic monitored, top threats detected, trend direction (improving/stable/degrading)
- Control effectiveness — enforcement coverage, false positive trend, user feedback summary
- Compliance readiness — regulatory mapping status, open gaps with owners and target dates
Attach the technical appendix for the CISO. The board reads the summary. The regulator reads both.
Build vs Buy: The 12-Month Reality
This section is not a sales pitch. It is a cost model.
Building V1 Is Not the Expensive Part
A senior engineer can build a basic prompt inspection pipeline in 3–6 months. Regex PII detection, keyword blocking, basic logging. It will work for the first few weeks.
The Expensive Part Is Month 4 Onwards
Here is what the 12-month ownership model looks like for a self-built solution:
- False positive tuning — patterns that worked in month 1 generate noise by month 3 as usage patterns shift. Someone has to review, adjust, and redeploy. Continuously.
- Policy updates — new regulation, new internal policy, new AI provider with different data handling terms. Every change requires engineering work.
- Evidence quality — an auditor asks for decision logs from 47 days ago with specific fields. Your logging schema needs to have captured those fields from day one.
- Model updates — jailbreak techniques evolve weekly. Your keyword lists and classifiers need to keep up.
- Operational support — user complaints about false positives, incident reconstruction requests, integration with new routes.
Realistic estimate: 0.5 FTE ongoing for maintenance of a self-built solution, plus periodic senior engineering time for capability upgrades.
SafeLLM’s OSS edition covers L0–L1.5 at zero licensing cost. Enterprise adds AI Guard, GLiNER, Redis Sentinel HA, and — critically — a team that handles model updates, pattern tuning, and custom implementation support. The question is not whether you can build it. It is whether maintaining it is the best use of your engineering team’s time.
Getting Started
You do not need a call to start. You do not need procurement approval for an observation-only deployment.
git clone https://github.com/safellmio/safellm-apisix-gateway-sidecar
docker compose up -d --buildShadow Mode is on by default. Deploy to your dev or QA environment. Run it for a week. Read the logs. If the data confirms what you suspected — that sensitive information is flowing through your AI endpoints uncontrolled — you have the evidence to justify the next step.
When you are ready for enforcement, enterprise features, or a technical walkthrough, we are here. No sales deck. Just engineering.



