The Real AI Risk Is Not What You Think

Three Things That Happened Last Quarter

A support agent needed to reformat a messy CSV before uploading it to the CRM. She pasted 1,200 customer records — names, emails, phone numbers, one IBAN column she did not notice — into a free ChatGPT session. The formatting came back perfect. So did a GDPR violation, because those records now sat on OpenAI’s infrastructure with no Data Processing Agreement covering that transfer.

A developer was debugging a failing integration test at 11 PM. He copied the full request payload into an AI assistant to ask why the JSON schema validation was rejecting it. The payload contained production database credentials, an API key for the payment gateway, and three customer social security numbers from a test fixture that someone had seeded with real data.

An intern was summarising contract terms for a deal review. She fed 14 pages of a confidential supplier agreement into an AI summariser her university recommended. The summary was excellent. The contract’s exclusivity clause, pricing tiers, and penalty structure were now in a third-party system with no audit trail and no deletion guarantee.

None of these actions triggered a firewall, a WAF, or an IDS. None of them were malicious. All three created regulatory exposure that could cost the organisation millions.

Three Myths That Keep Enterprises Vulnerable

Myth 1: “We have a DPA, so we are covered”

A Data Processing Agreement is a legal framework. It defines who is responsible for what. It does not prevent your support agent from pasting an IBAN into a prompt. It does not generate evidence that you tried to prevent it.

When a regulator investigates, they do not ask whether you had a contract. They ask what technical measures you deployed to enforce data minimisation at the point of processing. Under the EU AI Act, penalties reach €35M or 7% of global turnover. Under GDPR, €20M or 4%. A signed DPA with no runtime enforcement is a liability document, not a protection.

Myth 2: “Our model provider has safety features”

OpenAI, Anthropic, and Google all invest heavily in model safety — content filtering, refusal training, output classifiers. These features control what the model says. They do not control what your employees send.

Model safety does not know that EMP-12345 is your internal employee ID format. It does not know that the string Project Falcon refers to an unannounced acquisition. It cannot enforce your organisation’s data classification policy because it has never seen that policy. Provider safety and enterprise security operate in fundamentally different domains.

Myth 3: “We trained our employees on AI policy”

You probably did. You may even have a well-written acceptable use policy with clear examples. It will not survive contact with operational reality.

The support agent at 4:57 PM on a Friday, with a queue of 23 tickets and a team lead asking why resolution time is up, will paste whatever gets the job done. The developer at 11 PM, with a deployment blocked and a Slack channel full of “any update?” messages, will not pause to check whether the payload contains PII. Policy compliance degrades under time pressure. This is not a training problem — it is a human factors problem, and no amount of awareness sessions will eliminate it.

The Only Reliable Defence Operates at Runtime

If you cannot prevent sensitive data from reaching AI systems by relying on human judgement, the control must operate independently of human intent. This is the core design principle behind runtime AI security.

SafeLLM implements a layered inspection pipeline that sits between your users and the AI provider:

L0 — Cache layer: Previously cleared prompts pass through in <0.1ms. Reduces API costs and repeat scanning.
L1 — Keyword Guard: O(1) detection of known jailbreak patterns, blocklisted terms, and custom keywords. <0.01ms.
L1.5 — Data Shield: Regex and pattern-based detection of credit cards, IBANs, national IDs, emails, phone numbers, and custom patterns you define. <1ms.
L2 — AI Guard: ONNX neural classifier for sophisticated prompt injection and content policy violations. ~13.5ms. Enterprise adds GLiNER for 25+ entity types.

Total pipeline overhead: under 15 ms. Throughput: 1,200+ RPS on standard CPU hardware. No GPU required. No internet connection required.

The pipeline operates as a sidecar to your existing Apache APISIX gateway. No architecture changes. No SDK modifications. No committee meetings.

Evidence Is the Deliverable

Here is the part most security teams overlook: regulators do not penalise organisations for using AI. They penalise organisations for failing to demonstrate that they controlled it.

Every request through SafeLLM generates a decision record:

SHA-256 prompt hash — proves what was inspected without storing the prompt itself
Reason-coded decision — blocked:pii:iban, allowed:clean, redacted:credit_card
Policy version — which ruleset was active when the decision was made
Timestamp and route metadata — which endpoint, which user group, when

These records export to Loki, S3, or your SIEM. They map directly to GDPR Article 25 (privacy by design), EU AI Act transparency requirements, and NIS2 incident evidence obligations. When the regulator asks “what measures did you take?”, you hand them a queryable audit trail instead of a PDF policy document.

Start With Observation, Not Enforcement

SafeLLM ships with Shadow Mode enabled by default. Deploy it on Monday. It analyses every prompt and response, logs detections, and blocks nothing. By Friday, you have a report showing exactly what sensitive data your AI traffic contains — by type, by route, by frequency.

No disruption. No user complaints. No procurement cycle for the initial assessment.

When you are ready to enforce, start with high-confidence categories (credit cards, IBANs, known jailbreak patterns) and leave ambiguous cases in shadow mode with explicit owners.

The OSS edition is Apache 2.0. Clone the repo, run docker compose up, and start observing.

git clone https://github.com/safellmio/safellm-apisix-gateway-sidecar
docker compose up -d --build
docker compose logs -f safellm

For enterprise features — AI Guard, GLiNER entity detection, Redis Sentinel HA, custom onboarding — reach out for a technical conversation. No sales pitch. Just engineering.