Testing Overview

SafeLLM follows a rigorous testing process to ensure performance, security, and reliability.

Test Suite Structure

The project includes several levels of tests:

Unit Tests: Testing individual components and layers (e.g., Cache, Keywords, PII) in isolation.
Integration Tests: Testing the interaction between components, such as the full Waterfall Pipeline.
End-to-End (E2E) Tests: Testing the entire stack, including APISIX, the Sidecar, and a mock upstream model.
Benchmark Tests: Measuring latency, RPS (Requests Per Second), and memory usage.

You can run the entire test suite using the provided script:

cd safellm-oss
./run_tests.sh

This script will:

We maintain a set of “red team” prompts to verify the effectiveness of our security layers against:

These prompts are used in our integration tests to ensure that no security regression occurs during development.

For a step-by-step manual validation of the full APISIX -> Sidecar -> Upstream flow, see: