Features

Everything you need to test and prove your AI agents are production-ready

Mutation Engine

5 Mutation Types (Open Source)

Paraphrasing, noise injection, tone shifts, basic adversarial attacks, and custom templates

15 Mutation Types (Cloud)

All open-source types plus sophisticated prompt injections (OWASP Top 10), multi-turn attacks, encoding obfuscation, model-specific jailbreaks, and more

Semantic Perturbation

Uses local LLMs to rewrite inputs semantically without changing user intent. Not just random noise.

Up to 50 Mutations (Open Source)

Generate up to 50 mutations per test run locally

Unlimited Mutations (Cloud)

No limits on mutations per run with cloud GPUs

Invariant Assertions

Deterministic Checks

Contains patterns, regex matching, latency limits, JSON validity

Semantic Similarity

Vector-based similarity checking using local embeddings to ensure responses maintain semantic meaning

Basic Safety (Open Source)

Basic PII detection and refusal checks using regex patterns

Advanced Veridian (Cloud)

ML-based PII detection (94.7% accuracy), NER for contextual PII, factuality checking with RAG, context-aware scoring

Execution

Sequential Execution (Open Source)

Runs mutations one at a time on your local machine

Parallel Execution (Cloud)

10-20x faster with cloud GPUs. 50 mutations in 20-30 seconds vs 5-10 minutes

Local-First (Open Source)

Uses Ollama with Qwen 3 8B. No API costs. Generate 1,000+ mutations for free

Cloud GPUs (Cloud)

Leverage cloud infrastructure for faster testing without melting your laptop

Reporting

Interactive HTML Reports

Beautiful pass/fail matrices with mutation details and failure analysis

JSON Export

Export results as JSON for CI/CD integration and programmatic analysis

Terminal Output

Rich terminal UI with progress bars and real-time updates

Robustness Score

Mathematical score (0.0-1.0) that quantifies agent reliability

Test History (Cloud)

6-12 months of historical test runs with trend analysis and commit-by-commit comparison

Integrations

HTTP Agents

Test any HTTP-based agent endpoint

Python Callables

Directly test Python functions and callables

LangChain

Native LangChain chain integration

CI/CD (Cloud)

GitHub Actions, GitLab CI, Jenkins, CircleCI. Block merges on trust score drops. PR comments with results

Notifications (Cloud)

Slack, email, webhook support for test completion alerts

Developer Experience

Simple CLI

Install with pip, configure with YAML, run with one command

YAML Configuration

Human-readable configuration format. Same config works for local and cloud

Rich Terminal UI

Beautiful progress bars and real-time feedback using Rich library

Type Safety

Full type hints and Pydantic validation for configuration

Rust Performance

Performance-critical operations (scoring, similarity) use Rust bindings

Ready to get started?