Features
Everything you need to test and prove your AI agents are production-ready
Mutation Engine
5 Mutation Types (Open Source)
Paraphrasing, noise injection, tone shifts, basic adversarial attacks, and custom templates
15 Mutation Types (Cloud)
All open-source types plus sophisticated prompt injections (OWASP Top 10), multi-turn attacks, encoding obfuscation, model-specific jailbreaks, and more
Semantic Perturbation
Uses local LLMs to rewrite inputs semantically without changing user intent. Not just random noise.
Up to 50 Mutations (Open Source)
Generate up to 50 mutations per test run locally
Unlimited Mutations (Cloud)
No limits on mutations per run with cloud GPUs
Invariant Assertions
Deterministic Checks
Contains patterns, regex matching, latency limits, JSON validity
Semantic Similarity
Vector-based similarity checking using local embeddings to ensure responses maintain semantic meaning
Basic Safety (Open Source)
Basic PII detection and refusal checks using regex patterns
Advanced Veridian (Cloud)
ML-based PII detection (94.7% accuracy), NER for contextual PII, factuality checking with RAG, context-aware scoring
Execution
Sequential Execution (Open Source)
Runs mutations one at a time on your local machine
Parallel Execution (Cloud)
10-20x faster with cloud GPUs. 50 mutations in 20-30 seconds vs 5-10 minutes
Local-First (Open Source)
Uses Ollama with Qwen 3 8B. No API costs. Generate 1,000+ mutations for free
Cloud GPUs (Cloud)
Leverage cloud infrastructure for faster testing without melting your laptop
Reporting
Interactive HTML Reports
Beautiful pass/fail matrices with mutation details and failure analysis
JSON Export
Export results as JSON for CI/CD integration and programmatic analysis
Terminal Output
Rich terminal UI with progress bars and real-time updates
Robustness Score
Mathematical score (0.0-1.0) that quantifies agent reliability
Test History (Cloud)
6-12 months of historical test runs with trend analysis and commit-by-commit comparison
Integrations
HTTP Agents
Test any HTTP-based agent endpoint
Python Callables
Directly test Python functions and callables
LangChain
Native LangChain chain integration
CI/CD (Cloud)
GitHub Actions, GitLab CI, Jenkins, CircleCI. Block merges on trust score drops. PR comments with results
Notifications (Cloud)
Slack, email, webhook support for test completion alerts
Developer Experience
Simple CLI
Install with pip, configure with YAML, run with one command
YAML Configuration
Human-readable configuration format. Same config works for local and cloud
Rich Terminal UI
Beautiful progress bars and real-time feedback using Rich library
Type Safety
Full type hints and Pydantic validation for configuration
Rust Performance
Performance-critical operations (scoring, similarity) use Rust bindings
Ready to get started?