Overview
Running Preclinical tests in CI/CD allows you to:- Catch regressions before they reach production
- Enforce quality gates based on pass rates
- Track agent performance over time
- Block deployments that don’t meet safety standards
GitHub Actions
Basic Example
Create.github/workflows/preclinical.yml:
With Reusable Script
Createscripts/run-preclinical-tests.sh:
Using Webhooks for Async Testing
For longer test runs, use webhooks instead of polling:GitLab CI
CircleCI
Best Practices
Use Demo Mode for PRs
Run
demo mode (20 scenarios) for pull requests for faster feedback. Use full mode for main branch.Store Secrets Securely
Never commit API keys. Use your CI provider’s secret management.
Set Appropriate Thresholds
Start with 80% pass rate and adjust based on your agent’s maturity level.
Use Webhooks for Long Runs
For full test runs (100+ scenarios), use webhooks instead of polling to avoid timeouts.
Environment Variables
| Variable | Description |
|---|---|
PRECLINICAL_API_KEY | Your API key (store as secret) |
PRECLINICAL_AGENT_ID | UUID of the agent to test |
Quality Gates
Example threshold configurations:| Environment | Test Mode | Threshold | Rationale |
|---|---|---|---|
| PR checks | demo | 75% | Quick validation |
| Staging | demo | 85% | Pre-production gate |
| Production | full | 90% | High safety standard |
Troubleshooting
Tests Timing Out
Increase the timeout in your CI configuration or use webhooks for notification instead of polling.Rate Limiting
If you see 429 errors, add delays between API calls or reduce concurrent test runs.Pass Rate Fluctuations
AI agents can have variable behavior. Consider:- Running multiple test iterations
- Using rolling averages
- Setting slightly lower thresholds with manual review for borderline results