What is a Run?
A run represents a single execution of a test against a target AI agent. It tracks:- Execution status and progress
- Individual scenario results
- Aggregate metrics and pass rates
- Timing information
Run Lifecycle
Status Definitions
| Status | Description |
|---|---|
pending | Run created, waiting to start |
running | Scenarios actively executing |
grading | Conversations complete, grading in progress |
completed | All grading finished |
failed | Unrecoverable error occurred |
canceled | User canceled the run |
Monitoring a Run
Live Progress
While a run is executing, you can monitor:Progress Bar
Visual indicator of completed vs pending scenarios
Live Feed
Real-time updates as scenarios complete
Error Alerts
Immediate notification of failures
Viewing Results
Summary View
After completion, the run shows aggregate metrics:| Metric | Description |
|---|---|
| Pass Rate | Percentage of scenarios that passed |
| Passed | Count of passing scenarios |
| Failed | Count of failing scenarios |
| Errors | Count of execution errors |
| Duration | Total run time |
Detail View
Click any scenario to see:- Full conversation transcript
- Per-criterion grading results
- Evidence quotes from the transcript
- Timing metrics
Retrying Failed Scenarios
If scenarios fail due to transient errors (network issues, rate limits), you can:- Click Retry Failed to re-run only failed scenarios
- Results merge into the existing run
- Pass rate updates automatically
Preclinical automatically retries scenarios that fail due to rate limits. Manual retry is only needed for persistent failures.
Run History
All runs are persisted for historical tracking:- Compare runs over time
- Track agent improvement
- Identify regression patterns
- Export results for reporting