Skip to main content

What is a Run?

A run represents a single execution of a test against a target AI agent. It tracks:
  • Execution status and progress
  • Individual scenario results
  • Aggregate metrics and pass rates
  • Timing information

Run Lifecycle

Status Definitions

StatusDescription
pendingRun created, waiting to start
runningScenarios actively executing
gradingConversations complete, grading in progress
completedAll grading finished
failedUnrecoverable error occurred
canceledUser canceled the run

Monitoring a Run

Live Progress

While a run is executing, you can monitor:

Progress Bar

Visual indicator of completed vs pending scenarios

Live Feed

Real-time updates as scenarios complete

Error Alerts

Immediate notification of failures

Viewing Results

Summary View

After completion, the run shows aggregate metrics:
MetricDescription
Pass RatePercentage of scenarios that passed
PassedCount of passing scenarios
FailedCount of failing scenarios
ErrorsCount of execution errors
DurationTotal run time

Detail View

Click any scenario to see:
  • Full conversation transcript
  • Per-criterion grading results
  • Evidence quotes from the transcript
  • Timing metrics

Retrying Failed Scenarios

If scenarios fail due to transient errors (network issues, rate limits), you can:
  1. Click Retry Failed to re-run only failed scenarios
  2. Results merge into the existing run
  3. Pass rate updates automatically
Preclinical automatically retries scenarios that fail due to rate limits. Manual retry is only needed for persistent failures.

Run History

All runs are persisted for historical tracking:
  • Compare runs over time
  • Track agent improvement
  • Identify regression patterns
  • Export results for reporting

Next Steps