Running Evaluations
Starting a new evaluation
- Go to the Evaluations page
- Select the Human annotation tab
- Click Start new evaluation
Configuring your evaluation
- Select your test set - Choose the data you want to evaluate against
- Select your revision - Pick the version of your application to test
warning
Your test set columns must match the input variables in your revision. If they don't match, you'll see an error message.
- Choose evaluators - Select how you want to measure performance
Running the evaluation
After configuring:
- Click Start evaluation
- You'll be redirected to the annotation interface
- Click Run all to generate outputs and begin evaluation
Annotating responses
For each test case:
- Review the input and output
- Use the evaluation form on the right to score the response
- Click Annotate to save your assessment
- Click Next to move to the next test case
tip
Select the Unannotated tab to see only the test cases you haven't reviewed yet.
Collaboration
You can invite team members to help with evaluation by sharing the evaluation link. Team members must be added to your workspace first.
Next steps
- Learn about viewing results
- Try A/B testing to compare variants