Skip to main content

One post tagged with "v0.8.0"

Revamping evaluation

January 22, 2024

We've spent the past month re-engineering our evaluation workflow. Here's what's new:

Running Evaluations

Simultaneous Evaluations: You can now run multiple evaluations for different app variants and evaluators concurrently.

Rate Limit Parameters: Specify these during evaluations and reattempts to ensure reliable results without exceeding open AI rate limits.

Reusable Evaluators: Configure evaluators such as similarity match, regex match, or AI critique and use them across multiple evaluations.

Evaluation Reports

Dashboard Improvements: We've upgraded our dashboard interface to better display evaluation results. You can now filter and sort results by evaluator, test set, and outcomes.

Comparative Analysis: Select multiple evaluation runs and view the results of various LLM applications side-by-side.