Editing an Eval

Edit and version your evals on the Portex Datalab.

Evals are not static. You can edit tasks, answers, and criteria at any time from the Data Studio.

Making Edits

From the Data Studio, navigate to Evals > Edit Eval. Select the eval you want to modify. This opens the Eval Builder with your existing tasks loaded.

You can:

Add, remove, or rewrite tasks
Update answer keys and grading criteria
Adjust weights and pass thresholds
Add or replace reference files

When you save your changes, a new version of the eval is created and the leaderboards are re-run.

Versioning

Each save creates a new version of the eval dataset. You can revert to any previous version if needed. This is useful if you make changes that you later want to undo, or if you want to experiment with different task sets.

The new version is automatically linked to your existing listing. No action is needed to update the listing after editing.

Reviewing Results Across Versions

To compare how models perform across different versions of your eval, use the Annotate page in the Data Studio (under Evals > Annotate). The Annotate view lets you inspect results from Portex-administered leaderboard runs for each version, so you can see the impact of changes to tasks or criteria on model scores.

For more on interpreting results, see Reviewing Results.

PreviousImporting an Eval NextPublishing Your Eval

Last updated 20 days ago

hashtagMaking Edits

hashtagVersioning

hashtagReviewing Results Across Versions

Making Edits

Versioning

Reviewing Results Across Versions