Creating an Eval
How to create an eval on the Datalab.
Portex makes it easy to take your expert domain knowledge and generate custom evaluations or "evals" that you can offer to model builders on the Datalab. Follow this guide to get started with writing and building evals on the Datalab. First, we walk through the details of effective evals on Portex.
Details
Evaluations or "evals" have become essential tools to measure the progress and capabilities of AI. Think of evals simply as well-designed tests to determine the subjects and domains where AI systems excel, or falter.
Writing effective evals on Portex
Follow these guidelines to write and design good evals on Portex.
There are two ways to create evals on Portex: the Eval Builder and the Eval Dataset uploader. We walk through each below.
Writing an Eval with the Eval Builder
The eval builder allows experts to write evals into the platform directly without existing files.
You may optionally upload reference files (make sure your reference file is one of the following accepted formats) to accompany your task.
.json, .csv, .md, .html, .txt, .webp, .jpg, .jpeg, .png, .gif, .pdf

Creating an Eval Dataset Bundle
If you instead have files that you want to work with in an editor or IDE, you can upload those directly with the Eval Dataset option.
To get started, create a new dataset and choose Eval Dataset - from here you will be prompted to upload your files. Eval datasets on Portex are uploaded as a bundle of 4 files (2 optional). This forms a Core Dataset which can be offered to sale to model builders in addition to per-eval runs. We walk through each file in the Core Dataset below.

Formatting your Eval Dataset Bundle
Eval datasets on Portex use 4 parts. Two are required.
Upload Checklist
Required: tasks.json
Required: answers.json
Optional: reference_files.zip (or .tar.gz)
Optional (but recommended for premium offering): core_dataset.zip (or .tar.gz)
Rule: every task_id in answer_keys.json must also exist in task_list.json
🧩 Task List
A JSON file containing your tasks. Each record must include:
task_id: unique identifiertask_prompt: the question for the modelreference_file: if your task requires a reference file, include the file name here.
Example:
This file will be downloadable by eval buyers so they can generate model responses.
🧠 Answer Keys
A JSON file mapping task_id to the correct output. Must include:
task_id: unique identifier used aboveanswer: the expected result (numerical, textual, objective criteria etc.)
Optional fields you might include to supplement:
rationale: explanation of how the answer was derived, reasoning tracesanswer_type: e.g. "percentage", "multiple_choice", "exact_match"
Example:
By default, your answer key is blinded from eval buyers. You may optionally add it to the Core Dataset below for purchase.
📂 Reference Files (Optional)
If your tasks refer to text documents or images or other supporting files, bundle them into a .zip or .gz archive.
Example archive/zip structure:
Buyers will extract this archive to access the referenced files and generate model responses.
Accepted file types include: .json, .csv, .md, .html, .txt, .webp, .jpg, .jpeg, .png, .gif, .pdf
📘 Core Dataset (Optional)
A .zip or .gz archive that may optionally contain the answer key and/or additional files that support your reasoning or calculations (analyst notes, data tables, or expert commentary).
Example archive structure:
Buyers can use this core dataset to validate or audit the answer logic, or even refine their models using reinforcement learning methods. The contents of this dataset are up to you as the expert to decide but including the answer key can generally support a higher price point.
Creating an Eval Listing
Once you've uploaded your eval dataset, you can create a listing for it.
The first step will be setting two prices:
A per-eval price: a set price model builders pay each time they submit model responses and receive a performance report.
(If there is a Core Dataset) Core Dataset price and minimum bid: a fixed "buy now" price and minimum bid for model builders to access the Core Dataset (tasks and answers, as well as reference files and knowledge reference if applicable).

The next step is configuring your license and all other relevant details on the listing editor. Once you publish your listing, your eval will be ready for model builders to access.
Last updated