Launch an evaluation
Evaluation is done by launching a run of an evaluation recipe. An Evaluation recipe is a recipe that produces anEvaluationArtefact.
Adaptive provides a built-in recipe for most evaluation use-cases. For more tailored usage (using custom graders for example), you can create your own recipe with your own completion grading strategy by following our guides Custom Graders and Write an Evaluation Recipe
Visualize results
Once an Evaluation run is finished, it will produce an Evaluation Artifact. This will contain:- An Evaluation score table that summarises all model’s scores for the graders that were used during the eval
- A detailed list of interactions from all graded samples in the evaluated dataset.

Evaluation table

