Evaluate Models

Launch an evaluation
Visualize results

In Adaptive Engine, you can evaluate models to know which off-the-shelf model performs best on your task, or how much did your fine-tuned model improve vs. its base model or others after training.

Launch an evaluation

Evaluation is done by launching a run of an evaluation recipe. An Evaluation recipe is a recipe that produces an EvaluationArtefact. Adaptive provides a built-in recipe for most evaluation use-cases. For more tailored usage (using custom graders for example), you can create your own recipe with your own completion grading strategy by following our guides Custom Graders and Write an Evaluation Recipe

Visualize results

Once an Evaluation run is finished, it will produce an Evaluation Artifact. This will contain:

An Evaluation score table that summarises all model’s scores for the graders that were used during the eval
A detailed list of interactions from all graded samples in the evaluated dataset.

Inference optimization tips

A/B Tests

⌘I

Platform

Inference

Evaluation

Graders

Recipes & Runs

Datasets

Interactions

Integrations

Deployment

Launch an evaluation

Visualize results

Platform

Inference

Evaluation

Graders

Recipes & Runs

Datasets

Interactions

Integrations

Deployment

​Launch an evaluation

​Visualize results

Launch an evaluation

Visualize results