Feedback

Adaptive Engine allows you to annotate your LLM completions with scalar, boolean or preference feedback. Feedbacks power your continuous improvement journey in Adaptive ML: you can use them for observability, evaluation and as training objectives. There are 2 types of feedback you can log: metrics and preference sets. A metric is a boolean or scalar value attached to an individual completion. A preference set is a tuple of completions, where one is marked as preferred, and the other as dispreferred.

To log feedback, you must first register a feedback key to log against. If you want to associate a metric to a use case so you can later organize and analyze aggregate performance or individual results associated with it, you can link the metric to the use case.

Create an Adaptive client first

register_key = adaptive.feedback.register_key(
    key="acceptance",
    kind="bool",
    scoring_type="higher_is_better",
)

See the SDK Reference to see the full method definition.

Metrics

Metrics are valuable when you can measure or quantify some dimension about a completion. Examples of what those could be:

immediate human feedback, such as acceptance/rejection of a completion
downstream impact of a completion, such as customer churn or avoidance of it
user satisfaction [0-5] for a given conversation
execution feedback for generated code, such as success/error

Since metrics have a quantifiable value, Adaptive allows you to track their progress across time, helping you visualize the ongoing production feedback of newly trained models or prompting strategies beyond static, point-in-time evaluations.

Preference sets

Preference sets are useful when you cannot quantify your feedback, but only provide a relative judgement between 2 completions. For example, you may be able to judge one completion as less toxic than another, but neither of them as toxic or not toxic individually. Although preference sets are not plotted in the Adaptive UI, you may also use them for preference fine-tuning. See Log Feedback to see how to use the Adaptive SDK to log feedback. See the SDK Reference for all feedback-related methods.

Log Feedback

All metric feedback must be logged against a feedback_key (see Feedback). When you make an inference request, the API response includes a completion_id UUID along with the model’s output (see Make inference requests to learn more). You must log your feedback for an output using its completion_id.

Make sure to use the response’s completion_id for logging, not its id.

You can access the completion_id for a Chat API response as follows:

completion_id = response.json()["choices"][0]["completion_id"]

If you are passing stream=True to the Chat API to stream completions, you can find the same completion_id in each streamed chunk as follows:

Adaptive SDK / OpenAI Python

for chunk in streaming_response:
  completion_id = chunk.choices[0].completion_id

Log metric feedback

Metric feedback allows you to score a completion with scalar or boolean values. For example, the code snippet below logs that Llama3.1 8B’s completion to your prompt scored a CSAT (customer satisfaction score) of 5.

Create an Adaptive client first

response = adaptive.feedback.log_metric(
  value=5,
  feedback_key="CSAT",
  completion_id=completion_id,
  details="This answer was perfect" # optional text details of 
)

As exemplified in the above code snippets, you can log textual details for more context or justification on the provided feedback. See the SDK Reference to see the full method definition.

Log preference feedback

Preference feedback allows you to log a pairwise comparison between 2 completions. You can also log a tie between the 2, as equally good or equally bad.

Adaptive SDK

response = adaptive.feedback.log_preference(
  feedback_key="acceptance",
  preferred_completion="completion_id_1",
  other_completion="completion_id_2",
)

See the SDK Reference to see the full method definition.

Platform

Inference

Evaluation

Graders

Recipes & Runs

Datasets

Interactions

Integrations

Deployment

Metrics

Preference sets

Log Feedback

Log metric feedback

Log preference feedback

Platform

Inference

Evaluation

Graders

Recipes & Runs

Datasets

Interactions

Integrations

Deployment

​Metrics

​Preference sets

​Log Feedback

​Log metric feedback

​Log preference feedback

Metrics

Preference sets

Log Feedback

Log metric feedback

Log preference feedback