Make inference requests
Get completions from models deployed on Adaptive Engine
You can run inference for a target use case (and optionally model) using the Adaptive Python SDK, or the OpenAI Python library. The Adaptive SDK adopts a messages format similar to OpenAI’s Python library.
If you do not set a model
, the Adaptive client routes your request to the model you’ve set as default for the client’s use case,
or one of the models included in an active A/B test.
Interactions (pairs of [messages, completion] resulting from chat requests) are logged and saved to the Adaptive Interaction Store by default.
Create an Adaptive
or OpenAI
client first if applicable.
See the SDK Reference for the full method definition.
See the SDK Reference for the full method definition.
If you are using the Open AI Python library, model
should be use_case_key/[optional_model_key]
.
If you are using requests
, model
should be use_case_key/[optional_model_key]
.
If you are using curl
, model
should be use_case_key/[optional_model_key]
.
You can optionally tag requests with labels, as illustrated above.
Labels are useful to organize and categorize completions.
The labels
parameter is a dictionary of user-defined key-value pairs, for example labels = {"project": "RAG Bot", "topic": "Industrial Tools"}
.
If you are using the OpenAI Python library, labels
is not a supported parameter, but you can pass your labels in metadata
,
and they will be logged as any other label in Adaptive Engine.
Although the OpenAI Python library can be used to make inference requests to your Adaptive Engine deployment, not all of Adaptive Engine’s input parameters are supported by OpenAI’s library and vice-versa. See Chat API Reference for a list of supported parameters.
Both the Adaptive SDK’s and OpenAI Python library’s return types are Pydantic Models, which help with autocompletion within your editor. You can access the model’s response text with: