Skip to main content
Send chat completions to deployed models using the Adaptive SDK, OpenAI Python library, or any HTTP client. If you omit model, requests route to the project’s default model, or to a model in an active A/B test. Interactions (prompt + completion pairs) are logged automatically. See Interactions for details.

Chat completions

response = adaptive.chat.create(
    model="llama-3.1-8b-instruct",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Hello!"}
    ],
    labels={"project": "support-bot"},
)
print(response.choices[0].message.content)
ParameterTypeDescription
modelstrModel key. Omit to use the project default.
messageslistChat messages with role and content
labelsdictKey-value pairs for filtering interactions
streamboolEnable streaming (default: False)
temperaturefloatSampling temperature
max_tokensintMaximum tokens to generate
stoplistStop sequences
top_pfloatTop-p sampling threshold
session_idstr or UUIDSession ID for KV-cache reuse across turns
storeboolWhether to log the interaction (default: True)

Streaming

stream = adaptive.chat.create(
    messages=[{"role": "user", "content": "Hello!"}],
    stream=True,
)
for chunk in stream:
    if chunk.choices:
        print(chunk.choices[0].delta.content, end="", flush=True)

Get the completion ID

Use completion_id to log Metrics against the response:
completion_id = response.choices[0].completion_id

Vision requests

Models with the Multimodal tag accept images alongside text. Images must be base64-encoded data URIs (JPEG, PNG, WebP, or GIF, up to 10 MB each).
import base64

with open("photo.png", "rb") as f:
    image_data = base64.b64encode(f.read()).decode()

response = adaptive.chat.create(
    model="your-vlm-key",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "What's in this image?"},
                {"type": "image_url", "image_url": f"data:image/png;base64,{image_data}"},
            ],
        }
    ],
)
See SDK Reference for all chat methods.

OpenAI compatibility

Use the OpenAI Python library with your Adaptive deployment:
from openai import OpenAI

client = OpenAI(
    base_url=f"{ADAPTIVE_URL}/api/v1",
    api_key=ADAPTIVE_API_KEY,
)

response = client.chat.completions.create(
    model="project_key/model_key",
    messages=[{"role": "user", "content": "Hello!"}],
)
Set model to project_key/model_key. Use metadata instead of labels.
Multimodal image format differs between Adaptive and OpenAI:
# Adaptive format (flat string)
{"type": "image_url", "image_url": "data:image/png;base64,..."}

# OpenAI format (nested object)
{"type": "image_url", "image_url": {"url": "data:image/png;base64,..."}}

HTTP requests

Use any HTTP client to call the chat completions endpoint directly.
import requests

headers = {"Authorization": "Bearer ADAPTIVE_API_KEY"}
payload = {
    "model": "project_key/model_key",
    "messages": [
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Hello!"}
    ],
    "labels": {"project": "support-bot"},
}

response = requests.post(
    url="ADAPTIVE_URL/api/v1/chat/completions",
    json=payload,
    headers=headers,
)
completion_text = response.json()["choices"][0]["message"]["content"]
See API Reference for the full endpoint specification.