Skip to main content
Models power your AI applications. Add a model to a project to deploy it and make it available for inference.

Deploy a model

Add a model to your project, then deploy it:
adaptive.models.add_to_project(model="llama-3.1-8b-instruct")
adaptive.models.deploy(model="llama-3.1-8b-instruct", wait=True)

# Or use attach() to do both in one call
adaptive.models.attach(model="llama-3.1-8b-instruct", wait=True)
ParameterTypeRequiredDescription
modelstrYesModel key from the registry
waitboolNoBlock until model is online (default: False)
make_defaultboolNoSet as default model for the project
The model becomes available within a few minutes. Adaptive supports most transformer-based models including Llama, Qwen, Gemma, Mistral, and DeepSeek. See Integrations for proprietary models.

Run inference

response = adaptive.chat.create(
    model="llama-3.1-8b-instruct",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Hello!"}
    ],
    labels={"project": "my-app"},
)
print(response.choices[0].message.content)

# Get completion_id for metrics
completion_id = response.choices[0].completion_id
Requests are logged automatically. Use labels to organize and filter interactions. See Interactions for details.If you omit model, requests route to the project’s default model.
# Streaming
stream = adaptive.chat.create(
    messages=[{"role": "user", "content": "Hello!"}],
    stream=True,
)
for chunk in stream:
    if chunk.choices:
        print(chunk.choices[0].delta.content, end="", flush=True)

Vision models

Models with the Multimodal tag in the registry accept images alongside text in chat completions.Images must be base64-encoded data URIs (JPEG, PNG, WebP, or GIF, up to 10 MB each). External URLs are not supported.
import base64

with open("photo.png", "rb") as f:
    image_data = base64.b64encode(f.read()).decode()

response = adaptive.chat.create(
    model="your-vlm-key",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "What's in this image?"},
                {"type": "image_url", "image_url": f"data:image/png;base64,{image_data}"},
            ],
        }
    ],
)
print(response.choices[0].message.content)
Use the OpenAI Python library with your Adaptive deployment:
from openai import OpenAI

client = OpenAI(
    base_url=f"{ADAPTIVE_URL}/api/v1",
    api_key=ADAPTIVE_API_KEY,
)

response = client.chat.completions.create(
    model="project_key/model_key",
    messages=[{"role": "user", "content": "Hello!"}],
)
Set model to project_key/model_key. Use metadata instead of labels.Multimodal image format differs between Adaptive and OpenAI:
# Adaptive format (flat string)
{"type": "image_url", "image_url": "data:image/png;base64,..."}

# OpenAI format (nested object)
{"type": "image_url", "image_url": {"url": "data:image/png;base64,..."}}
See SDK Reference for all model and chat methods.