Send chat completions to deployed models using the Adaptive SDK, OpenAI Python library, or any HTTP client. If you omitDocumentation Index
Fetch the complete documentation index at: https://docs.adaptive-ml.com/llms.txt
Use this file to discover all available pages before exploring further.
model, requests route to the project’s default model, or to a model in an active A/B test.
Interactions (prompt + completion pairs) are logged automatically. See Interactions for details.
- SDK
- UI
Chat completions
| Parameter | Type | Description |
|---|---|---|
model | str | Model key. Omit to use the project default. |
messages | list | Chat messages with role and content |
labels | dict | Key-value pairs for filtering interactions |
stream | bool | Enable streaming (default: False) |
temperature | float | Sampling temperature |
max_tokens | int | Maximum tokens to generate |
stop | list | Stop sequences |
top_p | float | Top-p sampling threshold |
session_id | str or UUID | Session ID for KV-cache reuse across turns |
store | bool | Whether to log the interaction (default: True) |
Streaming
Get the completion ID
Usecompletion_id to log Metrics against the response:Vision requests
Models with the Multimodal tag accept images alongside text. Images must be base64-encoded data URIs (JPEG, PNG, WebP, or GIF, up to 10 MB each).Structured output
Passresponse_format to constrain a completion to a JSON Schema or a Pydantic model. For internal models, invalid tokens are masked at each generation step — the response is structurally guaranteed to parse. For external providers, the schema is forwarded to the provider’s native structured-output API.response_format accepts a Pydantic BaseModel class, a raw JSON Schema envelope ({"type": "json_schema", "json_schema": {"name": ..., "schema": ...}}), or None (default). Pydantic models are auto-converted via model_json_schema() and patched for strict-mode compatibility (refs inlined, additionalProperties: false added).The response is a JSON string in response.choices[0].message.content — the SDK does not auto-deserialize. Call Model.model_validate_json(...) to get a typed instance.Schema features supported
Schema features supported
Constrained decoding compiles the schema to a token-mask grammar. The compiler supports:
- Types:
string,integer,number,boolean,null,object,array, and union types via["string", "null"]syntax - Composition:
oneOf,anyOf,allOf(object merge only),$refand$defs(inlined during SDK prep) - Strings:
minLength,maxLength,pattern(regex),format,enum,const - Numbers:
minimum,maximum,exclusiveMinimum,exclusiveMaximum.multipleOfrequires explicit bounds — without bounds, it’s silently ignored. - Arrays:
items,minItems,maxItems. Arrays must declareitems. - Objects:
properties,required,additionalProperties(false / true / schema)
$ref are unrolled to depth 4; deeper nesting is truncated. Format keywords without a regex equivalent are dropped.Streaming with structured output
stream=True and response_format work together. Each chunk delivers partial JSON; buffer until the stream closes, then parse:Failure modes
| Situation | Behavior |
|---|---|
| Schema fails to compile | The constraint is dropped with a warning in server logs; the model generates unconstrained. Validate your schema during development. |
max_tokens exhausted before completion | finish_reason is "length". The response is truncated JSON — model_validate_json will raise. Check finish_reason before parsing. |
| External provider doesn’t support structured output for the model | The constraint is dropped silently. Stick to providers and models that support structured output natively. |
| Recursive schema beyond depth 4 | Deeper levels are truncated at compile time. |
OpenAI compatibility
Use the OpenAI Python library with your Adaptive deployment:model to project_key/model_key. Use metadata instead of labels.
Image format difference
Image format difference
Multimodal image format differs between Adaptive and OpenAI:
HTTP requests
Use any HTTP client to call the chat completions endpoint directly.- requests
- curl

