> ## Documentation Index > Fetch the complete documentation index at: https://docs.adaptive-ml.com/llms.txt > Use this file to discover all available pages before exploring further. # Models > Deploy and run inference on models in Adaptive Models power your AI applications. Add a model to a use case to deploy it and make it available for inference. ## Deploy a model Add a model to your use case, then deploy it: ```python theme={null} adaptive.models.add_to_use_case(model="llama-3.1-8b-instruct") adaptive.models.deploy(model="llama-3.1-8b-instruct", wait=True) # Or use attach() to do both in one call adaptive.models.attach(model="llama-3.1-8b-instruct", wait=True) ``` | Parameter | Type | Required | Description | | -------------- | ---- | -------- | -------------------------------------------- | | `model` | str | Yes | Model key from the registry | | `wait` | bool | No | Block until model is online (default: False) | | `make_default` | bool | No | Set as default model for the use case | The model becomes available within a few minutes. Adaptive supports most transformer-based models including Llama, Qwen, Gemma, Mistral, and DeepSeek. See [Integrations](/v0.12/advanced/integrations) for proprietary models. ## Run inference ```python theme={null} response = adaptive.chat.create( model="llama-3.1-8b-instruct", messages=[ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Hello!"} ], labels={"project": "my-app"}, ) print(response.choices[0].message.content) # Get completion_id for feedback completion_id = response.choices[0].completion_id ``` Requests are logged automatically. Use `labels` to organize and filter interactions. See [Interactions](/v0.12/core/interactions) for details. If you omit `model`, requests route to the use case's default model. ```python theme={null} # Streaming stream = adaptive.chat.create( messages=[{"role": "user", "content": "Hello!"}], stream=True, ) for chunk in stream: if chunk.choices: print(chunk.choices[0].delta.content, end="", flush=True) ``` Use the OpenAI Python library with your Adaptive deployment: ```python theme={null} from openai import OpenAI client = OpenAI( base_url=f"{ADAPTIVE_URL}/api/v1", api_key=ADAPTIVE_API_KEY, ) response = client.chat.completions.create( model="use_case_key/model_key", messages=[{"role": "user", "content": "Hello!"}], ) ``` Set `model` to `use_case_key/model_key`. Use `metadata` instead of `labels`. See [SDK Reference](/v0.12/reference/sdk) for all model and chat methods. ## Deploy a model

Navigate to your use case and open the **Models** tab. Click **Add Model** and select from the registry.

Fine-tuned adapter models appear indented under their base model. ## Run inference Open your use case and click **Chat** to interact with deployed models directly in the browser.