> ## Documentation Index
> Fetch the complete documentation index at: https://docs.adaptive-ml.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Release notes

> Release notes for Adaptive Engine

This page contains release notes for versions of Adaptive Engine, with the most recent releases at the top.

<Update label="v0.14.0" description="2026-04-24">
  ### Breaking changes

  * **`Feedback` renamed to `Score`** across SDK, GraphQL API, and database.
    * `FeedbackType` → `ScoreType`
    * `feedback_key` → `score_key` (in recipe configs, SDK methods, GraphQL types)
    * Existing recipe JSON configs in the database are migrated automatically. SDK code referencing `FeedbackType` or `feedback_key` must be updated.
  * `feedback.*` SDK and GraphQL API removed — use `score.*` / `metrics.*` equivalents. `FeedbackAggregate` → `ScoreAggregate` in GraphQL.
  * Legacy `judge` entity removed. GraphQL queries `get_judge`, `list_judges`, `list_judge_versions` are gone, along with the underlying table. Migrate to graders (function, prebuilt, or judge-as-grader).
  * `promoteCheckpoint` mutation now returns `[Job]` instead of `[Model]`. Promotion is an asynchronous copy job; callers must poll the returned job(s) rather than using the models directly.
  * AB campaign `n_completions` parameter removed from the inference path. Existing integrations passing `n > 1` will silently ignore the parameter.
  * Integration topic subscriptions using the `use_case:` prefix still work via a backward-compat shim, but subscribers should migrate to the `project:` prefix.

  ### Structured output

  * New `response_format` field across the platform: pass plain text, a JSON Schema, or (in Python) a Pydantic model to constrain model output.
  * Python SDK: `chat.create(..., response_format=MyPydanticModel)`. Schema is auto-patched for strict-mode compatibility.
  * REST completion API: `response_format` plumbed through to all external providers (OpenAI, Anthropic, Gemini, legacy OpenAI).
  * UI chat workbench: Response Format selector in Model Settings, with a JSON-schema editor and three preset examples.
  * UI judge workbench: judgement schema passed as `response_format`, guaranteeing parseable output.

  ### Function graders

  * New grader type. Define a Python `grade` function that runs in a sandboxed environment.
  * UI: create / edit / duplicate flow in the New Grader dropdown, with a Python code editor and a Test Payload panel that validates against a sample from your dataset.
  * Python SDK: `create_function_grader`, `update_function_grader`, `get_grader`, `list_graders`, `validate_function_grader`.

  ### Checkpointing

  * **Promote checkpoint to model**: name a checkpoint from a run and promote it to a standalone model. Promotion runs as an asynchronous copy job (S3 server-side copy when possible) with full provenance — promoted models expose `sourceCheckpoint` in GraphQL.
  * Promoting a LoRA checkpoint automatically binds its backbone model to the project.
  * Duplicate promotions of the same checkpoint are rejected; re-promoting after soft-delete is supported.
  * **Multi-stage checkpointing**: checkpoint state tracked per training stage (SFT / PPO / GRPO / ENV\_PPO / ENV\_GRPO), with stage names surfaced in the UI and logger step saved/restored across resumes.

  ### User interface

  * Image gallery in chat and interaction detail; navigation to the next assistant turn from the gallery.
  * Model comparison UX overhaul.
  * New generic filter bar and operators applied consistently across Runs, Jobs, Users, Usage, and Interactions pages.
  * Runs: inline rename in breadcrumbs, delete action, resizable list; soft-deleted jobs hidden by default.
  * ISB: prefetching of neighbor interactions for instant navigation; queue time, prefill time, and model surfaced in the interaction detail panel; new Author column; "Open evaluation in ISB" shortcut.
  * Resizable split views across sidebar and sheet, with persisted user preference; automatic fallback to a sheet on smaller screens.
  * Dataset viewer: smart column selection; scoped picker; hints on empty recipe selector.
  * Plot UX improvements; navigation arrows on metric detail pages.
  * Icons and display support for new model families: Gemma 3, Llama 3, Mistral 3, Qwen 3, Qwen 3.5; Claude.
  * Model size always displayed in B.
  * Compute-pool handling: auto-selection across multiple pools; 0-GPU pools filtered out; clearer error messaging; model marked offline during deployment.

  ### Inference & model management

  * Qwen 3.5 and Qwen 3.6 families supported for inference and training (backward pass for dense and MoE text-only variants), with dedicated Triton GDN kernels and CUDA-graph dispatch for low-latency inference.
  * Multimodal inference hardening: webp/gif support, rejection of unsupported images, LoRA inference fixed on multimodal models, no special tokens leaked in stream.
  * Per-model chat templates: a model's template travels with the model directory, replacing the previous global template directory.
  * Fragment-first harmony API supporting function calling on external models.

  ### Training & runs

  * Checkpoint promotion and multi-stage checkpointing (see above).
  * SFT weighting overhaul: assistant turns are no longer force-weighted; `get_turn_weights` exposed on `StringThread` / `TokenizedThread`; thread visualization highlights trained turns in green; warnings surface when user turns have weight > 0 or when a thread has no trained turns.
  * RL recipes: explicit `epochs` parameter supported alongside `max_num_steps` across PPO, GRPO, ENV\_PPO, ENV\_GRPO, GSPO variants.
  * Runs: delete, rename, soft-delete, with matching header actions.

  ### Python SDK

  * Multimodal inference: `ChatMessage.content` widened to `Union[str, List[ContentPart]]`, with `TextContentPart` / `ImageContentPart` following the OpenAI content-parts format.
  * Structured output: `response_format` accepts text, JSON Schema, or a Pydantic model.
  * Function graders: full CRUD (`create_function_grader`, `update_function_grader`, `get_grader`, `list_graders`, `validate_function_grader`).
  * Dataset download: `client.datasets.download(...)`.
  * Update-role method; user teams and roles exposed on the user object.
  * New REST types: `ResponseFormat`, `Jobs.delete`, `Jobs.update`, `validate_function_grader`.

  ### Evaluation & data management

  * Function graders (see above).
  * Evaluation artifacts can include images.
  * Dataset upload flow simplified.

  ### Recipes

  * Custom `entrypoint` and `config_entrypoint` per recipe — no longer hardcoded to `main.py` / `config.py`.
  * Dependency installation supports `pyproject.toml` projects: `[project.dependencies]` parsed directly (build backend skipped); `src/` added to `sys.path`.
  * Tree-view file selector in the recipe editor.
  * Config form handles None defaults and enum coercion correctly.
  * Download a recipe as a zip archive (for non-prebuilt recipes).

  ### Performance

  * Large models load substantially faster on multi-GPU setups. Llama-3.3 70B loads roughly 4× faster on 4 GPUs.
  * Worker-driven speculative-decoding draft loop: per-step roundtrips collapsed into a single command, reducing spec-dec overhead.
  * New kernels: Deltanet; Qwen 3.5 backward (dense + MoE, text only); linear eagle3.
  * No gradient creation during inference.

  ### Administration & infrastructure

  * Audit logging made asynchronous.
  * Role editing: `updateRole` mutation and matching UI.
  * Registry sync resilient to individual model operation failures.
  * Pinned Docker image versions; Docker images trimmed.
</Update>

***

<Update label="v0.13.0" description="2026-03-02">
  ### Breaking changes

  * **"Use Cases" renamed to "Projects"** across SDK, GraphQL API, REST API, and URL paths.
    * `client.use_cases` → `client.projects` (`client.use_cases` deprecated but still works)
    * `client.default_use_case` / `client.set_default_use_case()` → `client.default_project` / `client.set_default_project()`
    * `models.add_to_use_case()` → `models.add_to_project()`
    * `models.detach(use_case=...)` → `models.detach(project=...)` — `project` is now required
    * `models.deploy(use_case=...)` → `models.deploy(project=...)`
    * `jobs.run(use_case=...)` / `jobs.list(use_case=...)` → `jobs.run(project=...)` / `jobs.list(project=...)`
    * `interactions.list(use_case=...)` → `interactions.list(project=...)`
    * `graders.list/get/delete/lock(use_case=...)` → `graders.list/get/delete/lock(project=...)`
    * `recipes.list(use_case=...)` → `recipes.list(project=...)`
    * `feedback.get_key(feedback_key)` → `feedback.get_key(project, feedback_key)` — `project` is now a required positional argument
    * `feedback.create_metric()` now requires `project` as a parameter
    * GraphQL: `useCase` / `UseCaseData` → `project` / `ProjectData`
    * Integration topic patterns: existing subscriptions using the old `use_case:` prefix (e.g. `use_case:*:job:*:*`) continue to work — they are automatically matched against the new `project:` topics. No action required.
  * `feedback.link()` and `feedback.unlink()` removed.

  ### User interface

  * Multimodal image input in chat and interaction store: attach images alongside text for models with vision capabilities.
  * Interactive recipe sessions visible in run history with progress reporting, artifact uploads, and cancellation.
  * Multi-file recipe viewer: browse and edit individual files within a recipe.
  * Analytics dashboard pre-populated with default plots for new projects.
  * Hourly granularity in the usage dashboard.
  * ISB column selector: choose which columns to display in the interaction table.
  * Filter interactions by session ID; copy session ID from any interaction detail panel.
  * Download a selection of interactions as a file for offline analysis.
  * Delete interactions from the interaction store.
  * Add multiple models to a project in a single action from the model registry.
  * Click-to-copy model key in the project model registry.
  * Parent training run link shown in model detail page.
  * Run logs visible on the run preview sheet.

  ### Metrics

  Feedback has been unified with grader and system metrics into a single Metrics system, now scoped to projects rather than the organization. Existing feedback data has been automatically migrated.

  * Three categories unified under one system: system metrics (auto-computed: TTFT, latency, token counts), grader metrics (scores produced by judges and custom graders), and user metrics (custom metrics, formerly "feedback").
  * Metrics are now project-scoped; the legacy `link`/`unlink` mechanism is removed.
  * New project Metrics page replaces the org-level Feedbacks page, with metric detail view, create/edit/delete dialogs, category filters, and search.
  * Metric pickers across the product group options by category (System / Grader / User).
  * Metrics can be deleted; system and built-in grader metrics are protected.

  ### Inference & model management

  * Speculative decoding: train an accelerated model with a built-in recipe to reduce inference latency.
  * Multimodal inference end-to-end: streaming for vision models, improved image encoder handling, completions API aligned with OpenAI image format.
  * Flash Attention 3 C++ backend with split-KV aware graph capture.

  ### Training & runs

  * KV-caching for training (diff KV cache): reduces activation sizes for long context training.
  * Recipe names and descriptions: set a human-readable name and description when uploading a recipe.
  * New built-in recipe for aligning a draft model to a target (speculative decoding).
  * DPO training supports dataset looping.

  ### SDK

  * `extra_params` exposed for external models in recipes and via `models.add_external()`.
  * New `client.create_service_account()` method.
  * Named interactive sessions via `name` parameter in `RecipeContext`.
  * `chat.create()` `store` parameter to control whether completions are saved.

  ### Evaluation & data management

  * AI Judge graders now return metadata alongside scores.
  * Dataset download preserves the original uploaded filename.

  ### Administration & infrastructure

  * Notifications: subscribe to webhook, Slack, or email notifications for platform events (job completion, failure) with topic-based filtering and pattern matching.
  * HIPAA audit logging: full audit trail for query events, file access events, and connection events, each with user identity, IP address, and timestamp.
  * User management: full CRUD for users at the organization level; first and last activity timestamps surfaced in the UI.
  * Service accounts: create dedicated bot users with API keys for CI/CD pipelines and automated workflows; API key usage is audited.
  * Permissions and roles visible in the UI.
  * Security: updated dependencies addressing CVEs.
</Update>

***

<Update label="v0.12.0" description="2026-01-27">
  ### Breaking changes

  * SDK: `jobs.run()` parameter reorder - `args` moved from required positional to optional keyword.
  * SDK: `interactions` filter format - `advancedFilter` → `advancedFilters`, label filter structure changed.
  * SDK: `ModelserviceStatus` enum - removed `DETACHED` value.
  * SDK: async methods removed - `async attach()` and `async deploy()` no longer available.
  * SDK: `models.detach()` - `use_case` parameter now required (was optional).
  * SDK: `models.update()` - removed `attached` parameter.
  * Recipes: grader `setup()` and `teardown()` calls removed from production recipes.
  * Sandbox enabled by default - opt-out with `ENABLE_SANDBOX=0`.

  ### User interface

  * New sidebar presentation with split use case overview page.
  * New model registry UI: model details sidebar, table grouping, filtering by name/size/status.
  * Chat comparison page for side-by-side interaction analysis.
  * Raw/formatted selector and copy buttons in completion details.
  * Artifact status reflected in runs and evaluations UI.
  * Recipe upload by file with improved recipe fields display in run form.
  * GPU count and GPU time displayed in run details; prefill on clone run.
  * Improved dataset upload dialog and labels display in interaction store.
  * Grader tooltip showing metric column header details.

  ### Inference & model management

  * Differentiable KV cache with CuTe-DSL Flash Attention kernels.
  * LoRA models are now trainable directly.
  * Organization-level model deletion.
  * OpenAI-compatible external models by endpoint.
  * Model/use case bindings with use case stored in model registry.
  * Model statuses: published and stable.
  * LoRA backbone reuse and deduplicated spawn logic.
  * Configurable max prefill size via `DEFAULT_MAX_TOKENS_IN_INF_BATCH`.
  * TP support for FP8 MoE.
  * Suggested tensor parallelism logic based on model size.
  * Multimodal prefill splitting across batches.
  * Static memory usage limits via `GPU_MEMORY_UTILIZATION`.
  * Session claim mechanism after disconnection.

  ### Training & runs

  * Logger parameter for PPO and GRPO classes.
  * Multimodal dataset support.
  * Recipe OOM errors reported to user.
  * Optional defaults for recipe launch with improved upload SDK.
  * `recipe_key` now optional (inferred from file/directory name).
  * New `entrypoint` parameter for custom entry points in directories.

  ### SDK

  * New `artifacts.download(artifact_id, destination_path)` method.
  * New `models.add_to_use_case()` method.
  * `models.list()` accepts optional `filter` parameter.
  * `models.deploy()` signature changed: new `use_case`, `placement`, `make_default` params.
  * `chat.create()` new `store` parameter to control completion storage.
  * `JobArtifactStatus` enum: PENDING, PROCESSING, READY, ERROR.

  ### Evaluation & data management

  * Text search API on completions and prompts.
  * Advanced filter API for completions.
  * Preference dataset processing improvements.
  * Dataset with feedback selection in recipes.
  * Dataset deletion runs as background task.

  ### Administration & infrastructure

  * HIPAA compliance: model and use case reporting for chat/completions.
  * JWT authentication on internal API for recipes.
  * TLS support in Redis and Mangrove.
  * Sandbox improvements: HuggingFace import fixes, git support, syscall filtering.
  * Job permissions support.
  * Multiple replicas cancellation support.
  * Configurable RAM limits on Sandkasten.
  * Library upgrades for CVE fixes.
</Update>

***

<Update label="v0.11.0" description="2025-12-15">
  ### Breaking changes

  * Model keys: model path is now used as the key in the model registry instead of other identifiers.

  ### Inference & model management

  * Support custom endpoint URL for OpenAI Completions API-compatible external models.
  * Make API keys optional for external models.
  * Support per-request tool override for chat completions.
  * New connectivity check endpoint (ADAPTIVE\_URL/health).
  * New `min_gen_len` setter method for `InferenceModel` and `TrainingModel`.
  * Support GPU inference partition resize in SDK.
  * Improve FP8 MoE inference.

  ### Training & runs

  * First [GSPO](https://arxiv.org/abs/2507.18071) (Group Sequence Policy Optimization) implementation.
  * Add support for context with assistant turn on multi-turn generation with `env_grpo`.
  * Improved model saving and checkpointing.
  * New `skip_nan_gradients` argument in `model.optim_step()`.
  * Multi-file recipe support with proper frontend handling.
  * Recipe schema improvements and better parsing.

  ### Evaluation & data management

  * Labels and feedback annotations in chat UI and interaction store.
  * Multimodal dataset support with image handling.
  * Ability to upload GB-sized datasets.
  * Enhanced dataset creation with immutable dataset files for reuse in recipes.
  * Improved dataset artifact management.

  ### Administration & infrastructure

  * Add preflight checks to ensure environment compatibility.
  * Better resource management and allocation tracking.
</Update>

***

<Update label="v0.10.0" description="2025-11-05">
  ### User interface

  * Global search bar with cmd+K.

  ### Inference & model management

  * Allow external model spawning via HTTP.
  * Support for detached models in chat.
  * Rich magic for better REPL experience.
  * Better timeout configuration for client SDK.

  ### Training & runs

  * Display active run in compute pool detail page.
  * Enrich the parameters of the SFT recipe.
  * Rejection sampling production recipe.
  * GRPO KL divergence fix.
  * Better handling of `env_grpo` sample loading.
  * Improved callback system for training.

  ### Evaluation & data management

  * AI Judge workbench v3 with enhanced UI.
  * Custom grader support in product.
  * Dataset viewer page introduction.
  * Improved dataset chunked upload in SDK.
  * Better dataset source tracking.
  * Visual improvements to interaction store browsing.
  * External feedback endpoint in evaluation wizard.

  ### Administration & infrastructure

  * Dynamic world size support (experimental).
  * GPU metrics and Redis connection management cleanup.
</Update>

***

<Update label="v0.9.0" description="2025-10-17">
  ### Breaking changes

  * Model output artifact changes for better organization.

  ### User interface

  * New *recipes*-centric use case navigation.
  * New *split view* to better navigate runs.

  ### Inference & model management

  * MCP (Model Context Protocol) with all turns support.
  * System prompt support in chat settings.
  * Temperature control in chat settings.
  * Better external model handling and API integration.

  ### Training & runs

  * Loss clamping support.
  * Callbacks and training recipe cleanup.
  * Better skip-token-masking loss computation.
  * Improved dataset shuffling with seeding.

  ### Evaluation & data management

  * Judge Playground v2 with enhanced UI.
  * Parse XML stream from response in chat.
  * User metadata support in interaction store.
  * Dataset generation from interaction store filters.
  * Metric aggregation controls in header.
  * Better evaluation result reporting.
  * Dataset artifacts with proper management.
  * Improved interaction state persistence in `localStorage`.

  ### Administration & infrastructure

  * Better resource management logging.
  * Add UI controls to reset and resize GPU inference partitions.
</Update>

***

<Update label="v0.8.0" description="2025-09-23">
  ### Breaking changes

  * New message format migration for completions.

  ### User interface

  * Better form input styles following Epoch Design System.

  ### Inference & model management

  * Add model search in model registry.
  * Model service configuration improvements.
  * Better DMA (Direct Memory Access) handling.
  * External model API key management improvements.
  * OpenAI Response API support.
  * Model conversion re-added with better handling.

  ### Training & runs

  * Better model initialization fixes.
  * Improved training callback system.

  ### Evaluation & data management

  * Pre-built criteria with documentation links.
  * Increase robustness of Amazon S3 support integration in custom recipes.
  * Better recipe editor UI.
  * Enhanced interaction store with tooltip for all turns.
  * Improve MLFlow integration: let users view their use cases runs.
  * Add utility to upload and update custom recipes.

  ### Administration & infrastructure

  * Error management improvements with new error pages.
  * Contract usage reporting.
</Update>

<Update label="v0.7.0" description="2025-08-29">
  ### Training & runs

  * Shuffling in GRPO with better batching.
  * Improved built-in recipes with better parameters.
  * Grader evaluation support in Harmony.

  ### Evaluation & data management

  * Tool providers CRUD operations.
  * Ability to link tool providers with model services.
  * Custom grader support with enhanced UI.
  * New summarization recipe.
  * Interaction store general UI refactor.
  * UX improvement in the AI judge workbench.
  * Evaluation error reporting in new evaluations page.

  ### Administration & infrastructure

  * Team permission selector in use case creation UI.
  * Job partition improvements.
  * Kill router on connection drop.
  * Add API to create and delete users ahead of their SSO registration.
</Update>

<Update label="v0.6.0" description="2025-06-30">
  ### Use interface

  * Introduce use case overview dropdown.
  * Read-only permission UI.

  ### Inference & model management

  * Extending integration of Google API models.
  * Expose `max_ttft` parameter at request level.

  ### Evaluation & data management

  * Refactoring judge & prompt playground.
  * Preset metric visibility in the side-by-side view.
  * Improvement to built-in AI judges.
  * New evaluation wizard.
  * New evaluation results table.
  * Better evaluation exports.
  * Adding support for grader evaluation in custom recipes.

  ### Training & runs

  * Custom recipes improvements.
  * Adding job partition concept, allowing to run on subset of available GPUs.
  * Adding Infiniband health check.
  * Better training arguments and world size requirements removed.

  ### Administration & infrastructure

  * Add team removal method in SDK.
  * Display available GPU partitions.
</Update>

<Update label="v0.5.0" description="2025-05-28">
  ### User interface

  * New design system.
  * Read-only permission UI.
  * Improve Hugging Face model import UI.

  ### Evaluation & data management

  * Adding `source` metadata to identify origin of datasets.
  * Extend evaluation to support more models evaluated in parallel.
  * UI for external feedback endpoints (RLEF).
  * Access individual records from feedback detail page.
  * Support optional metadata saving in data generation jobs.

  ### Training & runs

  * Increase KV cache length in GRPO recipe.
  * Create dedicated URLs for run detail pages.
  * Expose more RL parameters in the training API.
  * Support journaling & replay in reward servers (RLEF).
  * Add APIs for RAG dataset generation.
  * Multi-judge training SDK.
</Update>

<Update label="v0.4.0" description="2025-03-26">
  ### User interface

  * Add use case search.
  * New use case-centric navigation.

  ### Inference & model management

  * Add support for Anthropic and NVIDIA NIM external models.
  * Add compute configuration (placement) to model endpoints.
  * Improve tokenization speed.
  * Custom inference kernels for A100, L40S, H100, H200.
  * Add richer inference metadata: parameters, latencies.

  ### Evaluation & data management

  * Display interaction metadata in interaction detail page.
  * Export raw interactions (JSONL) from the interaction store.
  * Add ability to evaluate existing completions.

  ### Training & runs

  * Addition of GRPO.
  * Display validation in training run details.
  * Better OOM management.
  * Improve granular timestamp reporting.
  * Improve custom attention.
  * Remove sync points in training.
  * Improve job status UI.

  ### Administration & infrastructure

  * Grafana logs integration.
</Update>

<Update label="v0.3.0" description="2025-02-24">
  ### Inference & model management

  * Add support for inference autoscaling on Kubernetes.

  ### Evaluation & data management

  * Filter feedback by label.
  * Improve feedback display in interaction detail page.
  * Add support for annotation (scalar, boolean, text comments) in the interaction store.

  ### Training & runs

  * Add Tensorboard integration.
  * Improve built-in SFT recipe & add SFT-specific UI launcher.
  * Add reward servers (RLEF) in SDK.

  ### Administration & infrastructure

  * Expose concept of compute pools.
</Update>

<Update label="v0.2.2" description="2025-02-14">
  ### Inference & model management

  * Adding concept of deployment placement, to manage partitioning and distribution of resources.

  ### Training & runs

  * Extend Weight & Bias integration to multi-step jobs.
  * New UI to directly train on uploaded dataset.
</Update>

<Update label="v0.2.1" description="2025-01-30">
  ### Inference & model management

  * Integration with Azure OpenAI endpoints.

  ### Evaluation & data management

  * New UI to enter granular AI judge policies for evaluation and training.
  * New dataset upload & browsing page.

  ### Administration & infrastructure

  * Extend permission management APIs and default team behavior.
  * GPU memory management improvements.
</Update>