> ## Documentation Index
> Fetch the complete documentation index at: https://docs.adaptive-ml.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Architecture

> Adaptive Engine system architecture

An Adaptive Engine cluster consists of 5 components:

1. **Control plane** - UI, permissions system, user-facing APIs
2. **Compute plane** - GPU servers for inference and training
3. **Shared storage** - Model files and training artifacts
4. **Postgres database** - Interactions, metadata, application state
5. **Redis** - Job events and usage statistics

<Frame caption="High-level overview of Adaptive Engine system architecture">
  <img src="https://mintcdn.com/adaptiveml/yzzpVsRf0ipPQg1A/static/arch-overview-redis.png?fit=max&auto=format&n=yzzpVsRf0ipPQg1A&q=85&s=88380fa661aaecceb05a655df2c42016" width="2360" height="998" data-path="static/arch-overview-redis.png" />
</Frame>

## Components

### Control plane

Low-latency containerized service serving the UI and APIs. Runs on CPU (minimum 4 vCPU, 16GB memory).

### Compute plane

Multi-GPU AI computing framework for training and inference.

**Compatible GPUs:** L4, L40, L40S, H100, H200, B200, GB200

**Requirements:**

* Multi-GPU or single-GPU servers supported
* NVLink/NVSwitch optional but recommended for training and large models
* PCIe-connected multi-GPU servers supported (e.g., EC2 g6.12xlarge)
* 300GB+ local storage recommended for Docker images and model cache

<Warning>
  Adaptive Engine 0.11 and above do not support NVIDIA A100.
</Warning>

### Shared storage

Model files and datasets. Any POSIX-compatible storage (local disk, NFS) or S3-compatible object storage.

### Database

PostgreSQL 16+. Stores interactions, permissions, and settings. Must be located near the control plane since blocking permission checks run on every inference call.

### Redis

Transfers events between compute and control planes, stores usage statistics and real-time metadata. Can be local or managed (Elasticache, Memorystore, etc.).

### Cloud service examples

| Component         | AWS                    | GCP                | Azure                         |
| ----------------- | ---------------------- | ------------------ | ----------------------------- |
| Control & compute | EC2, EKS               | GCE, GKE           | Azure VM, AKS                 |
| Storage           | S3, FSx for Lustre     | GCS, Filestore     | Azure Files NFS               |
| Database          | RDS PostgreSQL, Aurora | Cloud SQL, AlloyDB | Azure Database for PostgreSQL |
| Redis             | Elasticache            | Memorystore        | Azure Managed Redis           |

## Compute plane

<Frame caption="Detailed architecture of an Amazon EKS deployment with compute pools for training and inference">
  <img src="https://mintcdn.com/adaptiveml/z4z70NKCV_h4-Dnh/static/arch-deep-dive.png?fit=max&auto=format&n=z4z70NKCV_h4-Dnh&q=85&s=4d43695ec2d432aa00046b709553e861" width="3276" height="1238" data-path="static/arch-deep-dive.png" />
</Frame>

### Compute pools

A compute pool is a set of homogeneous GPU servers that receives workloads.

A cluster can have multiple compute pools representing different GPU types or capacity reservations.

### GPU workloads

1. **Inference endpoints** - Low-latency, high-throughput model serving
2. **Recipe runs** - Scripted batch tasks (training, fine-tuning, evaluation)
3. **Interactive sessions** - Real-time remote execution via Secure WebSockets

**Pool targeting:**

* Inference endpoints can span multiple inference or static pools
* Recipe runs and sessions can only be on one compute pool

## Deployment notes

* Built on open standards (Linux, Postgres, Rust, Docker) for portability
* All 5 components can be collocated on a single server
* For production, we recommend to use an external database and storage for independent scaling and availability
