Datasets

Datasets contain examples for training and evaluating models. Upload them as JSONL files.

Upload a dataset

adaptive.datasets.upload(
    file_path="training-data.jsonl",
    dataset_key="customer-support-v1",
)

Parameter	Type	Required	Description
`file_path`	str	Yes	Path to JSONL file
`dataset_key`	str	Yes	Unique identifier
`name`	str	No	Display name (defaults to `dataset_key`)

Dataset formats

Each line in your JSONL file must follow one of these schemas:Prompts and completions (most common):

{"messages": [{"role": "user", "content": "Hello"}], "completion": "Hi there!"}

Prompts only (for evaluation with generated completions):

{"messages": [{"role": "user", "content": "Hello"}]}

With feedback metrics (for training on ratings):

{"messages": [...], "completion": "...", "feedbacks": {"quality": 0.8, "helpful": true}}

With preferences (for RLHF/DPO training):

{"messages": [...], "preferred_completion": "Good answer", "other_completion": "Bad answer", "feedback_key": "quality"}

Add optional labels or metadata fields to any format for filtering or custom graders.

Multimodal content

In any of the formats above, content can be either a plain string or a list of content fragments to interleave text and images:

{"messages": [{"role": "user", "content": [
  {"type": "text", "text": "Describe this image"},
  {"type": "image", "url": "data:image/jpeg;base64,/9j/4AAQ..."}
]}], "completion": "A photo of a cat sitting on a desk."}

Each image must be a base64 data URI (JPEG, PNG, WebP, or GIF, up to 10 MB). External URLs are not supported.

Dataset vs chat API image format

Datasets and the chat completions API use different image fragment schemas:

# Dataset / Harmony format
{"type": "image", "url": "data:image/jpeg;base64,..."}

# Chat completions API format
{"type": "image_url", "image_url": {"url": "data:image/jpeg;base64,..."}}

See SDK Reference for all dataset methods.

Start

Core

Advanced

Deploy

Updates

Upload a dataset

Dataset formats

Multimodal content

Upload a dataset

​Upload a dataset

​Dataset formats

​Multimodal content

​Upload a dataset

Upload a dataset

Dataset formats

Multimodal content

Upload a dataset