Skip to main content
Datasets contain examples for training and evaluating models. Upload them as JSONL files.

Upload a dataset

adaptive.datasets.upload(
    file_path="training-data.jsonl",
    dataset_key="customer-support-v1",
)
ParameterTypeRequiredDescription
file_pathstrYesPath to JSONL file
dataset_keystrYesUnique identifier
namestrNoDisplay name (defaults to dataset_key)

Dataset formats

Each line in your JSONL file must follow one of these schemas:Prompts and completions (most common):
{"messages": [{"role": "user", "content": "Hello"}], "completion": "Hi there!"}
Prompts only (for evaluation with generated completions):
{"messages": [{"role": "user", "content": "Hello"}]}
With feedback metrics (for training on ratings):
{"messages": [...], "completion": "...", "feedbacks": {"quality": 0.8, "helpful": true}}
With preferences (for RLHF/DPO training):
{"messages": [...], "preferred_completion": "Good answer", "other_completion": "Bad answer", "feedback_key": "quality"}
With images (multimodal):
{"messages": [{"role": "user", "content": [{"type": "text", "text": "Describe this image"}, {"type": "image", "url": "data:image/jpeg;base64,/9j/4AAQ..."}]}], "completion": "A photo of a cat sitting on a desk."}
Use content as a list of parts to interleave text and images. Each image must be a base64 data URI (JPEG, PNG, WebP, or GIF, up to 10 MB each). External URLs are not supported.
Datasets and the chat API use different image schemas:
# Dataset format
{"type": "image", "url": "data:image/jpeg;base64,..."}

# Chat API format (see Models page)
{"type": "image_url", "image_url": "data:image/jpeg;base64,..."}
Add optional labels or metadata fields to any format for filtering or custom graders.See SDK Reference for all dataset methods.