- Adapt using existing feedback - fine-tunes a model to improve an outcome you have previously logged via UI or SDK.
- Teach behaviour with natural language guidelines - provide simple textual guidelines to define what constitutes a good and bad completion for your use case; an AI judge will use them to align your model with the desired behaviour. Reference completions and existing feedback are not required, only prompts are used.
- Reward with external feedback endpoint - set up an external endpoint to provide feedback on completions during training. Enables any custom reward function and is particularly useful for tasks where execution feedback is available, such as database queries, code execution, etc. Read more about how to configure a reward server.
- Supervised fine-tuning - standard SFT; fine-tunes your model using reference completions, no reinforcement learning involved.
Create an
Adaptive
client first- 1. Existing feedback
- 2. Guidelines
- 3. External feedback endpoint
- 4. SFT
Adapt on existing feedback (from dataset)
config
.
See the SDK Reference for the full training config specification.
.create
will create and register a new model you can deploy for inference, or A/B Test
against the base model or others for validation.