Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.adaptive-ml.com/llms.txt

Use this file to discover all available pages before exploring further.

The Monitoring tab surfaces training and RL telemetry — loss curves, reward signals, gradient norms, validation metrics — streamed live from a running job and persisted for the run’s lifetime. Pick multiple runs to compare them side-by-side on the same axes. Monitoring covers training-time signals only. For inference-time signals (TTFT, latency, token counts) and post-completion feedback, see Metrics.

Default metrics

Pre-built training recipes auto-emit the metrics below — no extra code in your run config:
RecipeAuto-emitted metrics
sfttrain/loss, train/gradient_norm, val/loss (when validation enabled), val/<grader_key> (when grader configured)
preference_rlhf, metric_rlhf, rltrain/loss, train/reward, train/kl, train/gradient_norm, plus stage-specific metrics for multi-stage runs
evalPer-grader scalar scores aggregated across the dataset
Metric streams flush to the platform every 0.5 seconds and appear in the UI within seconds of being emitted.

Log custom metrics from a Harmony recipe

For custom recipes, get a logger from the recipe context and call it like a function:
from adaptive_harmony.metric_logger import get_prod_logger

@recipe_main
async def my_recipe(config: MyConfig, ctx: RecipeContext):
    logger = get_prod_logger(ctx)

    # Wire the dashboard URL into the run record so users can click through
    if logger.training_monitoring_link:
        ctx.job.set_monitoring_link(logger.training_monitoring_link)

    for step in range(num_steps):
        loss = await train_one_step(...)
        logger({"train/loss": loss, "train/lr": current_lr})
The logger accepts any Mapping[str, int | float | Table]. Each call advances the internal step counter once.

Logging tables

For structured per-step data — sample completions, gradient breakdowns, layer statistics — log a Table:
from adaptive_harmony.logging_table import Table

samples = Table()
samples.add_row(["prompt-1", completion_text, score])
samples.add_row(["prompt-2", completion_text_2, score_2])

logger({"eval/samples": samples})
Tables appear in the run detail view, paginated and searchable.
get_prod_logger auto-selects a logging backend based on environment variables present in the recipe sandbox: WandB → MLflow → TensorBoard → stdout. The Adaptive monitoring backend is added in addition (not as a replacement) when ADAPTIVE_BASE_URL and ADAPTIVE_API_KEY are set, so metrics always reach the platform UI even if you also log to a third-party tracker.Set ADAPTIVE_MONITORING_DISABLED=1 to opt out of the Adaptive backend (e.g., for local-only development).
See SDK Reference for monitoring methods.

Retention

Monitoring metrics are tied to the run’s lifetime — they live as long as the run record does. Deleting a run deletes its metrics. There is no separate TTL for metrics.