Structured generation

The adaptive_harmony library includes methods to help you generate structured outputs with LLMs that adhere to a specific JSON schema, as well as to render simplified, LLM-readable JSON schemas you can prompt your LLM with to explain what output structure to follow. You can achieve both by using annotated Pydantic Models such as the following:

from typing import Literal
from pydantic import BaseModel, Field

class BinaryJudgeOutput(BaseModel):
    reasoning: str = Field(
        description="Reasoning to support the rationale behind the score."
    )
    grade: Literal["PASS", "FAIL"] = Field(
        description="The grade for the sample."
    )

The benefit of depending on Pydantic models is that the model definition becomes the single source of truth across the output structure instructions in a prompt, the parsing of a text response back into that model, and referring to known properties in the object that results from parsing and type validation.

Instruct LLM to follow desired output structure

Expanding on the example above, BinaryJudgeOutput is the response structure we would expect from an LLM judge that classifies completions as “PASS” or “FAIL” according to some user-defined eval criteria. A simple system prompt we could use for this judge would be something like:

You are an evaluator of human to AI interactions.
You will be given a full interaction between a human and an AI model, as well as an evaluation criterion.
Your task is to evaluate the AI’s response against the criterion. If the response respects and complies with the criterion, you must grade it with a “PASS”, otherwise you must grade it with a “FAIL”.
You must reason about the interaction and whether it respects the criterion in a short paragraph before you decide on the final grade.
You must return your output as a valid JSON string that strictly adheres to the following schema, with no preamble or postamble:

{json_schema}

Render simplified JSON schema

You could then create a simplified json schema from your model definition:

from adaptive_harmony.core.structured_output import render_schema

schema_str = render_schema(BinaryJudgeOutput)

print(schema_str)

Output:

{
  "reasoning": str,
  "score": Literal["PASS", "FAIL"]
}

reasoning: Reasoning to support the rationale behind the score
score: The score for the sample

Generate text and validate Pydantic model

You can then format the schema into the system prompt, and use model.generate_and_validate to generate with your LLM and get back a BinaryJudgeOutput object.

from adaptive_harmony import HarmonyClient, InferenceModel, StringThread
from adaptive_harmony.core.utils import stringify_thread
from adaptive_harmony.core.structured_output import JsonParseError

async def main():
    # instantiate this before
    client: HarmonyClient
    # spawn an inference model before with client.model().spawn_inference()
    model: InferenceModel
    # original interaction being evaluated 
    original_thread: StringThread
    # system prompt template from above
    system_prompt_template: str
    # convert original thread into a string representation
    stringified_original_thread = stringify_thread(original_thread)

    # build a thread to prompt the judge LLM with
    judging_thread = (
        StringThread
        .system(system_prompt_template.format())
        .user(f"INTERACTION TO EVALUATE:\n{stringified_original_thread}")
    )

    # returns both the generated text and the validate pydantic model
    try:
        output_text, output_pydantic_model = await model.generate_and_validate(
            thread=judging_thread,
            pydantic_model=BinaryJudgeOutput
        )
    except JsonParseError:
        print("Model failed to output valid structure")

By default, .generate_and_validate() retries generation once with correction instructions if the LLM fails to comply with the specified format. You can control how many retries are attempted by passing max_parsing_retries to the method.

Platform

Inference

Evaluation

Graders

Recipes & Runs

Datasets

Interactions

Integrations

Deployment

Instruct LLM to follow desired output structure

Render simplified JSON schema

Generate text and validate Pydantic model

Platform

Inference

Evaluation

Graders

Recipes & Runs

Datasets

Interactions

Integrations

Deployment

​Instruct LLM to follow desired output structure

​Render simplified JSON schema

​Generate text and validate Pydantic model

Instruct LLM to follow desired output structure

Render simplified JSON schema

Generate text and validate Pydantic model