StringThread is the atomic data element in adaptive_harmony. A thread is a sequence of turns (role + content), combined with turn weights for training and optional metadata (metric feedback, ground truth labels, or any custom key-value pairs).
Create a StringThread
from adaptive_harmony import StringThread
thread = StringThread(
turns=[
("user", "Hello, who are you?"),
("assistant", "I am a large language model. How can I help you today?"),
]
)
thread_with_metadata = StringThread(
turns=[
("user", "Hello, who are you?"),
("assistant", "I am a large language model. How can I help you today?"),
],
metadata={"label": "polite"}
)
Builder methods
Each method returns a new StringThread with the turn appended:
thread = StringThread([])
thread = thread.system("You're name is Adaptive.")
thread = thread.user("Hello there!")
thread = thread.assistant("Hello, I'm Adaptive, and here to help you!")
thread = thread.tool('{"result": "success"}')
Access turns and content
# All turns as (role, str content) tuples
all_turns = thread.get_turns()
# All turns except the last assistant turn
messages = thread.messages()
# The last assistant turn's content, or None
completion = thread.completion()
# Content of the very last turn (any role)
last = thread.last_content()
get_turns() returns every turn as a (role, content) tuple. For multimodal turns, images are represented as <|image|> in the string content.
messages() returns all turns except the final one if it has the assistant role. This is useful when you need to split a thread into prompt and completion.
Multimodal StringThread
The difference with text-only is that content becomes a list of fragments instead of a plain string. There are two fragment types:
| Type | Content key | Example |
|---|
TextFragment | text | {"type": "text", "text": "Describe this image."} |
ImageFragment | url | {"type": "image", "url": "data:image/png;base64,..."} |
Use StringThread.from_fragments() to create a multimodal thread. Don’t forget the await! from_fragments is async because it loads and decodes images.
from adaptive_harmony import StringThread, TextFragment, ImageFragment
thread = await StringThread.from_fragments([
("user", [
TextFragment(text="What's in this image?"),
ImageFragment(url=f"data:image/png;base64,{image_data}"),
]),
])
You can also pass fragments as plain dictionaries:
thread = await StringThread.from_fragments([
("user", [
{"type": "text", "text": "What's in this image?"},
{"type": "image", "url": f"data:image/png;base64,{image_data}"},
]),
])
A text-only StringThread is equivalent to a fragment thread with a single TextFragment:
# These two are equivalent
thread = StringThread([("user", "Hello world!")])
thread = await StringThread.from_fragments([
("user", [{"type": "text", "text": "Hello world!"}])
])
Only user and system roles can contain images. The assistant role must be text-only.
The fragment format in Harmony differs from the chat completions API. In Harmony, image fragments use {"type": "image", "url": "..."}, while the chat completions API uses {"type": "image_url", "image_url": {"url": "..."}}.
Image encoding
ImageFragment expects a url field with the full data URI. To base64-encode a local image, use the built-in helper:
from adaptive_harmony.core.image_utils import image_to_base64
b64 = image_to_base64("photo.png")
fragment = ImageFragment(url=f"data:image/png;base64,{b64}")
image_to_base64 returns the raw base64 string (without the data:... prefix). It also allows you to resize images and convert to grayscale:
| Parameter | Type | Description |
|---|
image_path | str or Path | Path to the image file |
format | str | Output format (default: "PNG") |
longest_side_max_size | int or None | Resize so the longest side fits this limit |
black_and_white | bool | Convert to grayscale (default: False) |
The formats accepted depend on the context:
- In recipes (
adaptive_harmony): most image formats are supported (PNG, JPEG, GIF, WebP, BMP, TIFF, etc.). Images can be loaded from file paths, HTTP URLs, or data: URIs.
- Via the chat completions API (SDK / OpenAI client): only PNG, JPEG, GIF, and WebP are accepted, and only
data: URIs, HTTP URLs are rejected.
We recommend using PNG or JPEG for maximum compatibility.
Turn weighting
During training, turn weights control how much each turn contributes to the loss. A weight of 0.0 means the model does not learn from that turn, while 1.0 means it contributes fully. This is how you tell the model which parts of a conversation to learn from: typically you want the model to learn from assistant responses, not from user prompts or system messages.
By default, turns added with .assistant() get a weight of 1.0 and all other roles get 0.0. When you load a dataset that contains completions, with_weight_last_assistant_turn() is applied automatically: only the final assistant turn is weighted. You can override this after loading using one of the methods below.
Weighting methods
Each method returns a new StringThread with updated weights:
| Method | Behavior |
|---|
with_weight(w) | Set weight w on all turns |
with_weight_all_assistant_turns() | Weight 1.0 on all assistant turns, 0.0 on others |
with_weight_last_assistant_turn() | Weight 1.0 on the last assistant turn only, 0.0 on others |
with_weight_assistant_turns_from_index(i) | Weight 1.0 on assistant turns starting from the i-th assistant turn |
# Train on all assistant responses in a multi-turn conversation
thread = thread.with_weight_all_assistant_turns()
# Train only on the final assistant response
thread = thread.with_weight_last_assistant_turn()
# Train on assistant turns starting from the 2nd one (index 1)
thread = thread.with_weight_assistant_turns_from_index(1)
# Override default weighting after loading a dataset
dataset = await config.dataset.load(ctx)
dataset = [thread.with_weight_all_assistant_turns() for thread in dataset]
Inspect weights
# Returns list of weight for each turn
turns_with_weights = thread.get_turn_weights()