Structured Response
Enforce a structured response from the model using Pydantic (Python), Zod (TypeScript), or JSON Schema
You can enforce a particular response format from an LLM by providing a schema (JSON or zod) to the .respond() method. This guarantees that the model's output conforms to the schema you provide.
Enforce Using a zod Schema
If you wish the model to generate JSON that satisfies a given schema, it is recommended to provide
the schema using zod. When a zod schema is provided, the prediction result will contain an extra field parsed, which contains parsed, validated, and typed result.
Define a zod Schema
import { z } from "zod";
// A zod schema for a book
const bookSchema = z.object({
title: z.string(),
author: z.string(),
year: z.number().int(),
});Generate a Structured Response
const result = await model.respond("Tell me about The Hobbit.",
{ structured: bookSchema },
maxTokens: 100, // Recommended to avoid getting stuck
);
const book = result.parsed;
console.info(book);
// ^
// Note that `book` is now correctly typed as { title: string, author: string, year: number }Enforce Using a JSON Schema
You can also enforce a structured response using a JSON schema.
Define a JSON Schema
// A JSON schema for a book
const schema = {
type: "object",
properties: {
title: { type: "string" },
author: { type: "string" },
year: { type: "integer" },
},
required: ["title", "author", "year"],
};Generate a Structured Response
const result = await model.respond("Tell me about The Hobbit.", {
structured: {
type: "json",
jsonSchema: schema,
},
maxTokens: 100, // Recommended to avoid getting stuck
});
const book = JSON.parse(result.content);
console.info(book);Heads Up
Structured generation works by constraining the model to only generate tokens that conform to the provided schema. This ensures valid output in normal cases, but comes with two important limitations:
-
Models (especially smaller ones) may occasionally get stuck in an unclosed structure (like an open bracket), when they "forget" they are in such structure and cannot stop due to schema requirements. Thus, it is recommended to always include a
maxTokensparameter to prevent infinite generation. -
Schema compliance is only guaranteed for complete, successful generations. If generation is interrupted (by cancellation, reaching the
maxTokenslimit, or other reasons), the output will likely violate the schema. Withzodschema input, this will raise an error; with JSON schema, you'll receive an invalid string that doesn't satisfy schema.