For any production application that consumes LLM output programmatically, structured output is not optional. You cannot reliably parse free-form text at scale. A response that is 99% correct will still fail your JSON parser 1% of the time, which at 100,000 calls per day means 1,000 failures. Getting structured output reliably requires choosing the right method for your use case.
The three methods, in order from least to most reliable: prompt-based JSON requests, function calling or tool use, and schema validation libraries like instructor or Zod with the Vercel AI SDK.
Why Free-Form Output Fails at Scale
When you ask an LLM to "respond with JSON," you are making a statistical request. The model generates JSON most of the time, but not all the time. Common failure modes:
- The model adds an explanation sentence before the JSON block ("Here is the JSON you requested: {...")
- The model wraps JSON in a Markdown code fence (
json ...) when you expected bare JSON - String values contain unescaped quotes that break JSON parsing
- Nested objects get flattened or restructured
- Arrays with one element are returned as a bare value instead of a single-element array
- Numbers are returned as strings ("42" instead of 42)
At low volume, you catch these with a try-catch and log an error. At scale, you need a method that eliminates the failure mode rather than handling it.
Method 1: Prompt-Based JSON Request
Ask for JSON directly and specify the schema in the prompt. This works roughly 80 to 90% of the time with current frontier models, depending on the complexity of the schema and the model.
Basic example:
Extract the key information from the following support ticket and return it as a JSON object with exactly these fields:
- ticket_id: string
- customer_email: string or null
- issue_category: one of ["billing", "technical", "account", "feature_request", "other"]
- priority: integer from 1 (low) to 5 (critical)
- summary: string, maximum 100 characters
Return only the JSON object. No explanation, no markdown fencing.
Ticket: "Hi, I'm trying to log in but keep getting 'invalid credentials' even though I just reset my password 5 minutes ago. My email is jane@example.com. This is urgent, I have a client presentation in 20 minutes."
Expected output:
{
"ticket_id": null,
"customer_email": "jane@example.com",
"issue_category": "account",
"priority": 5,
"summary": "Customer cannot log in after password reset, urgent due to upcoming client presentation"
}
Tips to improve reliability:
- Say "Return only the JSON object" and "No explanation" explicitly
- List allowed values for enum fields
- Specify types explicitly ("string or null" not just "the email")
- Keep the schema flat where possible; deep nesting increases errors
- If using the OpenAI API, set
response_format: { type: "json_object" }to engage JSON mode
Method 2: Function Calling and Tool Use
Function calling (OpenAI) and tool use (Anthropic) are API features that force the model to produce structured output matching a schema you define. The model does not generate free text; it generates a structured function call argument. This is the most reliable method for structured output without a validation library.
OpenAI function calling example (TypeScript):
const response = await openai.chat.completions.create({
model: "gpt-4o",
messages: [
{
role: "user",
content: "Extract ticket info from: 'Cannot log in after password reset, email jane@example.com, urgent, client presentation in 20 minutes'"
}
],
tools: [
{
type: "function",
function: {
name: "extract_ticket",
description: "Extract structured information from a support ticket",
parameters: {
type: "object",
properties: {
customer_email: { type: "string", nullable: true },
issue_category: {
type: "string",
enum: ["billing", "technical", "account", "feature_request", "other"]
},
priority: { type: "integer", minimum: 1, maximum: 5 },
summary: { type: "string", maxLength: 100 }
},
required: ["issue_category", "priority", "summary"]
}
}
}
],
tool_choice: { type: "function", function: { name: "extract_ticket" } }
});
With tool_choice set to force the function call, the model is constrained at the decoding level to produce valid JSON matching the schema. It cannot produce explanation text or malformed JSON.
Anthropic tool use equivalent:
const response = await anthropic.messages.create({
model: "claude-opus-4-5",
max_tokens: 1024,
tools: [
{
name: "extract_ticket",
description: "Extract structured information from a support ticket",
input_schema: {
type: "object",
properties: {
customer_email: { type: "string" },
issue_category: {
type: "string",
enum: ["billing", "technical", "account", "feature_request", "other"]
},
priority: { type: "integer" },
summary: { type: "string" }
},
required: ["issue_category", "priority", "summary"]
}
}
],
tool_choice: { type: "tool", name: "extract_ticket" },
messages: [{ role: "user", content: "Extract ticket info from: ..." }]
});
Function calling is the right choice for production systems where JSON parsing failures are not acceptable.
Method 3: Schema Validation Libraries
For TypeScript, the Vercel AI SDK combined with Zod gives you schema-validated structured output with type inference:
import { generateObject } from "ai";
import { openai } from "@ai-sdk/openai";
import { z } from "zod";
const ticketSchema = z.object({
customer_email: z.string().email().nullable(),
issue_category: z.enum(["billing", "technical", "account", "feature_request", "other"]),
priority: z.number().int().min(1).max(5),
summary: z.string().max(100)
});
const { object } = await generateObject({
model: openai("gpt-4o"),
schema: ticketSchema,
prompt: "Extract ticket info from: ..."
});
// object is typed as z.infer<typeof ticketSchema>
// TypeScript knows customer_email is string | null, priority is number, etc.
The Vercel AI SDK uses function calling under the hood and adds Zod validation on top. If the model output does not match the schema, the SDK retries automatically (up to a configurable limit). This gives you both the reliability of function calling and TypeScript type safety without manual schema-to-JSON-schema conversion.
For Python, the instructor library (by Jason Liu) provides equivalent functionality:
import instructor
from openai import OpenAI
from pydantic import BaseModel
from typing import Literal
client = instructor.from_openai(OpenAI())
class TicketExtraction(BaseModel):
customer_email: str | None
issue_category: Literal["billing", "technical", "account", "feature_request", "other"]
priority: int
summary: str
ticket = client.chat.completions.create(
model="gpt-4o",
response_model=TicketExtraction,
messages=[{"role": "user", "content": "Extract ticket info from: ..."}]
)
# ticket is a TicketExtraction instance with validated fields
Common Failure Modes and How to Handle Them
Model adds explanation text before JSON. Add "Return only the JSON object. No explanation text before or after." If using JSON mode or function calling, this does not occur.
Nested objects get confused. Flatten the schema where possible. "address_street" and "address_city" as top-level fields is more reliable than an "address" object with "street" and "city" fields.
Arrays become comma-separated strings. Be explicit: "Return a JSON array (using square brackets) of strings, even if there is only one item." Or use function calling.
Numbers returned as strings. Specify types explicitly. "priority: integer from 1 to 5" not "priority: how urgent this is."
Model invents fields not in your schema. Some models (especially smaller ones) add extra fields. Add: "Include only the fields specified. Do not add additional fields."
When to Use Which Method
Prompt-based JSON: Quick scripts, internal tools where occasional failures are acceptable, tasks where you want to inspect what the model is doing without decoding a function call.
Function calling / tool use: Any production system, any user-facing feature, any pipeline where a parsing failure causes a visible error.
Zod + Vercel AI SDK or instructor: TypeScript or Python production code where you want type safety alongside reliability. The added abstraction is worth it when the schema is complex or changes frequently.
Keep Reading
- Prompt Chaining: How to Break Complex Tasks Into Reliable Steps — Structured output is essential when prompt chains pass data between steps
- Prompt Engineering Complete Guide 2026 — Structured output in the context of every other major prompting technique
- How to Build an AI Agent — Agents depend on structured output to parse tool results; the methods here are directly applicable
Pristren builds AI-powered software for teams. Zlyqor is our all-in-one workspace — chat, projects, time tracking, AI meeting summaries, and invoicing — in one tool. Try it free.