Getting Structured Output From LLMs: JSON, XML, and Format Control

Getting reliable structured output from LLMs requires more than asking for JSON. Three methods from least to most reliable, with real prompt examples and failure modes for each.

Mahmudul Haque Qudrati

CEO & ML Engineer

May 17, 2026

9 min read

// tags

#structured-output#json#function-calling#prompt-engineering#llm

FIG. ART-33

9 min read

“

Getting Structured Output From LLMs: JSON, XML, and Format Control

// reading plan

sections

1,293

words

min read

// Machine Learning

GPT Architecture Explained: Beyond the Surface Level

GPT's autoregressive, decoder-only design enables text generation at scale. Here is how it actually works -- from pretraining data to emergent capabilities to GPT-4o.

9 min read

// Machine Learning

LLM Fine-Tuning in Practice: A Developer's Complete Walkthrough

For any production application that consumes LLM output programmatically, structured output is not optional. You cannot reliably parse free-form text at scale. A response that is 99% correct will still fail your JSON parser 1% of the time, which at 100,000 calls per day means 1,000 failures. Getting structured output reliably requires choosing the right method for your use case.

The three methods, in order from least to most reliable: prompt-based JSON requests, function calling or tool use, and schema validation libraries like instructor or Zod with the Vercel AI SDK.

Why Free-Form Output Fails at Scale

When you ask an LLM to "respond with JSON," you are making a statistical request. The model generates JSON most of the time, but not all the time. Common failure modes:

The model adds an explanation sentence before the JSON block ("Here is the JSON you requested: {...")
The model wraps JSON in a Markdown code fence (json ... ) when you expected bare JSON
String values contain unescaped quotes that break JSON parsing
Nested objects get flattened or restructured
Arrays with one element are returned as a bare value instead of a single-element array
Numbers are returned as strings ("42" instead of 42)

At low volume, you catch these with a try-catch and log an error. At scale, you need a method that eliminates the failure mode rather than handling it.

Method 1: Prompt-Based JSON Request

Ask for JSON directly and specify the schema in the prompt. This works roughly 80 to 90% of the time with current frontier models, depending on the complexity of the schema and the model.

Basic example:

Extract the key information from the following support ticket and return it as a JSON object with exactly these fields:
- ticket_id: string
- customer_email: string or null
- issue_category: one of ["billing", "technical", "account", "feature_request", "other"]
- priority: integer from 1 (low) to 5 (critical)
- summary: string, maximum 100 characters

Return only the JSON object. No explanation, no markdown fencing.

Ticket: "Hi, I'm trying to log in but keep getting 'invalid credentials' even though I just reset my password 5 minutes ago. My email is jane@example.com. This is urgent, I have a client presentation in 20 minutes."

Expected output:

{
  "ticket_id": null,
  "customer_email": "jane@example.com",
  "issue_category": "account",
  "priority": 5,
  "summary": "Customer cannot log in after password reset, urgent due to upcoming client presentation"
}

Tips to improve reliability:

Say "Return only the JSON object" and "No explanation" explicitly
List allowed values for enum fields
Specify types explicitly ("string or null" not just "the email")
Keep the schema flat where possible; deep nesting increases errors
If using the OpenAI API, set response_format: { type: "json_object" } to engage JSON mode

Method 2: Function Calling and Tool Use

Function calling (OpenAI) and tool use (Anthropic) are API features that force the model to produce structured output matching a schema you define. The model does not generate free text; it generates a structured function call argument. This is the most reliable method for structured output without a validation library.

OpenAI function calling example (TypeScript):

const response = await openai.chat.completions.create({
  model: "gpt-4o",
  messages: [
    {
      role: "user",
      content: "Extract ticket info from: 'Cannot log in after password reset, email jane@example.com, urgent, client presentation in 20 minutes'"
    }
  ],
  tools: [
    {
      type: "function",
      function: {
        name: "extract_ticket",
        description: "Extract structured information from a support ticket",
        parameters: {
          type: "object",
          properties: {
            customer_email: { type: "string", nullable: true },
            issue_category: {
              type: "string",
              enum: ["billing", "technical", "account", "feature_request", "other"]
            },
            priority: { type: "integer", minimum: 1, maximum: 5 },
            summary: { type: "string", maxLength: 100 }
          },
          required: ["issue_category", "priority", "summary"]
        }
      }
    }
  ],
  tool_choice: { type: "function", function: { name: "extract_ticket" } }
});

With tool_choice set to force the function call, the model is constrained at the decoding level to produce valid JSON matching the schema. It cannot produce explanation text or malformed JSON.

Anthropic tool use equivalent:

const response = await anthropic.messages.create({
  model: "claude-opus-4-5",
  max_tokens: 1024,
  tools: [
    {
      name: "extract_ticket",
      description: "Extract structured information from a support ticket",
      input_schema: {
        type: "object",
        properties: {
          customer_email: { type: "string" },
          issue_category: {
            type: "string",
            enum: ["billing", "technical", "account", "feature_request", "other"]
          },
          priority: { type: "integer" },
          summary: { type: "string" }
        },
        required: ["issue_category", "priority", "summary"]
      }
    }
  ],
  tool_choice: { type: "tool", name: "extract_ticket" },
  messages: [{ role: "user", content: "Extract ticket info from: ..." }]
});

Function calling is the right choice for production systems where JSON parsing failures are not acceptable.

Method 3: Schema Validation Libraries

For TypeScript, the Vercel AI SDK combined with Zod gives you schema-validated structured output with type inference:

import { generateObject } from "ai";
import { openai } from "@ai-sdk/openai";
import { z } from "zod";

const ticketSchema = z.object({
  customer_email: z.string().email().nullable(),
  issue_category: z.enum(["billing", "technical", "account", "feature_request", "other"]),
  priority: z.number().int().min(1).max(5),
  summary: z.string().max(100)
});

const { object } = await generateObject({
  model: openai("gpt-4o"),
  schema: ticketSchema,
  prompt: "Extract ticket info from: ..."
});

// object is typed as z.infer<typeof ticketSchema>
// TypeScript knows customer_email is string | null, priority is number, etc.

The Vercel AI SDK uses function calling under the hood and adds Zod validation on top. If the model output does not match the schema, the SDK retries automatically (up to a configurable limit). This gives you both the reliability of function calling and TypeScript type safety without manual schema-to-JSON-schema conversion.

For Python, the instructor library (by Jason Liu) provides equivalent functionality:

import instructor
from openai import OpenAI
from pydantic import BaseModel
from typing import Literal

client = instructor.from_openai(OpenAI())

class TicketExtraction(BaseModel):
    customer_email: str | None
    issue_category: Literal["billing", "technical", "account", "feature_request", "other"]
    priority: int
    summary: str

ticket = client.chat.completions.create(
    model="gpt-4o",
    response_model=TicketExtraction,
    messages=[{"role": "user", "content": "Extract ticket info from: ..."}]
)
# ticket is a TicketExtraction instance with validated fields

Common Failure Modes and How to Handle Them

Model adds explanation text before JSON. Add "Return only the JSON object. No explanation text before or after." If using JSON mode or function calling, this does not occur.

Nested objects get confused. Flatten the schema where possible. "address_street" and "address_city" as top-level fields is more reliable than an "address" object with "street" and "city" fields.

Arrays become comma-separated strings. Be explicit: "Return a JSON array (using square brackets) of strings, even if there is only one item." Or use function calling.

Numbers returned as strings. Specify types explicitly. "priority: integer from 1 to 5" not "priority: how urgent this is."

Model invents fields not in your schema. Some models (especially smaller ones) add extra fields. Add: "Include only the fields specified. Do not add additional fields."

When to Use Which Method

Prompt-based JSON: Quick scripts, internal tools where occasional failures are acceptable, tasks where you want to inspect what the model is doing without decoding a function call.

Function calling / tool use: Any production system, any user-facing feature, any pipeline where a parsing failure causes a visible error.

Zod + Vercel AI SDK or instructor: TypeScript or Python production code where you want type safety alongside reliability. The added abstraction is worth it when the schema is complex or changes frequently.

Keep Reading

Prompt Chaining: How to Break Complex Tasks Into Reliable Steps — Structured output is essential when prompt chains pass data between steps
Prompt Engineering Complete Guide 2026 — Structured output in the context of every other major prompting technique
How to Build an AI Agent — Agents depend on structured output to parse tool results; the methods here are directly applicable

Pristren builds AI-powered software for teams. Zlyqor is our all-in-one workspace — chat, projects, time tracking, AI meeting summaries, and invoicing — in one tool. Try it free.

Getting Structured Output From LLMs: JSON, XML, and Format Control

Related Articles

GPT Architecture Explained: Beyond the Surface Level

LLM Fine-Tuning in Practice: A Developer's Complete Walkthrough

Why Free-Form Output Fails at Scale

Method 1: Prompt-Based JSON Request

Method 2: Function Calling and Tool Use

Method 3: Schema Validation Libraries

Common Failure Modes and How to Handle Them

When to Use Which Method

Keep Reading

The workspace your team
actually needs

AI & ML insights, weekly

Mahmudul Haque Qudrati

Chain of Density Prompting: How to Get Information-Dense Summaries from LLMs

Getting Structured Output From LLMs: JSON, XML, and Format Control

Related Articles

GPT Architecture Explained: Beyond the Surface Level

LLM Fine-Tuning in Practice: A Developer's Complete Walkthrough

Why Free-Form Output Fails at Scale

Method 1: Prompt-Based JSON Request

Method 2: Function Calling and Tool Use

Method 3: Schema Validation Libraries

Common Failure Modes and How to Handle Them

When to Use Which Method

Keep Reading

The workspace your teamactually needs

AI & ML insights, weekly

Mahmudul Haque Qudrati

Chain of Density Prompting: How to Get Information-Dense Summaries from LLMs

The workspace your team
actually needs