Why JSON Mode Is Not Enough
OpenAI's JSON mode guarantees syntactically valid JSON — but it does not guarantee the JSON matches your schema. You get {"name": null} when you expected {"name": "Alice"}, or an extra key that breaks your downstream parser. There is no retry, no validation, no error message. You are left writing bespoke parsing logic for every model and every use case.
Instructor solves this by combining Pydantic models with automatic retry-on-validation-error. Define what you want, and Instructor loops until the LLM produces it — or raises after max_retries attempts.
Installation
pip install instructor
Basic Extraction
import instructor
from openai import OpenAI
from pydantic import BaseModel
client = instructor.from_openai(OpenAI())
class Person(BaseModel):
name: str
age: int
email: str | None = None
person = client.chat.completions.create(
model="gpt-4o-mini",
response_model=Person,
messages=[{"role": "user", "content": "John Doe is 34 and works at john@acme.com"}],
)
print(person) # Person(name='John Doe', age=34, email='john@acme.com')
The return value is a fully validated Pydantic model — not a dict, not a string.
Automatic Retry on Validation Errors
Add field-level validators and Instructor handles retries automatically:
from pydantic import field_validator
class CVData(BaseModel):
name: str
years_experience: int
skills: list[str]
@field_validator("years_experience")
@classmethod
def must_be_positive(cls, v: int) -> int:
if v < 0:
raise ValueError("years_experience must be non-negative")
return v
cv = client.chat.completions.create(
model="gpt-4o-mini",
response_model=CVData,
messages=[{"role": "user", "content": "Jane has 5 years exp in Python, SQL, and ML."}],
max_retries=3,
)
If the model returns -5 for years_experience, Instructor sends the Pydantic error back to the model and asks it to fix the value — up to 3 times.
Multi-Provider Support
Instructor patches any OpenAI-compatible client:
pip install instructor anthropic
import anthropic
import instructor
client = instructor.from_anthropic(anthropic.Anthropic())
# Same API — response_model works identically
Works with Groq, Ollama (via openai client with base_url), Google Gemini, Mistral, and more.
Partial Streaming
Stream partial Pydantic objects as they are generated:
for partial_person in client.chat.completions.create_partial(
model="gpt-4o-mini",
response_model=Person,
messages=[{"role": "user", "content": "Alice is 28, alice@example.com"}],
):
print(partial_person) # name='Alice' age=None email=None → ... → fully populated
Practical Example: Search Query Extraction
class SearchQuery(BaseModel):
intent: str
keywords: list[str]
date_range: str | None = None
max_results: int = 10
query = client.chat.completions.create(
model="gpt-4o-mini",
response_model=SearchQuery,
messages=[{"role": "user", "content": "Find Python ML papers from last year, top 5"}],
)
# SearchQuery(intent='research', keywords=['Python', 'ML'], date_range='last year', max_results=5)
Full documentation at python.useinstructor.com.