Best Practices

Structured Outputs and JSON Schema

Enforcing exact output formats from LLMs — JSON schema validation, grammar-constrained decoding, and production data extraction patterns

Published: 2026-04-08 · Last updated: 2026-04-13

Structured Outputs and JSON Schema

LLMs naturally produce unstructured text, but many applications need reliable, parseable data. Structured output techniques ensure the model's response conforms to a specific schema, enabling downstream processing without fragile text parsing.

Why Structured Outputs Matter

# ❌ Unstructured output — fragile to parse
response = "The user has 3 orders. The most recent one was placed on April 1st, 2026 " \
           "and costs $149.99. It's currently being shipped."

# How do you extract the date? Regex? The amount? What if the format changes?

# ✅ Structured output — reliable
response = {
    "order_count": 3,
    "latest_order": {
        "date": "2026-04-01",
        "amount": 149.99,
        "status": "shipping"
    }
}

Method 1: Prompt-Based JSON Output

The simplest approach — ask the model to return JSON:

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{
        "role": "user",
        "content": """Extract order information from:
        "I placed 3 orders with you. The latest one was on April 1st for $149.99 
         and the tracking shows it's shipping."
        
        Return ONLY valid JSON matching this schema:
        {
            "order_count": number,
            "latest_order": {
                "date": "YYYY-MM-DD",
                "amount": number,
                "status": "pending|processing|shipping|delivered"
            }
        }"""
    }],
    response_format={"type": "json_object"},  # OpenAI: enforces JSON
)

import json
data = json.loads(response.choices[0].message.content)

Limitations: The model can still produce invalid JSON occasionally, and schema violations are not caught at the generation level.

Method 2: OpenAI JSON Schema Mode

OpenAI supports enforcing a specific JSON schema:

schema = {
    "type": "object",
    "properties": {
        "order_count": {"type": "integer"},
        "latest_order": {
            "type": "object",
            "properties": {
                "date": {"type": "string", "format": "date"},
                "amount": {"type": "number"},
                "status": {
                    "type": "string",
                    "enum": ["pending", "processing", "shipping", "delivered"]
                }
            },
            "required": ["date", "amount", "status"]
        }
    },
    "required": ["order_count", "latest_order"]
}

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{
        "role": "user",
        "content": "Extract order info: I placed 3 orders..."
    }],
    response_format={"type": "json_schema", "json_schema": {"name": "order", "schema": schema}},
)

Method 3: Instructor (Pydantic Validation)

The instructor library adds Pydantic validation with automatic retry on schema violations:

import instructor
from pydantic import BaseModel, Field
from typing import Literal
from openai import OpenAI

client = instructor.patch(OpenAI())

class Order(BaseModel):
    date: str = Field(pattern=r"\d{4}-\d{2}-\d{2}")
    amount: float = Field(gt=0)
    status: Literal["pending", "processing", "shipping", "delivered"]

class OrderSummary(BaseModel):
    order_count: int = Field(gt=0)
    latest_order: Order

result = client.chat.completions.create(
    model="gpt-4o",
    messages=[{
        "role": "user",
        "content": "Extract: I placed 3 orders..."
    }],
    response_model=OrderSummary,
    max_retries=2,  # Auto-retry if validation fails
)

print(result.latest_order.amount)  # 149.99 (as float, not string!)

Method 4: Grammar-Constrained Decoding

For open-source models, grammar-constrained decoding forces the model to ONLY generate valid JSON:

from outlines import models, generate

model = models.transformers("meta-llama/Llama-3.2-3B")

schema = """
{
    "type": "object",
    "properties": {
        "name": {"type": "string"},
        "age": {"type": "integer"},
        "email": {"type": "string"}
    },
    "required": ["name", "age", "email"]
}
"""

generator = generate.json(model, schema)
result = generator("Extract from: John is 30 years old, email john@example.com")
# Guaranteed valid JSON — the decoder physically cannot produce invalid tokens

How it works: The decoder's vocabulary is filtered at each step to only allow tokens that keep the output valid according to the grammar/schema.

Method 5: LMQL (Language Model Query Language)

# LMQL constrains output at the token level
query "Extract person info":
    "Extract information: John is 30, john@example.com\n"
    "Name: " NAME [TYPE: str]
    "\nAge: " AGE [TYPE: int]
    "\nEmail: " EMAIL [TYPE: str]

Production Data Extraction Pipeline

from pydantic import BaseModel, Field, validator
from typing import Optional
import instructor

client = instructor.patch(OpenAI())

class ExtractedEntity(BaseModel):
    text: str
    label: str
    confidence: float = Field(ge=0.0, le=1.0)

class DocumentExtraction(BaseModel):
    entities: list[ExtractedEntity]
    summary: str
    language: str
    has_pii: bool
    
    @validator("entities")
    def validate_entities(cls, v):
        if len(v) > 100:
            raise ValueError("Too many entities extracted")
        return v

def extract_from_document(doc_text: str) -> DocumentExtraction:
    return client.chat.completions.create(
        model="gpt-4o",
        messages=[{
            "role": "user",
            "content": f"Extract all named entities, PII, and summarize:\n\n{doc_text}"
        }],
        response_model=DocumentExtraction,
        max_retries=3,
    )

# Usage
result = extract_from_document(long_contract_text)
for entity in result.entities:
    print(f"{entity.text} → {entity.label} ({entity.confidence:.2f})")

Common Extraction Patterns

Entity Extraction

class Person(BaseModel):
    name: str
    title: Optional[str]
    organization: Optional[str]
    email: Optional[str]

class DocumentAnalysis(BaseModel):
    people: list[Person]
    dates: list[str]
    monetary_amounts: list[float]
    key_topics: list[str]

Sentiment Analysis

class SentimentAnalysis(BaseModel):
    overall: Literal["positive", "negative", "neutral"]
    confidence: float
    aspects: list[dict]  # {"aspect": str, "sentiment": str, "evidence": str}

Classification

class TicketClassification(BaseModel):
    category: Literal["billing", "technical", "account", "feature_request"]
    priority: Literal["low", "medium", "high", "urgent"]
    requires_human: bool
    suggested_response: str

Validation in Production

Always validate model outputs even with schema enforcement:

def validate_extraction(result: BaseModel) -> list[str]:
    """Additional business-logic validation beyond schema."""
    errors = []
    
    if result.has_pii and len(result.entities) == 0:
        errors.append("PII detected but no entities extracted")
    
    if len(result.summary) > 500:
        errors.append("Summary exceeds length limit")
    
    for entity in result.entities:
        if entity.confidence < 0.5:
            errors.append(f"Low confidence on entity: {entity.text}")
    
    return errors

Key Takeaways

OpenAI's response_format enforces JSON output at the API level
Pydantic + Instructor adds type-safe validation with automatic retries
Grammar-constrained decoding (Outlines, LMQL) guarantees valid output for open-source models
Always add business-logic validation on top of schema validation
Structured outputs enable reliable downstream processing and integration

Function Calling — Combining tools with structured output
Evaluation Metrics — Measuring extraction quality
Prompt Engineering — Designing prompts for structured output

Structured Outputs and JSON Schema

Structured Outputs and JSON Schema

Why Structured Outputs Matter

Method 1: Prompt-Based JSON Output

Method 2: OpenAI JSON Schema Mode

Method 3: Instructor (Pydantic Validation)

Method 4: Grammar-Constrained Decoding

Method 5: LMQL (Language Model Query Language)

Production Data Extraction Pipeline

Common Extraction Patterns

Entity Extraction

Sentiment Analysis

Classification

Validation in Production

Key Takeaways

Related docs

Structured Output Architecture Patterns

Structured Output Cost and Performance

Structured Output Evaluation Metrics