Best Practices

Structured Outputs and JSON Schema

Enforcing exact output formats from LLMs — JSON schema validation, grammar-constrained decoding, and production data extraction patterns

Published: 2026-04-08 · Last updated: 2026-04-13

Structured Outputs and JSON Schema

LLMs naturally produce unstructured text, but many applications need reliable, parseable data. Structured output techniques ensure the model's response conforms to a specific schema, enabling downstream processing without fragile text parsing.

Why Structured Outputs Matter

# ❌ Unstructured output — fragile to parse
response = "The user has 3 orders. The most recent one was placed on April 1st, 2026 " \
           "and costs $149.99. It's currently being shipped."

# How do you extract the date? Regex? The amount? What if the format changes?

# ✅ Structured output — reliable
response = {
    "order_count": 3,
    "latest_order": {
        "date": "2026-04-01",
        "amount": 149.99,
        "status": "shipping"
    }
}

Method 1: Prompt-Based JSON Output

The simplest approach — ask the model to return JSON:

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{
        "role": "user",
        "content": """Extract order information from:
        "I placed 3 orders with you. The latest one was on April 1st for $149.99 
         and the tracking shows it's shipping."
        
        Return ONLY valid JSON matching this schema:
        {
            "order_count": number,
            "latest_order": {
                "date": "YYYY-MM-DD",
                "amount": number,
                "status": "pending|processing|shipping|delivered"
            }
        }"""
    }],
    response_format={"type": "json_object"},  # OpenAI: enforces JSON
)

import json
data = json.loads(response.choices[0].message.content)

Limitations: The model can still produce invalid JSON occasionally, and schema violations are not caught at the generation level.

Method 2: OpenAI JSON Schema Mode

OpenAI supports enforcing a specific JSON schema:

schema = {
    "type": "object",
    "properties": {
        "order_count": {"type": "integer"},
        "latest_order": {
            "type": "object",
            "properties": {
                "date": {"type": "string", "format": "date"},
                "amount": {"type": "number"},
                "status": {
                    "type": "string",
                    "enum": ["pending", "processing", "shipping", "delivered"]
                }
            },
            "required": ["date", "amount", "status"]
        }
    },
    "required": ["order_count", "latest_order"]
}

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{
        "role": "user",
        "content": "Extract order info: I placed 3 orders..."
    }],
    response_format={"type": "json_schema", "json_schema": {"name": "order", "schema": schema}},
)

Method 3: Instructor (Pydantic Validation)

The instructor library adds Pydantic validation with automatic retry on schema violations:

import instructor
from pydantic import BaseModel, Field
from typing import Literal
from openai import OpenAI

client = instructor.patch(OpenAI())

class Order(BaseModel):
    date: str = Field(pattern=r"\d{4}-\d{2}-\d{2}")
    amount: float = Field(gt=0)
    status: Literal["pending", "processing", "shipping", "delivered"]

class OrderSummary(BaseModel):
    order_count: int = Field(gt=0)
    latest_order: Order

result = client.chat.completions.create(
    model="gpt-4o",
    messages=[{
        "role": "user",
        "content": "Extract: I placed 3 orders..."
    }],
    response_model=OrderSummary,
    max_retries=2,  # Auto-retry if validation fails
)

print(result.latest_order.amount)  # 149.99 (as float, not string!)

Method 4: Grammar-Constrained Decoding

For open-source models, grammar-constrained decoding forces the model to ONLY generate valid JSON:

from outlines import models, generate

model = models.transformers("meta-llama/Llama-3.2-3B")

schema = """
{
    "type": "object",
    "properties": {
        "name": {"type": "string"},
        "age": {"type": "integer"},
        "email": {"type": "string"}
    },
    "required": ["name", "age", "email"]
}
"""

generator = generate.json(model, schema)
result = generator("Extract from: John is 30 years old, email john@example.com")
# Guaranteed valid JSON — the decoder physically cannot produce invalid tokens

How it works: The decoder's vocabulary is filtered at each step to only allow tokens that keep the output valid according to the grammar/schema.

Method 5: LMQL (Language Model Query Language)

# LMQL constrains output at the token level
query "Extract person info":
    "Extract information: John is 30, john@example.com\n"
    "Name: " NAME [TYPE: str]
    "\nAge: " AGE [TYPE: int]
    "\nEmail: " EMAIL [TYPE: str]

Production Data Extraction Pipeline

from pydantic import BaseModel, Field, validator
from typing import Optional
import instructor

client = instructor.patch(OpenAI())

class ExtractedEntity(BaseModel):
    text: str
    label: str
    confidence: float = Field(ge=0.0, le=1.0)

class DocumentExtraction(BaseModel):
    entities: list[ExtractedEntity]
    summary: str
    language: str
    has_pii: bool
    
    @validator("entities")
    def validate_entities(cls, v):
        if len(v) > 100:
            raise ValueError("Too many entities extracted")
        return v

def extract_from_document(doc_text: str) -> DocumentExtraction:
    return client.chat.completions.create(
        model="gpt-4o",
        messages=[{
            "role": "user",
            "content": f"Extract all named entities, PII, and summarize:\n\n{doc_text}"
        }],
        response_model=DocumentExtraction,
        max_retries=3,
    )

# Usage
result = extract_from_document(long_contract_text)
for entity in result.entities:
    print(f"{entity.text}{entity.label} ({entity.confidence:.2f})")

Common Extraction Patterns

Entity Extraction

class Person(BaseModel):
    name: str
    title: Optional[str]
    organization: Optional[str]
    email: Optional[str]

class DocumentAnalysis(BaseModel):
    people: list[Person]
    dates: list[str]
    monetary_amounts: list[float]
    key_topics: list[str]

Sentiment Analysis

class SentimentAnalysis(BaseModel):
    overall: Literal["positive", "negative", "neutral"]
    confidence: float
    aspects: list[dict]  # {"aspect": str, "sentiment": str, "evidence": str}

Classification

class TicketClassification(BaseModel):
    category: Literal["billing", "technical", "account", "feature_request"]
    priority: Literal["low", "medium", "high", "urgent"]
    requires_human: bool
    suggested_response: str

Validation in Production

Always validate model outputs even with schema enforcement:

def validate_extraction(result: BaseModel) -> list[str]:
    """Additional business-logic validation beyond schema."""
    errors = []
    
    if result.has_pii and len(result.entities) == 0:
        errors.append("PII detected but no entities extracted")
    
    if len(result.summary) > 500:
        errors.append("Summary exceeds length limit")
    
    for entity in result.entities:
        if entity.confidence < 0.5:
            errors.append(f"Low confidence on entity: {entity.text}")
    
    return errors

Key Takeaways

  • OpenAI's response_format enforces JSON output at the API level
  • Pydantic + Instructor adds type-safe validation with automatic retries
  • Grammar-constrained decoding (Outlines, LMQL) guarantees valid output for open-source models
  • Always add business-logic validation on top of schema validation
  • Structured outputs enable reliable downstream processing and integration

Related docs