Every major LLM now has a 'give me valid JSON' mode. They're not created equal. A practical breakdown of how Claude, GPT, and Gemini handle structured output — with real code and the gotchas nobody puts in the docs.

Structured Output Without the Hallucination Hangover: JSON Schema Modes Compared

Every major LLM provider now has a dedicated mode for getting structured, schema-valid JSON out of a model. They all sound the same in marketing copy. They are absolutely not the same in practice. Here's what actually works, what breaks in production, and how to pick the right one for your use case.

The Problem With 'Just Prompt It'

Asking an LLM for JSON in a plain prompt is a gamble. The model can hallucinate field names, miss required fields, and occasionally just... output a code block with a short story inside it. For prototypes this is fine. For anything going near a schema validator, a payment processor, or a UI component that expects specific fields, it's a liability.

The providers know this. They've each shipped a structured output mode. The implementations differ meaningfully.

Claude (Anthropic) — `extra_body` with `input_schema`

Claude doesn't have a dedicated "JSON mode" toggle. Instead, you use the extra_body parameter with an input_schema field. This is JSON Schema (draft 7), and Claude handles the constraint decoding under the hood.

python

from anthropic import Anthropic
client = Anthropic()
response = client.messages.create(
    model="claude-opus-4-5",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Extract the user profile from this text..."}],
    extra_body={
        "input_schema": {
            "type": "object",
            "properties": {
                "name": {"type": "string", "description": "Full name"},
                "email": {"type": "string", "format": "email"},
                "role": {"type": "string", "enum": ["admin", "member", "viewer"]},
                "metadata": {"type": "object", "additionalProperties": True}
            },
            "required": ["name", "email", "role"]
        }
    }
)
# response.content is already parsed as your schema
profile = response.content

The upside: The output is reliably structured. Anthropic uses constrained decoding — the model literally cannot output tokens that violate the schema. No trailing commas, no missing required fields, no stray markdown.

The gotcha: input_schema only accepts JSON Schema (draft 7). You can't pass $defs or references. If your schema is complex with reuse, you need to flatten it. Also, you can't combine input_schema with tools — it's one or the other. For tool-use cases, you're back to prompt engineering.

GPT (OpenAI) — `response_format` with `json_schema`

OpenAI's approach is explicit and well-named. You pass a response_format object with type: "json_schema" and a json_schema definition.

python

from openai import OpenAI
client = OpenAI()
response = client.responses.create(
    model="gpt-4o-2026-05",
    input="Extract the order details from this text...",
    response_format={
        "type": "json_schema",
        "json_schema": {
            "name": "order_details",
            "schema": {
                "type": "object",
                "properties": {
                    "order_id": {"type": "string"},
                    "items": {
                        "type": "array",
                        "items": {"type": "string"}
                    },
                    "total": {"type": "number", "minimum": 0},
                    "currency": {"type": "string", "enum": ["USD", "EUR", "GBP"]}
                },
                "required": ["order_id", "items", "total", "currency"],
                "additionalProperties": False
            },
            "strict": True
        }
    },
    max_output_tokens=1024
)
order = json.loads(response.output_text)

The upside: The strict: true flag enforces the schema with constraint decoding — same guarantee as Claude. OpenAI's schema support is more complete: you can use additionalProperties: false, minLength, maxItems, and the full set of JSON Schema keywords.

The gotcha: The model still outputs a JSON string, not a parsed object — you need to json.loads() it. More importantly, the json_schema mode and tools mode are mutually exclusive. And for the o-series models, structured output is more expensive (it uses more tokens to guarantee validity).

Gemini (Google) — `response_schema` in the API

Gemini takes a different approach. Instead of a schema object, you pass response_schema and response_mime_type as top-level parameters. It also supports YAML alongside JSON.

python

from google import genai
client = genai.Client()
response = client.models.generate_content(
    model="gemini-2.5-flash",
    contents="Extract the invoice data from this text...",
    config={
        "response_mime_type": "application/json",
        "response_schema": {
            "type": "object",
            "properties": {
                "invoice_number": {"type": "string"},
                "vendor": {"type": "string"},
                "amount": {"type": "number"},
                "line_items": {
                    "type": "array",
                    "items": {
                        "type": "object",
                        "properties": {
                            "description": {"type": "string"},
                            "quantity": {"type": "integer"},
                            "unit_price": {"type": "number"}
                        }
                    }
                }
            },
            "required": ["invoice_number", "vendor", "amount"]
        }
    }
)
invoice = response.parsed

The upside: response.parsed is already a native object — no manual json.loads(). Gemini's schema support is solid, and the YAML option is genuinely useful for configs and CLIs where JSON is clunky. Gemini 2.5 Flash in particular is fast and cheap.

The gotcha: Gemini's schema enforcement is good but not as strict as the others for complex nested schemas — you may still get unexpected additional fields in edge cases. Also, the response_schema parameter is not available on all model versions; check you're on a version that supports it.

The Practical Decision Framework

Here's how to pick in practice:

Provider	Best For	Watch Out For
Claude	Reliable constraint decoding, clean schema enforcement	No tool combination, limited JSON Schema features
GPT	Full JSON Schema keyword support, `strict` enforcement	Output is a string (needs parsing), more expensive on o-series
Gemini	Speed, cost, `response.parsed` convenience, YAML option	Slightly looser schema enforcement on edge cases

One Pattern That Works Everywhere

If you need schema-validated output from any provider, and you're willing to accept one extra step, this pattern is the most reliable:

python

def extract_with_fallback(text: str, schema: dict, providers=["claude", "openai", "gemini"]):
    for provider in providers:
        try:
            result = provider_map[provider](text, schema)  # call the appropriate API
            validated = jsonschema.validate(result, schema)  # explicit validation
            return result
        except (jsonschema.ValidationError, JSONDecodeError, ProviderError) as e:
            continue
    raise ValueError("All providers failed schema validation")

Schema validation as a fallback is not ideal — you want constraint decoding doing the work. But when you're in a multi-provider setup and one model has a bad day, this gives you a clean failure mode instead of corrupted data sliding through.

The Actual Recommendation

For new projects: start with Gemini 2.5 Flash for cost and speed, and use response_schema + response.parsed. If you're in an Anthropic-heavy stack: use Claude with input_schema and accept the additionalProperties: true limitation by designing your schemas flat. If you need the full JSON Schema keyword arsenal: go **GPT-4o with json_schema and strict: true**.

None of them are perfect. All of them are significantly better than prompt-based JSON extraction. The hallucination hangover? Largely gone — but know your provider's edge cases before you ship.

*Claude input_schema via extra_body, OpenAI response_format.json_schema with strict, Gemini response_schema + response.parsed. Constraint decoding beats prompt engineering for structured output. Pick based on your schema complexity, provider lock-in tolerance, and cost sensitivity.*

Structured Output Without the Hallucination Hangover: JSON Schema Modes Compared

Structured Output Without the Hallucination Hangover: JSON Schema Modes Compared

The Problem With 'Just Prompt It'

Claude (Anthropic) — extra_body with input_schema

GPT (OpenAI) — response_format with json_schema

Gemini (Google) — response_schema in the API

The Practical Decision Framework

One Pattern That Works Everywhere

The Actual Recommendation

Claude (Anthropic) — `extra_body` with `input_schema`

GPT (OpenAI) — `response_format` with `json_schema`

Gemini (Google) — `response_schema` in the API