← Back to Payloads
Open Source2026-06-04

BAML Is the Only LLM Library That Treats Prompts Like Code, and the Rest of the Stack Should Be Embarrassed

Every structured-output library I have used in the last two years — Instructor, Outlines, DSPy, Pydantic AI, LangChain's parsers — is a band-aid on the same wound. BAML stops pretending. It ships a real compiler for prompts, and that is the only honest answer.
Quick Access
Install command
$ mrt install baml
Browse related skills
BAML Is the Only LLM Library That Treats Prompts Like Code, and the Rest of the Stack Should Be Embarrassed

The Wound Everyone Keeps Bandaging

I have shipped the same Pydantic model, the same Instructor wrapper, and the same fragile response.choices[0].message.tool_calls[0].function.arguments parser in seven production codebases over the last eighteen months. Every team has its own version of the same bandage: a decorator here, a JSON schema there, a regex in the worst case, and a unit test that hopes the model behaves today. It is the worst engineering I have shipped in a decade.

BAML is the first tool I have used that treats the problem like an engineer. It made me delete a thousand lines of glue code on a Tuesday afternoon.

What BAML Actually Is

BAML is a domain-specific language with its own files, its own syntax, its own LSP, and a real Rust compiler that emits a typed client in Python, TypeScript, Ruby, Go, Rust, Java, or C#. You write .baml files. The compiler emits a client library. Your application calls the client like any other function.

```baml class Resume { name string email string @pattern("^[^@]+@[^@]+\\.[a-z]+$") years_experience int @assert(>= 0, <= 60) education Education[] }

class Education { school string degree string graduation_year int }

function ExtractResume(resume_text: string) -> Resume { client "openai/gpt-4o" prompt #" Extract the candidate's information.

{{ resume_text }}

{{ ctx.output_format }} "# } ```

That is a BAML file. Not a string in a Python file. A function declaration in a language designed for one job. The schema is checked against the prompt. The output type is checked at every call site. The email field has a regex constraint compiled into the prompt. ctx.output_format is a compiler-injected rendering of the schema.

Grep for ExtractResume across your codebase and you find every call site. Rename it and the type checker lights up. Unit-test the prompt render without touching a model. The prompt is a function. The schema is a contract. The compiler enforces both.

The Generated Client Is the Trick

After baml-cli generate you get a baml_client package:

```python from baml_client import b

resume = b.ExtractResume(resume_text=pasted_text) print(resume.email, resume.years_experience) ```

b.ExtractResume returns a Resume or raises a typed validation error. No try: json.loads(...). No if not response.choices: .... Schema compilation, prompt rendering, parsing, validation, retry, streaming, fallback — all behind one typed function call.

The generated client is the API surface. BAML does not ask you to learn its own internal API the way Instructor asks you to learn instructor.from_openai(client).chat.completions.create(..., response_model=Model). There is no BAML API in your application code. Rip BAML out of the project and you replace those calls with hand-written OpenAI calls. You do not refactor your application to learn a new idiom.

What Works Across All Models

BAML's structured generation runs against OpenAI, Anthropic, Gemini, Mistral, Ollama, vLLM, Together, Groq, and a dozen others through one interface. When the model supports a native structured-output mode, BAML uses it. When it does not, BAML compiles the schema into a token-level grammar and runs the generation as a constrained decoder.

A repo I worked on had four branches: Instructor for OpenAI, a hand-rolled regex parser for vLLM, a JSON-mode branch for Gemini, and a if model.startswith("ollama"): fallback_logic() ladder I am not proud of. BAML replaced all four with a single client declaration. The model swap is a string change.

The Benchmark That Sold Me

The Data Quarry benchmarked BAML against DSPy on four real-world structured-output tasks: clinical notes, financial entities, insurance claims, and PII. BAML beats DSPy on smaller and mid-tier models and on nested schemas past one level of depth. DSPy wins when the model is large enough to absorb its verbose prompt templates. Schema engineering beats prompt engineering for anything that has to ship.

What Is Actually Wrong With BAML

You have to learn a new language. Small, well-documented, excellent LSP — but a new language. Cost paid in the first week; payoff in every prompt after.

The compiler is opinionated. It will complain about ambiguous schemas and prompts that do not render cleanly. Some complaints are bugs, some are opinions. Error messages are good, not great.

The ecosystem is small. ~10K stars, a few hundred forks.

Streaming is rough at the edges. For deeply nested streams with optional fields, you are doing more work than you would like.

The Take

If you have more than five LLM calls in production and you are not using BAML, you are spending engineering time on a problem that has a better answer. If you are greenfielding a new agent system in 2026, picking up BAML is the highest-leverage decision you can make in the prompt engineering layer.

The thing BAML gets right that nobody else will say out loud: a prompt is not a string, a prompt is a function; an output is not a string, an output is a type. The boundary between your model and your application should be a typed function call, not a fragile parsing pipeline dressed up as an SDK. Every other library in this space is a workaround for the fact that nobody wanted to write a real compiler. Boundary wrote the compiler. The rest of the stack should be embarrassed.

Mr. Technology


*BAML: github.com/boundaryml/baml — v0.10.x, ~10K stars, Apache 2.0. Built by Boundary, YC W23. Rust compiler with generated clients in Python, TypeScript, Ruby, Go, Rust, Java, and C#. Benchmark: thedataquarry/structured-outputs. Install: pip install baml-py plus baml-cli generate.*

Related Dispatches