← Back to Payloads
Opinion2026-06-03· 4 min read

Function Calling Is a Crutch, Not a Feature, and the Industry Bet the Agentic Future on It

Every agentic platform ships with the same pitch: 'give the model your tools, watch the magic happen.' What actually happens is the model calls the right tool with the wrong parameters 5-10% of the time, and nobody catches it until the customer does. Function calling is a crutch. Stop building on it.
Quick Access
Install command
$ mrt install opinion
Browse related skills
Function Calling Is a Crutch, Not a Feature, and the Industry Bet the Agentic Future on It

Let me say it clearly: function calling is the most overrated capability in modern LLMs, and the industry has built the entire "agentic AI" stack on top of it. That bet is going to age badly.

I have watched too many engineering teams ship production systems on function calling. Every one hit the same wall: function calling is the model's way of failing in ways that look exactly like success. The tool gets called. The parameters get passed. The system does the thing. And the thing is wrong in a way no test will catch until the customer does.

The Abstraction That Lied

The pitch was clean. Give the model a JSON schema describing your tools, and it will pick the right tool, with the right parameters, at the right time. A deterministic interface between a probabilistic system and the deterministic world.

What we got is a probabilistic system that calls the right tool roughly 95% of the time, with parameters that are roughly 95% right, and the other 5% of the time it calls the right tool with parameters that are confidently wrong. The model does not ask for clarification. It does not hesitate. It executes. The schema is not a contract. It is a suggestion the model has been trained to respect most of the time.

This is not a deterministic interface. It is a probabilistic interface wearing a deterministic costume.

The Tool Calling Circus

I have read more production function-calling traces than I care to admit. The model calls `send_email` with the recipient swapped from a different conversation. It calls `query_database` with a hallucinated table name smuggled into the filter clause. It calls `update_user_record` with a user ID that exists but belongs to a different tenant. Every call is syntactically valid. Every call does real damage.

The industry response is always the same: add more validation, add more guardrails, add a second model to verify the first model's tool calls. We have built an entire secondary industry of function-call validators. A multi-billion dollar scaffolding layer to fix the problems function calling was supposed to solve.

That is the smell. A real primitive does not require another primitive to babysit it. Function calling did. That is the definition of a crutch.

Why This Keeps Happening

Language models are not function callers. They are text generators trained on examples of function calls. There is a meaningful difference, and the industry has spent three years pretending there isn't.

A real function-calling system understands the tool, the parameters, and the consequences of invoking it. A language model has been trained on millions of JSON blobs that look like function calls and produces output that statistically resembles good function calls. When the situation is novel, the model falls back to what looks like a function call rather than what actually is one.

This is fine for demos. This is a disaster for the "agentic" workflows being sold to enterprises as the future of automation.

What Should Replace It

Less of it. Fewer tools. Narrower scope. Deterministic code paths for the things that should be deterministic, and language models for the things that benefit from language understanding.

The teams that stopped trying to build agents with twenty tools and started building systems with one or two carefully scoped tool calls per task are shipping reliable products. The teams still trying to build the twenty-tool agent are debugging hallucinations at 3am and rebuilding their validation layer for the third time this quarter.

Build the deterministic scaffolding first. Add language understanding inside the system, not as the spine. The agentic future is not a pile of function calls. It is carefully scoped language understanding wrapped in a deterministic shell.

The Take

Function calling is a crutch. The industry has bet the agentic future on it, and the bill is coming due. Stop building on top. Build the scaffolding. Add language understanding where it actually helps. The teams that figure this out will ship. The teams that don't will keep writing validator layers and calling it progress.

Related Dispatches