
Hey guys, Mr. Technology here.
The TLDR May 11 issue dropped a research result that the entire agent-builder community needs to sit with. Wix ran 250 evaluations to test whether AI agent skills outperform documentation, and documentation won on net. The TLDR coverage added two more threads that read as separate stories but actually fit the same pattern: scenario models for agents need runtime guardrails, and Rust is becoming the default systems layer for serious AI infrastructure. The unifying question is: how do you build software that AI agents can use reliably? The honest answer in May 2026 is: with great documentation, runtime guardrails, and Rust where you can.
The full writeup is at Wix Engineering. The setup: Wix's developer productivity team gave AI agents two interfaces for the same tasks — one a skill (a packaged capability with structured input/output, tool definitions, and an execution contract) and one a documentation page (Markdown describing how to use the underlying API or service). They ran 250 evaluations across both interfaces. The result, in their own words: skills are not a clean win.
The uncomfortable findings:
The synthesis that the agent-builder community has been avoiding: a skill is a maintenance liability. It is a code dependency that lives in a special directory, has its own version, and breaks when the API changes. A documentation page is a Markdown file that lives in your docs site and can be updated in the same PR as the API change. For most agent consumers, the latter is the lower-friction path.
This does not mean skills are useless. Skills win in three cases:
1. The API is so complex that no agent can figure it out from docs. This is the rare case. Most APIs are not that complex. 2. The skill wraps stateful behavior that the agent should not have to reconstruct. Examples: database connection pooling, multi-step authentication flows, transaction handling. 3. The skill enforces policy the agent should not be able to override. Examples: rate limits, audit logging, PII redaction.
Outside of those three cases, the Wix data says: write good docs.
The second story in the TLDR issue is the technical answer to "what if the agent is going to do something irreversible." As agents get more capable, the worst-case scenarios stop being hypothetical. An agent that can send email, move money, deploy code, and modify production data can cause real damage. The traditional alignment approach — pre-deployment safety training — is necessary but not sufficient. Once the model is running with tools, you need runtime guardrails.
The pattern that is emerging: scenario models — small, specialized models that monitor the main agent's actions in real time and flag or block specific behaviors. The classic example is a model that watches every outbound email an agent composes and refuses to let it send anything that looks like a phishing payload. Another is a model that watches every tool call and refuses to let an agent delete more than N records per minute. The scenario model runs in parallel to the main agent and has veto power on specific action classes.
StateTech Magazine's January 2026 piece made the case that runtime guardrails are no longer optional in 2026. Menlo Security's March 2026 post on the same topic argued that agentic action at machine speed has outpaced pre-deployment evaluation: by the time you have a test suite for a new attack pattern, an agent has already attempted it 10,000 times in production.
The practical stack that is forming:
The vendors in this space are mostly second-generation — Vendia, Robust Intelligence, Calypso AI, Lakera, and a dozen newer entrants. The interesting move is that the major agent platforms are now shipping built-in guardrail hooks: LangChain's policy middleware, CrewAI's action review layer, and Anthropic's Claude Code Review being the most visible examples.
The third thread is the easiest to miss and the most consequential. Rust is not a popular language for AI application code — Python is and will be for a long time. But Rust is becoming the default for the systems layer underneath: inference servers, agent runtimes, vector database internals, model serving infrastructure, and the new generation of agentic code-execution sandboxes.
A few signals:
text-embeddings-inference** — the fastest open-source embedding server — is Rust.The Reddit r/rust discussion from May 2026 captured the trend. The reasons are not subtle: Rust gives you memory safety without a garbage collector, predictable latency under load, and a binary that is small enough to embed in a sidecar. For AI infrastructure, those three properties are worth the steeper learning curve.
If you are building AI infrastructure — model servers, agent runtimes, vector stores, inference hardware adapters — Rust is now the table-stakes choice. The performance gap between a Rust implementation and a Python implementation at the systems layer is often 10x-100x. The reliability gap (no GC pauses, no runtime errors from bad memory access) is the reason production teams keep picking Rust for new components.
The corollary is that the AI infrastructure hiring market is bifurcating. Application-layer engineers write Python. Systems-layer engineers write Rust. If you are a Rust engineer reading this, your skills are about to be very valuable. If you are a Python engineer who has been putting off learning Rust, the next 18 months are a good window to add it.
Three threads, one underlying shift: the agent-builder ecosystem is maturing past the "build something that works for the demo" phase into the "build something that works in production for two years" phase. That means:
If you are picking a stack in 2026, that is the rule set. If you are hiring, that is the skill map. The agent era is not "Python everything" the way the web era was "JavaScript everything." It is Python at the top, Rust at the bottom, and runtime guardrails in the middle.
— Mr. Technology
Sources: Wix Engineering — We Ran 250 AI Agent Evals to Find Out if Skills Beat Docs, StateTech Magazine — AI Guardrails Will Stop Being Optional in 2026, Menlo Security — When AI Acts: Why Guardrails Must Move Into the Runtime, Anthropic Engineering Blog, Epsilla — The $25 Code Review Tax, Reddit r/rust — Rust is quietly becoming the foundation layer for AI tooling, MCP server SDKs, Model Context Protocol.