← Back to Payloads
ai2026-06-02

Gemini Flash Gets Pricey, AI Act Delays, Agents Drive Online

Google raised Gemini 3.5 Flash pricing 3x, the EU delayed high-risk AI Act compliance to December 2027, AI-driven internet traffic nearly tripled in 2025 per Human Security (with agent traffic up 80x), and Andrew Ng made the case for AI Engineers over FDEs in the agent era.
Quick Access
Install command
$ mrt install ai
Browse related skills
Gemini Flash Gets Pricey, AI Act Delays, Agents Drive Online

Gemini Flash Gets Pricey, AI Act Delays, Agents Drive Online

The Batch's May 29, 2026 issue has three stories that hit different parts of the AI stack at the same moment: Google's mid-tier model just got three times more expensive, Brussels blinked on AI Act timelines, and a new study says AI agents already account for 80x more internet traffic than a year ago. Any one of those would be a story. Together they describe a market where frontier capability keeps climbing, regulation keeps sliding, and the network underneath is starting to feel it.

What You Need to Know: Google released Gemini 3.5 Flash at three times the per-token price of its predecessor, the EU agreed to push high-risk AI Act compliance to December 2027, and Human Security reported that AI-driven internet traffic nearly tripled in 2025, with agent traffic up roughly 80x year over year. Andrew Ng also made the case that "AI Forward Deployed Engineer" is a real role — but a much smaller one than the AI Engineer title that will grow around it.

Why It Matters

  • Per-token pricing is no longer monotonically falling. Gemini 3.5 Flash at $1.50 input / $9.00 output per million tokens is more expensive than Claude Haiku or GPT-5 mini on most workloads. If you're sizing an agentic budget, the cheap-Flash era is over.
  • Compliance windows are sliding right. The high-risk deadline moved from August 2026 to December 2027. That buys you time, but it also means a lot of mid-stage startups that were "going to figure out EU AI Act in Q3" just got bumped 16 months out — which is good news for engineering, bad news for any GTM motion that was leaning on the regulation.
  • Your traffic graph is lying to you. If 95% of your "AI-driven" traffic is concentrated in retail, streaming, and travel, you're already on the wrong side of the agent-data arms race. The scrapers grew 7x; the malicious ones grew 47%.
  • The "FDE vs AI Engineer" framing actually matters for hiring. Andrew Ng's point: a company will hire a few FDEs but many more internal AI Engineers. If your eng org is leaning on vendor FDEs for the heavy lifting, you are buying optionality away.
  • Agent traffic is the canary. 1.7% of AI-driven traffic in December, up 80x in a year, and 77% of agent interactions happen on product and search pages. That's not a footnote — that's your conversion funnel being navigated by something other than a human.

What Actually Happened

Gemini 3.5 Flash: A Mid-Tier Model With a Top-Tier Bill

Google launched Gemini 3.5 Flash at Google I/O 2026 as the direct replacement for the Flash line, but the positioning has shifted. The "Flash" tier used to mean small and fast, cheaper than Pro. Gemini 3.5 Flash is now closer to Anthropic's Sonnet — a mid-tier multimodal MoE that tops Artificial Analysis's APEX-Agents-AA benchmark at 47.1% accuracy on long-running agentic tasks, nearly 10 percentage points ahead of GPT-5.5 (37.7%). On MMMU-Pro visual reasoning, it scored 84%, the highest recorded.

The interesting number is the price: $1.50 per million input tokens, $0.15 cached, $9.00 per million output tokens. That is roughly 3x the price of Gemini 3 Flash. Add the new "thought preservation" feature (reasoning tokens persist across multi-turn context, similar to Kimi K2.6's preserved thinking) and the adjustable reasoning levels, and you get a model that thinks harder, runs faster (204 tokens/sec), and bills accordingly.

The "free" tier still exists via the Gemini app and AI Studio, but the API is the new reality for production work. Google also debuted Omni Flash, a video-generating multimodal model, and overhauled Antigravity as an agent-management coding tool with a CLI that replaces the open-source Gemini CLI. (Source)

The EU Softens the AI Act

The European Parliament and member states agreed to amend the AI Act on April 27, 2026, delaying the high-risk systems deadline from August 2026 to December 2027 and extending watermarking and transparency requirements to around December 2026. AI-driven machinery and toys get until August 2028.

The revisions aren't a giveaway — they still tighten one area: the amendments explicitly ban sexually explicit images of children and non-consensual nude images of real people. The high-risk sandbox requirement is also new: developers get until August 2027 to implement supervised isolation environments for new model testing.

The political backdrop matters. In 2023, 163 executives signed a letter calling the law "bureaucratic." In 2025, 110 companies (championed by an "AI Champions" coalition) urged a delay. Siemens and SAP lobbied publicly for revisions. Enrico Letta's April 2024 report on the EU single market and the September 2024 Draghi report on competitiveness set the intellectual frame: Europe's regulatory density was an "existential challenge" to scaling. The Commission withdrew the AI Liability Directive in February 2026, signaling that the direction of travel is simplification, not expansion. (Source)

AI Agents Now Drive 80x More Traffic Than Last Year

Human Security's 2026 State of AI Traffic and Cyberthreat Benchmark Report is built on over one quadrillion internet interactions observed across roughly 1,200 customers in 200+ countries. The headline numbers:

  • AI-driven traffic nearly tripled in 2025. Automated traffic (AI + bots) grew 23%. Human traffic grew 3%.
  • Training crawlers: 68% of AI-driven traffic, more than 2x the prior year's volume.
  • Immediate-use scrapers: 32% of AI-driven traffic, a 7x increase in volume.
  • Agents and agentic browsers: 1.7% of AI-driven traffic in December 2025, nearly 80x year-over-year growth.
  • 95% of AI-driven traffic fell into retail/e-commerce, streaming/media, or travel/hospitality.

The agent interaction breakdown is the one to actually act on: 77% on product and search pages, the rest on accounts, auth, and checkout. OpenAI produced 69% of automated traffic, Meta 16%, Anthropic 11%.

The security cost is concrete. Malicious scraping rose 47%. Account takeover attempts fell 30% in volume, but attacks that occur after login rose 4x. Compromised-card transaction volume blocked by issuers rose 20% — possibly because agents are now good enough to cycle card numbers at machine speed. (Source)

The "Forward Deployed Engineer" Is Real — But It's a Small Pond

Andrew Ng's letter this week is worth quoting directly. The AI FDE is a Palantir-era role that's been pulled back into the spotlight because OpenAI launched a deployment company and Anthropic partnered with Blackstone, Hellman & Friedman, and Goldman Sachs on an enterprise AI services firm. The role: embed inside a client, build and tune agentic workflows, push back on unrealistic asks, and translate between vendor and buyer.

Ng's argument is that the demand signal is being misread. "A company might accept a few FDEs to be embedded within its organization. But most companies will want far more of their own employees working on their projects." The optionality cost of a vendor FDE is real: they exist to deeply integrate one vendor's stack, and in a market where the leader rotates every six months, that is a liability.

The forward-looking roles Ng names — LLMOps Engineers, Evals Engineers, AI Data Engineers, Harness Engineers — are the ones to actually track if you're hiring this year. (Source)

The Take

Three things are true at the same time and they compound.

First, the price floor for "good enough" is rising. Gemini 3.5 Flash at 3x the price of its predecessor isn't a one-off — OpenAI, Anthropic, and Google have all moved reasoning-tier pricing up in 2026 because the only way to make reasoning models cheap is to make them worse. If you have an agentic product whose unit economics assumed Flash-class pricing forever, your COGS model is now wrong. Either pass the cost through, get smaller, or switch to a smaller open model on the cheap path and Flash on the expensive path. There is no "Flash but free" any more.

Second, the EU AI Act delay is good engineering hygiene and bad product strategy. You should be grateful for the runway, because building a high-risk system sandbox properly takes longer than 16 months. But if you were planning to launch in the EU as a regulated moat, that moat just evaporated — your American and Chinese competitors now have the same window.

Third, agent traffic is past the "novel" stage. An 80x growth rate on a small base is still a small base today, but the direction matters more than the magnitude for capacity planning. If you're running a product page, you should assume the next 18 months will see scrapers and agents combined outnumber humans. Build for it the way you'd build for a hostile crawler today — structured data, rate limits that don't punish real users, anti-spoofing for the legitimate-agent use case so you can tell them apart from the malicious ones. Human Security's 47% growth in malicious scraping is the number that should keep you up.

And the FDE thing? Ng is right. If you are a startup and your customer-success motion is "we send someone to embed with you," you are selling services disguised as software. The market for that exists, but the market for AI Engineers who can build with off-the-shelf components is ten times larger and growing faster. Hire accordingly.

Quick Summary

Gemini 3.5 Flash raised the bar and the bill (3x the input price, 9x the output), the EU gave high-risk AI Act compliance until December 2027, AI agent traffic grew 80x in a year per Human Security, and Andrew Ng argues AI Engineer hiring will dwarf FDE hiring for the rest of the decade.


Sources

Related Dispatches