Qwen3.7-Plus on Bailian June 2: multimodal, 1M context, five-step agentic loop, $0.40/$1.60 per 1M token (60% cheaper than text-only Max). The story: Alibaba closed the open-weights moat and the price cut is the tell.

Alibaba Closed Qwen and the 60% Price Cut Is the Real Tell

Alibaba released Qwen3.7-Plus on June 2, 2026, on Bailian (international: Model Studio). The model takes text, image, and video input, runs a 1-million-token context window with up to 256K tokens reserved for internal chain-of-thought, ships an explicit five-step agentic loop, and costs $0.40 per million input tokens and $1.60 per million output tokens. That last number is the one you should be staring at. It is 60% cheaper than the text-only Qwen3.7-Max that Alibaba released 13 days earlier, despite Plus being multimodal. The price gap is not an accident. It is a strategy.

Let me break down what shipped, what it actually does, and why the API-only distribution is the bigger announcement than the model.

The Release In One Paragraph

Qwen3.7-Plus is the multimodal sibling of Qwen3.7-Max (released May 20). Where Max is text-only, Plus adds image and video input — but explicitly not generation. It reads frames, it does not paint them. Alibaba's image and video generation stack is a separate family. The architecture is what Alibaba calls "multimodal hybrid agent technology": a model that is built first as an agent, with vision added as an input modality, rather than a vision model with a tool-call layer bolted on top.

The five agentic skills Alibaba is leading with are: deep reasoning, self-programming, tool invocation, verification and testing, and autonomous iteration. That maps almost directly onto the steps a Claude Code or Codex agent executes on a long-horizon task — except Alibaba is framing the loop as a model property, not a harness property. The model itself loops, verifies, and iterates, rather than relying on a developer-built outer loop.

The Architecture Bet: `preserve_thinking`

The technical decision worth caring about is the preserve_thinking parameter exposed at the API level. According to the VentureBeat report, this parameter retains internal <think> blocks across continuous conversational turns, so the model does not drop its reasoning trajectory or recompute its cached history on each turn. Anthropic calls the same idea "Extended Thinking" for Claude Opus 4.8. The rest of the frontier is converging on it under different names.

Alibaba's choice to expose it as a standardized, model-agnostic parameter — present on both the open-weight Qwen3.6-27B and the proprietary Max / Plus — is the part of the release that actually matters. It is a bet that long-horizon agents will be built with reasoning continuity as a first-class API contract. If you are building on Qwen today, design your agent loop around preserve_thinking rather than treating each turn as a stateless completion. The 1M context and 256K CoT reservation are useless if your harness restarts the thinking block on every tool call.

The 60% cost difference between Plus ($0.40/$1.60) and Max ($2.50/$7.50) for more capability is a pricing signal that Alibaba is not optimizing Plus for raw capability. It is optimizing for volume. They want Plus to be the cheap default that lands on every Chinese enterprise invoice, while Max stays as the premium tier. This is the AWS-spot-instance move: flood the low end, defend the high end.

The Numbers, Honestly

The Vision Arena preview posted #16 globally and gave Alibaba the #5 vision-lab slot worldwide. That places Plus behind the top US frontier labs (Claude Opus 4.8, GPT-5.5, Gemini 3.1 Pro Preview) and behind the top Chinese open-weight multimodal (Qwen2.5-VL). For a flagship model, #16 is "competitive," not "winning."

The Qwen3.7-Max text-only sibling scored 56.6 on the Artificial Analysis Intelligence Index, which was the highest placement for a Chinese model at release. Max is also proprietary. So both top Qwen 3.7 models are now closed, and the open-weight slot at the top of the Qwen lineup is held by Qwen3.6-27B and the older Qwen3-235B-A22B. Read that sentence again if you have been recommending Qwen to enterprise teams on the basis of "and the weights are downloadable."

No parameter count, no MoE-vs-dense breakdown, and no training data specification have been published for Qwen3.7-Plus. The model card on Bailian lists modalities, context length, and pricing. The rest is treated as Alibaba's IP. Standard playbook for closed labs. New for Qwen.

The Bigger Story Is The Strategy Shift

The Alibaba Qwen team spent three years building the most credible open-weight counterweight to the US frontier. The 2024–2025 releases (Qwen2, Qwen2.5, Qwen3) all shipped with permissive licenses, downloadable weights, and detailed model cards. Airbnb, Dell, a meaningful slice of the open-source agent ecosystem — they all integrated Qwen because the weights were there. That was the moat. It was built on openness, and it gave Chinese frontier AI a distribution channel no closed lab could match.

Qwen3.7-Max and Qwen3.7-Plus are both API-only. No weights. No license terms published. The strategy is now "Bailian is the platform, Model Studio is the international front door, the model is the lock-in." That is the same move OpenAI made with GPT-4 in 2023. It is the move Microsoft made with MAI two days earlier at Build 2026. It is the move that closes the open-weights moat Alibaba spent three years building.

The honest read: Alibaba looked at inference margins, watched DeepSeek V4 commoditize the open-weight tier, watched Microsoft ship seven proprietary MAI models with 10x cost advantages from custom silicon, and decided the open-weights play is a feature, not a business. The 60% price cut on Plus is Alibaba buying the transition. Cheap tiers pull workloads off Meta's Llama 4 and DeepSeek V4. Bailian's Agentic RL feedback loop improves the model on real execution traces the open-weight labs cannot follow. The flywheel needs a closed engine.

What You Actually Get If You Build On It

If you are integrating Qwen3.7-Plus today: the OpenAI-compatible Bailian API is straightforward, the pricing is competitive, the multimodal input handling is clean, and preserve_thinking is worth more than any benchmark number on the card for long-running agents. If you picked Qwen because of the open license: start the conversation with Alibaba Cloud sales now, because the Qwen 3.7 generation is closed and there is no public signal about which future Qwen releases will be open.

If you are an open-weight loyalist: Qwen3.6-27B and the Qwen3 family on Hugging Face are still there and still good. Qwen3.7 is a different product line with a different business model. Do not assume the next Qwen release will be downloadable.

The Take

Qwen3.7-Plus is a competent multimodal model with a real architectural bet on reasoning continuity and a price that makes volume economics work. It is also the public confirmation that the open-weights strategy at Alibaba has hit its expiry date. The cheap tier is the entry point. The proprietary stack is the destination. The 60% price cut from Max to Plus is the line item that says it out loud.

Every team that built a Qwen integration in 2025 because the weights were free now has a strategic question to answer. The answer is not "stop using Qwen." The answer is "treat Qwen 3.7 as a closed-API vendor, and pin your open-weights roadmap to Qwen 3.6 and DeepSeek V4."

The open Qwen era ended quietly. The 60% price drop is the receipt.

*Qwen3.7-Plus released June 2, 2026 by Alibaba (Qwen team). 1M-token context, 256K CoT reservation, multimodal input (text + image + video, no generation). Five agentic skills: deep reasoning, self-programming, tool invocation, verification, autonomous iteration. Standardized preserve_thinking parameter. Pricing: $0.40/M input, $1.60/M output on Alibaba Cloud Bailian / Model Studio / OpenRouter. Vision Arena #16 preview, Alibaba #5 lab in vision. Sibling Qwen3.7-Max (text-only, 56.6 on AA-II, $2.50/$7.50). Distribution: API-only, no weights, no published parameter count. Bailian platform adds Agentic RL training on real execution feedback. Sources: VentureBeat pricing & spec report, MarkTechPost launch coverage, Qwen3.7 family announcement, LLM Reference entry.*

Alibaba Closed Qwen and the 60% Price Cut Is the Real Tell

Alibaba Closed Qwen and the 60% Price Cut Is the Real Tell

The Release In One Paragraph

The Architecture Bet: preserve_thinking

The Numbers, Honestly

The Bigger Story Is The Strategy Shift

What You Actually Get If You Build On It

The Take

The Architecture Bet: `preserve_thinking`