Skip to content
news13 min read

Google Just Dropped Gemini 3.5 Flash at I/O 2026 — And It Has Anthropic Procurement Worried

Google launched Gemini 3.5 Flash at I/O 2026: 76.2% Terminal-Bench, 1656 Elo GDPval-AA, 4x faster, less than half price. Pro lands June. Decoded.

Author
Anthony M.
13 min readVerified May 21, 2026Tested hands-on
Gemini 3.5 Flash launch — benchmark scores Terminal-Bench 2.1 76.2 percent, GDPval-AA 1656 Elo, MCP Atlas 83.6 percent, CharXiv 84.2 percent
Gemini 3.5 Flash benchmarks announced at Google I/O 2026 — Terminal-Bench 2.1 76.2%, GDPval-AA 1656 Elo, MCP Atlas 83.6%, CharXiv Reasoning 84.2%.

What is Gemini 3.5 Flash? Google's new frontier model announced at I/O 2026 on May 19, scoring 76.2% on Terminal-Bench 2.1, 1656 Elo on GDPval-AA, 83.6% on MCP Atlas, and 84.2% on CharXiv Reasoning. It is generally available today across the Gemini app, AI Mode in Google Search, Google Antigravity, the Gemini API, and Gemini Enterprise. A larger sibling, Gemini 3.5 Pro, is in internal use and will roll out next month.

What Google actually announced at I/O 2026

On May 19, 2026, at the Shoreline Amphitheatre in Mountain View, Sundar Pichai used the Google I/O keynote to introduce two new frontier models: Gemini 3.5 Flash, available the same day, and Gemini 3.5 Pro, described as "coming next month." Pichai framed Gemini 3.5 Flash as "our first in a series of models combining frontier intelligence with action." The DeepMind announcement was signed by Koray Kavukcuoglu (CTO, Google DeepMind), Jeff Dean (Chief Scientist), Oriol Vinyals (VP) and Noam Shazeer (VP).

The launch sits in a market where Anthropic's Claude Opus 4.7 and OpenAI's GPT-5.5 already occupy the frontier-coding seat. Gemini 3 Flash from earlier in 2026 carried the line forward but did not reset positioning. Gemini 3.5 Flash is Google's attempt to do that on three axes at once: scoring, speed, and agent surface area.

The numbers Google disclosed — and the ones it did not

Four benchmark scores appear in the DeepMind post for Gemini 3.5 Flash:

  • Terminal-Bench 2.1: 76.2% — multi-step terminal task completion, the benchmark Claude Code popularized as the agentic-coding stress test.
  • GDPval-AA: 1656 Elo — the GDP-value-weighted economic-task ladder.
  • MCP Atlas: 83.6% — coverage of the Model Context Protocol tool-use ladder, the agent-routing benchmark Anthropic has been pushing.
  • CharXiv Reasoning: 84.2% — multimodal chart and document reasoning.

On speed, Pichai's keynote line was specific: "When looking at output tokens per second, it is four times faster than other frontier models." Inside Google Antigravity, the company claims an additional optimization layer pushes that to "12x faster than other frontier models." On price, Pichai said Flash delivers "frontier-level capabilities at less than half the price of comparable frontier models" and floated a thesis figure: for top customers, shifting 80% of workloads to Flash could mean "over $1 billion annually" in budget savings.

What Google did not disclose is just as important. There is no published price per token for Gemini 3.5 Flash or Pro in the launch post — pricing is not publicly disclosed yet. Google has not disclosed a context window figure publicly either. Anyone publishing a "Gemini 3.5 Flash supports X tokens of context at $Y per million tokens" headline today is filling those blanks themselves; Google has not.

Gemini 3.5 Flash available May 19 2026 versus Gemini 3.5 Pro coming June 2026 — frontier intelligence with action
Gemini 3.5 Flash ships May 19, 2026. Gemini 3.5 Pro is positioned as the larger sibling and will roll out next month.

Where you can actually use Gemini 3.5 Flash today

The DeepMind post lists the surfaces lit up at launch: the Gemini app globally, AI Mode in Google Search, Google Antigravity (the agent-first development platform), the Gemini API through Google AI Studio and Android Studio, the Gemini Enterprise Agent Platform, and Gemini Enterprise itself. That is a wider day-one footprint than Gemini 3 Flash got at its own launch, and it is a deliberate contrast with Anthropic's frontier-first, surface-second cadence and OpenAI's ChatGPT-first cadence.

The Antigravity wiring is the part to read carefully. The same day Gemini 3.5 Flash launched, Google also announced the transition from Gemini CLI to Antigravity CLI — a brand-new terminal experience built in Go, with native support for asynchronous multi-agent workflows and a unified architecture shared with the Antigravity 2.0 desktop app. The transition deadline is June 18, 2026: that is when Gemini CLI stops serving requests for Google AI Pro and Ultra users, the free tier, Gemini Code Assist IDE extensions, and the GitHub integration. Enterprise customers with existing Gemini Code Assist Standard or Enterprise licenses keep their access unchanged.

Read together, the model release and the CLI transition are a single move: Google is consolidating its developer surface area on Antigravity and using Gemini 3.5 Flash to give that surface a "12x faster" performance claim to anchor the migration story. Developers on the old Gemini CLI now have a one-month window to switch.

"Frontier intelligence with action" — what the agentic positioning actually means

Pichai's phrasing is precise. "Frontier intelligence" is the table-stakes claim — that Flash is, in his words, "very capable, at the frontier and comparable to the best models, but it's still very fast." "With action" is the load-bearing half. The DeepMind post lists the agentic capabilities Google is putting weight on: multi-step workflow execution, subagent deployment and orchestration, long-horizon task completion, multimodal UI and graphics generation, and context retention across complex tasks.

That list maps cleanly onto the Terminal-Bench 2.1 and MCP Atlas scores. Terminal-Bench 2.1 76.2% is in the same neighborhood as where the leading agentic coding models have been landing in 2026, and MCP Atlas 83.6% says Google is no longer leaving the protocol-level connector layer to Anthropic. Anthropic published MCP in late 2024 and has spent 2026 turning the protocol into a strategic moat — most recently acquiring Stainless, the SDK generator behind OpenAI, Google and Cloudflare. Google scoring 83.6% on MCP Atlas is the answer: it will play in MCP, hard, on its own surfaces.

Gemini 3.5 Flash sits in the top-right quadrant of the Artificial Analysis Intelligence Index — high intelligence and 4x faster output speed
Google's positioning claim: Gemini 3.5 Flash in the top-right quadrant of the Artificial Analysis Intelligence Index, at 4x the output tokens per second of other frontier models.

The Flash pricing thesis and why it matters strategically

Pichai said Gemini 3.5 Flash delivers "frontier-level capabilities at less than half the price of comparable frontier models." He paired that with a strikingly concrete number: for a single top company shifting 80% of workloads to Flash, the annual savings could be "over $1 billion." That figure does not need to be true at the median to do its job. It is a thesis aimed at the procurement layer of Fortune 500 buyers who already negotiate seven-figure annual inference budgets.

The strategic context is that Google spent March 2026 lighting the cheap-model pricing war on fire with Gemini 3.1 Flash-Lite at $0.25 per million tokens. Gemini 3.5 Flash is the next step up that ladder: not the cheapest, but the model where Google is trying to take the "frontier capability per dollar" framing away from competitors. Anthropic has spent 2026 leaning into the opposite framing — quality, safety, and SAP-grade enterprise positioning. OpenAI has been moving the GPT-5.5 family toward the consumer and agentic-super-app layer. Google is staking the middle and the procurement floor.

How it stacks against Claude Opus 4.7 and GPT-5.5

Direct cross-vendor benchmark comparison from a single launch post is always partial. Three structural observations are safe:

  • Speed framing. "Four times faster than other frontier models" is the kind of claim that becomes the procurement headline. Whether it survives independent benchmarking on Artificial Analysis is the next 30-day question.
  • MCP positioning. 83.6% on MCP Atlas is a deliberate posture against Anthropic's protocol-control playbook. With Anthropic having just absorbed Stainless, Google needs in-house excellence on the connector layer, and the number reads as that statement.
  • Surface coverage. Gemini 3.5 Flash launching simultaneously into Gemini app, AI Mode, Antigravity, the API, and Gemini Enterprise is a wider day-one footprint than either Claude Opus 4.7 or GPT-5.5 Instant got at launch. Google is using its surface advantage and its existing enterprise sales floor — the same one that just landed the Amdocs Gemini Enterprise marketplace deal — to amortize the model release.

None of this resolves the qualitative coding-quality question for production teams. That is a 60-day question and requires real workload data, not benchmark snapshots.

Gemini 3.5 Flash powers Antigravity 2.0, Gemini Enterprise Agents, AI Mode and the Gemini app — 12x faster than other frontier models inside Antigravity
The Gemini 3.5 Flash agentic stack — model at the base, Antigravity 2.0 with its 12x optimization claim, Gemini Enterprise Agents, and consumer surfaces on top.

What this means for developers this week

For teams shipping on Gemini today, three concrete consequences:

  1. The CLI clock is ticking. Gemini CLI sunsets on June 18, 2026 for Google AI Pro and Ultra subscribers, free-tier users, and Gemini Code Assist IDE extensions and GitHub integration. Antigravity CLI is the migration target.
  2. The Flash default is shifting. If you are calling Gemini 3 Flash through the API, "Flash" no longer means the same model class it did last week. Pricing tier and capability tier have both moved.
  3. Pro is the second shoe. Gemini 3.5 Pro is rolling out next month. Anyone planning a Q3 production cutover should be reading the launch quietly while Pro lands, not pre-committing to architecture decisions on Flash benchmarks alone.

What Google has not confirmed — the open questions

Five things worth tracking that the launch post does not answer:

  • Per-token pricing. No public tier published. Pricing is not publicly disclosed yet.
  • Context window. Google has not disclosed a context window figure publicly for Gemini 3.5 Flash or 3.5 Pro in the launch materials. Anything circulating is speculation until the API documentation publishes the figure.
  • Exact rollout date for Gemini 3.5 Pro. "Coming next month" puts it in June 2026; Google has not narrowed that to a calendar day.
  • How Gemini 3.5 Pro changes the pricing thesis. The "less than half the price" framing applies to Flash. Whether Pro pricing follows the same logic is an open question.
  • Antigravity 12x methodology. The "12x faster than other frontier models" claim inside Antigravity has not been backed by an independent benchmark methodology in the launch materials.
Gemini 3.5 Flash less than half the price of comparable frontier models — up to one billion dollars in annual savings if eighty percent of workloads shift
Google's pricing thesis for Gemini 3.5 Flash — less than half the price of comparable frontier models, with up to $1B annual savings claimed for top customers shifting 80% of workloads.

The strategic read: Google is buying the procurement floor

Stack the day's signals: a frontier benchmark sheet released across five product surfaces simultaneously, a pricing claim sized for Fortune 500 procurement, a developer-tool migration scheduled to land one month later, and a Pro variant teased for the same window. This is not a model release in the Anthropic sense, where the model is the artifact and the surface area follows. This is a platform release where the model is the wedge for a procurement and developer-surface push.

Anthropic spent the spring of 2026 winning the SAP floor, the PwC certification floor, and the protocol-control floor with Stainless. OpenAI spent it winning the consumer surface with the GPT-5.5 family and Codex on mobile. Google's I/O 2026 answer is to take the procurement floor — the place where CIOs actually negotiate annual inference contracts — and use a fast, cheap, frontier-positioned Flash model plus a developer-surface lock-in to do it.

How that move lands depends on three things over the next 60 days: independent benchmarks confirming the 4x speed claim, the Gemini 3.5 Pro launch landing on time, and developers migrating cleanly to Antigravity CLI before the June 18 sunset. None of those are guaranteed. All three are inside Google's control.

Bottom line

Gemini 3.5 Flash is a credible frontier-tier release with a procurement-shaped pricing thesis and a wider day-one surface than its peers. The benchmark numbers are concrete and source-cited. The pricing per token, the context window, and the independent verification of the 4x and 12x speed claims are not yet on the table. For production teams: track Gemini 3.5 Pro in June, migrate to Antigravity CLI before June 18, and wait for independent benchmark verification before re-architecting around the speed claims.

Frequently asked questions

What is Gemini 3.5 Flash?

Gemini 3.5 Flash is Google DeepMind's new frontier model, announced at Google I/O 2026 on May 19, 2026 and generally available the same day. Sundar Pichai positioned it as "our first in a series of models combining frontier intelligence with action." The DeepMind announcement was signed by Koray Kavukcuoglu, Jeff Dean, Oriol Vinyals and Noam Shazeer.

What benchmark scores did Google publish for Gemini 3.5 Flash?

Four scores: Terminal-Bench 2.1 at 76.2%, GDPval-AA at 1656 Elo, MCP Atlas at 83.6%, and CharXiv Reasoning at 84.2%. All four are listed in the DeepMind launch post for Gemini 3.5 Flash.

When does Gemini 3.5 Pro launch?

Google said Gemini 3.5 Pro is currently in internal use and will roll out "next month" — meaning June 2026. No specific calendar date has been disclosed. The DeepMind post does not publish benchmark scores for the Pro variant yet.

How much does Gemini 3.5 Flash cost per token?

Google has not publicly disclosed per-token pricing for Gemini 3.5 Flash or Gemini 3.5 Pro in the launch materials. Sundar Pichai said Flash delivers "frontier-level capabilities at less than half the price of comparable frontier models," but no specific dollar figure per million tokens has been published.

What is the context window for Gemini 3.5 Flash?

Google has not disclosed a context window figure publicly for Gemini 3.5 Flash in the launch materials. Any specific token-count claim circulating today is not from the official DeepMind announcement and should be treated as unconfirmed until the API documentation publishes the figure.

How fast is Gemini 3.5 Flash compared to Claude Opus 4.7 and GPT-5.5?

Pichai claimed in his I/O 2026 keynote that Gemini 3.5 Flash is "four times faster than other frontier models" when measured by output tokens per second, and that the optimization inside Google Antigravity makes it "12x faster than other frontier models." Google did not name Claude Opus 4.7 or GPT-5.5 specifically, and independent third-party benchmarks confirming these speed claims have not yet been published.

Where can I use Gemini 3.5 Flash today?

The DeepMind launch post lists five day-one surfaces: the Gemini app globally, AI Mode in Google Search, Google Antigravity (the agent-first development platform), the Gemini API via Google AI Studio and Android Studio, the Gemini Enterprise Agent Platform, and Gemini Enterprise itself.

What is happening to Gemini CLI?

Google announced on May 19, 2026 that Gemini CLI is being transitioned to Antigravity CLI. On June 18, 2026, Gemini CLI stops serving requests for Google AI Pro and Ultra users, the free tier, Gemini Code Assist IDE extensions, and the GitHub integration. Enterprise customers with existing Gemini Code Assist Standard or Enterprise licenses keep their access unchanged. Antigravity CLI is the migration target.

How does Gemini 3.5 Flash relate to Google Antigravity?

Google Antigravity is the agent-first development platform Google has been positioning since April 2026. Gemini 3.5 Flash is the default model inside Antigravity 2.0, and Google claims an additional optimization layer inside Antigravity makes the model "12x faster than other frontier models" rather than the baseline 4x cited for the model on its own.

How does this compare to MCP and Anthropic's strategy?

Model Context Protocol (MCP) was introduced by Anthropic in late 2024 and turned into a strategic asset through 2026, most recently with Anthropic acquiring Stainless — the SDK generator that powers SDKs and MCP servers for OpenAI, Google, Cloudflare, Replicate and Runway. Gemini 3.5 Flash scoring 83.6% on MCP Atlas reads as Google's signal that it will compete on the connector-protocol layer on its own surfaces, rather than ceding it to Anthropic.

Should developers re-architect production systems on Gemini 3.5 Flash today?

The conservative path is to test Gemini 3.5 Flash on representative production workloads, wait for independent benchmarks confirming the 4x and 12x speed claims, monitor the Gemini 3.5 Pro rollout in June 2026, and only then make architecture commitments. The migration to Antigravity CLI is the one near-term action that is forced by the June 18, 2026 Gemini CLI sunset.

What did Google not confirm at the launch?

Five open items: per-token pricing for Flash and Pro, the context window size for both models, the exact calendar date for Gemini 3.5 Pro within June 2026, whether Pro pricing follows the same "less than half the price" framing as Flash, and the independent benchmark methodology behind the 12x speed claim inside Antigravity.

Sources: Google DeepMind blog — Gemini 3.5; Sundar Pichai I/O 2026 keynote post; Google Developers blog — Gemini CLI to Antigravity CLI transition.

Related Articles

Was this review helpful?
Anthony M. — Founder & Lead Reviewer
Anthony M.Verified Builder

We're developers and SaaS builders who use these tools daily in production. Every review comes from hands-on experience building real products — DealPropFirm, ThePlanetIndicator, PropFirmsCodes, and many more. We don't just review tools — we build and ship with them every day.

Written and tested by developers who build with these tools daily.