Apple's $1B Secret: Why Siri Runs on Google Now

On January 12, 2026, Apple and Google announced a multi-year deal worth approximately $1 billion per year (Bloomberg) to integrate Google's Gemini 1.2-trillion-parameter model into Siri via Apple's Private Cloud Compute infrastructure. Codenamed Apple Foundation Models v10, the deal gives Apple full model access with distillation rights for on-device models. iOS 26.5 beta is expected in spring 2026 (April/May) with the first deep Gemini-powered Siri features. The rollout is happening in phases, not a single launch — iOS 26.4 already shipped with initial "more personalized" Siri improvements. The integration is white-labeled — no Google branding appears anywhere in the user experience.

The Deal That Changes Everything for AI Assistants

Apple paying Google roughly $1 billion per year to power Siri with Gemini is one of the most consequential AI deals of 2026 — and possibly the most strategically revealing. It confirms what many in the industry suspected: Apple, despite its $3 trillion market cap and world-class silicon team, could not build a frontier-class large language model fast enough to keep Siri competitive.

The announcement on January 12, 2026 was carefully staged. Apple and Google issued a joint statement confirming a "multi-year partnership to bring advanced AI capabilities to Apple platforms." Bloomberg subsequently reported the financial terms: approximately $1 billion annually from Apple to Google, making it one of the largest B2B AI licensing agreements ever signed — second only to the long-running Apple-Google search default deal, which itself runs upwards of $20 billion per year.

9to5Mac's follow-up reporting on March 25 revealed the deal is "deeper than previously known," with Apple gaining not just API access but full model weights, the ability to fine-tune and distill smaller on-device variants, and joint development privileges on future Gemini iterations optimized for Apple's hardware.

Deal Structure at a Glance

Component	Details
Announced	January 12, 2026
Annual Value	~$1 billion/year (Bloomberg)
Duration	Multi-year (exact term undisclosed)
Model	Google Gemini — 1.2 trillion parameters
Codename	Apple Foundation Models v10 (AFM v10)
Branding	White-labeled — zero Google branding
Access Level	Full model weights + distillation rights
Infrastructure	Apple Private Cloud Compute (PCC)
First Consumer Release	iOS 26.5 beta — Spring 2026 (expected April/May)

The white-label nature of the deal is particularly significant. Unlike Apple's integration of ChatGPT in iOS 18 (where OpenAI's branding was visible), Gemini runs entirely behind the scenes. Users will never see a "Powered by Google" badge. From Apple's perspective, this is Siri intelligence — not Google intelligence. The distinction matters enormously for brand positioning.

Technical Architecture: Private Cloud Compute and Model Distillation

The technical architecture of this deal is where it gets genuinely interesting — and where Apple's engineering philosophy becomes apparent. Apple is not simply making API calls to Google's servers. The integration runs through Apple's Private Cloud Compute (PCC) infrastructure, which Apple introduced in 2024 as its answer to the privacy challenges of cloud-based AI.

Apple Private Cloud Compute Architecture — Gemini Model Distillation Pipeline — How Apple's Private Cloud Compute processes Gemini through on-device distillation

How Private Cloud Compute Works with Gemini

Private Cloud Compute is Apple's custom cloud infrastructure running on Apple Silicon servers. The key property: user data is processed in secure enclaves that Apple itself cannot access. Data is encrypted in transit and at rest, processed ephemerally (never stored after the request completes), and the entire system is auditable by independent security researchers.

With the Gemini integration, the flow works like this:

Step 1 — On-device triage: Siri's local model (running on the A-series or M-series chip) evaluates whether a request can be handled on-device. Simple queries stay local.
Step 2 — PCC routing: Complex requests are encrypted and sent to Apple's Private Cloud Compute servers, where the full Gemini 1.2T model runs on Apple Silicon hardware.
Step 3 — Ephemeral processing: The query is processed in a secure enclave. No data is logged. No data is sent to Google. The response is encrypted and returned to the device.
Step 4 — On-device post-processing: Siri's local model formats and personalizes the response based on user context stored exclusively on-device.

This architecture means Google never sees Apple user data. Google provided the model weights and training methodology, but the actual inference happens entirely on Apple-controlled infrastructure. For privacy-conscious users — and Apple's marketing department — this is a critical distinction.

Model Distillation: The Long Game

Perhaps the most strategically significant clause in the deal is Apple's distillation rights. Model distillation is the process of training a smaller, more efficient model to replicate the behavior of a larger model. Apple can take the full 1.2-trillion-parameter Gemini and create compressed versions — potentially in the 3B to 30B parameter range — that run directly on iPhone, iPad, and Mac hardware without any cloud connection.

This is not a minor technical detail. It's the entire strategic rationale for the deal from Apple's perspective. By distilling Gemini into on-device models optimized for Apple Silicon, Apple gets:

Offline AI capabilities — Siri working intelligently without internet access
Zero-latency responses — no round-trip to the cloud for common queries
Complete privacy — data that never leaves the device at all
Reduced infrastructure costs — fewer PCC server requests as on-device models improve

We see this as the core of Apple's playbook. Pay Google now for frontier-class intelligence. Learn from the model architecture. Distill it onto Apple Silicon. Build internal expertise. Eventually, reduce dependency on Google entirely. We've seen this movie before — and it has a name.

The New Siri: What Changes for 2 Billion Apple Users

The iOS 26.5 beta, expected in spring 2026 (April/May), will deliver the first deep Gemini-powered Siri features. Mark Gurman at Bloomberg reported that some capabilities initially planned for iOS 26.4 were pushed to 26.5, and the most ambitious Siri features may even slip to iOS 27 (announced at WWDC in June, public release in September). iOS 26.4 already shipped with initial "more personalized" Siri improvements as the first phase of the rollout. Based on what has been demonstrated, the new Siri represents the most significant upgrade to the assistant since its launch in 2011. Three capabilities stand out:

On-Screen Awareness

Siri can now see and understand what's on your screen. If you're reading an article and say "summarize this," Siri processes the visible content and generates a summary. If you're looking at a restaurant in Maps and say "book a table for two on Friday," Siri understands the context without you naming the restaurant. This requires the kind of multimodal reasoning that a 1.2-trillion-parameter model excels at — understanding text, images, UI elements, and user intent simultaneously.

Cross-App Actions

Siri can now chain actions across multiple apps in a single request. "Take the photo I just shot, remove the background, resize it to 1080x1080, and post it to Instagram with the caption 'Sunday vibes'" — this kind of multi-step, cross-app workflow was impossible with the old Siri. The Gemini backbone gives Siri the reasoning capability to decompose complex requests into sequential app actions, handle edge cases, and recover from errors mid-workflow.

Conversation Memory

For the first time, Siri maintains conversation context across sessions. You can say "remind me about that restaurant we discussed yesterday" and Siri knows which restaurant you mean. Conversation history is stored encrypted on-device (not in the cloud), with the on-device distilled model handling memory retrieval and the PCC-hosted Gemini model handling complex reasoning about that context when needed.

These three capabilities — screen awareness, cross-app actions, and conversation memory — collectively bring Siri from "voice search with a personality" to something approaching a genuine AI assistant. It's the difference between a tool that answers questions and an agent that does work on your behalf.

The Privacy Architecture: Why Apple Didn't Just Use an API

A reasonable question: why didn't Apple just use Google's Gemini API, the way millions of developers already do? The answer is privacy — and it's the reason this deal costs $1 billion a year instead of a per-token API fee.

If Apple used Google's API, every Siri query would be processed on Google's servers. Google would see the queries. Google's terms of service and data handling policies would apply. For a company that has built its entire brand on privacy — "What happens on your iPhone stays on your iPhone" — this was a non-starter.

By licensing the full model weights and running them on Private Cloud Compute, Apple maintains its privacy architecture. The data flow looks like this:

Google provides: Model weights, training methodology, optimization support, future model updates
Google never receives: User queries, personal data, usage patterns, device information
Apple controls: All inference infrastructure, all user data, all privacy policies

This arrangement is unprecedented in the AI industry. No other company has licensed a frontier model at this scale while maintaining complete data sovereignty. It sets a template that other privacy-conscious companies — and potentially regulators — will study closely.

Why Apple Couldn't Do It Alone

Apple has approximately 2,000 machine learning engineers. It operates one of the largest private compute clusters in the world. It designed the Neural Engine — a dedicated AI accelerator built into every Apple chip since 2017. And yet, when it came time to build a frontier-class LLM to power Siri, Apple went to Google.

This is not a failure of engineering talent. It's a reflection of the brutal economics of frontier model training. Building a 1.2-trillion-parameter model from scratch requires:

Training data at scale: Google has 25+ years of web indexing data, YouTube transcripts, Google Books, Google Scholar, and proprietary datasets that no other company can replicate. Apple's data assets — while substantial — are primarily device-level user data that can't be used for training due to Apple's own privacy commitments.
Training compute: Gemini's training run reportedly consumed tens of thousands of TPU-years of compute. Google built custom TPU pods for this purpose. While Apple has significant compute resources, they're optimized for inference on Apple Silicon, not large-scale transformer training.
Institutional knowledge: Google DeepMind has been training large language models since the original Transformer paper in 2017. That's nearly a decade of accumulated expertise in training dynamics, data curation, alignment, and evaluation. You can't replicate institutional knowledge by hiring — it takes years of compounding experience.
Time: Even with unlimited resources, training a frontier model takes 12-18 months from architecture design to production readiness. Apple needed Siri to be competitive now, not in 2028.

The $1 billion annual price tag reflects all of this. Apple is not just buying a model — it's buying a decade of accumulated AI R&D, a training dataset that can't be replicated, and 18+ months of time-to-market advantage.

The Intel Playbook: Apple's Long-Term Strategy

Anyone who has followed Apple for the past two decades will recognize a pattern. It's the same playbook Apple used with Intel processors — and it tells us exactly where this Google partnership is heading.

Apple's Intel to Apple Silicon Playbook Applied to AI — From Google Gemini to Apple Foundation Models — Apple's proven playbook: partner, learn, build in-house, replace

Here's the timeline that maps directly onto Apple's current AI strategy:

Intel Playbook (2005-2020)	AI Playbook (2024-2030s?)
2005: Switch from PowerPC to Intel	2024: First LLM partnerships (OpenAI)
2005-2010: Learn x86 architecture deeply	2026: Gemini deal — full model access + distillation rights
2010: Begin designing custom ARM chips (A4)	2026-2028: Distill Gemini, build on-device models, learn
2020: Apple Silicon replaces Intel entirely	2030s: Apple Foundation Models replace external dependencies

The parallels are striking. In 2005, Apple couldn't build processors competitive with Intel — so it partnered with Intel, learned the architecture, and spent a decade building its own silicon team. By 2020, Apple Silicon wasn't just competitive with Intel — it was categorically superior. The M1 chip didn't just match Intel's performance; it redefined what was possible in terms of performance-per-watt.

We see the Gemini deal as phase two of the exact same playbook applied to AI. Phase one was the OpenAI partnership in iOS 18 (2024) — Apple's equivalent of dipping its toes in the water. Phase two is the Gemini deal — deep integration, full model access, distillation rights. This is Apple learning at the deepest technical level how frontier models are built and optimized.

Phase three — which we expect to unfold over the next 3-5 years — is Apple building its own frontier models trained on Apple Silicon, optimized for Apple's hardware-software ecosystem, and gradually reducing dependence on Google. The distillation rights in the current deal aren't just about making smaller models for on-device inference. They're about Apple's ML engineers learning the architecture, the training dynamics, and the optimization techniques that make a 1.2-trillion-parameter model work.

Apple has already begun hiring aggressively for its internal AI teams. Job postings for "Foundation Model Research" roles at Apple have increased 340% since 2024, according to LinkedIn data. The Gemini deal buys Apple time — competitive Siri features today — while the internal team ramps up to build what comes next.

Industry Impact: What This Deal Means for the AI Landscape

The Apple-Google Gemini deal sends shockwaves through multiple sectors of the tech industry:

For OpenAI and Microsoft

Apple's shift from OpenAI (iOS 18) to Google Gemini (iOS 26) is a significant competitive blow. OpenAI loses access to 2 billion Apple devices as a distribution channel. Microsoft, which has invested over $13 billion in OpenAI, sees its partner displaced from the world's most valuable consumer platform. The message is clear: Apple will work with whichever AI provider offers the best combination of model quality, licensing terms, and privacy compatibility. Loyalty to any single provider is not part of Apple's calculus.

For Google

Google gains a $1 billion annual revenue stream and validation that Gemini is the preferred choice for the world's most selective technology company. But Google also faces a strategic risk: by giving Apple full model weights and distillation rights, Google is effectively training its future competitor. If Apple's internal models eventually surpass Gemini — as Apple Silicon surpassed Intel — Google loses both the revenue and the competitive moat.

For Anthropic and Meta

Anthropic (Claude) and Meta (Llama) were reportedly in discussions with Apple during the bidding process. The fact that Google won suggests Gemini's 1.2-trillion-parameter scale, multimodal capabilities, and Google's willingness to grant full model access were decisive factors. For Anthropic, this underscores the importance of scaling its models and pursuing enterprise licensing deals. For Meta, it raises questions about whether open-source distribution alone is sufficient to compete for premium partnerships.

For Consumers

The immediate impact for Apple's 2 billion users is a dramatically better Siri. But the deeper implication is that AI assistant quality is now directly tied to billion-dollar licensing deals between the biggest companies in tech. The era of meaningful AI competition between Siri, Google Assistant, Alexa, and Cortana may be ending — replaced by an era where the same underlying models power different brand experiences, and the differentiation is in UX, privacy architecture, and on-device optimization rather than raw intelligence.

What We're Watching Next

iOS 26.5 beta and public release: The beta is expected April/May 2026, with public release in summer 2026. Some ambitious Siri features may slip to iOS 27 (September 2026). Mark Gurman (Bloomberg) reports the rollout is phased, not a single big-bang launch.
On-device model quality: How good are Apple's distilled Gemini variants? If the 7B-parameter on-device model can handle 80%+ of queries without cloud fallback, Apple's PCC costs drop dramatically.
Apple's internal model progress: Any sign of Apple publishing research papers, filing patents, or recruiting senior researchers from DeepMind/Anthropic/OpenAI signals acceleration of the in-house timeline.
Regulatory scrutiny: A $1B/year deal between the two largest tech companies will attract antitrust attention. The EU's Digital Markets Act may require disclosure of the partnership terms.
Competitive response: How do Samsung (with Google Assistant), Microsoft (with Copilot/OpenAI), and Amazon (with Alexa/Anthropic) respond to Siri suddenly becoming frontier-class?

The Bottom Line

Apple paying Google $1 billion per year for Gemini is a deal that looks expensive in the short term and brilliant in the long term — if Apple executes its playbook. The company gets an immediately competitive Siri, full access to a frontier model architecture for learning purposes, distillation rights to build its own on-device intelligence, and a privacy architecture that no competitor can match.

For Google, it's a calculated gamble: $1 billion in annual revenue today against the risk of creating a future competitor who no longer needs you. For the broader AI industry, it's confirmation that the era of "build everything in-house" is over. Even Apple — the most vertically integrated technology company on Earth — needs external AI partners. The question isn't whether to partner. It's how to structure the partnership so that you learn enough to eventually stand alone.

That's the Intel playbook. And Apple has proven, once, that it works.

Frequently Asked Questions

How much is Apple paying Google for Gemini?

According to Bloomberg reporting, Apple is paying approximately $1 billion per year under a multi-year deal. The exact contract duration has not been publicly disclosed. This makes it one of the largest B2B AI licensing agreements ever, though it's smaller than the estimated $20+ billion Apple pays Google annually for search default placement.

Will users see Google branding in Siri?

No. The deal is white-labeled, meaning there is zero Google branding anywhere in the Siri experience. Apple refers to the technology under the internal codename Apple Foundation Models v10 (AFM v10). From the user's perspective, this is Apple-built intelligence — Google's involvement is entirely behind the scenes.

How does Apple protect user privacy if Siri uses Google's AI model?

Apple runs Gemini on its own Private Cloud Compute (PCC) infrastructure — Apple Silicon servers with secure enclaves. Google provided the model weights but never receives user data. All queries are encrypted, processed ephemerally (never stored), and handled on Apple-controlled hardware. Google has no access to any Siri interactions.

What happened to Apple's partnership with OpenAI?

Apple integrated ChatGPT into iOS 18 in 2024, but that was a more limited integration with visible OpenAI branding. The Gemini deal represents a deeper, more strategic partnership: full model access, distillation rights, and white-label integration. The shift suggests Apple found Gemini's 1.2-trillion-parameter scale and Google's licensing terms more favorable for its long-term AI strategy.

Will Apple eventually build its own AI model and stop using Gemini?

This is widely expected, based on Apple's historical pattern with Intel processors. Apple partnered with Intel from 2005-2020 while building its own Apple Silicon, which ultimately replaced Intel entirely. The Gemini deal includes distillation rights that allow Apple to learn from the model architecture. Hiring data shows Apple's foundation model research team has grown 340% since 2024. The timeline for full independence is likely 4-6 years, around 2030-2032.

Frequently Asked Questions

How much does Apple pay Google for Gemini per year?

Apple pays Google approximately $1 billion per year under a multi-year deal announced January 12, 2026 (per Bloomberg). This makes it one of the largest B2B AI licensing agreements ever signed — second only to the Apple-Google search default deal, which itself runs upwards of $20 billion/year.

Is the new Gemini-powered Siri better than Google Assistant in 2026?

Gemini-powered Siri now offers on-screen awareness, cross-app action chaining, and persistent conversation memory — capabilities that match or exceed Google Assistant's feature set. Unlike Google Assistant, however, the Gemini integration is entirely white-labeled inside Siri and runs through Apple's Private Cloud Compute, meaning Google never processes Apple user data. There is no 'Powered by Google' badge anywhere in the UX.

Is Gemini-powered Siri better than ChatGPT on iOS?

The two integrations differ fundamentally in depth. ChatGPT was integrated into iOS 18 with visible OpenAI branding as a supplementary layer. Gemini runs fully white-labeled inside Siri at the OS level, enabling cross-app workflows (e.g., edit a photo and post to Instagram in a single command), on-screen context understanding, and encrypted on-device conversation memory — capabilities ChatGPT on iOS does not match at the system level.

When does iOS 26.5 ship with deep Gemini-powered Siri features?

iOS 26.5 beta is expected spring 2026 (April/May), delivering the first deep Gemini-powered Siri capabilities. iOS 26.4 already shipped an initial phase of improvements marketed as 'more personalized' Siri. The most ambitious features — including full cross-app action chains and conversation memory — may slip to iOS 27, announced at WWDC in June 2026 with a public release in September.

Does the Gemini-powered Siri integrate with third-party apps like Instagram?

Yes. Cross-app action chaining is one of the headline features enabled by Gemini's 1.2-trillion-parameter reasoning backbone. A demonstrated example from the article: 'Take the photo I just shot, remove the background, resize it to 1080x1080, and post it to Instagram with the caption Sunday vibes.' Siri decomposes the multi-step request, executes each step across apps sequentially, and handles errors mid-workflow — something the previous Siri architecture could not do.

What are the limitations of Apple's Gemini-powered Siri deal?

Key limitations include: (1) The most ambitious features are at risk of slipping from iOS 26.5 to iOS 27; (2) the deal is explicitly temporary — Apple's distillation rights signal a planned reduction of Google dependency over time; (3) users receive zero transparency about the underlying model, as there is no Google branding anywhere; (4) some Gemini capabilities are constrained by routing through Apple's Private Cloud Compute rather than direct API access. Exact deal duration has not been disclosed.

Will Apple eventually replace Google Gemini in Siri with its own model?

Almost certainly. The deal's most strategic clause grants Apple full model weights and distillation rights — meaning Apple can compress Gemini's 1.2-trillion-parameter architecture into smaller on-device models (potentially 3B–30B parameters) optimized for Apple Silicon. The article calls this the 'Intel playbook': Apple licensed Intel chips, studied the architecture, built its own Apple Silicon, and eliminated Intel entirely. The Gemini deal follows the same pattern.

How does Apple's Private Cloud Compute keep Gemini requests private from Google?

Apple's Private Cloud Compute (PCC) runs Gemini inference on Apple Silicon servers inside secure enclaves. User data is encrypted in transit, processed ephemerally — meaning it is never stored after the request completes — and is never forwarded to Google. Google provided only the model weights; all live inference runs on Apple-controlled hardware. Independent security researchers can audit the entire PCC system. This architecture is Apple's primary differentiator versus a standard API integration.

Apple Pays Google $1B/Year for Gemini to Power Siri — What This Really Means for AI

The Deal That Changes Everything for AI Assistants

Deal Structure at a Glance

Technical Architecture: Private Cloud Compute and Model Distillation

How Private Cloud Compute Works with Gemini

Model Distillation: The Long Game

The New Siri: What Changes for 2 Billion Apple Users

On-Screen Awareness

Cross-App Actions

Conversation Memory

The Privacy Architecture: Why Apple Didn't Just Use an API

Why Apple Couldn't Do It Alone

The Intel Playbook: Apple's Long-Term Strategy

Industry Impact: What This Deal Means for the AI Landscape

For OpenAI and Microsoft

For Google

For Anthropic and Meta

For Consumers

What We're Watching Next

The Bottom Line

Frequently Asked Questions

How much is Apple paying Google for Gemini?

Will users see Google branding in Siri?

How does Apple protect user privacy if Siri uses Google's AI model?

What happened to Apple's partnership with OpenAI?

Will Apple eventually build its own AI model and stop using Gemini?

Frequently Asked Questions

How much does Apple pay Google for Gemini per year?

Is the new Gemini-powered Siri better than Google Assistant in 2026?

Is Gemini-powered Siri better than ChatGPT on iOS?

When does iOS 26.5 ship with deep Gemini-powered Siri features?

Does the Gemini-powered Siri integrate with third-party apps like Instagram?

What are the limitations of Apple's Gemini-powered Siri deal?

Will Apple eventually replace Google Gemini in Siri with its own model?

How does Apple's Private Cloud Compute keep Gemini requests private from Google?

Related Articles

RAG vs Fine-Tuning vs Prompt Engineering: Which One Do You Actually Need?

AI Training vs Inference: Where the Money Really Goes

The AI Glossary: 45+ Terms Everyone Should Know (Plain English)