
INDIA – 2025/07/29: In this photo illustration, the Nvidia logo is seen displayed on a smartphone with a TSMC(Taiwan Semiconductor Manufacturing Company) logo in the background. (Photo Illustration by Avishek Das/SOPA Images/LightRocket via Getty Images)
SOPA Images/LightRocket via Getty ImagesGoogle Is Building Its Own AI Chips To Beat Nvidia. Google’s $4 Billion AI Chip Strategy Misses The Real Bottleneck.
Last week, The Information reported that Google is in late-stage talks with Marvell Technology to co-develop two new custom AI chips. One is a memory processing unit to pair with Google’s existing Tensor Processing Units. The other is an entirely new TPU designed specifically for AI inference, the kind of real-time work an AI model does every time it answers a question, not the massive training runs that built it in the first place.
If the Marvell deal closes, Google will have four design partners working on its custom silicon supply chain: Broadcom, MediaTek, Marvell, and its own internal teams. Google now expects TPU shipments to hit 4.3 million units in 2026, scaling to 35 million by 2028. Custom AI chip sales are projected by TrendForce to grow 45% in 2026, compared to 16% for standard GPUs. Google is offering cloud providers financial guarantees to make its TPUs the default for production AI workloads. It is launching a new software layer called TorchTPU to help developers move off the Nvidia ecosystem.
Most coverage frames this as the moment Google decides to take Nvidia seriously. That framing is not wrong. It is just measuring the wrong battle.
The battle that actually matters is happening inside one company in Taiwan that neither Google nor Nvidia controls, and neither of them can replace.
What Inference Is, And Why It Matters Now
Training is the phase where a model learns. Massive clusters of GPUs chew through billions of data points over months. It is expensive, computationally intense, and has been where most of the AI spending went from 2022 to 2025.
Inference is what happens after the model is trained. Every time someone asks ChatGPT a question, every time Claude writes a line of code, every time a company’s internal AI agent looks up a document, that is inference. Each request is small. But if you multiply it by billions of users and hundreds of enterprises, inference starts to cost more than training ever did.
This is why the chip industry is pivoting. Training chips were built for raw power. Inference chips need to be built for cost, speed, and efficiency per query. The companies that win inference will not necessarily be the ones with the fastest chip. They will be the ones who can deliver inference at the lowest cost per million tokens, which is a different engineering problem.
Google sees this clearly. Its Marvell deal targets exactly this gap. The memory processing unit it is co-developing is designed to fix the memory bandwidth bottleneck that slows down inference on existing TPUs. The new inference-specific TPU is a ground-up redesign for this workload. MediaTek’s contribution, a chip codenamed Zebrafish, is targeting a 30% cost reduction on per-inference operations. The whole strategy is coherent and the engineering is serious.
And none of it matters if the chips cannot actually be fabricated.
The One Building That Makes All Of This Work
Every advanced AI chip on the planet, with a handful of irrelevant exceptions, is manufactured by Taiwan Semiconductor Manufacturing Company (TSMC). Nvidia’s Blackwells. Google’s TPUs. Amazon’s Trainium. Microsoft’s Maia. AMD’s MI series. If the chip is cutting-edge and designed for AI, it comes off a wafer in Hsinchu, Taiwan. TSMC controls over 90% of global advanced chip fabrication.
Here is what that means in practice for Google’s ambitious multi-partner strategy.
In January 2026, Nvidia surpassed Apple as TSMC’s largest customer, a position Apple had held for more than a decade. Creative Strategies analyst Ben Bajarin projects Nvidia will generate around $33 billion in TSMC revenue in 2026, representing roughly 22% of the foundry’s total revenue. Apple is now second, at $27 billion and 18%.
The more important number is in advanced packaging, the CoWoS process used to stitch together logic and memory into a working AI chip. Nvidia has reserved over half of TSMC’s CoWoS capacity for 2026, with some reports citing over 60%, booking 800,000 to 850,000 wafers. TSMC itself is doubling its CoWoS capacity from roughly 35,000 wafers per month in late 2024 to a projected 130,000 per month by the end of 2026. Even with that expansion, Nvidia’s bookings extend into 2027.
TSMC CEO C.C. Wei has been explicit about what this means structurally:
Building a new fab takes approximately three yearsReaching full production capacity takes another two yearsThat is a five-year minimum between deciding to expand and serving new customers at scale
Google cannot fab its way out of this. Neither can Amazon, Microsoft, or anyone else. Every AI chip designer on the planet is now queuing for the same machines, in the same buildings, operated by the same company, whose next round of capacity will not come online for years.
Why This Changes The Frame
Here is the trap most investors fall into with AI chip stories: they think the competition is between Nvidia and Google, or Nvidia and AMD, or Nvidia and Amazon’s Trainium. They read “Google develops custom TPU” as “Google is coming for Nvidia’s lunch.”
What is actually happening is different. Google, Nvidia, Amazon, Microsoft, and every other designer is competing for slots at TSMC. The designer who gets more slots ships more chips. The designer who gets fewer slots ships fewer. The winner of the design race is still constrained by the gatekeeper of the fabrication race. And that gatekeeper has made one thing obvious over the last two years: it favors its biggest customer, and its biggest customer is Nvidia.
Why does Nvidia get the best slots? Three reasons:
It writes the biggest upfront checks. Multi-billion-dollar prepayments lock in production years in advance.Its chips are the most complex and highest-revenue-per-wafer. TSMC earns more per square millimeter on a Blackwell than on almost anything else it makes.Jensen Huang built the relationship with Morris Chang, TSMC’s founder, three decades ago. Huang told Chang at the time that Nvidia would one day be TSMC’s biggest customer. That prediction is now reality.
Google is now trying to do what Nvidia did, but on a compressed timeline. The Marvell talks, the MediaTek partnership, the extended Broadcom agreement, all of it is Google building the same kind of structural redundancy Nvidia spent years earning. It is the right move. It is also a multi-year project, during which Nvidia will keep getting the best capacity, the fastest nodes, and the earliest access to new processes.
The investor mistake is reading “Google challenges Nvidia” as “Nvidia’s moat is cracking.” The actual lesson is smaller and more precise: the moat has always been at the fab, not at the designer, and the fab has already told you who it favors.
What To Actually Watch
The people who understand this industry stopped watching chip benchmarks a year ago. They watch allocation reports.
The numbers that matter are not clock speeds or FLOPS. They are:
TSMC’s quarterly capacity disclosures, which tell you how many wafers are actually rolling off the advanced nodesCoWoS allocation by customer, which tells you who is getting the packaging that turns silicon into a working AI acceleratorTSMC’s capex guidance, which tells you how fast the capacity wall is movingThe gap between ordered and delivered wafers, which is the real measure of supply tightness
Every AI chip story of 2026 will eventually get filtered through these numbers. Google’s TPU ambitions, Amazon’s Trainium scaling, Microsoft’s Maia roadmap, Anthropic’s rumored custom silicon, even Nvidia’s Rubin architecture. None of them exist without TSMC, and TSMC’s pace is set by physics, not marketing.
There is a reasonable argument to be made that the most valuable position in the entire AI stack is not the best chip designer or the best model or even the best cloud provider. It is the one company that controls the physical bottleneck everyone else has to pass through. TSMC does not need to win the AI race. It just needs to keep being the only place where the race can actually happen.
That is the real story behind Google’s four-partner strategy. The design wars are real. The winners will make real money. But the fab wall is taller than any design advantage, and it is getting taller every year as the capacity expansion lags the demand curve. Google is not building a Nvidia killer. It is building insurance against the possibility that Nvidia takes an even bigger slice of a pie that neither of them controls.
The bottleneck is not in Mountain View or Santa Clara. It is in Hsinchu. Watch that building, not the benchmarks.