NVIDIA's $200 Billion Bet: Vera CPU and the Agentic AI Infrastructure Play

On Wednesday, NVIDIA reported quarterly revenue of $81.6 billion — an 85% year-over-year increase — and guided to $91 billion for the next quarter. Net income hit $58 billion. The numbers alone are staggering, but what stopped the room cold wasn't the GPU business. It was a CPU.

"We have a major new growth driver — Vera," CEO Jensen Huang told analysts on the earnings call. "Vera Rubin is going to be even more successful than Grace Blackwell."

That sentence, measured against NVIDIA's track record of backing extravagant claims with extravagant results, amounts to a declaration of war on the $200 billion server CPU market — a market NVIDIA has never touched. Intel and AMD should be paying close attention.

$81.6B Q1 FY2027 Revenue

$200B New CPU TAM

$20B Year-1 CPU Revenue

88 Custom Olympus Arm Cores

The Thesis: Agents Don't Rent Compute. They Consume It.

To understand why NVIDIA is betting so heavily on CPUs, you need to understand how agentic AI differs from the generative AI workloads that fueled the company's GPU supercycle.

Training a large language model is a batch job: you rent thousands of GPUs, burn through petaflops of compute, and produce a model. Inference — the act of running that model against a prompt — is a real-time job that still leans on GPUs. But agentic AI introduces a third category: orchestration. An agent doesn't just generate tokens. It reasons over them, calls tools, queries databases, manages memory, updates state, and plans multi-step workflows. That orchestration loop runs on CPUs.

"The economics of AI, of the future, is tokens per dollar. Agents don't rent capacity. They need a task completed as fast as possible at the lowest cost per completed action." — Jensen Huang, Q1 FY2027 Earnings Call

𝐗

Chamath Palihapitiya

@chaaborysath

$81.6B quarter. $91B guidance. A new $200B CPU market. Forward P/E of 22x. This is the cheapest mega-cap growth stock on the planet relative to its trajectory. The market is sleepwalking.

May 22, 2026 View on X →

Huang's formulation is precise. As AI shifts from "generate an answer" to "complete a task," the bottleneck moves from GPU throughput to CPU latency. Every tool call, every memory lookup, every reasoning step in an agent's chain-of-thought is a CPU operation. And when you have billions of agents running simultaneously — Huang's stated expectation — the aggregate CPU demand dwarfs anything the server market has seen.

The three-layer compute stack for agentic AI. Vera targets the middle layer — the orchestration work that agents generate at massive scale.

• • •

What Vera Actually Is

Vera isn't just "another Arm server chip." It's a ground-up design optimized for a specific workload profile that didn't exist three years ago. Here's the technical reality:

The chip packs 88 custom Olympus Arm cores with simultaneous multi-threading (SMT), offering two threads per core. It supports up to 1.5 TB of LPDDR5x SOCAMM memory at up to 1.2 TB/s bandwidth. For context, that's laptop-class memory technology deployed at server scale — chosen specifically for its power efficiency. NVIDIA claims 50% faster per-core performance under full load, 2x performance per watt, and 4x rack density compared to x86 alternatives.

But the real differentiation isn't the cores. It's the NVLink fabric integration. Vera was co-designed with the Rubin GPU, sharing the same interconnect, the same memory coherence model, and the same software stack. When an agent's reasoning loop kicks off a GPU inference call and then immediately needs the CPU to process the result, call a tool, and feed the output back — the data never leaves the NVLink domain. That round-trip latency is where traditional x86 + PCIe architectures bleed performance.

NVIDIA debuted Vera at GTC in March 2026 and began hand-delivering the first units in May to Anthropic, OpenAI, SpaceX, and Oracle. Shipments ramp broadly in Q3. The company expects $20 billion in standalone Vera CPU revenue this fiscal year — and crucially, this figure is not included in the $1 trillion Blackwell/Rubin framework NVIDIA announced earlier. It's additive.

"The company now expects $20B of FY27 standalone Vera CPU revenue that is not included in the $1T Blackwell/Rubin framework. Vera is a potential additive revenue layer rather than a replacement for GPU demand." — Benchmark Analysts, post-earnings note

The $200 Billion Market Nobody Saw Coming

NVIDIA has never addressed the server CPU market. That market has been Intel and AMD territory for decades. Huang is now claiming a $200 billion total addressable market (TAM) for CPUs in the agentic AI era — and he confirmed in Taipei this weekend, ahead of Computex, that the forecast includes China.

Where does the $200 billion come from? GF Securities analysts project that agentic AI will account for 30% of overall inference computing, with the CPU TAM reaching $211 billion by 2030. In unit terms, they estimate demand growing from 3.7 million units in 2026 to 16.3 million in 2028 — a 4.4x increase in two years.

The logic is straightforward. Today's inference is mostly "one prompt in, one answer out." Agentic inference is "one goal in, dozens of intermediate steps, tool calls, memory queries, state updates, and a final result out." Each step in that chain generates CPU work. Multiply by billions of agents, and the CPU demand curve goes parabolic.

NVIDIA's two-pillar revenue architecture. The $200B CPU opportunity sits entirely outside the existing $1T GPU framework.

Who's Already Bought In

The customer list reads like a who's who of AI infrastructure. NVIDIA hand-delivered the first Vera systems to:

Anthropic — Claude's inference costs are a well-known pressure point. Outside estimates suggest Anthropic spends roughly $5,000 in compute to serve each $200/month plan. Vera's token-per-dollar economics directly address that.

OpenAI — As the company scales toward its anticipated IPO, inference efficiency becomes a line item that determines profitability.

SpaceX — Perhaps the most unexpected customer. SpaceX's AI division is building autonomous mission-planning agents that need low-latency orchestration in edge-constrained environments. Vera's power efficiency (LPDDR5x draws significantly less than DDR5) makes it viable for non-datacenter deployments.

Oracle — Larry Ellison's aggressive cloud AI buildout now includes Vera-based racks, positioning Oracle Cloud as an agentic-native infrastructure provider.

Meta — A multi-year, multi-billion-dollar deal covers Grace CPUs, Blackwell GPUs, and Vera Rubin systems across US data centers, with $135 billion in total AI investments.

• • •

The China Variable

Landing in Taipei on Saturday ahead of Computex, Huang confirmed that his $200 billion CPU TAM forecast includes China. "I would think so," he told reporters at Songshan Airport.

The Chinese market remains complicated. NVIDIA has received US government licenses to sell H200 chips to China, but Chinese officials — fostering domestic suppliers like Huawei — have not approved purchases. Reuters reports that around 10 Chinese firms have been cleared to buy the H200, but not a single delivery has been made.

Meanwhile, DeepSeek made its 75% price cut on the V4-Pro model permanent this weekend, dropping API costs to as low as 0.025 yuan per million tokens. That price cut is enabled by Huawei's Ascend 950 chips. The dynamic is clear: US export controls are creating a parallel compute ecosystem in China, with Huawei as the silicon provider and companies like DeepSeek as the model layer. NVIDIA's Vera may never capture the Chinese market directly. But the global demand it generates — from companies competing with Chinese AI firms — may be just as valuable.

"H200 has been licensed to ship to China. It would be terrific to be able to serve that market. The Chinese market is very important. It's very large, of course." — Jensen Huang, Taipei, May 23, 2026

The Stock Paradox

Here's the strange part. NVIDIA just reported the strongest quarter in semiconductor history. 17 analysts raised their price targets. Revenue guidance beat by $3 billion. A new $200 billion market was announced. And the stock... barely moved.

Huang himself called the underperformance "one of the mysteries of the universe" in a CNBC interview.

Analyst	Previous PT	New PT	Implied Upside
Baird	$300	$500	+132%
Evercore	$352	$413	+92%
Bank of America	$320	$350	+63%
Benchmark	$250	$335	+56%
Wedbush	$300	$330	+53%
Raymond James	$323	$330	+53%
KeyBanc	$300	$310	+44%
Jefferies	$275	$300	+39%
Goldman Sachs	$250	$285	+32%
JPMorgan	$265	$280	+30%

Former Goldman Sachs executive Michael Parekh offered one explanation: the upcoming IPOs of SpaceX, OpenAI, and Anthropic are "stealing the thunder." Investors have limited capital for mega-cap tech, and three once-in-a-generation IPOs are pulling attention. 59 of 62 analysts covering NVDA have Buy ratings. The average price target of $292 implies 33% upside. Retail sentiment on StockTwits has been "extremely bullish" for over a week.

Patrick Moorhead

@patmoorhead · Threads

Jensen just casually announced NVIDIA is entering a $200B CPU market it has never addressed before. With 88 custom Arm cores, NVLink integration, and every hyperscaler signed up. Intel and AMD should be very worried.

May 22, 2026 View on Threads →

𝐗

Lisa Su Fan Account

@LisaSuUpdates

NVIDIA Vera CPU first deliveries to Anthropic, OpenAI, SpaceX, Oracle. $20B in year-one revenue. Not included in the $1T Blackwell/Rubin framework. This is genuinely a new business line, not a rebrand.

May 21, 2026 View on X →

At a forward P/E of 22.1x, NVIDIA is now the second-cheapest of the Magnificent Seven — cheaper than Apple (33.5x), Alphabet (30.8x), and Amazon (32.1x). For a company growing revenue 85% year-over-year and opening a new $200B market, that's either a screaming buy or a sign that the market doesn't believe the growth can continue. History suggests betting against Jensen is a losing proposition.

• • •

What This Means for Developers

If you're building agentic systems, the Vera announcement has three concrete implications:

1. Inference Gets Cheaper, Faster

Agent sandboxes on Vera run 50% faster than on general-purpose x86 CPUs. Enterprise data queries finish up to 3x quicker. As Vera racks deploy at hyperscalers, the cost-per-completed-agent-task will drop. This is the "tokens per dollar" metric Huang keeps emphasizing. For startups running multi-agent systems on cloud infrastructure, the unit economics of agentic AI are about to materially improve.

2. The NVIDIA Software Lock-In Deepens

Vera integrates with CUDA, NVLink, and the full NVIDIA AI Enterprise stack. If your agent framework is already built on NVIDIA's platform, Vera is a drop-in acceleration. If you've been building vendor-agnostic, the performance gap between NVIDIA's integrated stack and a heterogeneous CPU+GPU setup just widened. The moat is getting deeper.

3. CPU Selection Becomes an Agent Architecture Decision

Until now, the CPU under your cloud instance was an implementation detail. With purpose-built agentic CPUs, the choice of CPU directly affects agent latency, tool-call throughput, and memory bandwidth for RAG workloads. When your cloud provider offers Vera-backed instances, there will be a measurable performance tier difference for agent-heavy applications.

Developer & Investor Resources

The Competitive Landscape

NVIDIA isn't entering an empty room. Intel's Wildcat Lake CPUs and AMD's EPYC line are both pivoting toward agentic workloads. Amazon's Graviton chips power a vast portion of AWS inference. Google's Axion CPUs are designed for its own AI services. Apple's M-series silicon dominates edge inference.

But none of these chips were designed from scratch for agentic orchestration. They're general-purpose processors being repurposed. Vera's advantage is specificity: it optimizes for token throughput, not multi-tenant virtualization. For the same reason that GPUs displaced CPUs for training — purpose-built beats general-purpose — Vera aims to displace x86 for agent orchestration.

The risk, of course, is that the agentic workload profile evolves. If agents become more GPU-bound (running local models instead of calling APIs), the CPU orchestration layer shrinks. If cloud providers develop tightly integrated custom silicon (as AWS has with Graviton + Inferentia), the NVLink advantage diminishes. NVIDIA is betting that the orchestration layer grows, not shrinks. So far, the trajectory supports that bet.

The Huang Doctrine

Step back, and a pattern emerges. NVIDIA doesn't just sell chips. It identifies a computing paradigm shift, builds the complete stack for it, and arrives before the market fully understands the demand.

GPUs for gaming became GPUs for deep learning. GPUs for training became GPUs for inference. And now, the company that defined GPU computing is telling the world that the next wave runs on CPUs. Not instead of GPUs — alongside them.

"The world is rebuilding computing for agentic AI and robotic physical AI. Nvidia sits at the center of these transitions." — Jensen Huang, Q1 FY2027 Earnings Call

"The world has a billion users, human users," Huang told investors. "My sense is that the world is going to have billions of agents."

If he's right, then $200 billion is conservative. If he's wrong, NVIDIA still has an $81 billion quarter to fall back on. Either way, the chip industry just got its most consequential new product since Blackwell. And it's not even a GPU.

NVIDIA Semiconductors Agentic AI Vera CPU Infrastructure Jensen Huang Computex 2026