Nvidia's headquarters in Santa Clara, Ca.

Nvidia’s headquarters in Santa Clara, Ca.

Nvidia

OK, I haven’t done this in a while; no excuse other than laziness. But here are ten concrete, defensible predictions for AI in 2026, with a bias toward things that materially matter for infra, enterprises, and policy.

1. Agentic AI moves from demos to staffed “digital teams”

We’ve all heard the cry, “The Agents are coming! The Agents are coming!”, and this will be the largest theme in AI next year. By late 2026, many large enterprises will have at least one production agentic workflow (not just a POC) handling end‑to‑end tasks in support, finance, or operations.

Analyst estimates put the autonomous AI agent market around the mid‑single digit billions in 2026, growing several‑fold by 2030, so this will be visible in expense and capital budgets.

2. Inference dominates AI capex and reshapes hardware mix

Ok, a no-brainer. Trying to up my average, here. Many estimates predict that by 2026, roughly two‑thirds of AI compute cycles will be spent on inference, not training, and most of that will still sit in data centers and enterprise servers rather than at the edge. Consequently, specialized inference in‑house silicon will grow much faster than GPUs in shipment growth, even if GPUs remain larger in absolute dollars.

Nvidia is not about to be slain by ASICs, however. The Rubin CPX demonstrates that Nvidia is ready for the high-memory future of agentic AI inference. Google TPUv7 will also be a serious contender, as will the Qualcomm AI200/250 whose massive memory capacity and lower costs will be compelling. AMD will have to await Helios in order to play, but other than price, their differentiation remains unclear.

3. Domain‑specific and smaller models beat general AI LLMs in the enterprise

This one is also a no-brainer. Almost everyone agrees with this.

More than half of serious enterprise deployments will standardize on domain‑specific or fine‑tuned models (finance, healthcare, legal, telco) rather than a single general LLM, driven by accuracy, governance, and especially ROI.

4. CSP and hyperscaler ASICs start to bite into Nvidia, especially Google.

The cloud‑native accelerators from hyperscalers (TPU, Trainium/Inferentia, MTIA, etc.) plus Chinese incumbents will post materially (3X?) higher growth rates than merchant GPUs in 2026, driven by cost‑per‑token and ecosystem control. We certainly seem to be heading towards a structurally more competitive accelerator landscape. But I suspect Nvidia will remain supply-constrained, not demand, through at least 2026. And Rubin Ultra could extend this enviable state at least through 2027.

5. Data, especially synthetic data, becomes as important as model architecture in AI

With credible forecasts that high‑quality public web data for training will be largely tapped out around 2026, leading labs and enterprises will invest aggressively in synthetic data pipelines and private data curation.
Benchmarks and R&D narratives will increasingly differentiate on data generation, filtering, and feedback loops rather than just model size.

6. We will see the first AI “World Models” appear

“World models” here means models that build a structured internal representation of 3D/physical environments and dynamics, often for simulation, robotics, or rich video, trained on multi-modal data, not the internet. World Labs, Google, and Meta will benefit. Qualcomm AI200/250 could be a real surprise here as well.

7. Identity theft and AI deep fakes will scare the ^&*! out of a lot of us.

High‑quality, real‑time deepfakes and AI‑assisted fraud (“CEO doppelgänger,” voice clones, synthetic documents) will force enterprises to treat identity and content authenticity as first‑class security domains. This will certainly become a major issue in the upcoming mid-term elections.

8. Consequently, AI regulation shifts from principle to enforcement and liability

In 2026, the regulatory conversation will move from broad frameworks to concrete enforcement: incident reporting, fines, and mandatory controls for high‑risk AI uses in finance, healthcare, employment, and critical infra.

Procurement and compliance teams will start to demand detailed model documentation, data lineage, and risk controls, affecting which vendors can sell into large accounts.

9. AI hardware roadmaps double down on memory and interconnect, not just FLOPs

Next‑gen accelerators (GPUs, NPUs, CSP ASICs) launching in 2026 will emphasize HBM4, advanced packaging, and proprietary interconnect fabrics (NVLink‑class, vendor‑specific protocols), as the locus of competition moves from raw TOPS to system‑level throughput and scale‑out efficiency.

10. AI productivity gains show up unevenly, widening the “AI gap” between firms

Most organizations will report using generative AI somewhere, but only a small minority will have it fully scaled across workflows with measurable ROI, creating a widening performance gap between AI‑mature and AI‑experimental firms.

Any Black Swans?

Of course there will be surprises. General AI will not be one of them. But that is what keeps this market so interesting!

Note that Nvidia and Qualcomm are clients of Cambrian-AI.