Vast reshapes data infrastructure for agentic AI

The enterprise data stack wasn’t designed for continuous, autonomous agentic AI. For years, the challenge was storing and organizing information. Now the challenge is delivering that data — consistently, globally and in real time — to systems that reason and act without pause. Most infrastructure was built for batch analytics and discrete workflows, not always-on execution.

That mismatch is reshaping the stack. The constraint isn’t model quality or GPU supply. Rather, it’s the fragmented architecture underneath — metadata catalogs disconnected from execution platforms, streaming systems layered onto storage tiers and orchestration frameworks stretched beyond their original intent. For enterprises pushing AI into production, that fragmentation becomes the limiting factor, according to Rob Strechay, managing director and principal analyst at theCUBE Research.

“Pipelines can’t be fragile and glued together anymore; they have to be part of the platform itself,” he said. “The desired outcome is clear: Faster time to value for developers, fewer operational seams for platform teams and data pipelines that can scale AI responsibly, adding more tools, more copies of data, and more risk if it’s not addressed properly.”

These themes will be examined in depth on Feb. 25 at Vast Forward, where theCUBE, SiliconANGLE Media’s livestreaming studio, will explore how enterprises are collapsing fragmented data stacks into unified platforms built for continuous AI execution. Industry execs from Vast Data, Nvidia, Pixar, Solidigm and Cisco Systems will weigh in on the architectural shifts required to sustain always-on, production-grade AI at scale.

This feature is part of SiliconANGLE Media’s exploration of the architectural shifts powering continuous, production-grade AI. (* Disclosure below.)

When tools aren’t the answer

The problem most enterprises face with Kubernetes-based data pipelines isn’t a tool shortage — it’s a coordination failure hidden beneath it. Findings from theCUBE Research’s Future of Data Platforms Summit revealed that 51.8% of organizations are running four to six data platform vendors, with another 30.8% operating seven or more. Despite that density, more than a third still report persistent data silos — evidence that for many organizations, integration has remained conceptual rather than operational.

“Obviously, the logging and the telemetry, that’s important to understand whether you need to scale a pipeline up,” said Andy Pernsteiner, field chief technology officer at Vast, who recently spoke to theCUBE about data pipeline challenges. “But then on the metadata side, a lot of times the metadata about the data that’s being analyzed and having derivatives built upon, it’s almost just as important or more important than the data itself.”

That metadata gap is where pipelines most often fail to scale. Traditional architectures treat metadata as a descriptive afterthought — something crawled, cataloged and reconciled after the fact — rather than as an operational requirement baked into the platform itself. The Summit numbers bear this out: 51.6% of organizations flag data quality and metadata issues as persistent blockers, even as 88% claim strong governance practices.

“Tying all these pieces together, if you think about telemetry, audit logs, all of those other bits and pieces, most of the time that’s up to the operations team to understand and manage it,” Pernsteiner said. “But once we released the data engine in our platform … end user developers … need access to that telemetry and logging as well because they’re iterating. The way that we’ve built the data engine … not only are the high-level metrics available to the administrators and the developers, but also the low-level tracing to understand what part of a pipeline is taking the longest, where it’s getting stuck.”

The developer experience imperative runs through the data platform layer in ways that fragmented tooling hasn’t been able to address. The Summit survey found that 93% of organizations plan to increase investment in management tools, and 65.4% identify scaling AI as their top challenge, not model availability and frameworks. For development and platform engineering teams working inside Kubernetes environments, those numbers point to a common culprit: The operational overhead generated by systems that weren’t designed to work together, according to Strechay.

“The next phase of Kubernetes-native data platforms will not be defined by more services, but by fewer seams,” Strechay said.

One platform, no seams

Vast’s answer to pipeline fragmentation isn’t another layer of abstraction — it’s a collapse of the stack itself. Over the past several years, the company has built five software domains atop its disaggregated, shared-everything storage core. Together, they form what Vast calls an AI Operating System — a single software fabric spanning on-premises clusters, public cloud zones and edge footprints, according to Alon Horev, co-founder and chief technology officer of Vast.

“When people say cloud-native, it’s really the ability for infrastructure providers to build consistency across different venues and locations where processing takes place,” he recently told theCUBE. “What we’ve done over the last few years is to bring compute capabilities into that data.”

Architecturally, Vast “has engineered its architecture for the agentic era by combining integrated vector indexing, real-time event triggers and a low-code agent engine directly on top of its disaggregated flash fabric,” according to Dave Vellante, co-CEO of SiliconANGLE Media and co-founder of theCUBE Research, in a recent Special Breaking Analysis. “Vast’s vision has resonated with customers, and I’m interested to see what they announce as the next chapter in their products and ecosystem relationships.”

Vast counts several Fortune 100 companies among its customers, including Nvidia, Google Cloud, Microsoft and Cisco. The company also recently signed a $1.17 billion commercial agreement with CoreWeave. For enterprises that want to move AI out of the lab and into continuous production, the breadth of that ecosystem signals that the architecture is being validated operationally, not just claimed, according to Jeff Denworth, co-founder of Vast.

“Now, data systems become an integral part of my compute framework,” he told theCUBE. “Our belief is that the world is about to embark on one of the largest technology refresh events in history now that people realize that they need to uplevel their data infrastructure to feed these new agentic systems.”

For developers and platform engineers living inside these agentic AI environments, the practical proof of that argument isn’t architectural elegance — it’s whether the system makes their jobs simpler when things go wrong. A platform that unifies storage, streaming, metadata and compute isn’t just easier to build on; it’s easier to debug, scale and recover when a pipeline stalls or a workload suddenly demands more than anticipated, according to Pernsteiner.

“Obviously, there’s complexity,” he said. “When you have lots of different pieces talking to each other, if anything is stuck or is broken, it’s going to impact the rest of the pipeline. Having a single way of unifying everything makes it simpler from a debuggability and then even from a scaling standpoint, because … if you have a platform that can hold everything and you can scale it as you need to, it’s not just that you have to scale one thing or another. You can scale everything in concert if you choose to do so.”

Flash for the agentic AI era

Vast’s architectural ambitions rest on a hardware foundation that makes the economics of all-flash for all data real. The company’s partnership with Solidigm — built around Solidigm’s high-density quad-level cell NVMe SSDs — was designed from the start to eliminate the artificial storage tiering that forced organizations to choose between performance and capacity. By building on its QLC technology, Vast created a platform where archive-grade data sits on the same flash fabric as production workloads, accessible at NVMe latency for both analytical and AI workloads.

“Vast is attempting one of the bolder moves we have seen in AI infrastructure, moving from a high-performance AI storage vendor to an operating-system provider and positioning itself as the control plane for distributed, agentic computing,” Vellante said. “Vast management is executing. It’s a rare example of a startup that is profitable but also growing fast. The total addressable market that Vast is going after is much larger — perhaps 3-5 times — than the legacy storage business.”

The Vast-Solidigm partnership has since produced more than a hardware foundation. A joint whitepaper from the two companies found that all-flash architectures deliver a 58.9% lower total cost of ownership than traditional hard-disk-drive tiers, challenging long-held assumptions about the need for tiered storage.

That hardware foundation is no longer a differentiator in isolation. It’s table stakes for what agentic AI infrastructure now demands, according to Avi Shetty, senior director of AI enablement and partnerships at Solidigm.

“Over the last year, we’ve seen storage kind of put itself in its place where you’ve seen usages and certain solutions which have exposed the need for having high-performing, reliable, scalable, high-dense storage solutions,” he told theCUBE. “Two trends have emerged. When it comes to storage, you need low-latency, fast super-performing storage. On the other end, you need high-dense cost-effective storage solutions.”

As vector databases scale into the billions and trillions, Vast’s response is a design principle: Everything on the platform — vectors, metadata, storage — should be capable of scaling to the same order of magnitude as the system itself, with no refactoring required as pipelines grow, according to Pernsteiner.

“We look at the rest of our data and metadata structures to make sure they can scale as well,” he said. “Billions and trillions of vectors in a single store instead of having to split and shard it across many different table spaces is also a design that we’ve implemented … no matter what happens with the AI pipelines that are being developed, they have a place that they can grow, even exponentially, without having to go and refactor everything.”

That architecture is now intersecting with the evolution of the graphics processing unit ecosystem. Nvidia’s Vera Rubin system, unveiled at CES 2026, signals a fundamental rethinking of where context lives. And Vast has been working directly with Nvidia to re-architect the Key-Value cache layer for a world of longer reasoning cycles and multi-turn inference, according to John Mao, vice president of global business development at Vast.

“KV cache used to be very local to the GPU and the high bandwidth memory,” he told theCUBE. “But, obviously, that’s not good enough if you’re trying to store very long conversations. If you’re trying to grow that context over time, you need a different method. A lot of that development that Vast has been doing with Nvidia is in how do we build and re-architect that part of the stack for these new systems that are going into deployment?”

(* Disclosure: TheCUBE is a media partner for Vast Forward. Sponsors of theCUBE’s coverage do not have editorial control over content on theCUBE or SiliconANGLE.)

Image: ChatGPT/SiliconANGLE

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.

About SiliconANGLE Media

SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.

Vast reshapes data infrastructure for agentic AI

Tags: