How Espresso AI Is Tackling One of Tech’s Biggest Hidden Bills

Matthew Kayser
| Contributor

For many companies, the move to the cloud promised a future of flexibility and scalability. But as data needs have grown, so too have the bills. Snowflake, one of the most popular cloud data warehouse platforms, often ranks as a company’s second-largest IT expense after its primary cloud provider. For finance chiefs, the challenge isn’t whether to use data warehouses, but how to use them strategically: Are you optimizing for both performance and cost, or just convenience?

Ben Lerner knows this problem well. A veteran of Google Search and DeepMind, Lerner co-founded Espresso AI to address what he calls the “runaway growth” of cloud costs. “It’s very common for a team to think they’ve budgeted two years of runway with cloud data warehouse platforms, only to realize they’ve burned through the contract in 14 months,” he explains. “Costs can grow faster than the revenue of the business, and teams are often told to cut the bill in half without any extra headcount. That is an almost impossible ask.”

A Different Approach to FinOps

The field of cloud cost management, often called FinOps, has grown quickly as organizations look for ways to control spending. Traditional approaches focus on financial engineering such as negotiating discounts for long-term commitments, or on observability tools that track who is running what. Those methods provide visibility, but Lerner argues they leave most of the work to already-stretched engineers.

“Other startups in this space are essentially giving you dashboards and graphs,” he says. “That is valuable, but it also creates more work for the team. Our approach is different. We take the workloads you are already running, and Espresso’s software makes them more efficient. You get the same results, faster, on less hardware.”

Espresso’s software sits between a company’s analytics tools and Snowflake. Every query passes through its system, where it is evaluated, rewritten, and scheduled in real time. The company uses large language models combined with reinforcement learning and formal verification. In practice, that can mean running the same queries more efficiently in less time and with lower compute costs.

An Air Traffic Controller for Data

In addition to optimizing queries, Espresso routes queries between different compute clusters to maximize efficiency. Lerner often reaches for analogies to explain what Espresso does, at one point comparing Espresso’s software to an air traffic controller: “Imagine an airport where every flight had its own dedicated runway all day. It would work, but it would be horribly inefficient,” says Lerner. “Instead, you have controllers directing planes to land and take off in sequence. That is what our software does with queries.”

Another metaphor is packing a moving truck. Instead of just throwing things into the truck without thinking about how it all fits – which leads to wasted space and more trips than necessary – it’s preferable to pack the truck in a thoughtful manner that maximizes efficiency. Similarly, by managing workloads intelligently, Espresso reduces the “dead space” in how data warehouses allocate compute resources.

The impact can be immediate. Lerner says some customers see savings of 30 to 50 percent in their first month. Others have cut bills by a factor of five. Espresso’s pricing model is tied directly to results. The company charges a percentage of the savings it delivers, which makes the return on investment transparent.

From Google Lessons to Broader Horizons

Lerner credits his years at Google for shaping the company’s strategy. At Google scale, efficiency was essential. “The lesson is that speed and cost are the same problem,” he says. “If something runs faster, it almost always runs cheaper. That is the mindset we have brought into Espresso.”

The company is where the customer pain is sharpest, but Lerner sees a much larger opportunity. He describes Espresso’s long-term goal as building the “brain that runs cloud compute.” Beyond data warehouses, he points to areas like large-scale batch processing, Kubernetes workload scheduling, and GPU inference as natural next steps. All share the same problem of fluctuating demand and expensive capacity.

A Future with Smarter Compute

Cloud computing costs will continue to rise for businesses, as even non-engineers will be using generative AI tools to write code at a faster volume than ever before. Lerner believes that without smarter management, enterprises will face spiraling bills and strained infrastructure.

“Humans cannot manually retune cloud systems every second,” he says. “Machines are well-suited to understand how other machines work. That is why this has to be automated.”

For now, Espresso is helping companies get their cloud data warehouse bills under control. But Lerner is clear that the vision goes beyond one platform. “What we are really building is infrastructure that can make compute itself more efficient,” he says. “If we can do that, we do not just save companies money. We make it possible to run things that would otherwise be too costly or too slow.”

In an era where efficiency is as critical as innovation, Espresso’s approach offers a compelling complement to the cloud platforms it helps optimize.

How Espresso AI Is Tackling One of Tech’s Biggest Hidden Bills

Tags: