DeepSeek, a Chinese startup company, announced the release of its AI reasoning model, DeepSeek R1, in January. It appeared to perform at a level comparable to OpenAI’s ChatGPT-4o at a fraction of the cost. Within days, DeepSeek R1 and DeepSeek’s non-reasoning model V3 became the most downloaded apps in the United States and around the world. In response, the manufacturer of the most advanced AI chips, Nvidia, lost nearly $600 billion in market cap. U.S. utilities and policymakers began to question whether the projected tsunami-scale increase in energy generation and transmission capacity necessary to fuel massive data centers would be needed.
Less than six months later, the world of AI has returned to full speed ahead. Meta, Microsoft, Alphabet/Google, and OpenAI have recommitted to major investments in new data centers and entered into partnerships with nuclear and other energy companies to fuel their operation. Nvidia’s stock has recovered most of its losses and, in July, became the first company ever to achieve a $4 trillion market value. The U.S. Department of Energy announced a new initiative to locate data centers on public lands, and President Donald Trump, alongside a delegation of tech CEOs, recently returned from a tour of three Arab Gulf states where he announced billions of dollars of new sales by U.S. chipmakers and foreign investments in U.S. data centers.
For the energy world, the expected surge in the amount of new electric power generation required to support the rapid expansion of U.S. data centers over the balance of the decade was back on the table. In April, Bloomberg Intelligence forecast continued growth in energy demand for AI, growing up to 4x by 2032, even if “efficiency gains like those reported by Ant Group and DeepSeek materialize.” The International Energy Administration (IEA) issued a report in April 2025 estimating that data center demand for energy in the U.S. will increase by 130% by 2030 compared to the 2024 level. A March 2025 forecast by AI lab Anthropic estimates that 50 gigawatts of new power capacity will be needed by 2027 just to train the latest AI models, and substantially more to operate them. Former Google CEO Eric Schmidt testified before Congress that data centers will require an additional 29 gigawatts of power by 2027 and 67 more gigawatts by 2030.
This enormous projected growth in the outlook for AI energy demand presents a serious challenge to the financial and technical capacity of the U.S. utility system. It brings into sharp focus the question of who should bear the risk for energy projects needed to power the expected massive surge in AI over the next decade and beyond. How much of the responsibility should rest on local and regional electric utility systems, which, pursuant to their legal “obligation to serve,” are required to meet the power demand of their current and expected customers? Is it realistic to expect these companies to construct sufficient new capacity to meet demand? And do they have the financial means to do so? To what extent should some or all of the financial risk be reallocated to the companies building new data farms? And if these companies build new power plants “behind the meter,” should there be an obligation to interconnect with and potentially support the larger grid?
In a future post, we will examine how to allocate the responsibility and risk of meeting the surge in electricity demand that might be needed to support the growth of mega data centers across the U.S. In this post, we aim to explain why DeepSeek caused such a significant disruption in the AI and energy marketplaces, how these markets so quickly adjusted, and the potential for similar disruptions in the future.
What DeepSeek accomplished
The Chinese AI start-up’s December 2024 technical report showed that its V3 model had capabilities comparable to major U.S. models, but consumed, in its final training run, only a fraction of the graphics processing unit (GPU) hours used to train its American counterparts. The company’s follow-up technical report, released on January 22, 2025, showed that its family of R1 reasoning models derived from V3 matched the capabilities of OpenAI’s o1 reasoning model, unveiled in September 2024.
One way to understand DeepSeek’s training efficiencies is to compare its V3 model, which was released in December 2024, to Meta’s Llama 3.1 model, which was released in July 2024. Both models score similarly on standard benchmarks, but DeepSeek used only 2,048 Nvidia H800 GPUs to train V3—just 13% of the 16,000 Nvidia H100 GPUs Meta used to train Llama 3.1. DeepSeek also trained its V3 model for only 2.788 million GPU hours, less than 10% of the 30.8 million GPU hours that Meta used to train its Llama 3.1 model.
Differences in the size of the model and training data set cannot explain these efficiencies. DeepSeek’s V3 is somewhat larger, with 671 billion parameters compared to Llama 3.1’s 405 billion. Meta used 15 trillion tokens (words or parts of words) to train Llama 3.1, and DeepSeek used 14.8 trillion tokens for V3.
Nor did DeepSeek use more efficient chips. In fact, using H800 GPUs instead of H100s should have taken more compute time, not less, since the communications pathways between GPUs in the H800 are restricted substantially compared to the H100, a limitation Nvidia imposed deliberately to slow compute efficiency so that the H800s would be compliant with U.S. export controls.
DeepSeek achieved its efficiencies through several technical improvements described more fully by Epoch AI and Interconnects:
A very efficient mixture-of-experts innovation, which allowed it to activate only 37 billion of its 671 billion parameters during training;
Calculating model weights in FP8 format (with 8 digits of precision) instead of the more precise FB16 format often used in training AI models;
Using its efficient multi-head latent attention (MLA) instead of the traditional multi-head attention (MHA) architecture, which reduced memory usage and thereby sped up inference time;
DeepSeek predicted two tokens (words) at a time rather than the traditional one, thereby saving inference time compute.
While these efficiencies are for training costs, they also allow DeepSeek’s models to operate more efficiently at inference time, when users provide input and the models generate output. For instance, the mixture of expert innovation means that instead of needing to activate all 671 billion parameters to answer a query or prompt, DeepSeek models need only 37 billion.
These efficiencies are reflected in DeepSeek’s prices offered to users. Its cheapest input price is $0.07 per million tokens for its V3 chat model and $0.27 for its R1 reasoning model. Its output price is $1.10 per million tokens for its V3 model and $2.19 per million tokens for its RI reasoning model. It offers a 50% reduction in price for use at off-peak times. These prices are much lower than OpenAI’s current prices. For instance, GPT 4.1’s lowest input price is $0.50 per million tokens, almost twice that of DeepSeek’s comparable V3 model, and its output price is $8.00 per million tokens—over seven times the output price of DeepSeek’s V3.
DeepSeek’s innovations will inevitably translate into significant energy savings per token generated at both training and inference time. Although the energy needed to train and use a particular AI model is hard to calculate precisely from the outside, energy use is generally a function of the compute hours spent on training or use, the number of processors used, and the average power used per processor. Thus, the dramatically lower number of GPU hours used to train the DeepSeek models and the number of chips used are also likely to be rough relative measures of the amount of electric energy required.
The reaction to DeepSeek’s innovations
Despite some initial attempts to dismiss DeepSeek’s technical achievements as illusory or fabricated, members of the AI industry rapidly recognized them as real. But this did not change their determined rush to develop additional sources of energy for AI training and use.
One reason for this stay-the-course reaction is that efficiencies such as those achieved by DeepSeek have been a common feature of AI development over the last decade or so, in a manner reminiscent of Moore’s Law. An estimate from July 2024 suggested that inference time compute efficiency, for instance, has increased by three orders of magnitude in the two years up to July 2024. One study of computer vision estimated that “compute-augmenting innovations halve compute requirements every nine months.” Another study of large language models (LLMs) from 2013-2023 found that “the compute required to reach a set performance threshold has halved approximately every 8 months.” If the trend in AI is for regular decreases in unit computing costs, then, as Anthropic CEO Dario Amodei said, DeepSeek’s improvements are “on trend” at best.
The AI industry has always used these computational efficiency gains to fuel improvements in the performance of AI models. Given the well-known imperfections of today’s AI models, this drive to improve AI models will necessarily continue. Hallucinations are only one manifestation of the unreliability that makes today’s LLMs of limited utility for high-stakes applications in medicine, law, warfare, and other domains. To increase the usage of AI-based services among professionals and enterprises, the industry must improve. This means expanding additional compute resources above today’s level, which means the AI industry’s demand for energy will continue to grow.
Another reason for the industry’s unwavering commitment to more energy is that the lower price that DeepSeek is forcing on the industry will stimulate demand for the usage of AI models. A lower price for today’s admittedly less than perfect AI service would be highly likely to draw in so many additional customers that overall AI compute time and energy usage will increase. Even critics of the AI industry’s “irrational exuberance” about energy demand for AI acknowledge that “in the long run, making AI cheaper will likely spur wider adoption and use, and hence increase power requirements.”
A little skepticism
The key to the demand for AI energy is the demand for AI services. But what if AI service quality does not increase enough to generate the demand to cover development and operating costs?
One of the big stories in AI from 2024 was that scaling parameters, training data, and compute time for model training seems to have hit a wall. Increasing training-time compute does not significantly improve the accuracy, reliability, or capabilities of the latest AI models. The AI industry thought it had found a new scaling law positing that increasing post-training compute could improve models more substantially. But it seems that these improvements in chain-of-thought reasoning models provide better performance only in limited areas such as coding, math, and logic, where exact answers can be known in advance. In August 2025, after two years of development, OpenAI released its GPT 5 model, but early users reported it had achieved only “modest” and “evolutionary” improvements compared to its earlier models and those available from its nearest frontier AI lab rivals.
Given these difficulties in achieving substantial AI product improvement, the AI industry has a long way to go to create a revenue stream that would support its outsized development costs. In June 2024, the venture capital firm Sequoia Capital estimated that annual AI revenue would have to be $600 billion just to justify Nvidia’s capital expenditures. It is hard to imagine that coming from individual consumers. Would a billion consumers pay $600 each, or would 100 million consumers pay $6,000 a year? Today, most of ChatGPT’s 400 million weekly active users around the world pay nothing, as do its 67.5 million U.S. users. Professionals and enterprises will have to provide the demand, and they will do this only if they can find $600 billion in productivity gains from using these tools.
By June 2025, revenue for the big AI labs had shown significant increases, but they were still not commensurate with their capital expenditures. Richard Waters, the tech industry editor of The Financial Times, noted, “the chasm between capital spending and revenue has shown little sign of narrowing.” In the second quarter of 2025, tech titans boasted stellar earnings reports, which boosted Microsoft into a company worth over $4 trillion; however, as Waters pointed out, “not much [of that performance] can be directly attributed to generative AI. Nor is it entirely clear how they will ultimately justify the massive increase in capital spending to build and deliver the technology.” Even after the impressive earnings reports, The Economist noted that the tech giants “are spending big, but many other companies are growing frustrated” and welcomed the industry to the “AI trough of disillusionment.”
A little skepticism about market demand for AI products and services seems to be in order. As a result, the consensus projections of energy demand for AI deserve similar skepticism.
The regulatory risks
The AI industry will continue to approach utilities and policymakers with requests for substantial increases in energy production. Utilities will have to be prepared to respond, which means contemplating billions of dollars of new investment in power plants of enormous capacity, measured in gigawatts. Absent policies and tariff structures that isolate data center energy development costs, this new demand all by itself is likely to force up rates for ordinary consumers. However, the regulatory problems are even worse than this. Energy utilities also have to be prepared for the risk that today’s AI company demands are vastly overblown. If that turns out to be true, then utilities might be saddled with billions of dollars of stranded investment that can only be recovered from ordinary ratepayers. The urgent challenge is to devise a regulatory strategy that can accommodate the energy wishes of the AI industry without putting the costs and risks on other energy customers.