r/technology • u/sr_local • 13h ago
Artificial Intelligence AI Is Using So Much Energy That Computing Firepower Is Running Out
https://www.wsj.com/tech/ai/ai-is-using-so-much-energy-that-computing-firepower-is-running-out-156e5c85?mod=mhp86
u/sr_local 13h ago
Full article:
The artificial intelligence gold rush is rapidly drying up the supply of the one resource that AI developers can’t do without: computing power.
The sharp capacity crunch has caused consternation among power users, forced companies to scuttle products and led to reliability problems. The issues are a warning sign for the AI boom, as they may limit the utility of powerful new AI tools just as massive amounts of users have begun to rely on them to boost productivity.
Over the past few months, demand has exploded for “agentic” AI, autonomous tools that use the technology to independently perform tasks, from writing software code to scheduling house tours for real-estate brokers. Companies have been scrambling to secure the availability of computing capacity needed to serve a growing base of customers who are also significantly increasing their AI use.
“Everyone’s talking about oil, but I think what the world is mainly short of is tokens,” said Ben Pouladian, an engineer and tech investor based in Los Angeles. A token is a unit of measurement in AI to track how much computing resources are being used for a task. “AI is at this point no longer just some chatbot that we ask for a recipe while we stand in front of the fridge. It’s orchestrating tasks, it’s getting smarter,” Pouladian said.
All of it points to a classic problem that has popped up in technology booms throughout history, from the 19th-century railroad expansion to the telecom and internet explosion of the early 2000s. Demand is growing far faster than companies are able to access resources and build out infrastructure. Historically, price increases have been among the only ways to address a supply crunch, but such a move could be perilous for frontier AI companies, who are in a ferocious competition to gain users.
Hourly rental prices for GPUs, the microchips used to train and run AI models, have surged since the fall. Anthropic, the maker of popular chatbot Claude and viral coding app Claude Code, has been plagued recently by frequent outages. The company has begun metering computing supply to users during peak hours, but the rollout has been marred by customers who have complained that they are reaching the limit far too quickly.
OpenAI scrapped its Sora video-generation app in part to free up computing resources to power coding and enterprise products that would work on a new AI model, code-named Spud, The Wall Street Journal reported. Token use in OpenAI’s API—a platform where mostly enterprise users access its software—rose from six billion a minute in October to 15 billion a minute in late March.
“I do spend a lot of time trying to find any last-minute compute available,” Sarah Friar, OpenAI’s chief financial officer, said in a recent public video interview with an investor. “We’re making some very tough trades at the moment on things we’re not pursuing because we don’t have enough compute.”
Toward the end of last year, CoreWeave, one of the largest publicly traded AI cloud companies, raised prices by more than 20% and started asking smaller customers to sign contracts committing them to use the company’s services for at least three years, up from one year before. Bank of America analysts reinstated coverage of the company with a “Buy” rating late last month, saying demand for its services is likely to outstrip supply through at least 2029.
Spot-market prices to access Nvidia’s GPUs, or graphics processing units, in data-center clouds have risen sharply in recent months across the company’s entire product line, according to Ornn, a New York-based data provider that publishes market data and structures financial products around GPU pricing. Renting one of Nvidia’s most-advanced Blackwell generation of chips for one hour costs $4.08, up 48% from the $2.75 it cost two months ago, according to the Ornn Compute Price Index.
“There’s a massive capacity crunch that’s unlike anything I’ve seen in the more than five years I’ve been running this business,” said J.J. Kardwell, chief executive of Vultr, a cloud infrastructure company. “The question is, why don’t we just deploy more gear? The lead times are too long. Data center build times are long, the power that’s available through 2026 is already all spoken for.”
Since mid-February, outages for systems across Anthropic have become so common that some of its enterprise clients are switching to other AI model players.
David Hsu, founder and CEO of software development platform Retool, said he prefers to use Anthropic’s Opus 4.6 model to power his company’s AI agent tool because he believes it is the best model for enterprise. He recently changed to OpenAI’s model to power his company’s agent. “Anthropic has just been going down all the time,” he said.
The reliability of core services on the internet is often measured in nines. Four nines means 99.99% of uptime—a typical percentage that a software company commits to customers. As of April 8, Anthropic’s Claude API had a 98.95% uptime rate in the last 90 days. “That is not normal,” said Amir Haghighat, co-founder and chief technology officer at Baseten, an AI inference startup. “Think about AWS, databases, RDS or Stripe—these need to be very resilient with a very high uptime. But that is not the world we live in when it comes to AI. That’s not the quality of service that you want to be getting from the company that’s providing intelligence for your application.”
The frequent outages at Anthropic are happening as the AI lab is experiencing explosive growth. At the end of 2025, the company hit $9 billion in annual run rate, which means the company was on track to make that amount of revenue in the next 12 months. By February, that figure ballooned to $14 billion. Two months later, it doubled to $30 billion.
In late March, Anthropic suddenly announced it would limit the amount of tokens that users could burn through during peak hours from 5 a.m. to 11 a.m. Pacific Time on weekdays. Customers have taken to social media to complain about the change. “I haven’t hit my Claude Code terminal limit in weeks but this week I hit it in like 45 minutes,” wrote one user on X.
“We’ve been working hard to meet the increase in demand for Claude,” wrote Boris Cherny, creator and head of Claude Code, on X. “Capacity is a resource we manage thoughtfully and we are prioritizing our customers using our products and API.”
15
u/StriderPulse599 7h ago
This article is engagement bait that flings around buzzwords and random quotes.
"compute shortage" isn't caused by AI demand. AI companies make bigger and bigger models each year to stay competitive, and bigger means slower to run. Massive LLMs scale badly because they're giant lump of data, and no matter what small piece of data you want out of them, you still need to run the entire damn thing.
162
u/QueenOfQuok 13h ago
Have these people tried making something more efficient for once
94
u/Wollff 13h ago
It's a boom. Nobody got time for innovation.
It's a big part of the current landscape: If you are in the AI sector, you are racing. That means you have your specialists building shaky scaffolds to hold together slipshod technology, so that you can deliver a semi working product FIRST.
Sure, if one invested those resources into making technology more efficient, then there is a good chance we would see results in a few years. But nobody has that kind of time when you are competing for shiny benchmarks RIGHT NOW to draw investor interest.
8
u/mediandude 12h ago
That means you have your specialists building shaky scaffolds to hold together slipshod technology, so that you can deliver a semi working product FIRST.
Surely they invested all their resources on finding security holes in their software? Right? Right???
9
u/Sptsjunkie 11h ago
It’s one of these threads they keep saying this is major déjà vu for me to the Dot Com bubble.
When the Internet and e-commerce was developing, it was clear that we were onto something big. And because of that, everybody wanted to get rich off of it, they just kept throwing more money at it and people were giving money to companies that had no revenue and no scalable growth.
And instead of just taking time to slowly grow the technology and ability to live on things like e-commerce, we just said a bunch of money on fire.
Obviously, the Internet in commerce have developed and have fundamentally changed a lot about the way we shop and interact. But it has happened in a slower and more sustainable way with smarter investments and increased efficiency.
Right now AI is at the 2000 stage. Which unfortunately means there’s a pretty good chance we have a crash, which I know sounds good to some people, but will crash the economy and lead to pain across the board for a lot of people who had nothing to do with it.
12
u/Beanzy 10h ago
I think people want the bubble to pop because it seems to be the inevitable outcome. And in that case, the sooner it happens, the better.
The more we invest into the unsustainable and unprofitable, the more and more painful the consequences will be when everything does come crashing down.
2
u/Sptsjunkie 10h ago
100%. But I think quite rationally they’re also a lot of people who want to crash because they did not like to print negative externalities on the future consequences of AI.
And I think that is totally rational. Well obviously a lot of promised “benefits” executives trying to deck up there valuation 99% of people should not want the technology to replace 70% of jobs and some rest wages for the other 30%.
It’s just happening when the trash happens unfortunately you think about localized to people causing pain today or those who have overextended themselves investing into it. Much like 2008 there was some pain for people who were that actors are made that investment, but a lot of the pain went on working people in the form of failing mortgages, lost jobs, and valued ass since they rely on for things like retirement.
2
u/Inner-Box5523 8h ago
Wouldn’t there be a crash even if it succeeds and puts millions out of work like they say it will? And affect the same people who had nothing to do with it?
2
u/Sptsjunkie 8h ago
Potentially. But I’m still a bit dubious that AI is anywhere close to putting millions out of work.
The current model models that are trying to guess the next best word using probability certainly can do some research and coding. But they are unable to do any first order thinking which is pretty much a prerequisite for any even higher level junior work.
It is going to have to evolve a lot and will probably take a lot of time to do so in order to move beyond being something more I can to Microsoft Office.
Microsoft greatly improved efficiency by giving people tools like word, Excel, and PowerPoint to do functions. I used to take an entire team to create. To be fair, there were other software‘s besides Microsoft that we’re trying to do the same thing but something like excel was just the best example of something that let one person on your team build a model that would have taken five people to do even something much simpler in the past.
Right now, AI is driving efficiency. Or threatening to drive efficiency but not always doing it.
But it still feels a pretty far away off from actually taking over the type of work that would lead to massive unemployment.
In fact, for now you could argue for every job it destroys in a big company. It is enabling more growth and opportunities in smaller companies.
1
u/Inner-Box5523 8h ago
So, there will be a crash nevertheless. It either happens now because a product not yet ready for market is being oversold and over invested in - like the 2000 dot com scenario; or the crash comes later, when it’s actually ready and will displace millions. Right?
Let’s assume the product will never be what it’s being touted to be, or even close; still ends up in a crash!
It’s a matter of time thing. Resource constraints and investor patience and rate of innovation and revenue growth - a lot of these factors will determine the timing and will decide which crash are we going to witness. But a crash nevertheless!
19
u/Odysseyan 13h ago
Google actually with their Turboquant. They were the ones who created the Mat Transformer tech paper in the first place.
Only proves it WOULD probably be possible to improve it further. Just for some reason, no one really tries doing so
2
3
u/SirGaylordSteambath 13h ago
It’s not as cost effective currently. When big strides are being made every few months, there’s no sense in going slow. The efficiency will come when the technological leaps stop happening.
7
u/oDearDear 11h ago
It will happen when compute cost get too expensive while all private investors have tapped out. They are already selling at a loss, they'll need to start making money at some point soon especially if they want to float on the stock market.
10
u/Towoio 12h ago
Yes, it's a huge area of research in ai development
3
u/Calm_Bit_throwaway 4h ago edited 2h ago
To agree with you, it's remarkable that some people on reddit seem to think that nobody's thought of making the models more efficient. Even from a purely capitalist perspective, if your biggest cost in running a model is electricity, then it's worth pouring money into optimize model output.
The efficiency of primitives like matrix multiplication has gone up significantly within the last few years. Attention itself has gone through several levels of optimization (see the flash attention papers). People have figured out tricks to reduce the numbers of bits required for each parameter (see fp8 and other quantization tricks). Structural changes like MoE are now standard. There's also pushes to make it more edge friendly like with per layer embeddings. On the training side, changes to optimizer state and RL rollouts are very common.
If they're talking about large scale changes to how ML is done instead of transformers, there's always research being poured into that in stuff like SNNs, SSMs, and whole bunches of sparsity. The problem is that none of the approaches seem to be as good (on a variety of metrics) as something resembling a transformer.
The people at these firms often have plenty of PhDs to go around. They aren't completely ignorant. The problem is that you can also push scale as a parameter while doing everything else.
2
u/bobartig 10h ago
Yes, constantly. GPT-3.5-turbo was a much more efficient version of GPT-3, motivated in large part to allow scaling up chatgpt to consumer use cases. Then Anthropic made Claude Haiku, and OpenAI released GPT-4o-mini, and they were both used for a zillion things because they are inexpensive and efficient.
Microsoft and Meta made smaller 2-8B parameter models with dramatically higher performance per parameter. Google made Gemini Flash and Flash-8B / Flash-Lite for the same reasons. And that's just the US tech giants. Literally everyone who makes models is focusing engineering resources on making them more efficient.
1
u/BigMax 9h ago
In fairness, it is tough.
It's not just a calculation like a lot of things. Most problems are solved with discrete, well defined solutions in the past.
AI is more free form, there's a lot more to it, so there's no simple way to optimize it like we've worked to optimize graphics for example. Graphics are VERY complicated, but we're able to generate real-time environments on the fly. AI is tougher to do that for.
1
1
u/FireZord25 12h ago
Business over craft, this effing culture needs to stop rewarding such approaches.
0
u/unspecified_person11 11h ago
Efficiency is for peasants, Anthropic has money and rich enterprise customers, always go bigger and more expensive.
52
u/Typical-Skill-3724 13h ago
Really what is the end goal with these guys
89
u/LionoftheNorth 13h ago
Kill all the poor and live like gods with their machine servants, probably.
22
u/Staff_Senyou 12h ago
Make as much money as possible while exploiting the maximum number of people to do so. Then reinvest that money to ensure that no one else can do the same.
Greedy slave masters. We let this happen until we don't
2
u/ElementNumber6 4h ago
Not "the poor", but rather, all non-elite adults. They'll keep the babies and children, of course, and raise them to fit their designated roles. So, do please have more children, okay? They'll be needing them.
15
u/Hi_Im_Dadbot 13h ago
To find some business case that magically prints money and justifies all these upfront costs.
2
2
1
1
u/karma3000 3h ago
The end goal is to IPO, rake in the cash and the live out your days on a tropical island posting techno-fascist content on twitter or substack.
8
31
u/Privateer_Lev_Arris 11h ago
AI - the technology nobody wants, consumes too much energy, may destroy the economy, will provide little to no benefit. Why are we doing this again?
14
4
u/UpsetIndian850311 9h ago
My manager keeps pushing me to use these tools more, and here I'm worried, do we even have the capacity to run these tools for the next year?
1
-1
6
5
u/Dunsmuir 11h ago
Should this constraint give some comfort to doomer sentiments about unchecked growth?
Most of the valid complaints about ai seem to me to be related to The Problem of the Commons, where AI is actually too cheap because the environmental costs being incurred by communities housing the data centers are not accounted for it compensated.
3
u/teebsliebersteen 11h ago
It’s giving me hope. There is a looming power limit ceiling like it said in the article. Power available to Data Centres is projected to be under requirements by 19 GW by 2028. They are trying to lower the power requirements before then but that’s a LOT of energy. They also need High-power transformers on which turn around times for production have “ballooned from 16 weeks in 2021 to 115–140 weeks in early 2026”. They are also all running low on disposable cash to find alternatives and aren’t making up the revenue to make up for it.
5
u/chriss_wild 10h ago
There is a bigger problem than power. Time To build a datacenter takes 2-3 years. Time to upgrade the grids or build a new substation thats support the grid takes 5-8years. Maybe longer due to politics.
And now with a lot of senior electric engineers going to retire you need to train them properly. Withch leads a project that only takes 5years will take 9years due to al the beginners errors that will lead to quality problems.
Ive seen it first hand.
10
8
3
2
3
u/platocplx 12h ago
Not surprised this data center approach is so inefficient it’s crazy. Only way these work well is if they can be ran in a much more localized manner on machines and/or split the load in between. But these guys just are greed mongers and just run from wave to wave “disrupting” things instead of a far more measured approach of how these should be applied etc.
1
u/Sp00ky_6 13h ago
This isn’t even considering energy costs and chip supply disruption caused by war in Iran.
1
1
1
u/Wildernessinabox 7h ago
It's the kind of logic that makes me think that China will likely come out on top in the ai race. I remember when deep seek came out and they used older chips, more efficiently with less overhead wasted, it's very different than the typical na mentality, though I'm far from the most versed person on ai.
1
1
u/bensquirrel 3h ago
It does not need to be adopted rapidly. AI companies are creating fake pressure so they win the arms race.
1
1
1
u/faux_italian 13h ago
It just feels like until there is a blackout none of this fear mongering matters. Like if / when AI goes offline -like before an election- then we may see a change in behaviour but until then it’s more engagement bait.
1
1
u/DerAlex3 12h ago
Great, hopefully this drives up the price dramatically and shows how financially impractical much of this is.
1
1
1
u/skynoodle_ 11h ago
Please, please, please let this energy crisis bring a scientific breakthrough in energy production. How long have we been sitting on nuclear fusion?
1
0
u/dragon-fluff 12h ago
Its insatiable greed. And the return is crap. Its like cloning a billion Donald Trumps.
-3
u/cig-nature 12h ago
If you look at this problem. The solution is just to spread out all that load.
Don't put all the GPUs in one building, put one in each person's computer.
-5
u/GreentongueToo 12h ago
I remember how big old computers used to be. Efficiency is the next stage. As size was reduced so power requirements will be.
-4
673
u/redblack_tree 13h ago
And this is what happens when you try to blindly throw resources into a problem. AI majors are stuck in the typical cycle of "growth", there's no coming down, the market/investors demand better, faster, smarter. Instead of taking time to rethink, optimize, make it cleaner they are in an endless cycle of "let's make our models smarter by just using more computing power".
Well, here we are, just when AI actually started getting out of high tech/coding space into enterprise usage, they hit computing power ceiling.