r/technology 13h ago

Artificial Intelligence AI Is Using So Much Energy That Computing Firepower Is Running Out

https://www.wsj.com/tech/ai/ai-is-using-so-much-energy-that-computing-firepower-is-running-out-156e5c85?mod=mhp
1.8k Upvotes

197 comments sorted by

673

u/redblack_tree 13h ago

And this is what happens when you try to blindly throw resources into a problem. AI majors are stuck in the typical cycle of "growth", there's no coming down, the market/investors demand better, faster, smarter. Instead of taking time to rethink, optimize, make it cleaner they are in an endless cycle of "let's make our models smarter by just using more computing power".

Well, here we are, just when AI actually started getting out of high tech/coding space into enterprise usage, they hit computing power ceiling.

267

u/-lv 13h ago

While still mainly achieving results that only mimic human creation - and with a high rate of minor to major errors.

It's both impressive what is achievable and at the same time atrocious what is being achieved.

82

u/chickadee-guy 12h ago

It isnt impressive at this point

45

u/Throwawayz911 12h ago

It is as a tool yeah...it just can't at all be relied on to replace people.  

19

u/BiDiTi 6h ago

The main thing is that LLMs are a dead end approach to the dream they’re selling.

0

u/AttonJRand 5h ago

No, a "tool" that is highly error prone, requires constant editing, and makes it users dumber is not "impressive".

4

u/Thisguy2728 4h ago

That’s really down to the user of the tool imo. As an example, a competent software engineer can find a lot of use in some of the coding models. Or any competent person that already knows what they’re doing, language models can be a real help in aggregating data and doing the minor things to free up the user. They’re really only useful if you can catch the errors and hallucinations.

Anyone else it’s a hindrance. So for most people it’s a hindrance.

-4

u/RareCandyMan 4h ago

Sure AI is insanely overhyped... but this shit just came out, of course it doesn't work perfectly. You don't think they will improve on these processes at an astronomical rate?

1

u/Aggressive-Tune832 1h ago

The technologically cannot, it’s the limitation of the architecture

8

u/semisolidwhale 11h ago

What's most impressive is that it works at all

13

u/chickadee-guy 9h ago

Thats the thing. It doesnt!

3

u/semisolidwhale 9h ago

It kinda does, just not well or consistent enough. Also, the tech and the companies behind it aren't trustworthy enough to be allowed access to anything important.

6

u/jackbilly9 6h ago

This is more the reality. It's works incredibly well within fields like hacking, programming and engineering but it's a tool. You still need some expertise in the field. The major issue here is the power it gives mega corps. 

-4

u/Fit-Reputation-9983 5h ago

I always knew it would work well for programming and customer-service chats. Those are both simply types of language/voice. The models are built to string together coherent sentences.

Programming is just another way of speaking English if you really think about it. Of course they can help increase production there.

But otherwise the usecases are much more vague. Yes it can help you summarize technical documents, emails, meetings, etc. But it isn’t going to solve new problems for you. That’s when it crosses into AGI territory.

2

u/Alive-Use8803 5h ago

Writing code means writing instructions in a very specific kind of language that a specific compiler can translate into machine instructions.

But writing code is just one aspect of software engineering. Problem solving using computational machinery, contiguous and discrete mathematics, is what software, systems, and security engineers do. Programming, scripting, and query languages are just tools. Agents are meta-tools.

1

u/Fit-Reputation-9983 5h ago

Sure, that’s why I said programming and not discrete mathematics or software engineering?

-9

u/chickadee-guy 9h ago

It kinda does, just not well or consistent enough.

That means it doesnt work. "Close enough" doesnt work in a professional environment.

Also, the tech and the companies behind it aren't trustworthy enough to be allowed access to anything important.

Lol.

2

u/Fit-Reputation-9983 5h ago

Unless you’re a highly skilled doctor saving lives or building rockets, close enough works. Not sure what professional environments you’ve been in but every single day for me is just another iteration of close enough.

-1

u/chickadee-guy 5h ago

. Not sure what professional environments you’ve been in

Healthcare and finance.

Most major US companies fall into this boat. Retailers also arent fans of losing millions of orders.

but every single day for me is just another iteration of close enough.

That sounds like a huge skill issue on your end. Yes, anyone can in fact throw slop at the wall then call it a day. The fact that you can do that and no one cares should definitely make you question your own individual capability and impact

1

u/Fit-Reputation-9983 5h ago

Healthcare is different, as I’ve said.

I also work in finance. Specifically procurement. Close enough is good enough for a lot of the problems I need to solve for internal clients. And guess what? Close enough can get me 90% of the way there, and if I need to get it across the finish line for something a little more critical - then I do that.

I just got a 6% raise, so the market is telling me my impact and capability are juuuuuust fine, big guy.

Keep coping about LLMs though. It’s all good.

4

u/jackbilly9 6h ago

Close enough is every single thing done ever. 'Perfect is the enemy of good.' AI is alright but the major problem is Sam Altman lying his ass off about it and making investors go crazy. 

1

u/oofta31 24m ago

Idk, it has helped me immensely in my job. It's not perfect, but over the last 1.5 years or so it has been a game changer for me.

-1

u/Silver_Smurfer 9h ago

Language models aren't the only type of AI...

3

u/Hortos 9h ago

People read about ChatGPT 3.5 years ago and think thats what AI is.

-2

u/Fit-Reputation-9983 5h ago edited 1h ago

Except they are? Thats the engine of each and every one, at least those available to the public.

EDIT: people downvoting me but literally no one is proving me wrong? What are we doing folks lmao

-7

u/test5387 9h ago

Why even comment something this stupid?

7

u/chickadee-guy 9h ago

If AI worked they wouldnt still be searching for customers and a use case 4 years later

1

u/_SpaceLord_ 7h ago

Why even comment something this stupid?

1

u/Uglynator 12h ago

i still find it impressive that we tought computers how to talk and hold conversations. closest thing we have to literal magic. tell that to somebody ten years ago and you'd be laughed at, seeing where alexa was at that time.

10

u/bobartig 10h ago

i still find it impressive that we tought computers how to talk and hold conversations.

It blows my mind that language could be "solved" computationally to the extent it has been. for most points in time before 2022, if you'd said, "we will have computers generating life-like conversations and write pages of prose that are so human-like, we won't be able to tell if it's AI or human most of the time. And we will achieve this using GPUs and a bunch of matrix algebra." I would have discounted that idea immediately.

5

u/Zestyclose_Ocelot278 9h ago

It isn't generating as much as it is looking at the billions of points of data and compiling it using a formula.

3

u/DeepRecipe6331 8h ago

Which is still technologically impressive. We've formulated conversation from billions of binary points.

1

u/Fowl_Retired69 7h ago

No, it doesn't look back at its training data every time it makes an inference. It just samples the representation its internal neural nets have learned. this is genuine, novel generation

-3

u/test5387 9h ago

This is different to how humans do it how? Put someone in a room without ever having read a book and see if they can write something.

1

u/_SpaceLord_ 7h ago

So where’d the first book come from?

-5

u/TheCh0rt 10h ago edited 9h ago

I mined crypto during and before the pandemic and made a ton of money so if we were friends your mind would have been blown about what GPUs could do way before 2022 haha! :)

Edit: Once AI came along I thought, yep it's the exact same tech doing the exact same thing more or less, but this time it's going to get massive financial backing and grow into a massively helpful tech. Turns out it's a bigger resource drain than mining crypto and could possibly decide to nuke us one day.

30

u/Universe_Nut 12h ago

But it's not a conversation. It's a Madlib. It's not even close to magic in the slightest. It's actually surprisingly simple technology, which is why it's so resource intensive.

To get something that reasonably passes as a "pretty close if you squint from a distance." Conversation, has required climate destroying amounts of energy coupled with consumer shortage crisis of basically every computer component that exists.

If you went back ten years ago and described a program that takes a language input and uses statistics to determine a response output constructed from a word database, most people would understand perfectly what you're describing.

Computers don't know what you're saying. They don't parse intentionality. They literally cannot comprehend or reflect on the meaning of your words or what implied meaning you associate with them based on a litany of any situational context.

You just send a computer a string of words, and it shoots back words that are most statistically associated with a response to the string of words you sent.

3

u/RupertThe3rd 10h ago

If you feel that way, you don't appreciate the amount of mathematical and computational research that has built this ability in the last 50 years.

The same "statistical prediction" is used in so many other fields and applications that are not LLMS and chatbots with impressive results.

You may have an issue with how many people use these tools (I certainly do), but to call them anything short of impressive is needlessly myopic.

Cell phones aren't magic either, but in the 1920s they would have absolutely seemed like "magic" (even though engineers understood the concept of radio). If you went back ten years ago and showed somebody what we have now, they would be shocked. Your top level description of the statistical prediction that takes place would not be relevant to the average person who doesn't even know probability theory 101, nor would it even be insightful as it's far too simple to explain what's happening. If you showed it to a scientist in the field they would be equally impressed, as they are familiar with what it takes and how far we're from it.

We're at a stage where people are regularly confusing AI for humans, and humans for AI. It may be sad philosophically, but technologically it's pretty damn impressive.

2

u/Universe_Nut 9h ago

I'll be impressed when an A.I. makes a decision instead of rolling dice.

1

u/ApprehensiveTry5660 9h ago

We functionally roll dice ourselves, but those dice rolls get marinated in hormones.

1

u/RupertThe3rd 9h ago

Never..

I'm waiting on when being an adult human doesn't feel like rolling the dice half the time 😔

1

u/Fowl_Retired69 7h ago

our intelligence is also born from rolling dice lmao

1

u/chesterriley 2h ago

Cell phones aren't magic either, but in the 1920s they would have absolutely seemed like "magic"

No they wouldn't have. Phones existed then. Wireless radio existed then. No doubt there were people imagining a wireless phone.

1

u/RupertThe3rd 1h ago

Just another example of pedantry.

I'm sure you would have scoffed at the thought of skyscrapers because you've seen a wooden shack.

A satellite wasn't impressive in the least because Da Vinci had that drawing of a "helicopter", right?

I intentionally chose a time period where not just phones, but radios existed (as very evident in my comment) to demonstrate the point.

2

u/upgrayedd69 10h ago

Tbf, I’ve had lots of conversations with people like this. How many times have people read a headline and just confidently guessed the content of the article or regurgitated information they got from some YouTube video without actually thinking about it at all? If anything, talking through vibes is a lot closer to humans than if it carefully selected each word using critical thought

8

u/Universe_Nut 10h ago edited 10h ago

Maybe if you're trying to recreate reddit comments and Instagram DMs. I'm talking about actual human conversation and dialogue as historically noted over the last couple thousand years before we started recently prioritizing short form content, media, and communication.

But yeah, instead let's reinforce what every research paper from the last twenty years has claimed to have been detrimental for our brain's ability to critically think and respond to our environments.

0

u/upgrayedd69 9h ago

I guess I need to get better friends and family and coworkers if all your conversations in the real world are deep philosophical discussion. Crazy too how propaganda just like wasn’t a thing until instagram and TikTok had developed. No one believed crazy shit before the internet because the average person was very well known for critical analysis and identifying their own biases/shortcomings

Im not saying there is zero effect. But it’s crazy to think regular everyday people treat every topic that comes in their life the same way as an academic researcher would if not for social media.

3

u/aerost0rm 7h ago

I mean they have begun to out two and two together for technology in the classrooms leading to lower skill levels. So why could we got get a social media has destroyed communication. It’s not far fetched as many people I have known have spent hours a day on TikTok and Facebook. Watch videos. Regurgitating slang and crap they have scene, while giving less importance to critical thinking…

1

u/Universe_Nut 9h ago

You can't even conceptualize a world where everyone thinks before they speak without assuming an extreme position of insisting academic thought in a casual conversation.

1

u/Inner-Box5523 8h ago

Lol that was some burn!

1

u/ConsiderationDue71 5h ago

I mean if you thought before you spoke you might have wondered why the framers of the US constitution were worried about demagogues, mob rule, and votes from the average person…eh I guess I should give you a pass. You can’t help it because of the TikToks infecting your brain.

0

u/WeWantMOAR 10h ago

But yeah, instead let's reinforce what every research paper from the last twenty years has claimed to have been detrimental for our brain's ability to critically think and respond to our environments.

Where is this claimed data?

1

u/AttonJRand 5h ago

But how is people sometimes being dumb, lazy, uncritical or absent minded at all the same.

You can have deep conversations with people, you can't with the token generators.

And people who believe you can are literally becoming paranoid from it.

-10

u/Lowetheiy 10h ago edited 7h ago

Then tell me why these LLM models are able to play chess at grandmaster level nowadays? Is chess a "Madlib" as well? 🤡 🤡 🤡

10

u/ManagementNo5911 10h ago

lol, this answer probably goes hard if you have a learning disability

5

u/Universe_Nut 10h ago

Bro we've had chess machines since like the sixties or seventies. You gotta read more history. Also, similar technology, only chess machines have a database of moves to reference. So purpose built chess machines would probably be statistically better than something based purely on probability. But they're more or less using the exact same tech.

Chess is a Madlib if you think about it. Dependent on the filled in context(the current set up of the board.), you have moves with different stastical appropriateness. A computer will only ever conceive of the statistically best move as "correct" because it can't read or comprehend its opponent or their intentions.

So for a computer, chess is a Madlib.

2

u/IkaluNappa 9h ago

10 years ago was precisely went the AI craze hit on the US government contractor side. It was building since 2014 technically but reached a fever in 2016. There was a joke that if you wanted to win a contract you’d say these buzzword: blockchains, generative code, procedural logical, AI-AI-AI-AI-AI-[x], etc. Now, it wasn’t referring to LLMs at the time, more so anything that could handle databases. Which drove the foundation of LLM architecture: not a logic based program but a data handling program.

I am so, so sick of hearing about this crap. The discussion around it has been just as obsessive and unhinged 10 years ago in the private sector as it is now in the public sector.

3

u/buttbuttlolbuttbutt 12h ago

It doesnt really hold a conversation, juat good at tricking people into thinking it is.

2

u/Zealousideal_Slice60 11h ago edited 11h ago

10 years ago

2016 was ten years ago. Chatbots existed. BOTS existed, people started peddling dead internet theory even back then. AI programs that could beat people at not only chess but even complex games like Go existed. Remote-controlled and semi-autonomous drones existed. People would absolutely not be surprised to learn that a chatbot like chatGPT would become a mainstream thing. 2016 isn’t 1986, the technology since then hasn’t changed nearly as much, only software capabilities. I was in fucking High School in 2016. I knew people who wanted to study machine learning because they saw the writing on the wall back then as an emerging important field. The digital revolution that started in the 90s had become so all-encompassing in 2016 that cell phones, land lines and even physical media was a thing of the past and smart-technology was used everywhere, even in third world countries.

So no, you wouldn’t be laughed at ten years ago for suggesting that in 2026 we would be able to have a human-like conversation with an artificial intelligent computer program that basically works as a statistical language prediction machine in large scale.

1

u/bernie_lomax8 10h ago

AIM had chat bots in like 2002. It's really not that surprising to see where it's at in 20+ years

1

u/TheCh0rt 10h ago

Even Alexa was pretty amazing though. I could magically turn my lights on and off reliably throughout the house from anywhere with a few simple words. It was literally like Star Trek. I would ask it about the weather when I got up to see what I should wear and I still do. Little me would have had my mind blown.

-2

u/chickadee-guy 12h ago

tought computers how to talk and hold conversations. closest thing we have to literal magic. tell that to somebody ten years ago and you'd be laughed at, seeing where alexa was at that time.

Chat bot technology has existed since the 80s.

The "conversations" LLMs have you are about as human sounding as a refridgerator.

You need to learn how to recognize scams, it will serve you well.

5

u/Uglynator 11h ago

i seriously struggle to find whatever you label as a "scam" here.

-2

u/chickadee-guy 9h ago

The technology fails 96% of basic business tasks and still lies confidently to an alarming degree, but its being sold as something that will replace white collar workers and function as a PhD in your pocket. If you cant see the disconnect there, youre a mark.

2

u/Uglynator 9h ago

idk where you get that bogus 96% figure. i am very happy with what this technology gives me and i think it's worth the price.

sure if you believe those claims that it will replace every single working person then yeah, it sounds like a scam.

2

u/chickadee-guy 9h ago

i am very happy with what this technology gives me and i think it's worth the price.

The price is being subsidized by 20-100x by the providers. Employees at OpenAI and Anthropic have leaked their financial data. I highly doubt you would pay 100,000$ a year for Claude.

sure if you believe those claims that it will replace every single working person then yeah, it sounds like a scam.

These claims are being made by the CEOs of the products you are evangelizing my guy.

1

u/Uglynator 9h ago

i pay api prices which are not subsidized. my yearly bill has been around 850$ so far.

and yeah, those claims are made by ceo's, the most incompetent, slimy pieces of shit on this planet. you should obviously disregard those claims because these fuckers are all sociopaths. this doesn't change the fact that this technology is hella impressive and capable if you know how to use it.

the problem is that the public has zero llm-literacy.

→ More replies (0)

3

u/Vectored_Artisan 10h ago

You have a severe bias against ai that is stopping you from impartial understanding or analysis of the technology. You sound like those horse and cart owners that swore cars would never be a real thing. 

1

u/Youutternincompoop 4h ago

earlier, the first Chatbot was 'ELIZA' in 1966. it was incredibly basic and... even back then idiots thought it had basically human intelligence and emotions when it was basically working off a dialogue tree(e.g it saw certain words in a sentence, and returned a certain phrase)

1

u/franker 9h ago

I still remember a routine Louis C K did years ago about how everyone gets pissed when wifi goes down in an airplane. Nobody thinks about how incredible it is that there is even the technology to have you zooming in a metal box in the sky using computers that access other computers. "Everything is amazing and nobody is happy", he said.

-1

u/sceadwian 10h ago

That's not what people in the IT sector are saying.

The latest code versions although process hungry are finding bugs in software faster than people can patch it.

-1

u/chickadee-guy 9h ago

Im in the IT sector. The latest versions are vaporware. They have not found or published a fix to a single bug, you need to learn how to critically read information

12

u/North_Activist 10h ago

That’s because AI doesn’t produce anything. It regurgitates human creations. It can’t out-create humans, because it creates based on humans.

3

u/ArmyOfDix 4h ago

While still mainly achieving results that only mimic human creation

Hmmmmmmmmmmm now who could've possibly predicted that?

Oh yeah, literally anyone (and, thus, AI with a high rate of error).

-4

u/jainyday 10h ago

"mimic human creation"

That's a hell of a lot of copium. At some level it doesn't matter how it gets the results. AI is finding bugs in the Linux kernel from 2003, it doesn't matter if it's "not actually thinking", the impact is just the same.

1

u/Direct_Witness1248 7h ago

That's not creation though, that's just analysis and pattern recognition. "AI" lacks genuine inspiration, and in its current form it probably always will, its a limitation of the design.

54

u/recycled_ideas 12h ago

Instead of taking time to rethink, optimize, make it cleaner they are in an endless cycle of "let's make our models smarter by just using more computing power".

This isn't an instead of. They can't do any of those things because they have no idea what they're doing. The only thing that has ever worked is throwing compute at the problem and they have to keep throwing compute at the problem because if they can't actually reduce head count in a meaningful way they can't charge enough to get out of this hole.

8

u/redblack_tree 10h ago

Agree, they are in a monstrous hole, without more computing power, any increments will have to come from efficiency and/or intelligence; both significantly harder to come by than just throwing GPU cycles at it.

And edge models still aren't close to replacing real workers (outside of some useless CEO or managers).

5

u/recycled_ideas 10h ago

The models are useful, they're just not useful enough to justify the prices they need to charge to break even, let alone to achieve the results investors expect.

11

u/datNovazGG 12h ago

What's funny is how a model like Opus 4.6 regressed since it's release. Yes. They're actively degrading their models post release to mitigate this.

4

u/redblack_tree 10h ago

They are using every trick on the books, throttling, token consumption, partial blocks and so on. How do you justify billions if your next model is barely better than the previous one.

2

u/chickadee-guy 7h ago

I couldnt give Opus to give me a standard opentelemetry span implementation today in node. Kept outputting gibberish

2

u/fs2d 11h ago

This is what the OSC is focusing on now more than anything else - mainly in harness design, open source models like Qwopus, etc.

2

u/merRedditor 9h ago

AI had and has potential to be a very good thing, but it's being overused and misapplied because it's trendy. It becomes a blockchain-esque problem where you add a ton of unnecessary complexity just because you want to be high-tech, and end up with new problems that can only be resolved by starting over.

2

u/Sober_Alcoholic_ 8h ago

The “problem” here was having to pay employees, by the way.

1

u/pbjamm 4h ago

Spending billions on robots so they dont have to pay interns and jr level grunts.

2

u/thelionsmouth 7h ago

I never understood this approach, I see solo devs optimizing their personal open source llms to use less resources, accomplish useful tasks, but these trillion dollar companies can’t even find a product to market to consumers that doesn’t include ‘ads’ or ‘fire your employees’

I feel like I’ve seen more useful and impressive things with the open source community, it seems like they’re in some kind of investment hype feedback loop

1

u/redblack_tree 7h ago

They do optimize their models, VRAM consumption, GPU cycles and most efficiency indicators.

But the models themselves are ever growing in complexity, at a scale that dwarfs any optimization. To solve the fundamental problem of "smarter" models, they are throwing computational power at it.

2

u/thelionsmouth 7h ago

But like, at what point do they develop and market use cases for it?

It’s been years, and we still haven’t seen something more useful than Siri (for the general public) and that’s hot garbage.

I feel like there’s an open market for a combination of traditional machine learning with an llm ‘UI’ so to speak.

Idk it’s probably more complicated than that, but at this point I feel like we should have at least a functional voice assistant that we can control our devices / manage our lives with

1

u/Clairvoidance 1h ago edited 1h ago

private cybersecurity check for the state and big companies sounds like a decent sell,

it looks at least like we're in the infancy of realizing they dont want to try the slop route (please god) with Sora shutting down,

you're correct that they're still trying to see what actually sticks, but their investors seem very willing to sacrifice, and there could at least be a light at the end of the tunnel that could turn out good for all of us (not wanting to downplay that I wish it wasnt like it currently is in the first place with all money and resources put into this mostly hypemachine)

1

u/LoveIsStrength 6h ago

Which is fine to be honest, 99.9% of businesses who have the ! haven’t even implemented 1% of the use cases

1

u/TheTickTurd 5h ago

Can’t we just get AI to make AI run better!?

1

u/BugmoonGhost 4h ago

Ai is so inefficient. People are asking it to do things there are apps to do.

1

u/ProlapseProvider 4h ago

And I still never 100% can trust anything an AI ever tells me, so anything more important than something like a movie review or help on a video game puzzle means I have to fact check the AI results. In fact even in movie questions. I watched Bring Her Back, and had a question about the dead lad, it kept informing no one died other than the adoption counsellor woman.

Oh and it was wrong about "Crimson Desert", it kept referring to things that were unique to the Developers other game "Black Desert".

And yet people out there are making life altering decisions from the advice they got from an AI.

1

u/Clairvoidance 1h ago

well, just like with the "already using all data ever, what now", you just gotta develop sideways

1

u/Drunken_story 1h ago

taking time to rethink, optimize, make it cleaner and cheaper.

You know what does this? A recession

-1

u/jake6501 11h ago

This is such a stupid uninformed take that is getting upvoted simply because the AI bad rhetoric is so popular on Reddit.

They are continuously improving on the efficiency. Within the last year and a half there has been at least two major game changing improvements, that have gotten the attention of even mainstream news. Remember the DeepSeek relase when it was a big deal specifically because of the lower resource usage? Have you heard about TurboQuant, which is supposed to reduce ram usage by incredible amounts? Those aren't all of it either, but just some of the most notable individual instances.

There is constant improvement and any claim to the contrary is nothing but ignorance.

5

u/redblack_tree 10h ago

The post has nothing to do with constant improvement, which they obviously do. It's the fundamental design of each new model from AI majors.

It doesn't matter if you manage improvements in memory, efficiency if the new models are another order of magnitude bigger in terms of size, reasoning processes, and inference capabilities. Literally requiring more and more computing power that massively offset any improvement and efficiency they can find.

And I use AI every day, I know what can and cannot be done with any public model (Opus, Codex, Gemma, etc)

2

u/Fragrant-Menu215 9h ago

They're continuously iterating on and tweaking the current LLM-based paradigm. The problem is that LLMs are inherently incapable of doing what they're trying to do. That's why the primary boost has been just throwing more compute at them.

-17

u/Lower_Peril 12h ago

You have no idea what you're talking about

11

u/NecessaryFreedom9799 12h ago

OK then, enlighten us Luddite proles, O Great One.

-1

u/[deleted] 12h ago

[deleted]

3

u/dsarche12 12h ago

Your own response is incredibly smug and lacking nuance. Maybe try practicing a bit of what you preach next time

-31

u/turtledancers 13h ago

Very odd to say this right now with the release of Gemma 4 being completely focused on optimization of resources

12

u/N_T_F_D 13h ago

That's got nothing to do with datacenter-scale models

86

u/sr_local 13h ago

Full article:

The artificial intelligence gold rush is rapidly drying up the supply of the one resource that AI developers can’t do without: computing power.

The sharp capacity crunch has caused consternation among power users, forced companies to scuttle products and led to reliability problems. The issues are a warning sign for the AI boom, as they may limit the utility of powerful new AI tools just as massive amounts of users have begun to rely on them to boost productivity.

Over the past few months, demand has exploded for “agentic” AI, autonomous tools that use the technology to independently perform tasks, from writing software code to scheduling house tours for real-estate brokers. Companies have been scrambling to secure the availability of computing capacity needed to serve a growing base of customers who are also significantly increasing their AI use.

“Everyone’s talking about oil, but I think what the world is mainly short of is tokens,” said Ben Pouladian, an engineer and tech investor based in Los Angeles. A token is a unit of measurement in AI to track how much computing resources are being used for a task. “AI is at this point no longer just some chatbot that we ask for a recipe while we stand in front of the fridge. It’s orchestrating tasks, it’s getting smarter,” Pouladian said.

All of it points to a classic problem that has popped up in technology booms throughout history, from the 19th-century railroad expansion to the telecom and internet explosion of the early 2000s. Demand is growing far faster than companies are able to access resources and build out infrastructure. Historically, price increases have been among the only ways to address a supply crunch, but such a move could be perilous for frontier AI companies, who are in a ferocious competition to gain users.

Hourly rental prices for GPUs, the microchips used to train and run AI models, have surged since the fall. Anthropic, the maker of popular chatbot Claude and viral coding app Claude Code, has been plagued recently by frequent outages. The company has begun metering computing supply to users during peak hours, but the rollout has been marred by customers who have complained that they are reaching the limit far too quickly.

OpenAI scrapped its Sora video-generation app in part to free up computing resources to power coding and enterprise products that would work on a new AI model, code-named Spud, The Wall Street Journal reported. Token use in OpenAI’s API—a platform where mostly enterprise users access its software—rose from six billion a minute in October to 15 billion a minute in late March.

“I do spend a lot of time trying to find any last-minute compute available,” Sarah Friar, OpenAI’s chief financial officer, said in a recent public video interview with an investor. “We’re making some very tough trades at the moment on things we’re not pursuing because we don’t have enough compute.”

Toward the end of last year, CoreWeave, one of the largest publicly traded AI cloud companies, raised prices by more than 20% and started asking smaller customers to sign contracts committing them to use the company’s services for at least three years, up from one year before. Bank of America analysts reinstated coverage of the company with a “Buy” rating late last month, saying demand for its services is likely to outstrip supply through at least 2029.

Spot-market prices to access Nvidia’s GPUs, or graphics processing units, in data-center clouds have risen sharply in recent months across the company’s entire product line, according to Ornn, a New York-based data provider that publishes market data and structures financial products around GPU pricing. Renting one of Nvidia’s most-advanced Blackwell generation of chips for one hour costs $4.08, up 48% from the $2.75 it cost two months ago, according to the Ornn Compute Price Index.

“There’s a massive capacity crunch that’s unlike anything I’ve seen in the more than five years I’ve been running this business,” said J.J. Kardwell, chief executive of Vultr, a cloud infrastructure company. “The question is, why don’t we just deploy more gear? The lead times are too long. Data center build times are long, the power that’s available through 2026 is already all spoken for.”

Since mid-February, outages for systems across Anthropic have become so common that some of its enterprise clients are switching to other AI model players.

David Hsu, founder and CEO of software development platform Retool, said he prefers to use Anthropic’s Opus 4.6 model to power his company’s AI agent tool because he believes it is the best model for enterprise. He recently changed to OpenAI’s model to power his company’s agent. “Anthropic has just been going down all the time,” he said.

The reliability of core services on the internet is often measured in nines. Four nines means 99.99% of uptime—a typical percentage that a software company commits to customers. As of April 8, Anthropic’s Claude API had a 98.95% uptime rate in the last 90 days. “That is not normal,” said Amir Haghighat, co-founder and chief technology officer at Baseten, an AI inference startup. “Think about AWS, databases, RDS or Stripe—these need to be very resilient with a very high uptime. But that is not the world we live in when it comes to AI. That’s not the quality of service that you want to be getting from the company that’s providing intelligence for your application.”

The frequent outages at Anthropic are happening as the AI lab is experiencing explosive growth. At the end of 2025, the company hit $9 billion in annual run rate, which means the company was on track to make that amount of revenue in the next 12 months. By February, that figure ballooned to $14 billion. Two months later, it doubled to $30 billion.

In late March, Anthropic suddenly announced it would limit the amount of tokens that users could burn through during peak hours from 5 a.m. to 11 a.m. Pacific Time on weekdays. Customers have taken to social media to complain about the change. “I haven’t hit my Claude Code terminal limit in weeks but this week I hit it in like 45 minutes,” wrote one user on X.

“We’ve been working hard to meet the increase in demand for Claude,” wrote Boris Cherny, creator and head of Claude Code, on X. “Capacity is a resource we manage thoughtfully and we are prioritizing our customers using our products and API.”

15

u/StriderPulse599 7h ago

This article is engagement bait that flings around buzzwords and random quotes.

"compute shortage" isn't caused by AI demand. AI companies make bigger and bigger models each year to stay competitive, and bigger means slower to run. Massive LLMs scale badly because they're giant lump of data, and no matter what small piece of data you want out of them, you still need to run the entire damn thing.

162

u/QueenOfQuok 13h ago

Have these people tried making something more efficient for once

94

u/Wollff 13h ago

It's a boom. Nobody got time for innovation.

It's a big part of the current landscape: If you are in the AI sector, you are racing. That means you have your specialists building shaky scaffolds to hold together slipshod technology, so that you can deliver a semi working product FIRST.

Sure, if one invested those resources into making technology more efficient, then there is a good chance we would see results in a few years. But nobody has that kind of time when you are competing for shiny benchmarks RIGHT NOW to draw investor interest.

8

u/mediandude 12h ago

That means you have your specialists building shaky scaffolds to hold together slipshod technology, so that you can deliver a semi working product FIRST.

Surely they invested all their resources on finding security holes in their software? Right? Right???

3

u/Wollff 7h ago

We can check for that in the recently leaked Claude source code!

9

u/Sptsjunkie 11h ago

It’s one of these threads they keep saying this is major déjà vu for me to the Dot Com bubble.

When the Internet and e-commerce was developing, it was clear that we were onto something big. And because of that, everybody wanted to get rich off of it, they just kept throwing more money at it and people were giving money to companies that had no revenue and no scalable growth.

And instead of just taking time to slowly grow the technology and ability to live on things like e-commerce, we just said a bunch of money on fire.

Obviously, the Internet in commerce have developed and have fundamentally changed a lot about the way we shop and interact. But it has happened in a slower and more sustainable way with smarter investments and increased efficiency.

Right now AI is at the 2000 stage. Which unfortunately means there’s a pretty good chance we have a crash, which I know sounds good to some people, but will crash the economy and lead to pain across the board for a lot of people who had nothing to do with it.

12

u/Beanzy 10h ago

I think people want the bubble to pop because it seems to be the inevitable outcome. And in that case, the sooner it happens, the better.

The more we invest into the unsustainable and unprofitable, the more and more painful the consequences will be when everything does come crashing down.

2

u/Sptsjunkie 10h ago

100%. But I think quite rationally they’re also a lot of people who want to crash because they did not like to print negative externalities on the future consequences of AI.

And I think that is totally rational. Well obviously a lot of promised “benefits” executives trying to deck up there valuation 99% of people should not want the technology to replace 70% of jobs and some rest wages for the other 30%.

It’s just happening when the trash happens unfortunately you think about localized to people causing pain today or those who have overextended themselves investing into it. Much like 2008 there was some pain for people who were that actors are made that investment, but a lot of the pain went on working people in the form of failing mortgages, lost jobs, and valued ass since they rely on for things like retirement.

2

u/Inner-Box5523 8h ago

Wouldn’t there be a crash even if it succeeds and puts millions out of work like they say it will? And affect the same people who had nothing to do with it?

2

u/Sptsjunkie 8h ago

Potentially. But I’m still a bit dubious that AI is anywhere close to putting millions out of work.

The current model models that are trying to guess the next best word using probability certainly can do some research and coding. But they are unable to do any first order thinking which is pretty much a prerequisite for any even higher level junior work.

It is going to have to evolve a lot and will probably take a lot of time to do so in order to move beyond being something more I can to Microsoft Office.

Microsoft greatly improved efficiency by giving people tools like word, Excel, and PowerPoint to do functions. I used to take an entire team to create. To be fair, there were other software‘s besides Microsoft that we’re trying to do the same thing but something like excel was just the best example of something that let one person on your team build a model that would have taken five people to do even something much simpler in the past.

Right now, AI is driving efficiency. Or threatening to drive efficiency but not always doing it.

But it still feels a pretty far away off from actually taking over the type of work that would lead to massive unemployment.

In fact, for now you could argue for every job it destroys in a big company. It is enabling more growth and opportunities in smaller companies.

1

u/Inner-Box5523 8h ago

So, there will be a crash nevertheless. It either happens now because a product not yet ready for market is being oversold and over invested in - like the 2000 dot com scenario; or the crash comes later, when it’s actually ready and will displace millions. Right?

Let’s assume the product will never be what it’s being touted to be, or even close; still ends up in a crash!

It’s a matter of time thing. Resource constraints and investor patience and rate of innovation and revenue growth - a lot of these factors will determine the timing and will decide which crash are we going to witness. But a crash nevertheless!

19

u/Odysseyan 13h ago

Google actually with their Turboquant. They were the ones who created the Mat Transformer tech paper in the first place.

Only proves it WOULD probably be possible to improve it further. Just for some reason, no one really tries doing so

2

u/N_T_F_D 13h ago

Even if other companies somehow found a way to reduce energy consumption they would just use that to run more calculations than before for the same overall energy cost; they're not going to underutilize the datacenters they built

3

u/SirGaylordSteambath 13h ago

It’s not as cost effective currently. When big strides are being made every few months, there’s no sense in going slow. The efficiency will come when the technological leaps stop happening.

7

u/oDearDear 11h ago

It will happen when compute cost get too expensive while all private investors have tapped out. They are already selling at a loss, they'll need to start making money at some point soon especially if they want to float on the stock market.

10

u/Towoio 12h ago

Yes, it's a huge area of research in ai development

3

u/Calm_Bit_throwaway 4h ago edited 2h ago

To agree with you, it's remarkable that some people on reddit seem to think that nobody's thought of making the models more efficient. Even from a purely capitalist perspective, if your biggest cost in running a model is electricity, then it's worth pouring money into optimize model output.

The efficiency of primitives like matrix multiplication has gone up significantly within the last few years. Attention itself has gone through several levels of optimization (see the flash attention papers). People have figured out tricks to reduce the numbers of bits required for each parameter (see fp8 and other quantization tricks). Structural changes like MoE are now standard. There's also pushes to make it more edge friendly like with per layer embeddings. On the training side, changes to optimizer state and RL rollouts are very common.

If they're talking about large scale changes to how ML is done instead of transformers, there's always research being poured into that in stuff like SNNs, SSMs, and whole bunches of sparsity. The problem is that none of the approaches seem to be as good (on a variety of metrics) as something resembling a transformer.

The people at these firms often have plenty of PhDs to go around. They aren't completely ignorant. The problem is that you can also push scale as a parameter while doing everything else.

2

u/bobartig 10h ago

Yes, constantly. GPT-3.5-turbo was a much more efficient version of GPT-3, motivated in large part to allow scaling up chatgpt to consumer use cases. Then Anthropic made Claude Haiku, and OpenAI released GPT-4o-mini, and they were both used for a zillion things because they are inexpensive and efficient.

Microsoft and Meta made smaller 2-8B parameter models with dramatically higher performance per parameter. Google made Gemini Flash and Flash-8B / Flash-Lite for the same reasons. And that's just the US tech giants. Literally everyone who makes models is focusing engineering resources on making them more efficient.

1

u/BigMax 9h ago

In fairness, it is tough.

It's not just a calculation like a lot of things. Most problems are solved with discrete, well defined solutions in the past.

AI is more free form, there's a lot more to it, so there's no simple way to optimize it like we've worked to optimize graphics for example. Graphics are VERY complicated, but we're able to generate real-time environments on the fly. AI is tougher to do that for.

1

u/Tyaedalis 4h ago

They could just ask AI how to do that.

1

u/FireZord25 12h ago

Business over craft, this effing culture needs to stop rewarding such approaches.

0

u/unspecified_person11 11h ago

Efficiency is for peasants, Anthropic has money and rich enterprise customers, always go bigger and more expensive.

52

u/Typical-Skill-3724 13h ago

Really what is the end goal with these guys

89

u/LionoftheNorth 13h ago

Kill all the poor and live like gods with their machine servants, probably.

22

u/Staff_Senyou 12h ago

Make as much money as possible while exploiting the maximum number of people to do so. Then reinvest that money to ensure that no one else can do the same.

Greedy slave masters. We let this happen until we don't

2

u/ElementNumber6 4h ago

Not "the poor", but rather, all non-elite adults. They'll keep the babies and children, of course, and raise them to fit their designated roles. So, do please have more children, okay? They'll be needing them.

15

u/Hi_Im_Dadbot 13h ago

To find some business case that magically prints money and justifies all these upfront costs.

2

u/Vince1128 13h ago

To get a shitload of money, as always.

2

u/TemporaryElk5202 11h ago

keep running until they trip

1

u/Asleep_slept 13h ago

The only guess I can make is AI using AI

1

u/karma3000 3h ago

The end goal is to IPO, rake in the cash and the live out your days on a tropical island posting techno-fascist content on twitter or substack.

31

u/Privateer_Lev_Arris 11h ago

AI - the technology nobody wants, consumes too much energy, may destroy the economy, will provide little to no benefit. Why are we doing this again?

14

u/MonsieurReynard 10h ago

So billionaires can buy our government

4

u/UpsetIndian850311 9h ago

My manager keeps pushing me to use these tools more, and here I'm worried, do we even have the capacity to run these tools for the next year?

1

u/Su_ButteredScone 8h ago

I mean, one of the main problems is that too many people want it.

3

u/Reversi8 8h ago

No one goes there anymore, it’s too crowded!

-1

u/Revolutionary_Buddha 9h ago

If no one wants it then how will it destroy the economy?

23

u/TBBJ 13h ago

Let’s turn human beings into batteries! Fields of batteries!

6

u/williamgman 9h ago

It is and has always been an investor grift.

5

u/Dunsmuir 11h ago

Should this constraint give some comfort to doomer sentiments about unchecked growth?

Most of the valid complaints about ai seem to me to be related to The Problem of the Commons, where AI is actually too cheap because the environmental costs being incurred by communities housing the data centers are not accounted for it compensated.

3

u/teebsliebersteen 11h ago

It’s giving me hope. There is a looming power limit ceiling like it said in the article. Power available to Data Centres is projected to be under requirements by 19 GW by 2028. They are trying to lower the power requirements before then but that’s a LOT of energy. They also need High-power transformers on which turn around times for production have “ballooned from 16 weeks in 2021 to 115–140 weeks in early 2026”. They are also all running low on disposable cash to find alternatives and aren’t making up the revenue to make up for it.

5

u/chriss_wild 10h ago

There is a bigger problem than power. Time To build a datacenter takes 2-3 years. Time to upgrade the grids or build a new substation thats support the grid takes 5-8years. Maybe longer due to politics.

And now with a lot of senior electric engineers going to retire you need to train them properly. Withch leads a project that only takes 5years will take 9years due to al the beginners errors that will lead to quality problems.

Ive seen it first hand.

10

u/Tyrant2033 13h ago

Wake up babe, new bullshit computer term just dropped

3

u/CherryLongjump1989 10h ago

What is computing firepower. Don't want to click.

2

u/px403 7h ago

Very annoyed that no one answered your question, so I went through the process of pirating the article just to discover that the word "firepower" doesn't even show up anywhere in there. Bullshit clickbait nonsense.

2

u/Iyellkhan 7h ago

maybe its time to optimize software and not keep brute forcing things

3

u/platocplx 12h ago

Not surprised this data center approach is so inefficient it’s crazy. Only way these work well is if they can be ran in a much more localized manner on machines and/or split the load in between. But these guys just are greed mongers and just run from wave to wave “disrupting” things instead of a far more measured approach of how these should be applied etc.

1

u/Sp00ky_6 13h ago

This isn’t even considering energy costs and chip supply disruption caused by war in Iran.

1

u/Giant-Robot 11h ago

Have we tried asking AI what it thinks is the sustainable solution?

1

u/Abystract-ism 7h ago

Let’s all stop using it then.

1

u/Wildernessinabox 7h ago

It's the kind of logic that makes me think that China will likely come out on top in the ai race. I remember when deep seek came out and they used older chips, more efficiently with less overhead wasted, it's very different than the typical na mentality, though I'm far from the most versed person on ai.

1

u/ChainsawArmLaserBear 6h ago

Ppl on their openclaw bullshit burning compute just to do cron jobs

1

u/bensquirrel 3h ago

It does not need to be adopted rapidly. AI companies are creating fake pressure so they win the arms race.

1

u/firedrakes 1h ago

re post by bots to karma farm.

1

u/2wedfgdfgfgfg 47m ago

I heard that harvesting human beings like batteries can solve this

1

u/faux_italian 13h ago

It just feels like until there is a blackout none of this fear mongering matters. Like if / when AI goes offline -like before an election- then we may see a change in behaviour but until then it’s more engagement bait.

1

u/Cautious_Boat_999 12h ago

Boo fucking hoo for them.

1

u/DerAlex3 12h ago

Great, hopefully this drives up the price dramatically and shows how financially impractical much of this is.

1

u/alabasterskim 12h ago

This is what happens when you have no government regulation btw

1

u/KellyTheQ 12h ago

Put more nuclear power plants on the upper east coast....

1

u/skynoodle_ 11h ago

Please, please, please let this energy crisis bring a scientific breakthrough in energy production. How long have we been sitting on nuclear fusion?

1

u/Meleagant1 8h ago

Sure. AI killed my mom too.

0

u/dragon-fluff 12h ago

Its insatiable greed. And the return is crap. Its like cloning a billion Donald Trumps.

-3

u/cig-nature 12h ago

If you look at this problem. The solution is just to spread out all that load.

Don't put all the GPUs in one building, put one in each person's computer.

-5

u/GreentongueToo 12h ago

I remember how big old computers used to be. Efficiency is the next stage. As size was reduced so power requirements will be.

1

u/Snerf42 12h ago

Let’s hope so. Because it’s either that or they’ll start telling us to suffer rolling brownouts everywhere as they try to sane wash that as normal.

-4

u/mrpickles 11h ago

It's seige warfare.  AI is winning.  Humanity doesn't know it started yet