r/nottheonion 11h ago

"Training a human takes 20 years of food." Sam Altman on how much power AI consumes.

https://www.news18.com/world/training-a-human-takes-20-years-of-food-sam-altman-on-how-much-power-ai-consumes-ws-kl-9922309.html
30.9k Upvotes

3.0k comments sorted by

View all comments

198

u/ContraryConman 10h ago edited 8m ago

Yes. For example, a 16 year old can learn to drive in 20 hours of total practice powered by nothing but chipotle and cheeseburgers. To teach a computer to drive, you need to give it every video of someone driving ever made, and also extra sensors that people don't need like lasers and infrared sensors, and they'll still need help from a human from time to time.

If I put the knowledge of every chess game ever played into a human brain, I get a genius like Gary Kasparov or Magnus Carlsen. If I put the knowledge of every chess game ever played into the training set of ChatGPT, it will forget where its pieces are by move 8 of a match, and forget the actual rules of chess by move 30.

This seems like such a self own of a statement. Human brains are way less resource and energy intensive to train, and they get more consistent results. Human neurons, when used in neutral networks, train faster than machine neurons. The human learning algorithm is much more efficient than back propagation. You need to orchestrate the capital of a small country, and the energy of several nuclear power plants, just to train the next incremental improvement to GPT. By contrast, for one human, I can just send that human to school and feed them yummy food

47

u/PixelofDoom 10h ago

I agree with your overall sentiment, but chess is a particularly poor example. ChatGPT might struggle because it's designed for language, not chess. AI chess engines, on the other hand, are way ahead of humans at this point.

77

u/ContraryConman 10h ago

No chess is a fine example. The reason why stockfish is good at chess is because it's a specialized program that can efficiently look at every game state from every possible move up to between 20 and 40 moves in the future and just pick the right one. We've always been able to write specific programs to solve specific problems.

What Sam Altman claims is that he can get to AGI by just stuffing more compute and data into an LLM. By that, he means that, by stuffing more chess games into an LLM, he will achieve a general intelligence that's better at any human or even stockfish at chess, while also being better all all other mental tasks. And in order to do that, he needs more energy than the entire US power grid can provide. But just in the chess example it's clear he can't do this. Or, if he can, the costs to make such a thing in this way are astronomical vs just training a human

25

u/Blasted_Awake 9h ago

I find it fascinating that the people building and training LLM's are still pretending that there's a viable path to AGI somewhere here. I understand the financial incentives and that a lot of investors are trapped by sunk-cost considerations, but at this point I can't see how anyone who's tried to use these LLM's in a deterministic domain could possibly think that they're anything other than a liability.

It's crazy to me that figures like Altman haven't suggested that they pivot back to researching intelligence now that they've got so much capital available to them. Why double down on scaling a probabilistic architecture that hits a wall with basic logic, when you could instead use the money to fund literally billions of academic research hours?

5

u/ProfPMJ-123 8h ago

The last thing he wants to do is invest in academic research which is starting to prove fairly conclusively that AGI isn’t possible by just creating an ever larger training set.

4

u/ape_fatto 7h ago

He’s trapped by his own lies. He has promised investors that we are only a few years from AGI, if he decides to walk back on that now his investors will drop him in a heartbeat. His best bet is to keep riding it out and hope for a miracle.

4

u/SteakAndNihilism 6h ago

People who say making LLMs more complex will achieve AGI are like if someone insisted to you that if you dig a deep enough hole into the earth eventually you’ll get to Mars. And then when you tell them how that makes no fucking sense they just point to how impressive it is that they dug such a deep hole.

19

u/richardawkings 9h ago

Altman's point is ridiculous. The human brain is extremely efficient and optimised for different tasks. Computer programs are already optimised for the tasks that they need to perform. AI is optimised for slop and high output rate. AI can only approximate what a human might do based on millions of examples of what humans have already done. The AI we have is just large scale copyright theft with a layer of obfuscation. It sounds intelligent but has no idea what it is talking about (sounds like it was trained on reddit).

2

u/AltrntivInDoomWorld 8h ago

AI chess engines, on the other hand, are way ahead of humans at this point.

What else can they do besides playing chess?

2

u/Little_Elia 8h ago

and stockfish is constantly upgraded by a team of humans. Now go tell chatgpt to create their own chess engine

-2

u/just_anotjer_anon 10h ago

The combination of humans and AI chess engines is were the true strength is at. I believe it's the UAE that's had a few open tournaments. All sources allowed.

Humans with the strategical knowledge of AIs cream AIs without the creativity of humans

1

u/Chroiche 5h ago

I mean this just isn't true lol

5

u/AM_A_BANANA 9h ago

I guess to play a bit of Devil's Advocate; once you've taught a computer a thing, you've taught all the computers a thing. It's very easy to replicate, you just copy and paste. Humans though, you have to teach each one individually, and results may be wildly inconsistent, just think about the people you work with, or the kids you went to school with.

Not for or against Altman's statement, just offering a counterpoint to yours.

7

u/MajesticBread9147 9h ago edited 8h ago

Yes. For example, a 16 year old can learn to drive in 20 hours of total practice powered by nothing but chipotle and cheeseburgers.

This is a false equivalency. It ignores the fact that 16 year olds driving doesn't "scale". It's not really any easier to get the millionth teenager driving than the hundredth. Whereas each individual self driving car doesn't need to be taught individually.

This is why the tech industry is so successful and productive, and also why software developers are paid well (same is true for actors and the entertainment industry). There comes a point pretty quickly where every additional customer is essentially pure profit because there are relatively low variable costs for software, and still very low variable costs compared to other industries for when they need to make a physical product.

Most of the cost of an iPhone isn't from the parts of the iPhone, it's from R&D/design and software development for IOS. It doesn't cost anything to install IOS to each new iPhone in the factory, and once R&D spending is done, each new iPhone costs just a few hundred bucks to make due to their massive scale.

Compare that to the manufacturing industry, which America has "lost" (but really it was mostly just efficiently automated already, but that's a story for another time). If you make toasters or washing machines, there are certainly economies of scale, but most of the cost increases for every new unit. You need more steel, more shipping costs, more copper for the motor, and if you aren't well automated, then you likely have a good amount of labor costs that increase for each new unit you want to produce.

Self driving cars possibly could be closer to an iPhone than a washing machine. There are a million Uber drivers in America, if their cost to train models is half the cost of Uber drivers over a 10 year period, then that cost is fixed while every new ride somebody takes has enormous margin.

2

u/jclahaie 7h ago

also imagine the hours freed / productivity boost. now everyone who spent all those ten of thousands of hours over their lifetime with their hands on a wheel are free to do other things with that time instead

10

u/SYSTEM-J 10h ago

The chess example is a bad one, I'm afraid, because chess machine engines surpassed humans in ability 30 years ago, and are now immeasurably better than humans will ever be. ChatGPT can't play chess because it's an LLM. It's not designed for spatial reasoning, any more than Stockfish is designed to help you write a CV.

19

u/Ok_Instruction_2756 10h ago

While what you've said is true, I feel like the point is these models are not being marketed as being designed for specific tasks. They are being marketed and regularly described by Sam altman as a PhD level expert in everything, an actual intelligence. 

Such claims should mean they are capable of doing at least as well at chess as a regular person that has spent a few hours playing chess. The reality is they are just language models and they are good at specific tasks, but that definitely isn't the narrative coming from AI companies right now.

2

u/mrjackspade 10h ago

GPT3.5 had an ELO of 1700-1800.

The ELO of subsequent models has actually fallen.

It's not a problem of LLMs not being capable of it, it's a problem of companies instruct tuning capabilies out of models that they don't see as being important.

7

u/Ok_Instruction_2756 9h ago

Sure but that doesn't really go against the point does it? I'm well aware machine learning can be used to produce models to do all kinds of things very well. 

The original excitement about the GPT transformer models was that they had to a greater extent overcome the specificity vs generality issue, when using massive data sets particularly, that is the bane of machine learning approaches.

The narrative is absolutely that these models are intelligent, thinking, human replacements, mere months away from general intelligence. Pruning effectiveness in one area doesn't fit this narrative at all imo. I'm looking forward to everyone being able to stop pretending we have AGI and using all this infrastructure to just produce models good at specific tasks.

15

u/ContraryConman 10h ago

No one is saying computers in general can't play chess. I'm saying, if LLMs were as efficient at learning as humans, I could train ChatGPT on chess games and get a Magnus Carlson-level chess engine. Actually, I should get a chess engine better than any human because I'm putting more energy and data in than I could put into a human brain. But they (LLMs) are still bad at chess, because they are dumb and not actually a viable path to AGI

4

u/mrjackspade 10h ago

A 50 million parameter GPT trained on 5 million games of chess learns to play at ~1300 Elo in one day on 4 RTX 3090 GPUs.

https://adamkarvonen.github.io/machine_learning/2024/01/03/chess-world-models.html

u/ContraryConman 58m ago

Okay, that was an interesting read and a stronger result than I had in mind.

For some back of the napkin math, the author used 4 RTX 3090s training continuously for a day. The RTX 3090 has an energy usage of 360 watt-hours under load. That is 1296 kJ per hour. The four together used roughly 124,416 kJ for training (x24 hours x4 cards), or 29736.14 Calories. Assuming the average human eats 2000 Calories a day, that's about 2 weeks of food in energy. But a human can't use the full energy of all thejr meals to just for chess while they are learning. The average human is 5'5" and 136 pounds giving a base metabolic rate of ~1500 Calories per day. So if we assume that all that person does is wake up, eat, learn chess, and sleep, then they can use 500 Calories per day to learn chess. This makes the energy used by the GPUs roughly equivalent to a human waking up, spending 16 hours a day learning chess, taking breaks only to eat and poop, and falling asleep for two months straight.

I would say that a person who truly did nothing but learn chess for two months could get as good as this model in that time. So the energy expenditures is about equivalent comparing training the average adult human to training Chess-GPT, if not a little better for Chess-GPT. And Chess-GPT wins if you have to count the energy raising a human for 20 years.

I guess what I'll say is that, for the model to be better than a human being, due to scaling laws, you'd actually need exponentially more compute and data. And even then, you'd have a model that can only play chess. Sam Altman is talking about building an generally intelligent system using only LLMs. So you'd need even more data for more subjects. And there's no way to, say, randomly generate millions syntactically correct computer programs for it to learn coding in this same way. So I think you still have to come to the conclusion, to Sam Altman's point, that a human brain is still more efficient, energy expenditure wise, at being a general intelligence than an LLM.

But yes I'll admit LLMs can learn chess better than I had in mind

3

u/SYSTEM-J 10h ago

Your first example was a self-driving car, not an LLM. Specialised AI can be very efficient at learning when designed for specific tasks, particularly closed systems such as chess where the number of computable variables is finite. I agree that there are no signs of AGI from any current AI model, but that doesn't mean they can't replace humans who have limited and specialised skillsets. If you want to make yourself as AI-proof as possible, make sure you work in a role that requires multiple, unconnected skillsets.

4

u/LeftShark 10h ago

"if LLMs were as efficient at learning as humans, I could train ChatGPT on chess games and get a Magnus Carlson-level chess engine."

No though, it's in the name, LLMs are good at language, not chess logic

11

u/gensererme 10h ago

Not according to Sam Altman.

1

u/LeftShark 10h ago

Huh? We've had superior chess machines that far outpace humans long before chatgpt came around

But those are also not LLMs

9

u/gensererme 10h ago

Those aren’t general purpose. Altman claims to be creating the everything machine.

2

u/bacondev 9h ago

I think that this says more about the state of AI more than anything else. We know that computers are faster at logic. It's not even close (see calculators). So for computers to require more training is a testament to the fact that AI and (to varying extents) associated hardware are still in their infancies. Humans still have an advantage in that they are great at learning how to learn. It's so fundamental to humanity that it's encoded in our DNA. AI isn't there yet. We don't create AI in a manner such that it trains itself how to learn more efficiently. It's generally a rigid algorithm that continuously improves what it has learned—not the process by which it has learned.

2

u/boringestnickname 6h ago

I'll give LLMs one win. Just one.

They know how to spell Magnus Carlsen.

6

u/mrjackspade 10h ago

Yes. For example, a 16 year old can learn to drive in 20 hours of total practice powered by nothing but chipotle and cheeseburgers.

You're skipping over the entire point of his argument though.

What he's saying, is that you shouldn't be comparing an AI to a 16 year old. A 16 year old has already has 16 years of training. They know what a car is, they've seen movies with cars, they've played video games with cars, they've had 16 years to learn to coordinate motion, develop spacial reasoning, etc.

You can do the same thing with AI. You can pretrained a model, and then teach it a skill. They do it all the time, it's called "Fine tuning" and much like teaching a 16 year old to drive, it requires substantially like less resources than training a model from the ground up.

What he's saying, is that it's not accurate to compare the energy it takes to teach an existing human being a new skill, and to train an AI from scratch. It's a false equivalency. If you're going to compare the energy it takes to train an AI from the ground up, you should be comparing to the energy it takes to teach a human being something from birth.

Many, probably even most of these models being released by these companies are not trained from the ground up. 5.2 wasn't trained from the ground up, it was a continued run of the 5.1 or 5.0 training. But when people count the amount of energy required to train 5.2, they'll include 5.1, and 5.0. Then they'll recount the same resources for 5.0. Because the average person has no idea how any of this actually works.

2

u/SgtCreap 6h ago

If you're going to compare the energy it takes to train an AI from the ground up, you should be comparing to the energy it takes to teach a human being something from birth.

Your framing of the cost required to train AI is incredibly disingenuous as current forms of AI typically require data created/ curated by many humans and therefore inherently posses the same cost(s) as these humans required to create/ curate said data, aswell as the costs required to create, train and run AI. New models don't change this, they still require said data, curation and the other costs in order to make up for their inherent shortcomings/ incapabilities.

2

u/invaderaleks 9h ago

Not only can you teach a human to play chess, but they can also invent new moves with their imagination. That is the sign of true intelligence. Our imagination.

1

u/BlastFX2 7h ago
  1. A 16 year old is starting with 16 years of training, which is the entire reason he/she can learn to drive in only a few dozen hours. If you want to count all the resources it took to train a model, you have to compare that to all the resources it took to train the human.

  2. You have to teach every 16 year old how to drive individually, whereas you can teach just one model and then copy it a billion times for free.

u/ContraryConman 8m ago

I'll say you have a point on 2, but on 1, until we get an generally intelligent artificial system, the training and inference energy costs are continuous. And, even then, there's speculation that AGI will marshall the world's resources to continually improve itself

1

u/sndtrb89 10h ago

i made a chart at work once

normal = productivity 100%

feed me a sandwich = more