The latest AI models that work through problems step-by-step are powerful—and thirsty. Researchers at Hugging Face and Salesforce have quantified just how much: reasoning models use 30 times more energy on average than traditional AI, with some spiking to 700 times higher when their reasoning mode activates.
The reason is straightforward. These models don't just spit out answers. They generate text that mimics thinking aloud, working through a problem the way you might sketch notes on paper. That inner monologue—the reasoning chain itself—gets output as tokens, each one requiring computational energy. A model might generate hundreds of times more words to arrive at an answer, which directly translates to a proportional jump in power consumption.
This matters because the energy cost of AI is no longer theoretical. A June study found that reasoning models produce up to 50 times more CO₂ than their efficient counterparts. As AI systems scale globally, that difference compounds.
We're a new kind of news feed.
Regular news is designed to drain you. We're a non-profit built to restore you. Every story we publish is scored for impact, progress, and hope.
Start Your News DetoxMeasuring what we can't ignore
Hugging Face launched the AI Energy Score project to create a standardized way to measure this. They run each model through 10 tasks on the latest GPUs, tracking watt-hours consumed per 1,000 queries. It's the kind of transparency that forces a reckoning: you can't optimize what you don't measure.
"We should be smarter about the way that we use AI," said Hugging Face research scientist Sasha Luccioni. "Choosing the right model for the right task is important." That's the pragmatic insight. Not all problems need a reasoning model. Some don't warrant the energy cost.
But here's the catch. The models that stayed below 500 grams of CO₂ equivalent per 1,000 questions couldn't crack 80 percent accuracy. The powerful reasoning models? They got the hard questions right. Maximilian Dauner, a researcher at Hochschule München University of Applied Sciences, framed the tension clearly: better performance and lower emissions are currently pulling in opposite directions.
So we're at an inflection point. The research is making the trade-offs visible. Whether that visibility translates into different choices—using lighter models where they suffice, reasoning models only when necessary—depends on what we decide matters more.









