Training AI to be really smart poses risks to climate

The computing power needed can use a lot of energy and spew lots of carbon dioxide

an illustration of a data center full of servers

AI research teams rely on data centers — buildings filled with powerful computers — to develop their models. These models are getting ever larger and more power-hungry, leading some experts to worry about their impact on Earth’s climate.

Vladimir_Timofeev/iStock/Getty Images Plus

Artificial intelligence — or AI — is the computer code that allows a machine to do something that normally requires a human brain. On TikTok, for instance, AI sorts the posts so that the first ones you see are likely to be those you’d prefer. AI serves up the useful results of every Google search. When you ask Siri to play Taylor Swift, AI turns your speech into a command to start her songs. But before an AI can do any of that, developers must train it. And that training devours energy. A lot of it. In fact, that training’s appetite for energy could soon become a huge problem, researchers now worry.

The energy to develop AI comes out of the electrical grid. And in most parts of the world, making electricity spews carbon dioxide (CO2) and other greenhouse gases into the air.

To compare how different activities affect the climate, researchers often combine the impacts of all greenhouse gases into what they call CO2 equivalents. In 2019, researchers at the University of Massachusetts Amherst calculated the impact of developing an AI model named Transformer. It released a whopping 626,000 pounds of CO2 equivalents. That’s equal to the greenhouse gases that would be spewed by five American cars from when they were made to when they were junked.

Only the largest, most complex models use that much energy. But AI models are rapidly growing ever larger and power hungry. Some AI experts have sounded an alarm about the threat these energy hogs pose.

Deep learning

Transformer can analyze text, then translate or summarize it. This AI model uses a type of machine learning that has skyrocketed in popularity. Called deep learning, it produces AI that excels at finding and matching patterns. But first, the system has to practice, a process known as training.

To translate between English and Chinese, for example, an AI model may churn through millions or even billions of translated books and articles. In this way, it learns which words and phrases match. Later, when given new text, it can suggest its own translation.

a bookshelf full of old books in brown leather binding, there is a library ladder next to the shelf
Language-processing AI systems learn by devouring texts in a particular language. This might include “reading” everything ever published online in some language — including libraries full of old books that have been digitized. Such data-intensive training uses a lot of energy.benstevens/iStock/Getty Images Plus

Thanks to deep learning, computers can sift through mountains of data to make quick, useful, smart decisions. Engineers have built AI that can direct self-driving cars or recognize emotions in human faces. Other models find cancer in medical images or help researchers discover new drugs. This technology is changing the world.

It comes at a cost, however.

The best deep-learning models are the behemoths of the AI world. Training them requires huge amounts of computer processing. They train on a type of computer hardware called graphics processing units (GPUs). They’re the same things that run the graphics for a realistic video game.

It may take hundreds of GPUs running for weeks or months to train one AI model one time, explains Lasse F. Wolff Anthony. He’s a student in Switzerland at ETH Zurich, a technical university. “The longer [the GPUs] run,” he adds, “the more energy they use.”

Today, most AI development happens at data centers. These computer-filled buildings account for only some 2 percent U.S. electricity use and 1 percent of global energy use. And AI development takes up only a tiny share of any data center’s workload.

But AI’s energy impact already is “big enough that it’s worth stopping and thinking about it,” argues Emily M. Bender. She’s a computational linguist. She works at the University of Washington in Seattle.

One common measure of the size of a deep-learning model is how many parameters (Puh-RAM-ih-turz) it has. These are what get tweaked during training. Those parameters allow a model to recognize patterns. Models that find patterns in language, such as Transformer, tend to have the most.

Transformer contains 213 million parameters. One of the world’s biggest language models of 2019, GPT-2, has 1.5 billion parameters. The 2020 version, GPT-3, contains 175 billion parameters. Language models also train on huge amounts of data, such as all the books and articles and web pages written in English on the internet. And, keep in mind, those data available for training grow month by month, year by year.

Bigger models and larger sets of training data usually make a model better at recognizing patterns. But there’s a downside. As models and datasets grow, they tend to need more GPUs or longer training times. So they also devour more electricity.

a series of images of penguin-carrots created by AI DALL-E
Large language models perform impressive feats. DALL-E, a model introduced in 2021, contains 12 billion parameters. It trained on a dataset of images linked to some text. DALL-E can combine ideas to draw its own creative images. The prompt here was to draw “a penguin made of carrot.” OpenAI

Sounding the alarm

Bender had been watching this trend with concern. Eventually, she got together with a group of experts from Google to say something about it.

This team wrote a March 2021 paper that argues AI language models are getting too big. Instead of creating ever larger models, the paper says researchers should ask themselves: Is this necessary? If it is, could we make it more efficient?

The paper also pointed out that rich groups benefit the most from AI language models. In contrast, people living in poverty suffer most of the harm from climate-change-related disasters. Many of these people speak languages other than English and there may be no large AI models focusing on their languages. “Is this fair?” asks Bender.

Even before it was published, her group’s new paper sparked a controversy.

Google asked its employees to remove their names from it. One of those people, Timnit Gebru, co-led Google’s AI ethics team. Ethics is the study of what is right or wrong. When she wouldn’t take her name off, Google fired her, she reported on Twitter.

Meanwhile, the company kept at its work on the biggest language model yet. In January 2021, it announced this model had a whopping 1.6 trillion parameters.

Leaner and greener

The new paper by Bender and Gebru’s team raises “a very important discussion,” says Roy Schwartz. He’s a computer scientist at The Hebrew University in Jerusalem, Israel. The climate impact of AI training is not huge. At least not yet. But, he adds, “I’m seeing a troubling trend.” Emissions from the training and use of AI models will grow ever larger — and soon, he suspects.

Sasha Luccioni agrees. This researcher at MILA, an AI institute in Montreal, Canada, also finds the rapid growth of these models as “worrying.”

Usually, Schwartz says, AI developers report only how well their models work. They compete on their accuracy in completing tasks. How much energy they use is all but ignored. Schwartz calls this Red AI.

In contrast, green AI focuses on boosting a model’s efficiency, he explains. That means getting the same or better results using less computing power or energy. You don’t necessarily have to shrink your model to do this. Since computer processing is complex, engineers can find ways to use less computing power without cutting the number of parameters. And some types of computer hardware can provide that power while sipping much less electricity than others. 

Right now, few developers share their model’s efficiency or energy-use data. Schwartz has called on AI developers to disclose them.

And he’s not alone in asking for this. A new annual workshop for AI developers convened for the first time in 2020. Its goal: to encourage simpler, more efficient AI language models.

Wolff Anthony teamed up with Benjamin Kanding, a student at the University of Copenhagen in Denmark, to create one new tool. It helps AI developers estimate the environmental impacts of their AI — such as energy or CO2 use — before they train them. Luccioni created a different tool. It tracks the CO2 emissions as a model goes through training.

Another way to make models greener is to carefully select the data center where a model trains. “If you train in Sweden,” says Kanding, “most of the energy comes from sustainable sources.” By that, he means wind, solar or wood-burning. Timing matters, too. At night, more electricity is available as most human users sleep. Some utilities charge less for that off-peak energy, too, or can use cleaner sources to produce it.

Deep learning is an incredible and powerful technology. But it will offer the most benefits when used wisely, fairly and efficiently.

Kathryn Hulick is a freelance science writer and the author of Strange But True: 10 of the World's Greatest Mysteries Explained, a book about the science of ghosts, aliens and more. She loves hiking, gardening and robots.

More Stories from Science News Explores on Tech