Would lip-synching make androids seem less creepy?

Lifelike lip motion might overcome the ‘Uncanny Valley’ effect triggered by some robots

a photo of a humanoid robot face on a rod against a black backdrop

This new robot can synch its lip motions with its speech pretty convincingly. Would that make you more comfortable interacting with it?

Jane Nisselson/Columbia Engineering

I smiled when I first saw the robot Pepper. Its small mouth and big black eyes gave it a bit of that cute puppy look. But the more humanlike androids Sophia and Ameca struck me quite differently. Though impressively realistic, they creeped me out. This kind of unease is a common reaction to robots that look too close to human. But one team of engineers may have found a way to overcome it: Make humanoids that lip-synch.

The discomfort triggered by some humanlike robots is called the “Uncanny Valley.” The name refers to the fact that we tend to become more comfortable with robots as they become more humanlike — but only up to a point. Get too close to the real thing, and our comfort with androids sharply drops off. We may suddenly feel unsettled or even repulsed by them instead.

For instance, Pepper, which looks like a cross between a Powerpuff Girl and a marshmallow, may be charming. But Sophia and Ameca, designed to resemble real people, look just a bit off — putting them in the Uncanny Valley.

Roboticist Masahiro Mori first proposed the Uncanny Valley back in 1970. This effect doesn’t only show up with robots. It also happens with other artificial depictions of people, such as AI-generated art or some animations (as in the movie The Polar Express).

Alexander Diel is a neuroscientist at LVR University Hospital Essen in Germany. He wasn’t involved in the new work. But he does study social robots. “Research suggests that the Uncanny Valley is associated with ‘prediction error’ responses,” he says. It’s the “brain’s reaction to a stimulus that does not fit an expected pattern.” Humans have evolved to be very sensitive to facial expressions and cues. When we see a face that doesn’t fit this expected pattern, we are clued in that something is “wrong.”           

Many aspects of a robot can contribute to the Uncanny Valley, but some more than others. Hod Lipson has given a lot of thought to what makes this effect show up the most. A roboticist, he works at Columbia University in New York City. “Is it the skin? Is it the eyes?” he wonders. In the end, he concludes: “It’s the lips.”

Learn about the Uncanny Valley and how it’s been plaguing roboticists and filmmakers alike.

About those lips

When you see someone talking, about half the time you watch their lips. Focusing on the mouth helps you understand what’s being said, even amid noise. “When a robot does not move their lips, or moves them in a kind of Muppet way, it is very noticeable,” Lipson says.

Giving a robot realistic lip motion is hard. People use a lot of facial muscles to express themselves. In contrast, humanoid robots typically only have a few stiff, mechanized motors to move their faces.

Even making a robot smile is quite challenging. It may seem like just a matter of “lift the two tips of the lips,” Lipson says, “and there you go. You smile.” In reality, if only the corners of a robot’s mouth move, he says, “it looks really weird and off.”

Lipson and his group realized that their robot’s lip movements needed to be delicate or subtle to look realistic. “There’s the arts and crafts of it,” he says. So the team created a robot with 26 motors to drive its facial movements. Its soft, silicone lips can move in 10 different ways, like a rubber band distorting into various shapes. Those motor-driven movements can now faithfully recreate the shapes our lips make as we talk.

Binge-watching YouTube

The team designed the lips to move in multiple ways, but they didn’t tell the robot how to move its lips. Instead, the robot learned by watching videos of people talking. In other words, it watched YouTube. Lots and lots and lots of YouTube.

As it watched, the robot used machine learning to study the shapes lips make as we speak. Later, it practiced in front of a mirror, matching the facial expressions its motors created to the human expressions linked with the voices it had heard in videos. 

“People, for decades now, have been trying to program robots to do things,” says Lipson. “We underestimate how complicated it is.” Machine learning is more robust, he says. It allows robots to adapt to the many complexities of talking. For example, the robot only watched YouTube videos in English. Yet it was able to adapt to move its lips in line with French, Japanese, Korean, Spanish, Italian, German, Russian, Chinese, Hebrew and Arabic voices.          

The Columbia team’s robot, named Emo, lip-synchs to a song from its own AI-generated album “Hello World.”Yuhang Hu/Creative Machines Lab

Lipson’s team described its new android on January 14 in Science Robotics.

The results are very convincing. The robot’s head and eye movements are still a bit jerky, but the lips move in very lifelike ways, synching to its voice. When I watched the team’s robot sing, it didn’t trigger that creepy feeling.

“The Columbia Engineering team’s work on robotic lip motion is very impressive,” says Martina Mara. At Johannes Kepler University Linz in Austria, she studies how people feel about humanlike robots.

Still, she adds, while lips are a huge source of social cues, the whole face is important. When listening to a voice, “people usually do not respond to one isolated feature, but to the overall consistency of the face.”  

Do you have a science question? We can help!

Submit your question here, and we might answer it an upcoming issue of Science News Explores

Made in our image

Why make a humanlike robot in the first place?  

“If you needed to ‘talk’ to a computer in the old days, there were punch cards,” says Lipson. Then there were screens that displayed numbers and letters — and nothing else. Later, graphics could create almost photo-quality images. After that, screens emerged that respond to touch. As computers have changed over time, so do the ways we interact with them.

“A very humanlike robot can make interaction more intuitive, even for non-technical users,” says Mara.

But making robots more humanlike could go too far, Mara adds. “A highly humanlike design may fuel people’s tendency to anthropomorphize the machine,” she says. This could lead people to form emotional attachments with a device or to “even begin to care about the robot’s ‘well being.’” This would blur the line between robots and humans, she says. As a society, we have to consider carefully if that’s what we want.

“People have emotions towards their puppies or their teddy bears, but at least they don’t mistake it to be a human,” says Lipson. We don’t want to cross the line and mistake a machine for another human. Doing so, he says, risks people choosing to connect emotionally with a machine instead of with other people.

For now, Lipson’s team gives its robot slightly blue-tinged skin so it can’t be mistaken for a person.

There’s another option, points out Diel. Robots don’t have to look precisely human. “Rather than recreating the mechanics of human facial motion, simple facial expressions can be expressed as smileys on a screen, for instance. This makes the robot likable … without risking an Uncanny Valley.”

Keeping those challenges in mind, Lipson believes in a future full of robotics. “For kids in high school, it means that most of their life — certainly their career — is going to be dominated coexisting with intelligent machines that are human-shaped,” he says. “This is real. And it’s going to come very fast.”