New Code To Enable Robots Understand Body Language

by CXOtoday News Desk    Jul 10, 2017


Researchers at Carnegie Mellon University in Pittsburgh, Pennsylvania, US have enabled a computer code to understand body poses and movements of people around them.

“We communicate almost as much with the movement of our bodies as we do with our voice,” said, Yaser Sheikh, Associate Professor of Robotics at Carnegie Mellon University in Pittsburgh, Pennsylvania, US. “But computers are more or less blind to it.”

The method was developed with the help of the Panoptic Studio, a two-story dome embedded with 500 video cameras.

The insights gained from experiments in that facility now make it possible to detect the pose of a group of people using a single camera and a laptop computer, researchers said.

These methods for tracking 2-D human form and motion open up new ways for people and machines to interact with each other and for people to use machines to better understand the world around them. The ability to recognize hand poses, for instance, will make it possible for people to interact with computers in new and more natural ways, such as communicating with computers simply by pointing at things, Yaser said.

To encourage more research and applications, the researchers have released their computer code for both multi-person and hand pose estimation.

Sheikh and his colleagues will present reports on their multi-person and hand pose detection methods at CVPR 2017, the Computer Vision and Pattern Recognition Conference July 21-26 in Honolulu.

Tracking multiple people in real time, particularly in social situations where they may be in contact with each other, presents a number of challenges. Simply using programs that track the pose of an individual does not work well when applied to each individual in a group, particularly when that group gets large.

Sheikh and his colleagues took a “bottom-up” approach, which first localizes all the body parts in a scene — arms, legs, faces, etc. — and then associates those parts with particular individuals.

The challenges for hand detection are greater. As people use their hands to hold objects and make gestures, a camera is unlikely to see all parts of the hand at the same time. Unlike the face and body, large datasets do not exist of hand images that have been annotated with labels of parts and positions.

The researchers said it will help several applications like, a self-driving car could get an early warning that a pedestrian is about to step into the street by monitoring body language. Enabling machines to understand human behavior also could enable new approaches to behavioral diagnosis and rehabilitation, for conditions such as autism, dyslexia and depression.