Will we soon be able to control machines with simple gestures ?

The “Silense” European project launched in May 2017 is aimed at redefining the way we interact with machines. By using ultrasound technology similar to sonar, the researchers and industrialists participating in this collaboration have chosen to focus on 3D motion sensing technology. This technology could allow us to control our smartphone or house with simple gestures, without any physical contact with a tactile surface.

Lower the volume on your TV from your couch just by lowering your hand. Close the blinds in your bedroom by simply squeezing your fingers together. Show your car’s GPS the right direction to take by lifting your thumb. It may sound like scenes from a science fiction movie. Yet these scenarios are part of the real-life objectives of the European H2020 project called “Silense”, which stands for (Ultra)Sound Interfaces and Low Energy iNtegrated SEnsors. For a three-year period, this project will bring together 42 academic and industrial partners from eight countries throughout the continent. This consortium—which is particularly large, even for a H2020 project—will work from 2017 to 2020 to develop new human-machine interfaces based on ultrasound.

“What we want to do is replace tactile commands by commands the users can make from a distance, by moving their hands, arms or body,” explains Marius Preda, a researcher with Télécom SudParis, one of the project’s partners. To accomplish this, scientists will develop technology that is similar to sonar. An audio source will emit an inaudible sound that fills the air. When the sound wave hits an obstacle, it bounces back and returns to the source. Receivers placed at the same level as the transmitter record the wave travel times and determine the distance between the source and the obstacle. A 3D map of the environment can therefore be created. “It’s the same principle as an ultrasound,” the researcher explains.

In the case of the Silense project, the source will be made up of several transmitters, and there will be many more receivers than for a sonar. The goal is to improve the perception of the obstacles, thus improving the resolution of the 3D image that is produced. This should make it possible to detect smaller variations in shape, and therefore gestures that are more complex than those that are currently possible. “Today we can see if a hand is open or closed, but we cannot distinguish a finger that is up or two fingers that are up and squeezed together”, Marius Preda explains.

Télécom SudParis is leading the project’s software aspect. Its researchers’ mission is to develop image processing algorithms to recognize the gestures users make. By using neural networks to create deep learning, the scientists want to create a dictionary of distinctly different gestures. They will need to be recognizable by the ultrasound sensors regardless of the hand or arm’s position in relation to the sensor.

This is no easy task: the first step is to study differentiating gestures; the ones that cannot confuse the algorithms. The next steps involve reducing noise to improve the detection of shapes, sometimes in a way that is specific to the type of use—a sensor in the wall of a house will not have the same shortcomings as one in a car door. Finally, the researchers will also have to take the uniqueness of each user into account. Two different people will not make a specific sign the same way nor at the same speed.

“Our primary challenge is to develop software that can detect the beginning and end of a movement for any user,” explains Marius Preda, while emphasizing how difficult this task is, considering the fluid nature of human gestures: “We do not announce when are going to start or end a gesture. We must therefore succeed in perfectly segmenting the user’s actions into a chain of gestures.”

Moving towards the human-machine interaction of tomorrow

To meet this challenge, researchers at Télécom SudParis are working very closely with the partners in charge of the hardware aspect. Over the course of the project’s three-year period, the consortium hopes to develop new, smaller generations of sensors. This would make it possible to increase the number of transmitters and receivers on a given surface area, therefore improving the image resolution. This innovation, combined with new image processing algorithms, should significantly increase the catalogue of shapes recognized by ultrasound.

The Silense project is being followed very closely by car and connected object manufacturers. A human-machine interface that uses ultrasound features several advantages. In comparison to the current standard interface—touch—it improves vehicle safety by decreasing the attention required to push a button or tactile screen. In the case of smartphones or smart houses, this will mean greater convenience for consumers.

The ultrasound interface that is proposed here must also be compared with its main competitor: interaction through visual recognition—Kinect cameras, for example. According to Marius Preda, the use of ultrasound removes the lighting problems encountered with video in situations of overexposure (bright light in a car, for example) or underexposure (inside a house at night). In addition, the shape segmentation, for example for hands, is easier using 3D acoustic imaging. “If your hand is the same color as the wall behind you, it will be difficult for the camera to recognize your gesture,” the researcher explains.

Silense therefore has high hopes of creating a new way to interact with machines in our daily lives. By the end of the project, the consortium hopes to establish three demonstrators: one for a smart house, one integrated into a car, and one in a screen like that of a smartphone. If these first proof-of-concept studies prove conclusive, don’t be surprised to see drivers making big gestures in their cars someday!