When talking with people face-to-face, we may experience the "Cocktail Party Effect": even in a crowded, noisy room, we can use our binaural hearing to focus our listening on a single person speaking. With current telepresence technologies, however, we lose this important ability. Thankfully, there are already researchers giving us new tools for effectively bridging these remote distances.

Kyoto University's Professor Hiroshi Okuno and Assistant Professor Toru Takahashi, Honda Research Institute-Japan's Dr. Kazuhiro Nakadai, and four Kyoto University and Tokyo Institute of Technology graduate students spent a week at Willow Garage, integrating HARK with a Texai telepresence robot. HARK stands for Honda Research Institute-Japan (HRI-JP) Audition for Robots with Kyoto University.  The robot audition system provides sound source localization, sound source separation, acoustic feature extraction, and automatic speech recognition.

The HARK system integrated well with ROS and our Texai.  The Texai was outfitted with a green salad bowl helmet embedded with eight microphones, and there is now a hark package for ROS. Using this setup, their team put together three demos showing off the potential for telepresence technologies.

In the first demo, four people, including one present through a second Texai, talk over each other while the HARK-Texai separates out each voice.  The second demo shows that sound is localized and that sound direction and power can be displayed in a radar chart.  The final presentation puts these two demos together into a powerful new interface for the remote operator: the Texai pilot can determine where various sounds and voices are coming from, and select which sound to focus on.  The HARK system then provides the pilot with the desired audio, cutting out any background noise or additional voices. Even in a crowded room, you can have a one-on-one conversation.

HARK came out of close collaboration between HRI-JP and Kyoto University, and Professor Okuno's passion to make computer/robot audition helpful for the hearing impaired. HARK is provided free and open source for research purposes and can be licensed for commercial applications.


more info on hardware please

Could you provide more information about the hardware used? What types of microphones, how were they mixed and connected to the Texai?

Here's some more info

We used a reference platform that was purchased from the developers of HARK. It uses MEMS microphones that connect to a black box that converts the sounds to digital and sends it via gigabit ethernet to our computer.