Humans Helping Robots See

Alex Sorokin of University of Illinois, Urbana-Champaign has been hacking on many projects this summer. Alex is the author of the CV Web Annotation Toolkit, which is an open source tool that helps researchers in computer perception collect and classify data, with the help of workers on Amazon's Mechanical Turk. If you're a researcher collecting a data set, you can use this toolkit to submit images to Mechanical Turk and pay people to label them for you. For example, you can instruct Mechanical Turk users to draw boxes around all of the people in an image, or draw polygons around people's hands. These manually-labeled datasets allow researchers to test their own algorithms and compare them against what a person sees. If you're worried about the quality of results that random Turkers might give you, you can easily let an additional set of users grade the results that you get back, and increase the likelihood of accurate responses.

In addition to helping our researchers collect data sets, this toolkit is also very useful for our robots. For just a couple of dollars, you can easily submit images to Mechanical Turk and have users teach the robot where your refrigerator and dishwasher are, what your cups look like, or where your power outlets are.  We're a long way from having robots that can identify objects in an environment as well as humans can, but toolkits like Alex's help us to bridge that gap by allowing humans to lend their abilities to robots.

In addition to his work with Mechanical Turk, Alex came up with many great extensions to ROS. These include bagserver, which enables random-access into ROS bag files, and bag_image_view, which lets you visually scan through images in a bag file and play them back.

Below is Alex's end-of-summer presentation, which discusses his various contributions in greater detail.

Alex Sorokin: CV/Mechanical Turk Presentation (Download PDF from ros.org)