Identifying and Mapping Common Objects with Semantic World Modeling

This past summer at Willow Garage Julian “Mac” Mason, a Ph.D. student from Duke University, worked on semantic world modeling using the PR2 with an attached Microsoft Kinect. Without any pre-existing object models in a database, the PR2 navigated through the hallways and other indoor areas, identifying and mapping objects that are located on flat surfaces, such as tables and countertops.

The PR2 identified common objects like cups, books, printers, robot parts, and houseplants. This method doesn’t require a database of preexisting object models nor the close-range, high-resolution data traditionally used in object recognition and tabletop segmentation. The PR2 stores the Kinect RGB-D point clouds (and tf frames), and then processes the data segmenting out horizontal planes and the objects on those planes. This results in some interesting applications.

  1. Rescanning the area uses an existing database to provide object locations to the navigation system, and the robot explicitly drives to, and directly observes, the location of each potential object. Individual object appearance and disappearance can be tracked over time.
  2. Querying the databases generated by two or more runs to see if a particular object moved.
  3. Relying on the perceptual data associated with each object allows queries like "show me small, curved, white objects in the cafeteria" (a good substitute for "coffee cups").

The related software is available in the semanticmodel package on ROS.org. While this software was built and tested on a PR2, it requires only a localized mobile base and Microsoft Kinect. For more technical information, please see these presentation slides from IROS 2011.