Scalable Object Recognition

Marius Muja from University of British Columbia returned to Willow Garage this summer to continue his work object recognition. In addition to working on an object detector that can scale to a large number of objects, he has also been designing a general object recognition infrastructure.

One problem that many object detectors have is that they get slower as they learn new objects. Ideally we want a robot that goes into an environment and is capable of collecting data and learning new objects by itself. In doing this, however, we don't want the robot to get progressively slower as it learns new objects.

Marius worked on an object detector called Binarized Gradient Grid Pyramid (BiGGPy), which uses the gradient information from an image to match it to a set of learned object templates. The templates are organized into a template pyramid. This tree structure has low resolution templates at the root and higher resolution templates at each lower level. During detection, only a fraction of this tree must be explored. This results in big speedups and allows the detector to scale to a large number of objects.

Marius also worked on a recognition infrastructure that allows object detection algorithms to be dynamically loaded as C++ plugins. These algorithms can easily be combined together or swapped one for another. The infrastructure also allows very efficient data passing between the different algorithms using shared memory instead of data copying. This recognition infrastructure is useful for both users and researchers -- they can experiment with different algorithms and combine them together in a system without having to write additional code.

The code and documentation for the binarized gradient grid pyramid object detector (bigg_detector) and the recognition infrastructure (recognition_pipeline) are available on ros.org. Slides from Marius' end-of-summer presentation can be viewed below or downloaded from ROS.org.