Object Detection in Crowded Workspaces

Steven Gould from Stanford University interned this summer at Willow Garage. While some companies have their interns spend time cleaning up after their co-workers, Steven spent his time developing computer vision models for detecting objects in cluttered scenes -- helping robots learn how to clean up.

One of the challenges for computer vision is identifying objects when they are partially obscured by another object, such as the items partially obscured behind your laptop. Reality can be messy, but if we expect robots to work with everyday objects in real environments it's a challenge that must be met.

Standard parts-based object detection models directly assign a cost to matching each object part to the corresponding pixels in an image. This is problematic when objects can be partially occluded and the image pixels corresponding to a part's location belong to another object (i.e., the occluding objects).

Steven worked on building an occlusion-aware parts-based model that reduces the penalty for mismatch parts if those parts are obscured. This was accomplished by associating an occlusion variable with each part that indicates whether a part is visible or not. This variable can depend on other objects in the image so that once an object is found at a particular location the model encodes the pixels associated with that object as possibly occluding pixels for other objects. In this way, mismatch object parts are not heavily penalized as they are in the standard approach.

The same reasoning can be applied to truncated objects -- objects that are cut-off by the edge of the image since the parts not within the camera view can be simply treated as occluded.

Steven's work makes progress towards the goal of finding objects in everyday, cluttered environments. For more information, please see his slides below (download as PDF), or read his research on this topic. Code for this work will be integrated in future versions of the open-source OpenCV library.