Willow Garage Blog
It probably happens hundreds if not thousands of times every day; two college roommates get together to make some pancakes for dinner. The only difference this time is that the parties involved were robots. In this case, two robots from the Munich-based cluster of excellence CoTeSys (Cognition for Technical Systems) starred as chefs, demonstrating their capabilities in the world of robotics -- and in the kitchen.
In the video below, you can see James, a PR2 Beta Program robot, opening and closing cupboards and drawers, removing the pancake mix from the refrigerator, and handing it over to Rosie. Rosie is another robot at TUM that is deployed in the Assistive Kitchen environment of the Intelligent Autonomous Systems Group. Rosie cooks and flips the pancakes, and then delivers them back to James.
Behind this domestic tableau is a demonstration of the capabilities of service bots. This includes characteristics such as learning, probabilistic inference, and action planning.
James uses the Web for problem solving, just like we would. To retrieve the correct bottle of pancake mix from the fridge, it looks up a picture on the Web and then goes online to find the cooking instructions.
Rosie makes use of gravity compensation when pouring the batter, with the angle and the time for pouring the pancake mix adjusted depending on the weight of the mix. The manipulation of the spatula comes in to play when Rosie's initially inaccurate depth estimation is resolved by sensors detecting contact with the pancake maker.
In the course of the demonstration, both robots cope with mechanical inaccuracies, shifting furniture, obstacles, and execution errors. Failures in the task execution are dealt with using learned knowledge about the processes involved, the tools and their usage. If the spatula has not been grasped correctly, for example, the robot will notice this and adjust its grasp.
Rosie and James are not ready yet for haute cuisine, but the video provides an excellent sense of the future capabilities of service robots.
Much of the software used in this demonstration within the CoTeSys cluster will soon be made available to robot researchers in the TUM ROS repository.
As a reminder, it's just a few more days until the first CoTeSys-ROS Fall School on Cognition-Enabled Mobile Manipulation. It's being held from November 1-6, 2010 at Technische Universität München (TUM) in Munich, Germany.
One of the challenges that robots like the PR2 face is knowing how to grasp an object. We have years of experience to help us determine what objects are and how to grasp them. We can tell the difference between a mug, and wine glass, and a bowl, and know that they each should be handled in a different way. For robots, the world is not as certain, but there are approaches they can take that let them interact in an uncertain world.
This summer, Peter Brook from the University of Washington wrote a grasp planning system which lets robots successfully pick up objects, even in cases where they make incorrect guesses about what the object is. This planner uses a probabilistic approach, where the robot uses potentially incomplete or noisy information from its sensors to make multiple guesses about the identity of the object it is looking at. Based on how confident the robot is in each of the possible explanations for the perceived data, it can select the grasps that are most likely to work on the underlying object.
First, the planner builds up a set of representations for the sensed data; some are based on the best guesses provided by ROS recognition algorithms, and some use the raw segmented 3D data. For each representation, it uses a grasp-planning algorithm to generate a list of possible grasps. It then combines the information from all these sources, sorting grasps based on their estimated probability of success across all the representations. For grasp planners running on known object models, it can also use pre-computed grasps that speed up execution time.
This probabilistic planner allows the PR2 robot to cope with uncertainty and reliably grasp a wider range of objects in unstructured environments. It is also integrated into the ROS object manipulation pipeline so that others can experiment and improve upon it. For more information, please see Peter's slides below (download PDF), or checkout the source code in the probabilistic_grasp_planner package on ROS.org.
Robot motion planning has traditionally been used to avoid collisions when moving a robot arm. Avoiding collisions is important, but many other desirable criteria are often ignored. For example, motions that minimize energy will let the robot extend its battery life. Smoother trajectories may cause less wear on motors and can be more aesthetically appealing. There may be even more useful criteria, like keeping a glass of water upright when moving it around.
This summer, Mrinal Kalakrishnan from the Computational Learning and Motor Control Lab at USC worked on a new motion planner called STOMP, which stands for "Stochastic Trajectory Optimization for Motion Planning". This planner can plan paths for high-dimensional robotic systems that are collision-free, smooth, and can simultaneously satisfy task constraints, minimize energy consumption, or optimize other arbitrary criteria. STOMP is derived from gradient-free optimization and path integral reinforcement learning techniques (Policy Improvement with Path Integrals, Theodorou et al, 2010).
The accompanying video shows the STOMP planner being used to plan motions for the PR2 arm in simulation and a real-world setup. It shows the ability to plan motions in real-world environments, while optimizing constraints like holding the cans upright at all times. Ultimately, the utility of this motion planner is limited only by the creativity of the system designer, since it can plan trajectories that optimize any arbitrary criteria that may be important to achieve a given task.
For more information, please see Mrinal's slides below (download pdf), or checkout the code in the stomp_motion_planner package on ROS.org. This package builds on the various packages in the policy_learning stack, which was written in collaboration with Peter Pastor. You can also checkout Mrinal's work from last summer on the CHOMP motion planner.
Nobody likes to wait, even for a robot. So when a personal robot searches for an object to deliver, it should do so in a timely manner. To accomplish this, Feng Wu from the University of Science and Technology of China spent his summer internship developing new techniques to help PR2 select which sensors to use in order to more quickly find objects in indoor environnments.
Robots like PR2 have several sensors, but they can be generally categorized into two types: wide sensing and narrow sensing. Wide sensing covers larger areas and greater distances than narrow sensing, but the data may be less accurate. On the other hand, narrow sensing can use more power and take more time to collect and analyze. Feng worked on planning techniques to balance the tradeoffs between these two types of sensing actions, gathering more information while minimizing the cost.
The techniques involved the use of the Partially Observable Markov Decision Process. POMDP provides an ideal mathematical framework for modeling wide and narrow sensing. The sensing abilities and uncertainty of sensing data are modeled in an observation function. The cost of sensing actions can be defined in a reward function. The solution balances the costs of the sensing action and the rewards (i.e., the amount of information gathered).
For example, one part of the search tree might represent the following: "If I move to the middle of the room, look around and observe clutter to my left, then I will be very sure that there is a bottle to the left, moderately sure that there is nothing in front of me, and completely unsure about the rest of the room." When actually performing tasks in the world, the robot will receive observations from its sensors. These observations will be used to update its current beliefs. Then the robot can plan once again, using the updated beliefs. Planning in belief space allows making tradeoffs such as: "I'm currently uncertain about where things are, so it's worth taking some time to move to the center of the room to do a wide scan. Then I'll have enough information to choose a good location toward which to navigate."
If it's true that a picture is worth a thousand words, then the pictures from today's CNN interview go a long way to explaining the power and potential of the PR2 robot. A PR2 was awarded to Professor Charles Kemp and his team at Georgia Tech's School of Interactive Computing. Their focus at the Healthcare Robotics Lab is to develop robots that improve the quality and efficiency of healthcare.
See for yourself what they've been able to accomplish in just a few short months.
Simple trial and error is one the most common ways that we learn how to perform a task. While we can learn a great deal by watching and imitating others, it is often through our own repeated experimentation that we learn how -- or how not -- to perform a given task. Peter Pastor, a PhD student from the Computational Learning and Motor Control Lab at the University of Southern California (USC), has spent his last two internships here working on helping the PR2 to learn new tasks by imitating humans, and then improving that skill through trial and error.
Last summer, Peter's worked focused on imitation learning. The PR2 robot learned to grasp, pour, and place beverage containers by watching a person perform the task a single time. While it could perform certain tasks well with this technique, many tasks require learning about information that cannot necessarily be seen. For example, when you open a door, the robot cannot tell how hard to push or pull. This summer, Peter extended his work to use reinforcement learning algorithms that enable the PR2 to improve its skills over time.
Peter focused on teaching the PR2 two tasks with this technique: making a pool shot and flipping a box upright using chopsticks. With both tasks, the PR2 robot first learned the task via multiple demonstrations by a person. With the pool shot, the robot was able to learn a more accurate and powerful shot after 20 minutes of experimentation. With the chopstick task, the robot was able to improve its success from 3% to 86%. To illustrate the difficulty of the task, Peter conducted an informal user study in which 10 participants were performed 20 attempts at flipping by box by guiding the robot's gripper. Their success rate was only 15%.
Peter used Dynamic Movement Primitives (DMPs) to compactly encode movement plans. The parameters of these DMPs can be learned efficiently from a single demonstration by guiding the robot's arms. These parameters then become the initialization of the reinforcement learning algorithm that updates the parameters until the system has minimized the task-specific cost function and satisfied the performance goals. This state-of-the-art reinforcement learning algorithm is called Policy Improvement using Path Integrals (PI^2). It can handle high dimensional reinforcement learning problems and can deal with almost arbitrary cost functions.
Programming robots by hand is challenging, even for experts. Peter's work aims to facilitate the encoding of new motor skills in a robot by guiding the robot's arms, enabling it to improve its skill over time through trial and error. For more information about Peter's research with DMPs, see "Learning and Generalization of Motor Skills by Learning from Demonstration" from ICRA 2009. The work that Peter did this summer has been submitted to ICRA 2011 and is currently under review. Check out his presentation slides below (download PDF). The open source code has been written in collaboration with Mrinal Kalakrishnan and is available in the ROS policy_learning stack.
This summer, Hae Jong Seo, a PhD student from the Multidimensional Signal Processing Research Group at UC Santa Cruz, worked with us on object and action recognition using low-cost web cameras. In order for personal robots to interact with people, it is useful for robots to know where to look, locate and identify objects, and locate and identify human actions. To address these challenges, Hae Jong's implemented a fast and robust object and action detection system using features called locally adaptive regression kernels (LARK).
LARK features have many applications, such as saliency detection. Saliency detection determines which parts of an image are more significant, such as containing objects or people. You can then focus your object detection on the salient regions of the image in order to detect more quickly. Saliency detection can be extended to "space-time" for use with video streams.
LARK features can also be used for generic object and action detection. As you can see in the video, objects such as door knobs, the PR2 robot, and human faces and be detected using LARK. Space-time LARK can also detect human actions, such as waving, sitting down, and getting closer to the camera.
We're looking forward to seeing you in Taiwan at IROS 2010! If you're interested in checking out what Willow Garage has been up to lately, come check out our keynote presentation, workshops, and research talks. We'll be hosting a booth in the hall from Tuesday through Friday so you can come check out PR2 and chat with the Willow Garage team there, too.
Wednesday, October 20, 2010, in the Plenary Hall
The Status of ROS and the PR2 Beta Program
Steve Cousins, CEO
Tuesday-Friday, October 19-22, 2010
Taipei World Trade Centre Hall I, Booth #604
Come visit the Willow Garage booth to meet us and check out PR2!
Monday, October 18, 2010, 9:00-17:30, in Room 201E
Defining and Solving Realistic Perception Problems in Personal Robotics
Rusu, Pantofaru, Bradski, Konolige
As personal robotics platforms, such as the Willow Garage PR2, become increasingly available, there will be an emphasis in the research community on creating algorithms that are successful in the real world. Many robotics problems, such as planning or grasping, are currently addressed in simulation, where the state of the world is known. However, in the physical world, the assumption of a known world model falls apart and perception becomes a serious bottleneck. In this workshop, we will explore perception problems that are sufficiently well-defined and constrained to have proven solutions, and that enable interesting robotics research and behaviors.
Friday, October 22, 2010, 8:20-17:20, in Room 203A
Semantic mapping and autonomous knowledge acquisition
Holtz, Duckett, Rusu
As robots and autonomous systems move away from laboratory setups towards complex real-world scenarios, both the perception capabilities of these systems and their abilities to acquire and model semantic information must become more powerful. For example, a general-purpose service robot collaborating with a human user needs to know human spatial concepts and have an understanding of three-dimensional objects, their use and functional relationships between them. More generally, semantic perception and mapping must become a resource for the robot, which links sensory information to the robot's knowledge base and high-level deliberative components. A key issue is the robot's ability to autonomously acquire and model the perceived semantic information, as well as (exploration) strategies for deciding where and how to acquire such information.
Robot Industry Forum Presentation and Panel
Wednesday, October 20, 2010
13:40-14:15 in Room 201, Session: Robot Industry Forum
Open Source Robotics
Steve Cousins, CEO
16:00-17:30 in Room 201, Session: Robot Industry Forum
Research Paper Presentations
Monday, October 18, 2010
Afternoon in Room 202B
Workshop Paper - RoboEarth Towards a World Wide Web for Robots
Ciocarlie, Bradski, Hsiao, Brook
Tuesday, October 19, 2010
11:20-11:40 in Room 101A, Session: Mapping I
Sparse pose adjustment for 2d mapping
Konolige, Grisetti, Limketkai, Burgard, Kuemmerle, Vincent
Pose graphs have become a popular representation for solving the simultaneous localization and mapping (SLAM) problem. A pose graph is a set of robot poses connected by nonlinear constraints obtained from observations of features common to nearby poses. Optimizing large pose graphs has been a bottleneck for mobile robots, since the computation time can grow cubically with the size of the graph. In this paper, we propose an efficient approach for optimizing 2D pose graphs that takes advantage of their sparse structure by efficiently constructing and solving a sparse linear sub-problem. Our method, called Sparse Pose Adjustment (SPA), outperforms all existing approaches in terms of convergence speed and accuracy. We demonstrate its effectiveness on a large set of indoor real-world maps, and a very large simulated dataset. An open-source implementations in C++ and matlab/octave and the datasets are publicly available.
15:20-16:40 in 1F Corridor, Session: Interactive Session IA
Contact-Reactive Grasping of Objects with Partial Shape Information
Hsiao, Chitta, Ciocarlie, Jones
Robotic grasping in unstructured environments requires the ability to select grasps for unknown objects and execute them while dealing with uncertainty due to sensor noise or calibration errors. In this work, we propose a simple but robust approach to grasp selection for unknown objects, and a reactive adjustment approach to deal with uncertainty in object location and shape. The grasp selection method uses 3D sensor data directly to determine a ranked set of grasps for objects in a scene, using heuristics based on both the overall shape of the object and its local features. The reactive grasping approach uses tactile feedback from fingertip sensors to execute a compliant robust grasp. We present experimental results to validate our approach by grasping a wide range of unknown objects. Our results show that reactive grasping can correct for a fair amount of uncertainty in the measured position or shape of the objects, and that our grasp selection approach is successful in grasping objects with a variety of shapes.
Wednesday, October 20, 2010
10:50-11:10 in Room 101C, Session: Range Sensing I
Fast 3D Recognition and Pose Using the Viewpoint Feature Histogram
Rusu, Bradski, Hsu
We present the Viewpoint Feature Histogram (VFH), a descriptor for 3D point cloud data that encodes geometry and viewpoint. We demonstrate experimentally on a set of 60 objects captured with stereo cameras that VFH can be used as a distinctive signature, allowing simultaneous recognition of the object and its pose. The pose is accurate enough for robot manipulation, and the computational cost is low enough for real time operation. VFH was designed to be robust to large surface noise and missing depth information in order to work reliably on stereo data.
Thursday, October 21, 2010
9:30-9:50 in Room 201E, Session: Haptics and Haptic Interfaces I
Multi-DOF Equalization of Haptic Devices for Accurate Rendering at High Frequencies
Wilson, Chan, Salisbury, Niemeyer
Previous work has shown that high frequency content is important for realistic haptic feedback, while stability considerations limit the ability of closed-loop control to effectively generate high frequencies. Open-loop playback of high frequencies offers a promising way to generate rich contact transients and textures, but complex high frequency dynamics cause distortion. This paper explores the equalization and dynamic decoupling of multi-DOF haptic devices for accurate open-loop playback. Toward this end, a user study is performed to determine the frequency limit of human force direction sensitivity at 35Hz. This information together with experimental system identification techniques is used to develop a strategy for equalization in different frequency bands. Finally, MIMO equalization is accomplished through online simulation of the system model under the control of an LQR tracking controller.
12:00-12:20 in Room 101B, Session: Force Control II
Examining the Benefits of Variable Impedence Actuation
Variable impedance actuators provide the ability to robustly alter interaction impedances mechanically, without bandwidth, power, and stability limitations. They can achieve the physical benefits of an elastic transmission and also recover characteristics of traditionally controlled, inelastic motors. We review previously explored benefits of variable impedance actuators for energetic tasks and impact safety. We then focus on benefits in low frequency force interactions. We examine impedance and force dynamic ranges and illustrate how they are significantly increased by physical impedance variation. Theoretical analysis is confirmed by experiments on a 1-DOF testbed with three impedance settings.
15:40-16:00 in Room 101C, Session: Telerobotics I
Mutli-DOF Model-Reference Force Control for Telerobotic Applications
For a wide range of telerobotic applications, the slave device needs to be a large, powerful, industrial type robot in order to achieve the desired tasks. Due to the large frictional forces within the gearing of such robots, a force-feedback controller is necessary to precisely control the forces the robot applies when manipulating its environment. This paper proves passivity, and therefore guarantees stability, of a model-based force controller in one degree of freedom (DOF) when subject to viscous and Coulomb friction. The controller is then expanded to muli-DOF systems. In addition to maintaining the robustness of the 1-DOF controller, the multi-DOF controller provides additional freedom to design the closed loop dynamics of the robot. This freedom allows the control designer the ability to shape and optimize how the system feels from a users perspective. The robustness of the controller is experimentally validated and the freedom to modify the closed loop dynamics is explored using a 2-DOF device.
Bastian Steder, a PhD student from the Autonomous Intelligent Systems Group at the University of Freiburg, Germany, spent the summer at Willow Garage implementing an object recognition system using 3D point cloud data. With 3D sensors becoming cheaper and more widely available, they are a valuable tool for robot perception. 3D data provides extra information to a robot, such as distance and shape, that enables different approaches to identifying objects in the world. Bastian's work focused on using databases of 3D models to identify objects in this 3D sensor data.
The main focus for Bastian's work was on the feature-extraction process for 3D data. One of his contributions was a novel interest keypoint extraction method that operates on range images generated from arbitrary 3D point clouds. This method explicitly considers the borders of the objects identified by transitions from foreground to background. Bastian also developed a new feature descriptor type, called NARF (Normal Aligned Radial Features), that takes the same information into account. Based on these feature matches, Bastian then worked on a process to create a set of potential object poses and added spatial verification steps to assure these observations fit the sensor data.
The full system can identify the existence and poses of arbitrary objects, of which we have a point cloud model in a very efficient manner, using only the geometrical information provided by the 3D sensor. Code for Bastian's work, including object recognition and feature extraction, has been integrated with PCL, which is a general library for 3D geometry processing in development at Willow Garage. To find out more, check out the point_cloud_perception stack on ROS.org. For detailed technical information, you can check out Bastian's presentation slides below (download PDF).
Adam Harmat from McGill University worked on three projects this summer to make the PR2 more dexterous when manipulating objects: a monitoring system for arm movement, a persistent 3D collision map, and a multi-table manipulation application. All of these projects demonstrated how increased knowledge of its environment is necessary for improving PR2's mobile manipulation capabilities.
The arm-monitoring system uses head-mounted stereo cameras to detect new obstacles. While the arm moves, the PR2 looks at locations that are a few seconds ahead of the arm's current position. Any detected obstacles are added to a collision map, and, if a future collision is anticipated, the arm stops and waits. If the new obstacle doesn't move, the PR2 will attempt to move around it.
The collision map was improved to store information about everything the robot has previously seen. This allows the PR2 to perform tasks that require it to relocate as it maintains knowledge about places it currently cannot see. This new collision map is based on Octomap, an open source package from the University of Freiburg. The octree structure of Octomap is more compact and also enables the storing of probabilistic values.
No one wants a clumsy robot. As a result of these projects, the PR2 is able to maintain more knowledge about its local environment, and is able to keep its arms from bumping into objects. Adam developed a demo application to demonstrate these new capabilities.
In his multi-table manipulation demo, the PR2 continuously finds and moves objects between separate tables. This application is integrated with the ROS navigation stack to determine pickup locations and navigate between tables. Adam's multi-table application demonstrates how planning with the persistent collision map can be integrated with base movement and local task-execution into a complete system.