Seth Hutchinson

University of Illinois at Urbana-Champaign

Planning Under Uncertainty


A robot's knowledge about the state of the world (including the robot itself) derives from two sources: the actions that the robot performs, and the data it obtains using its sensors. The robot's actions, which are typically formalized in a process model, are subject to a variety of uncertainties, both intrinsic (e.g., inaccurate actuation of joints) and extrinsic (e.g., inaccurate models of friction for mobile robots). Sensing, which is typically formalized in an observation model, is also an inaccurate enterprise. Here uncertainties can be due to physical limitations of the sensor (e.g., depth of focus, limited resolution), or to noise in the sensing process.

Of course it is possible to ameliorate the effects of uncertainty at run time by designing better control systems or better sensing systems, or even by combining the two (e.g., as with visual servo control), but it is also possible to consider uncertainty during the planning process. With the latter approach, rather than designing algorithms to reduce uncertainty during execution, one designs plans that are robust to uncertainty during plan execution. This general approach is sometimes referred to as planning under uncertainty.

Over the years, my group has taken a variety of approaches to this problem. In the earliest work, uncertainty was modeled using bounded sets, and symbolic constraint manipulation techniques were used to propagate uncertainty through the steps in a plan. Later, we investigated Bayesian methods and formalized the concept of a performance preimage, which is essentially the set of configurations from which plan execution is guaranteed to exceed certain performance criteria, even in the presence of uncertainty. In our most recent work, we have adopted the formalism of partially observed Markov decision processes (POMDPs), and focused on efficient algorithms for building approximately optimal policies.

An overview of the main projects is given below.

Sampling in the Hyperbelief Space

James Davidson, Salvatore Candido and Seth Hutchinson

Typically, the location of a robot is described by a configuration. If there is uncertainty in the configuration, one can lift the configuration space into the belief space, which is essentially the set of possible probability functions for the robot's position (in this case, a belief is merely an a posteriori pdf on the robot's configuration). If a robot system plans into the future, the future beliefs will depend on future sensor measurements, which cannot be known at planning time. To deal with this, we can lift the belief space to the hyperbelief space, which is merely the set of all possible pdfs on the belief space. In the hyperbelief space, policies have deterministic effects (i.e., for a specific control policy, the transition from one stage of execution to the next is deterministic). Thus, in the higher-dimensional hyperbelief space, one need not (explicitly) worry about the representation of uncertainty. Thus, it can be convenient to model partially observed Markov decision processes (POMDPs) in the hyperbelief space.

We have developed an anytime algorithm for determining nearly optimal policies for total cost and finite time horizon POMDPs using a sampling-based approach. The technique, sampling hyperbelief optimization technique (SHOT), builds a roadmap in the hyperbelief space. The method exploits the notion that small changes in a policy have little impact on the quality of the solution except at a small set of critical points. The result is a technique to represent POMDPs independent of the initial conditions and the particular cost function, so that the initial conditions and the cost function may vary without having to re-perform the majority of the computational analysis.
The basic ideas, along with some results, are given in the following WAFR paper: Back to top

Minimum Uncertainty Robot Path Planning using a POMDP Approach

Sal Candido, James Davidson and Seth Hutchinson

We have developed a minimum uncertainty planning technique for mobile robots localizing with beacons. We model the system as a partially-observable Markov decision process and use a sampling-based method in the belief space (the space of posterior probability density functions over the state space) to find a belief-feedback policy. This approach allows us to analyze the evolution of the belief more accurately, which can result in improved policies when common approximations do not model the true behavior of the system. Our experimental evaluations demonstrate that our method performs comparatively, and in certain cases better, than current methods in the literature.
Preliminary results are given in a forthcoming IROS paper (not yet available).
Back to top

Exploiting Domain Knowledge when Learning Policies for POMDPs

Sal Candido, James Davidson and Seth Hutchinson

We have developed a planning algorithm that allows user-supplied domain knowledge to be exploited in the synthesis of information feedback policies for systems modeled as partially observable Markov decision processes (POMDPs). POMDP models, which are increasingly popular in the robotics literature, permit a planner to consider future uncertainty in both the application of actions and sensing of observations. With our approach, domain experts can inject specialized knowledge into the planning process by providing a set of local policies that are used as primitives by the planner. If the local policies are chosen appropriately, the planner can evaluate further into the future, even for large problems, which can lead to better overall policies at decreased computational cost. We use a structured approach to encode the provided domain knowledge into the value function approximation.

We have applied our approach to a multi-robot fire fighting problem, in which a team of robots cooperates to extinguish a spreading fire, modeled as a stochastic process. The state space for this problem is significantly larger than is typical in the POMDP literature, and the geometry of the problem allows for the application of an intuitive set of local policies, thus demonstrating the effectiveness of our approach.
The early results for this project are given in an ICRA paper: Back to top

An Objective-Based Framework for Motion Planning Under Sensing and Control Uncertainties

Steve LaValle and Seth Hutchinson

This work was initially inspired by the preimage planning methods of Lozano-Perez, Mason and Taylor, Erdmann, and Donald. Preimage planning involves worst-case analysis. Our method presented a general framework for determining feedback strategies by blending ideas from stochastic optimal control and dynamic game theory with traditional preimage planning concepts. This generalizes classical preimages to performance preimages and preimage planning for designing motion strategies with information feedback. For a given problem, one can define a performance criterion that evaluates any executed trajectory of the robot. Our method selects a motion strategy that optimizes this criterion under either nondeterministic uncertainty (resulting in worst-case analysis) or probabilistic uncertainty (resulting in expected-case analysis). We implemented the method using dynamic programming algorithms that numerically compute performance preimages and optimal strategies.

The image above shows, from left to right, computed optimal strategies, performance preimages and several sample paths.
The primary results for this project are given in Steve's IJRR paper: Back to top

Optimizing Robot Motion Strategies for Assembly with Stochastic Models of the Assembly Process

Rajeev Sharma, Steve LaValle and Seth Hutchinson

Gross-motion planning for assembly is commonly considered as a distinct, isolated step between task sequencing/scheduling and fine-motion planning. In this work, we formulate a problem of delivering parts for assembly in a manner that integrates it with both the manufacturing process and the fine motions involved in the final assembly stages. One distinct characteristic of gross-motion planning for assembly is the prevalence of uncertainty involving time (in parts arrival, in request arrival, etc.). Our approach uses stochastic representation of the assembly process, and produces a state-feedback controller that optimizes the expected time that parts wait to be delivered. This leads to increased performance and a greater likelihood of stability in a manufacturing process. We demonstrated the method on six specific instances of the general framework, each of which were solved to yield optimal motion strategies for different robots operating under different assembly situations.

The image to the left shows a stylized representation of an assembly process in which parts arrive with random arrival times from multiple sources, and must be assembled in the robotic work cell.
The main results are given in the following paper: Back to top

SPAR: A Planner that Satisfies Operational and Geometric Goals in Uncertain Environments

Avi Kak and Seth Hutchinson

Spar (simultaneous planner for assembly robots) is a planner that reasons about high-level operational goals, geometric goals, and uncertainty-reduction goals to create task plans for an assembly robot. These plans contain manipulations to achieve the assembly goals and sensory operations to cope with uncertainties in the robot's environment. High-level goals (which we refer to as operational goals) are satisfied by adding operations to the plan using a nonlinear, constraint-posting method. Geometric goals are satisfied by placing constraints on the execution of these operations. If the geometric configuration of the world prevents this, Spar adds new operations to the plan along with the necessary set of constraints on the execution of these operations. When the uncertainty in the world description exceeds that specified by the uncertainty-reduction goals, Spar introduces either sensing operations or manipulations to reduce this uncertainty to acceptable levels. If Spar cannot find a way to sufficiently reduce uncertainties, it augments the plan with sensing operations to be used to verify the execution of the action and, when possible, posts possible error-recovery plans.
The main ideas are given in the following paper: Back to top