Publications

* denotes equal contribution and joint lead authorship.

2022

arXiv

Learning Perceptual Concepts by Bootstrapping from Human Queries

Andreea Bobu, Chris Paxton, Wei Yang, Balakumar Sundaralingam, Yu-Wei Chao, Maya Cakmak, and Dieter Fox.

ArXiv Preprint, 2022

Abstract PDF Project Website

Robots need to be able to learn concepts from their users in order to adapt their capabilities to each user’s unique task. But when the robot operates on high-dimensional inputs, like images or point clouds, this is impractical: the robot needs an unrealistic amount of human effort to learn the new concept. To address this challenge, we propose a new approach whereby the robot learns a low-dimensional variant of the concept and uses it to generate a larger data set for learning the concept in the high-dimensional space. This lets it take advantage of semantically meaningful privileged information only accessible at training time, like object poses and bounding boxes, that allows for richer human interaction to speed up learning. We evaluate our approach by learning prepositional concepts that describe object state or multi-object relationships, like above, near, or aligned, which are key to user specification of task goals and execution constraints for robots. Using a simulated human, we show that our approach improves sample complexity when compared to learning concepts directly in the high-dimensional space. We also demonstrate the utility of the learned concepts in motion planning tasks on a 7-DoF Franka robot.
IJRR

Inducing Structure in Reward Learning by Learning Features

Andreea Bobu, Marius Wiggert, Claire J. Tomlin, and Anca D. Dragan.

The International Journal of Robotics Research

Abstract PDF

Reward learning enables robots to learn adaptable behaviors from human input. Traditional methods model the reward as a linear function of hand-crafted features, but that requires specifying all the relevant features a priori, which is impossible for real-world tasks. To get around this issue, recent deep Inverse Reinforcement Learning (IRL) methods learn rewards directly from the raw state but this is challenging because the robot has to implicitly learn the features that are important and how to combine them, simultaneously. Instead, we propose a divide and conquer approach: focus human input specifically on learning the features separately, and only then learn how to combine them into a reward. We introduce a novel type of human input for teaching features and an algorithm that utilizes it to learn complex features from the raw state space. The robot can then learn how to combine them into a reward using demonstrations, corrections, or other reward learning frameworks. We demonstrate our method in settings where all features have to be learned from scratch, as well as where some of the features are known. By first focusing human input specifically on the feature(s), our method decreases sample complexity and improves generalization of the learned reward over a deepIRL baseline. We show this in experiments with a physical 7DOF robot manipulator, as well as in a user study conducted in a simulated environment.

2021

ICRA

Situational Confidence Assistance for Lifelong Shared Autonomy

Matthew Zurek*, Andreea Bobu*, Daniel S. Brown, and Anca D. Dragan.

International Conference on Robot Learning (ICRA), 2021

Abstract PDF Talk

Shared autonomy enables robots to infer user intent and assist in accomplishing it. But when the user wants to do a new task that the robot does not know about, shared autonomy will hinder their performance by attempting to assist them with something that is not their intent. Our key idea is that the robot can detect when its repertoire of intents is insufficient to explain the user's input, and give them back control. This then enables the robot to observe unhindered task execution, learn the new intent behind it, and add it to this repertoire. We demonstrate with both a case study and a user study that our proposed method maintains good performance when the human's intent is in the robot's repertoire, outperforms prior shared autonomy approaches when it isn't, and successfully learns new skills, enabling efficient lifelong learning for confidence-based shared autonomy.
ICRA

Dynamically Switching Human Prediction Models for Efficient Planning

Arjun Sripathy*, Andreea Bobu*, Daniel S. Brown, and Anca D. Dragan.

International Conference on Robot Learning (ICRA), 2021

Abstract PDF Project Website Code Talk

As environments involving both robots and humans become increasingly common, so does the need to account for people during planning. To plan effectively, robots must be able to respond to and sometimes influence what humans do. This requires a human model which predicts future human actions. A simple model may assume the human will continue what they did previously; a more complex one might predict that the human will act optimally, disregarding the robot; whereas an even more complex one might capture the robot's ability to influence the human. These models make different trade-offs between computational time and performance of the resulting robot plan. Using only one model of the human either wastes computational resources or is unable to handle critical situations. In this work, we give the robot access to a suite of human models and enable it to assess the performance-computation trade-off online. By estimating how an alternate model could improve human prediction and how that may translate to performance gain, the robot can dynamically switch human models whenever the additional computation is justified. Our experiments in a driving simulator showcase how the robot can achieve performance comparable to always using the best human model, but with greatly reduced computation.
HRI

Feature Expansive Reward Learning: Rethinking Human Input

Andreea Bobu*, Marius Wiggert*, Claire J. Tomlin, and Anca D. Dragan.

ACM/IEEE International Conference on Human Robot Interaction (HRI), 2021
Best Paper Award Finalist

Abstract PDF Project Website Code Talk

When a person is not satisfied with how a robot performs a task, they can intervene to correct it. Reward learning methods enable the robot to adapt its reward function online based on such human input, but they rely on handcrafted features. When the correction cannot be explained by these features, recent work in deep Inverse Reinforcement Learning (IRL) suggests that the robot could ask for task demonstrations and recover a reward defined over the raw state space. Our insight is that rather than implicitly learning about the missing feature(s) from demonstrations, the robot should instead ask for data that explicitly teaches it about what it is missing. We introduce a new type of human input in which the person guides the robot from states where the feature being taught is highly expressed to states where it is not. We propose an algorithm for learning the feature from the raw state space and integrating it into the reward function. By focusing the human input on the missing feature, our method decreases sample complexity and improves generalization of the learned reward over the above deep IRL baseline. We show this in experiments with a physical 7DOF robot manipulator, as well as in a user study conducted in a simulated environment.

2020

HRI

LESS is More: Rethinking Probabilistic Models of Human Behavior

Andreea Bobu*, Dexter R. R. Scobee*, Jaime F. Fisac, S. Shankar Sastry, and Anca D. Dragan.

ACM/IEEE International Conference on Human Robot Interaction (HRI), 2020
Best Paper Award Winner

Abstract PDF Project Website Talk

Robots need models of human behavior for both inferring human goals and preferences, and predicting what people will do. A common model is the Boltzmann noisily-rational decision model, which assumes people approximately optimize a reward function and choose trajectories in proportion to their exponentiated reward. While this model has been successful in a variety of robotics domains, its roots lie in econometrics, and in modeling decisions among different discrete options, each with its own utility or reward. In contrast, human trajectories lie in a continuous space, with continuous-valued features that influence the reward function. We propose that it is time to rethink the Boltzmann model, and design it from the ground up to operate over such trajectory spaces. We introduce a model that explicitly accounts for distances between trajectories, rather than only their rewards. Rather than each trajectory affecting the decision independently, similar trajectories now affect the decision together. We start by showing that our model better explains human behavior in a user study. We then analyze the implications this has for robot inference, first in toy environments where we have ground truth and find more accurate inference, and finally for a 7DOF robot arm learning from user demonstrations.
HRI

Detecting Hypothesis Space Misspecification in Robot Learning from Human Input

Andreea Bobu

HRI Pioneers Companion of the ACM/IEEE International Conference on Human-Robot Interaction (HRI), 2020.

Abstract PDF

Learning from human input has enabled autonomous agents to perform increasingly more complex tasks that are otherwise difficult to carry out automatically. To this end, recent works have studied how robots can incorporate such input - like demonstrations or corrections - into objective functions describing the desired behaviors. While these methods have shown progress in a variety of settings, from semi-autonomous driving, to household robotics, to automated airplane control, they all suffer from the same crucial drawback: they implicitly assume that the person's intentions can always be captured by the robot's hypothesis space. We call attention to the fact that this assumption is often unrealistic, as no model can completely account for every single possible situation ahead of time. When the robot's hypothesis space is misspecified, human input can be unhelpful - or even detrimental - to the way the robot is performing its tasks. Our work tackles this issue by proposing that the robot should first explicitly reason about how well its hypothesis space can explain human inputs, then use that situational confidence to inform how it should incorporate them.

2019

T-RO

Quantifying Hypothesis Space Misspecification in Learning from Human-Robot Demonstrations and Physical Corrections

Andreea Bobu, Andrea Bajcsy, Jaime F. Fisac, Sampada Deglurkar, and Anca D. Dragan.

IEEE Transactions on Robotics (T-RO)
Best Paper Award Honorable Mention

Abstract PDF Code Poster

Human input has enabled autonomous systems to improve their capabilities and achieve complex behaviors that are otherwise challenging to generate automatically. Recent work focuses on how robots can use such input - like demonstrations or corrections - to learn intended objectives. These techniques assume that the human’s desired objective already exists within the robot’s hypothesis space. In reality, this assumption is often inaccurate: there will always be situations where the person might care about aspects of the task that the robot does not know about. Without this knowledge, the robot cannot infer the correct objective. Hence, when the robot’s hypothesis space is misspecified, even methods that keep track of uncertainty over the objective fail because they reason about which hypothesis might be correct, and not whether any of the hypotheses are correct. In this paper, we posit that the robot should reason explicitly about how well it can explain human inputs given its hypothesis space and use that situational confidence to inform how it should incorporate human input. We demonstrate our method on a 7 degree-of-freedom robot manipulator in learning from two important types of human input: demonstrations of manipulation tasks, and physical corrections during the robot’s task execution.

2018

CoRL

Learning under Misspecified Objective Spaces

Andreea Bobu, Andrea Bajcsy, Jaime F. Fisac, and Anca D. Dragan.

Conference on Robot Learning (CoRL), 2018
Invited to Special Issue

Abstract PDF Code Video Poster Talk

Learning robot objective functions from human input has become increasingly important, but state-of-the-art techniques assume that the human's desired objective lies within the robot's hypothesis space. When this is not true, even methods that keep track of uncertainty over the objective fail because they reason about which hypothesis might be correct, and not whether any of the hypotheses are correct. We focus specifically on learning from physical human corrections during the robot's task execution, where not having a rich enough hypothesis space leads to the robot updating its objective in ways that the person did not actually intend. We observe that such corrections appear irrelevant to the robot, because they are not the best way of achieving any of the candidate objectives. Instead of naively trusting and learning from every human interaction, we propose robots learn conservatively by reasoning in real time about how relevant the human's correction is for the robot's hypothesis space. We test our inference method in an experiment with human interaction data, and demonstrate that this alleviates unintended learning in an in-person user study with a 7DoF robot manipulator.
ICLR

Adapting to Continuously Shifting Domains

Andreea Bobu, Eric Tzeng, Judy Hoffman, and Trevor Darrell.

International Conference on Learning Representations (ICLR) Workshop, 2018

Abstract PDF Poster

Domain adaptation typically focuses on adapting a model from a single source domain to a target domain. However, in practice, this paradigm of adapting from one source to one target is limiting, as different aspects of the real world such as illumination and weather conditions vary continuously and cannot be effectively captured by two static domains. Approaches that attempt to tackle this problem by adapting from a single source to many different target domains simultaneously are consistently unable to learn across all domain shifts. Instead, we propose an adaptation method that exploits the continuity between gradually varying domains by adapting in sequence from the source to the most similar target domain. By incrementally adapting while simultaneously efficiently regularizing against prior examples, we obtain a single strong model capable of recognition within all observed domains. Our method is applicable on a wide variety of learning settings, including visual classification and reinforcement learning in a video game domain.

2016

MICCAI

Patch-Based Discrete Registration of Clinical Brain Images

Adrian Dalca, Andreea Bobu, Natalia S. Rost, and Polina Golland.

Patch-based Techniques in Medical Imaging at the International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI-PATCHMI), 2016
Best Paper Award Winner

Abstract PDF Code Poster

We introduce a method for registration of brain images acquired in clinical settings. The algorithm relies on three-dimensional patches in a discrete registration framework to estimate correspondences. Clinical images present significant challenges for computational analysis. Fast acquisition often results in images with sparse slices, severe artifacts, and variable fields of view. Yet, large clinical datasets hold a wealth of clinically relevant information. Despite significant progress in image registration, most algorithms make strong assumptions about the continuity of image data, failing when presented with clinical images that violate these assumptions. In this paper, we demonstrate a non-rigid registration method for aligning such images. The method explicitly models the sparsely available image information to achieve robust registration. We demonstrate the algorithm on clinical images of stroke patients. The proposed method outperforms state of the art registration algorithms and avoids catastrophic failures often caused by these images. We provide a freely available open source implementation of the algorithm.

Publications

* denotes equal contribution and joint lead authorship.

2022

Learning Perceptual Concepts by Bootstrapping from Human Queries

Inducing Structure in Reward Learning by Learning Features

2021

Situational Confidence Assistance for Lifelong Shared Autonomy

Dynamically Switching Human Prediction Models for Efficient Planning

Feature Expansive Reward Learning: Rethinking Human Input

2020

LESS is More: Rethinking Probabilistic Models of Human Behavior

Detecting Hypothesis Space Misspecification in Robot Learning from Human Input

2019

Quantifying Hypothesis Space Misspecification in Learning from Human-Robot Demonstrations and Physical Corrections

2018

Learning under Misspecified Objective Spaces

Adapting to Continuously Shifting Domains

2016

Patch-Based Discrete Registration of Clinical Brain Images