The Intervention-based Imitation Learning (IIL) Family

From DAgger, to HG-DAgger and more recent advances DAgger Dataset Aggregation (DAgger) is a imitation learning algorithm proposed in AISTAT11 paper A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning by Stéphane Ross, Geoffrey J. Gordon and J. Andrew Bagnell. It is a simple yet effective algorithm that has been widely used in imitation learning, and as you can tell from the title, it’s not related to human-in-the-loop RL....

October 21, 2023 · Dibbla

The Intervention-based Imitation Learning (IIL) Family

From DAgger, to HG-DAgger and more recent advances DAgger Dataset Aggregation (DAgger) is a imitation learning algorithm proposed in AISTAT11 paper A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning by Stéphane Ross, Geoffrey J. Gordon and J. Andrew Bagnell. It is a simple yet effective algorithm that has been widely used in imitation learning, and as you can tell from the title, it’s not related to human-in-the-loop RL....

October 21, 2023 · Dibbla

Notes on Generalization/Cross-Embodiment Experiments

In paper1 Generalizable Imitation Learning from Observation via Inferring Goal Proximity, the idea of task structure/task information is proposed without further citation or reference. This high-level task structure generalizes to new situations and thus helps us to quickly learn the task in new situations. As for current AIRL methods: However, such learned reward functions often overfit to the expert demonstrations by learning spurious correlations between task-irrelevant features and expert/agent labels CoRL21, and thus suffer from generalization to slightly different initial and goal configurations from the ones seen in the demonstrations (e....

October 25, 2022 · Dibbla

Notes on Generalization/Cross-Embodiment Experiments

In paper1 Generalizable Imitation Learning from Observation via Inferring Goal Proximity, the idea of task structure/task information is proposed without further citation or reference. This high-level task structure generalizes to new situations and thus helps us to quickly learn the task in new situations. As for current AIRL methods: However, such learned reward functions often overfit to the expert demonstrations by learning spurious correlations between task-irrelevant features and expert/agent labels CoRL21, and thus suffer from generalization to slightly different initial and goal configurations from the ones seen in the demonstrations (e....

October 25, 2022 · Dibbla

RL generalization: Generalizable LfO via Inferring Goal Proximity

Paper Here; Official Blog Here Generalizable Imitation Learning from Observation via Inferring Goal Proximity is a NIPS2021 paper which focuses on the generalization problem of Learning from Demonstration(LfO). The idea of the paper is quite straightforward without much mathematical explanations. In this blog I will show the high-level idea and experiment setting of the paper. Preliminaries: LfO and “Goal” idea LfO is an imitation learning setting, where we cannot access the action information of experts’ demonstrations....

October 22, 2022 · Dibbla