Generalization & Imitation Learning: IRL Identifiability Part1

Paper reference Paper1: Towards Resolving Unidentifiability in Inverse Reinforcement Learning HERE Paper2: Identifiability in inverse reinforcement learning HERE Paper3: Identifiability and generalizability from multiple experts in Inverse Reinforcement Learning HERE This papers are quite theoretical and not so easy to read. But they, at least for me, reveals something to do with generalization. Preliminaries: IRL & Identifiability IRL, as a subset of Imitation Learning, aims to recover the reward function of certain MDP, given the reward-free environment $E$ and an optimal agent policy $\pi$....

September 30, 2022 · Dibbla

Pre-training with RL: APT

Behave From the Void: Unsupervised Active Pre-training paper The paper, Behave From the Void: Unsupervised Active Pre-training, proposed a new method for pretraining RL agents, APT , which is claimed to beat all baselines on DMControl Suite. As the abstract pointed out: the key novel idea is to explore the environment by maximizing a non-parametric entropy computed in a abstract representation space. This blog will take a look at the motivation, method and explanation of the paper, as well as compare it with the other AAAI paper....

September 24, 2022 · Dibbla

Pre-training with RL: APT

Behave From the Void: Unsupervised Active Pre-training paper The paper, Behave From the Void: Unsupervised Active Pre-training, proposed a new method for pretraining RL agents, APT , which is claimed to beat all baselines on DMControl Suite. As the abstract pointed out: the key novel idea is to explore the environment by maximizing a non-parametric entropy computed in a abstract representation space. This blog will take a look at the motivation, method and explanation of the paper, as well as compare it with the other AAAI paper....

September 24, 2022 · Dibbla

RL generalization: 2 Evaluations

It is obvious that to propose a problem better, one has to illustrate the problem well. RL generalization, as the survey indicated, is a class of problems. And here, we show two benchmark environments and their common experiment settings. Procgen Following Coinrun, OpenAI’s team proposed a new testing environment called procgen. Consisting of 16 games, the Procgen provides a convenient way to generate environments procedurally that share the same underlying logic and reward but are different in layout and rendering....

September 24, 2022 · Dibbla

RL generalization: 2 Evaluations

It is obvious that to propose a problem better, one has to illustrate the problem well. RL generalization, as the survey indicated, is a class of problems. And here, we show two benchmark environments and their common experiment settings. Procgen Following Coinrun, OpenAI’s team proposed a new testing environment called procgen. Consisting of 16 games, the Procgen provides a convenient way to generate environments procedurally that share the same underlying logic and reward but are different in layout and rendering....

September 24, 2022 · Dibbla