Several thoughts on automated science

Thu, 21 May 2026 00:19:20 -0700

When I first write down the hook of this blog last month, I was thinking about something narrow: AI models have become very good at coding, and this is changing how machine learning research gets done. My thought was that the field might be shifting from a workflow problem to an evaluation problem — where the hard part is no longer doing the work, but judging which results are advancing human understanding.

The intervention-based imitation learning (IIL) family

Sat, 21 Oct 2023 11:37:48 +0800

Update Nov 2025: I am surprised by PI integrates IIL method into the $\pi$*-0.6 model, and I firmly believe that human / end-user will be integrated into the post-post-training of robotic foundation models in certain ways.

In this blog, we discuss the imitation learning in an online fashion with human from DAgger, to HG-DAgger and more recent advances

DAgger

Dataset Aggregation (DAgger) is an imitation learning algorithm proposed in the AISTATS11 paper A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning by Stéphane Ross, Geoffrey J. Gordon and J. Andrew Bagnell. It is a simple yet effective algorithm that has been widely used in imitation learning, and as you can tell from the title, it’s not related to human-in-the-loop RL.

Many Matrices on dibbla.space

Several thoughts on automated science

The intervention-based imitation learning (IIL) family

DAgger