Online world modeling
enables real-world
Inverse Reinforcement Learning from Observation

meaning... NO
  • rewards
  • action supervision
  • pre-training
  • play data
  • failure examples
  • prior models
  • interventions
  • simulation

Only 15 observation-only demonstrations and < 40 minutes of real-world training from scratch

Training 100%

Citation

If you find this work useful, please cite:

@article{han2026mpail2,
  title   = {Online World Modeling Enables Real-World Inverse Reinforcement Learning from Observation},
  author  = {Han, Tyler and Nemekhbold, Bat and Shen, Siyang and Baijal, Rohan and
             Ebock, Richard and Ravichandiran, Harine and Jung, Sanghun and
             Huang, Kevin and Boots, Byron},
  year    = {2026},
}

Authors