Optimal Model Design for Reinforcement Learning
omd JAX code for the paper “Control-Oriented Model-Based Reinforcement Learning with Implicit Differentiation” Summary Model based reinforcement learning typically trains the dynamics and reward functions by minimizing the error of predictions.The error is only a proxy to maximizing the sum of rewards, the ultimate goal of the agent, leading to the objective mismatch.We propose an end-to-end algorithm called Optimal Model Design (OMD) that optimizes the returns directly for model learning.OMD leverages the implicit function theorem to optimize the model parameters […]
Read more