Gaussian Process Bandit Optimization of theThermodynamic Variational Objective
 
				Achieving the full promise of the Thermodynamic Variational Objective (TVO),a recently proposed variational lower bound on the log evidence involving a one-dimensional Riemann integral approximation, requires choosing a “schedule” ofsorted discretization points. This paper introduces a bespoke Gaussian processbandit optimization method for automatically choosing these points...
Our approach not only automates their one-time selection, but also dynamically adaptstheir positions over the course of optimization, leading to improved model learning and inference. We provide theoretical guarantees that our bandit optimizationconverges to the regret-minimizing choice of integration points. Empirical validation of our algorithm is provided in terms of