Neural Comput. 2022 Oct 14:1-38. doi: 10.1162/neco_a_01547. Online ahead of print.
This review examines gradient-based techniques to solve bilevel optimization problems. Bilevel optimization extends the loss minimization framework underlying statistical learning to systems that are implicitly defined through a quantity they minimize. This characterization can be applied to neural networks, optimizers, algorithmic solvers, and even physical systems and allows for greater modeling flexibility compared to the usual explicit definition of such systems. We focus on solving learning problems of this kind through gradient descent, leveraging the toolbox of implicit differentiation and, for the first time applied to this setting, the equilibrium propagation theorem. We present the mathematical foundations behind such methods, introduce the gradient estimation algorithms in detail, and compare the competitive advantages of the different approaches.