Neural Netw. 2021 Mar 26;141:11-29. doi: 10.1016/j.neunet.2021.03.025. Online ahead of print.
ABSTRACT
In deep learning tasks, the update step size determined by the learning rate at each iteration plays a critical role in gradient-based optimization. However, determining the appropriate learning rate in practice typically relies on subjective judgment. In this work, we propose a novel optimization method based on local quadratic approximation (LQA). In each update step, we locally approximate the loss function along the gradient direction by using a standard quadratic function of the learning rate. Subsequently, we propose an approximation step to obtain a nearly optimal learning rate in a computationally efficient manner. The proposed LQA method has three important features. First, the learning rate is automatically determined in each update step. Second, it is dynamically adjusted according to the current loss function value and parameter estimates. Third, with the gradient direction fixed, the proposed method attains a nearly maximum reduction in the loss function. Extensive experiments were conducted to prove the effectiveness of the proposed LQA method.
PMID:33845311 | DOI:10.1016/j.neunet.2021.03.025