Abstract:Currently, the widely used first-order deep learning optimizers include non-adaptive learning rate optimizers such as SGDM and adaptive learning rate optimizers like Adam, both of which estimate the overall gradient through exponential moving average. However, such a method is biased and hysteretic. In this study, we propose a rectified SGDM algorithm based on difference, i.e. RSGDM. Our contributions are as follows: 1) We analyze the bias and hysteresis triggered by exponential moving average in the SGDM algorithm. 2) We use the difference estimation term to correct the bias and hysteresis in the SGDM algorithm, and propose the RSGDM algorithm. 3) The experiments on CIFAR-10 and CIFAR-100 datasets proves that our RSGDM algorithm is higher than the SGDM algorithm in convergence accuracy.