Abstract:Second-order optimization can accelerate the training of deep neural networks, but its huge computational cost hinders it from applications. Therefore, many algorithms have been proposed to approximate second-order optimization in recent studies. The K-FAC algorithm can approximate natural gradient, based on which an improved K-FAC algorithm is proposed according to the quasi-Newton method. The K-FAC algorithm is applied to the first few iterations. Then, a rank-one matrix is built, and its inverse matrix is computed by the Sherman-Morrison formula, greatly reducing computational complexity. The experimental results prove that the improved K-FAC algorithm has similar or even better performance than the original K-FAC, especially with much less training time. It also has the advantage over first-order optimization in regard to training time.