Abstract:As one of the important development directions of artificial intelligence, spiking neural networks have received extensive attention in the fields of neuromorphic engineering and brain-inspired computing. To solve the problems of poor generalization as well as large memory and time consumption in spiking neural networks, this study proposes a classification method based on spiking neural networks for spatio-temporal interactive images. Specifically, a temporal efficient training algorithm is introduced to compensate for the kinetic energy loss in the gradient descent process. Then, the spatial learning through time algorithms are integrated to improve the ability of the network to process information efficiently. Finally, the spatial attention mechanism is added to enable the network to better capture important features in the spatial dimension. The experimental results show that the training memory occupation on the three datasets of CIFAR10, DVS Gesture, and CIFAR10-DVS are reduced by 46.68%, 48.52%, and 10.46%, respectively, and the training speed is increased by 2.80 times, 1.31 times, and 2.76 times, respectively. These results indicate that the proposed method improves network performance effectively on the premise of maintaining accuracy.