Abstract:Football video lasts for a long time, and many video content is not the interest of audience. Therefore, football video scene classification has become an important research topic in recent decades, and many machine learning methods have also been applied to this topic. In this study, a soccer video scene classification algorithm based on 3D (three-dimensional) convolution neural network is proposed. The 3D convolution is applied to the field of soccer video, and the feasibility of this algorithm is verified by experiments. The flow of this experiment is as follows. Firstly, football video scene switching is detected based on frame difference method and logo detection method, and shot segmentation is realized. On this basis, the semantic features of shot segmentation are extracted and tagged, and then football events are classified by C3D. In this study, football videos are divided into seven categories: long shot, medium shot, close-up shot, playback shot, audience shot, opening shot, and VAR (Video Assistant Referee) shot. The experimental results show that the classification accuracy of the model is 96% on football video datasets.