Abstract:Human behavior recognition has a strong correlation with human body poses, but many open datasets for behavior recognition do not provide relevant data of poses. As a result, few recognition methods train pose data and fuse with other modalities. Current mainstream behavior recognition methods based on deep learning fuse RGB images with optical flow. This study proposes a behavior recognition algorithm based on a multi-stream convolutional neural network, which integrates human body poses. Firstly, the pose estimation algorithm is used to generate the data of key points on the human body from the static pictures containing people, and the poses are constructed by connecting the key points. Secondly, RGB, optical flow, and pose data are respectively trained on the multi-stream convolutional neural network, and the scores are fused. Finally, substantial experimental research is conducted on ablation and recognition accuracy in UCF101 and HMDB51 datasets. The experimental results reveal that the experimental precision of the multi-stream convolutional neural network integrated with pose images increases by 2.3% and 3.1% in the UCF101 and HMDB51 datasets, respectively, proving the effectiveness of the proposed algorithm.