Abstract:Pedestrian detection is an important application of computer vision. However, it mostly uses the methods of low-level features. Deep learning, by combining the low-level features of pedestrians, can get more abstract representation of high-level features which makes the detection more robust. In this study, we propose a Faster Region-based Convolutional Neural Networks (RCNN)-based pedestrian detection method in which semantics is jointly considered. Firstly, we modify and fine-tune the Faster RCNN for fitting in the pedestrian dataset and for making it more capable of detecting small objects. Secondly, we establish connections between the pedestrian and its semantic attributes by spatial relationship, then fuse the pedestrian and its semantic attributes, and meanwhile adaptively adjust the confidence of the target pedestrian. The adaptive adjustment strategy, based on the connections between the pedestrian and its semantic attributes, realizes the fusion of the individual information. Extensive experiments and comparison show that the proposed approach in this study is of high accuracy, acceptable speed, and practical value. What is more, the semantic attributes can be used to count people or analyze the pedestrian's behavior.