Abstract:Object detection is widely used in surveillance systems for pedestrian detection and face recognition. It is a research hotspot of current deep learning. Supervised learning trains pedestrian detectors for specific scenes by manually annotating large datasets. However, the manual labeling method is time-consuming and laborious. In this work, the shortcomings of manual labeling of datasets for supervised learning are studied. A method of semi-automatic labeling of pedestrians is proposed. The surveillance video captured by the stationary monocular camera, using the initial foreground possibilities provided by the optical flow information, and the visual similarity across time, iteratively updates the initial foreground likelihood to segment the moving pedestrians. According to the segmented foreground pedestrians, a method of semi-automatic labeling of pedestrians is proposed. The experimental results show that the proposed method can provide a large number of datasets for the pedestrian detection system, and the efficiency is obviously superior to the traditional manual annotation method.