计算机系统应用  2018, Vol. 27 Issue (11): 78-83 PDF

Fast Abnormal Pedestrians Detection Based on Multi-Task CNN in Surveillance Video
LI Jun-Jie, LIU Cheng-Lin, ZHU Ming
School of Information Science and Technology, University of Science and Technology of China, Hefei 230027, China
Foundation item: National Science and Technology Major Project of China (2017ZX03001019)
Abstract: In case that public safety has already caused extensive social concern in recent years, how to use surveillance video to detect abnormal pedestrians and prevent dangerous events becomes a hot topic. Abnormal pedestrians are those who are distinctly different from ordinary pedestrians in appearance, for example, using helmet to cover the face or ducking from the camera. Considering that the characteristics of abnormal pedestrians are mainly concentrated in head and face, this study proposes a fast detection method for abnormal pedestrians based on multi-task Convolutional Neural Network (CNN) and one-class Support Vector Machine (SVM) for head-facial features. First, we detect head-facial regions in surveillance video, then we use the multi-task CNN to extract features of these regions, and then we use one-class SVM to judge whether it is a normal pedestrian or not. In addition, this study designs a convolution kernel splitting method for CNN to accelerate the feature extraction speed. Finally, the experiment shows that the algorithm proposed in this study can effectively and quickly detect abnormal pedestrians in surveillance video.
Key words: surveillance video     abnormal pedestrians     multi-task CNN (Convolutional Neural Network)     convolution kernel splitting method     one-class SVM (Support Vector Machine)

 图 1 监控场景中的异常行人示例

1 异常行人检测概述

 图 2 异常行人检测系统架构

1.1 头面部区域检测

1.2 异常行人判别

2 算法设计与实现 2.1 多任务卷积神经网络

 图 3 多任务卷积神经网络初级模型(输入以120×100为例)

2.2 卷积核拆分

 图 4 卷积核拆分

 $\begin{split} \frac{{n \cdot n \cdot M \cdot k + n \cdot n \cdot M \cdot k + n \cdot n \cdot N \cdot M}}{{n \cdot n \cdot N \cdot k \cdot k \cdot M}}= \frac{2}{{kN}} + \frac{1}{{{k^2}}}\end{split}$

2.3 训练数据集

 图 5 改进后的多任务卷积神经网络模型

1) 使用公开人脸属性数据集CelebA[20]进行网络预训练, 选用了其中十二个属性作为多任务网络模型的输出属性, 分别为眼袋、光头、刘海、黑发、金发、眼镜、性别、年龄段、嘴巴张开、胡子、帽子和领带. 部分属性及对应样本如图6所示.

2) 在预训练得到的参数基础上, 用实际监控视频中的样本进行微调, 多任务网络的输出部分改为如下四个分类任务: 是否戴眼镜、是否戴帽子、是否露出嘴巴和人脸方位(正面、侧面和背面), 如图7所示.

 图 6 CelebA数据集部分样本示例

 图 7 实际监控视频部分样本示例

2.4 单分类算法

3 实验与分析

3.1 多任务卷积神经网络的训练与评估

3.2 图像特征与单分类器的组合

3.3 异常行人检测系统

 图 8 异常行人检测示例

4 总结与展望

