计算机系统应用  2020, Vol. 29 Issue (5): 202-208 PDF

Bird Nest Detection on Transmission Tower Based on Improved SSD Algorithm
QI Jie, JIAO Liang-Bao
Institute of Artificial Intelligence Industry Technology, Nanjing Institute of Technology, Nanjing 211167, China
Foundation item: National Natural Science Foundation of China (61703201); Natural Science Foundation of Jiangsu Province, China (BK20170765)
Abstract: As an important part of overhead transmission line, the safety of transmission tower will affect the operation of the whole power system. The construction of bird's nest is one of the important factors affecting the normal operation of transmission line, which needs to be monitored. Nevertheless, the existing monitoring methods not only are inefficient, but also require a lot of manpower and material resources. To cope with this phenomenon, this study puts forward a real-time detection method based on the algorithm of SSD. In addition, lead network VGGNet is replaced by ResNet-101 based on the network structure of SSD, so as to improve their ability of feature extraction. The Focal loss instead of Softmax loss improve SSD sample imbalances in the algorithm. And the data augmentation is used to increase diversity, in order to improve the robustness of the model. Experimental results show that the detection accuracy of the method proposed in this study is improved by 3.17% and 6.35% respectively in terms of accuracy and recall rate compared with the original SSD algorithm.
Key words: deep learning     SSD algorithm     bird nest detection     ResNet     Focal loss

 图 1 传统的目标检测算法流程图

1 SSD原理和方法 1.1 SSD目标检测模型

SSD (Single Shot multibox Detector)算法是一种多框检测的One-stage算法, 其网络模型是基于一个前馈CNN网络, 该网络产生一个固定大小的包围框集合, 并对这些框中存在的对象类别进行评分, 然后利用非极大值抑制方法产生最后的检测结果. 其网络结构如图2所示[16].

 图 2 SSD网络结构

(1) 采用多尺度的特征图进行检测.

(2) 用于检测的卷积预测器.

(3) 设置多种宽高比的default box.

Default box是指在feature map[17]的每一个小格上都有一系列固定大小的box, 在default box宽高比的设置上, SSD借鉴了Faster R-CNN中anchor的理念, 所预测的bounding box是以default box为基准的, 该做法在一定程度上可以减少训练的难度. 对default box尺寸大小的确定是根据6层卷积层输出的特征图大小决定的, 其分别是Conv4_3、Conv7、Conv8_2、Conv9_2、Conv_10_2、Conv11_2, 所对应的特征图大小分别是38×38、19×19、10×10、5×5、3×3、1×1. 由于特征图的不同, 所需设置的先验框的尺度和长宽比也不尽相同. 对于先验框的尺度要遵循线性递增的规则, 是按式(1)进行计算的.

 ${s_k} = {s_{\min }} + \dfrac{{{s_{\max }} - {s_{\min }}}}{{m - 1}}(k - 1),\;k \in \left[ {1,\;m} \right]$ (1)

1.2 目标损失函数

SSD在计算损失函数时用到了两项的加权和, 分别是: 分类loss: Softmax loss; 回归loss: smooth L1 loss.

 $L\left( {x,c,l,g} \right) = \frac{1}{N}\left( {{L_{\rm conf}}\left( {x,c} \right) + \alpha {L_{\rm loc}}\left( {x,l,g} \right)} \right)$ (2)
 $\left\{\begin{split} &{{L_{\rm conf}}(x,c) = \displaystyle\sum\limits_{i \in Pos}^N {x_{ij}^p\log (\hat c_i^p)} - \displaystyle\sum\limits_{i \in Neg}^{} {\log (\hat c_i^0)} }\\ &{\;\;\hat c_i^p = \frac{{\exp (c_i^p)}}{{\displaystyle\sum\nolimits_p {\exp (c_i^p)} }}} \end{split}\right.$ (3)

1.3 改进策略 1.3.1 前置网络的改进

VGG-16中, 用于提取小目标信息的是Conv4_3层, 作为最浅的网络层, 在信息传递时, 或多或少的会存在信息丢失、损耗的问题, 而ResNet在某种程度上解决了这个问题, 通过将输入信息直接传递到输出, 以保护信息的完整性, 使得整个网络只需学习输入和输出差别的那一部分, 简化学习目标与难度.

1.3.2 损失函数的改进

 $FL\left( {{P_t}} \right) = - {\alpha _t}{\left( {1 - {p_t}} \right)^\gamma }\log \left( {{p_t}} \right)$ (4)

 $L\left( {x,c,l,g} \right) = \frac{1}{N}\left( {FL\left( {{p_t}} \right) + \alpha {L_{\rm loc}}\left( {x,l,g} \right)} \right)$ (5)
2 实验 2.1 实验平台

2.2 数据预处理

 图 3 图片标签示例

2.3 性能评价指标

 ${{P = }}\frac{{TP}}{{TP + FP}}$ (6)
 ${{R = }}\frac{{TP}}{{TP + FN}}$ (7)

2.4 训练

 图 4 对比损失曲线图

 图 5 SSD算法改进前后效果对比图

3 结束语

 [1] 师飘. 输电线路上鸟巢的检测算法研究[硕士学位论文]. 北京: 北京交通大学, 2017. [2] Castrillón M, Déniz O, Hernández D, et al. A comparison of face and facial feature detectors based on the Viola-Jones general object detection framework. Machine Vision and Applications, 2011, 22(3): 481-494. DOI:10.1007/s00138-010-0250-7 [3] Dalal N, Triggs B. Histograms of oriented gradients for human detection. Proceedings of 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Diego, CA, USA. 2005. 886–893. [4] Chen PH, Lin CJ, Schölkopf B. A tutorial on ν-support vector machines . Applied Stochastic Models in Business and Industry, 2005, 21(2): 111-136. DOI:10.1002/asmb.537 [5] Felzenszwalb PF, Girshick RB, McAllester D, et al. Object detection with discriminatively trained part-based models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32(9): 1627-1645. DOI:10.1109/TPAMI.2009.167 [6] Jiao JL, Sun J, Satoshi N. A convolutional neural network based two-stage document deblurring. Proceedings of 2017 14th IAPR International Conference on Document Analysis and Recognition. Kyoto, Japan. 2017. 703–707. [7] Ren J, Chen XH, Liu JB, et al. Accurate single stage detector using recurrent rolling convolution. Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA. 2017. 752–760. [8] Girshick R. Fast R-CNN. Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile. 2015. 1440–1448. [9] 何春燕. 基于卷积神经网络的车行环境多类障碍物检测与识别[硕士学位论文]. 重庆: 重庆邮电大学, 2017. [10] Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection. Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA. 2016. 779–788. [11] Liu W, Anguelov D, Erhan D, et al. SSD: Single shot MultiBox detector. Proceedings of the 14th European Conference on Computer Vision. Amsterdam, The Netherlands. 2016. 21–37. [12] Mahdianpari M, Salehi B, Rezaee M, et al. Very deep convolutional neural networks for complex land cover mapping using multispectral remote sensing imagery. Remote Sensing, 2018, 10(7): 1119. DOI:10.3390/rs10071119 [13] He KM, Zhang XY, Ren SQ, et al Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA. 2016. 770–778. [14] Shi WW, Gong YH, Tao XY, et al. Fine-grained image classification using modified DCNNs trained by cascaded softmax and generalized large-margin losses. IEEE Transactions on Neural Networks and Learning Systems, 2019, 30(3): 683-694. DOI:10.1109/TNNLS.2018.2852721 [15] Lin TY, Goyal P, Girshick R, et al. Focal loss for dense object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018. DOI:10.1109/TPAMI.2018.2858826 [16] 唐聪, 凌永顺, 郑科栋, 等. 基于深度学习的多视窗SSD目标检测方法. 红外与激光工程, 2018, 47(1): 0126003. [17] Nasr MB, Chtourou M. A constructive based hybrid training algorithm for feedforward neural networks. Proceedings of 2009 6th International Multi-conference on Systems, Signals and Devices. Djerba, Tunisia. 2009. 1–4. [18] He KM, Zhang XY, Ren SQ, et al. Identity mappings in deep residual networks. Proceedings of the 14th European Conference on Computer Vision. Amsterdam, The Netherlands. 2016. 630–645. [19] 肖尧. 小型飞行平台视频目标检测与跟踪技术研究[硕士学位论文]. 西安: 西安电子科技大学, 2018.