Bird Nest Detection on Transmission Tower Based on Improved SSD Algorithm
QI Jie, JIAO Liang-Bao
Institute of Artificial Intelligence Industry Technology, Nanjing Institute of Technology, Nanjing 211167, China
Foundation item: National Natural Science Foundation of China (61703201); Natural Science Foundation of Jiangsu Province, China (BK20170765)
Abstract: As an important part of overhead transmission line, the safety of transmission tower will affect the operation of the whole power system. The construction of bird's nest is one of the important factors affecting the normal operation of transmission line, which needs to be monitored. Nevertheless, the existing monitoring methods not only are inefficient, but also require a lot of manpower and material resources. To cope with this phenomenon, this study puts forward a real-time detection method based on the algorithm of SSD. In addition, lead network VGGNet is replaced by ResNet-101 based on the network structure of SSD, so as to improve their ability of feature extraction. The Focal loss instead of Softmax loss improve SSD sample imbalances in the algorithm. And the data augmentation is used to increase diversity, in order to improve the robustness of the model. Experimental results show that the detection accuracy of the method proposed in this study is improved by 3.17% and 6.35% respectively in terms of accuracy and recall rate compared with the original SSD algorithm.
Key words: deep learning     SSD algorithm     bird nest detection     ResNet     Focal loss

 图 1 传统的目标检测算法流程图

1 SSD原理和方法 1.1 SSD目标检测模型

SSD (Single Shot multibox Detector)算法是一种多框检测的One-stage算法, 其网络模型是基于一个前馈CNN网络, 该网络产生一个固定大小的包围框集合, 并对这些框中存在的对象类别进行评分, 然后利用非极大值抑制方法产生最后的检测结果. 其网络结构如图2所示[16].

 图 2 SSD网络结构

(1) 采用多尺度的特征图进行检测.

(2) 用于检测的卷积预测器.

(3) 设置多种宽高比的default box.

Default box是指在feature map[17]的每一个小格上都有一系列固定大小的box, 在default box宽高比的设置上, SSD借鉴了Faster R-CNN中anchor的理念, 所预测的bounding box是以default box为基准的, 该做法在一定程度上可以减少训练的难度. 对default box尺寸大小的确定是根据6层卷积层输出的特征图大小决定的, 其分别是Conv4_3、Conv7、Conv8_2、Conv9_2、Conv_10_2、Conv11_2, 所对应的特征图大小分别是38×38、19×19、10×10、5×5、3×3、1×1. 由于特征图的不同, 所需设置的先验框的尺度和长宽比也不尽相同. 对于先验框的尺度要遵循线性递增的规则, 是按式(1)进行计算的.

 ${s_k} = {s_{\min }} + \dfrac{{{s_{\max }} - {s_{\min }}}}{{m - 1}}(k - 1),\;k \in \left[ {1,\;m} \right]$ (1)

1.2 目标损失函数

SSD在计算损失函数时用到了两项的加权和, 分别是: 分类loss: Softmax loss; 回归loss: smooth L1 loss.

 $L\left( {x,c,l,g} \right) = \frac{1}{N}\left( {{L_{\rm conf}}\left( {x,c} \right) + \alpha {L_{\rm loc}}\left( {x,l,g} \right)} \right)$ (2)
 $\left\{\begin{split} &{{L_{\rm conf}}(x,c) = \displaystyle\sum\limits_{i \in Pos}^N {x_{ij}^p\log (\hat c_i^p)} - \displaystyle\sum\limits_{i \in Neg}^{} {\log (\hat c_i^0)} }\\ &{\;\;\hat c_i^p = \frac{{\exp (c_i^p)}}{{\displaystyle\sum\nolimits_p {\exp (c_i^p)} }}} \end{split}\right.$ (3)

1.3 改进策略 1.3.1 前置网络的改进

VGG-16中, 用于提取小目标信息的是Conv4_3层, 作为最浅的网络层, 在信息传递时, 或多或少的会存在信息丢失、损耗的问题, 而ResNet在某种程度上解决了这个问题, 通过将输入信息直接传递到输出, 以保护信息的完整性, 使得整个网络只需学习输入和输出差别的那一部分, 简化学习目标与难度.

1.3.2 损失函数的改进

 $FL\left( {{P_t}} \right) = - {\alpha _t}{\left( {1 - {p_t}} \right)^\gamma }\log \left( {{p_t}} \right)$ (4)

 $L\left( {x,c,l,g} \right) = \frac{1}{N}\left( {FL\left( {{p_t}} \right) + \alpha {L_{\rm loc}}\left( {x,l,g} \right)} \right)$ (5)
2 实验 2.1 实验平台

2.2 数据预处理

 图 3 图片标签示例

2.3 性能评价指标

 ${{P = }}\frac{{TP}}{{TP + FP}}$ (6)
 ${{R = }}\frac{{TP}}{{TP + FN}}$ (7)

2.4 训练

 图 4 对比损失曲线图

 图 5 SSD算法改进前后效果对比图

3 结束语

