计算机系统应用  2018, Vol. 27 Issue (11): 142-148 PDF

1. 中国科学技术大学 计算机科学与技术学院 网络计算与高效算法实验室, 合肥 230027;
2. 安徽省计算与通信软件重点实验室, 合肥 230027;
3. 中国科学技术大学 先进技术研究院, 合肥 230027

Image CAPTCHA Recognition Based on Convolutional Neural Network
QIN Bo, GU Nai-Jie, ZHANG Xiao-Ci, LIN Chuan-Wen
Laboratory of Network Computing and High Efficient Algorithm, School of Computer Science and Technology, University of Science and Technology of China, Hefei 230027, China;
Anhui Provincial Key Laboratory of Computing and Communication Software, Hefei 230027, China;
Institute of Advanced Technology, University of Science and Technology of China, Hefei 230027, China
Abstract: As a security measure, CAPTCHA is widely used in Internet. This study proposes a CAPTCHA identification method based on convolutional neural network. Through convolutional layer concatenation, residual learning, global pool, and other technical means, under the premise of ensuring the recognition accuracy rate is not affected, it greatly reduces the amount of network parameters. This study uses the CAPTCHA in the railway ticket website and the CAPTCHA in the educational system as examples to test the performance of the model. For the CAPTCHA in railway ticket website, the experimental results show that this method has the least amount of parameters, and the recognition accuracy of this method is 98.76% for image and the recognition accuracy of the Chinese phrases is 99.14%. For the CAPTCHA in educational system, it has the least amount of parameters and the accuracy is 87.30%.
Key words: image CAPTCHA recognition     convolutional neural network     residual learning     visualization

1 引言

2 相关工作

3 网络模型设计

 图 1 本文网络结构

3.1 级联卷积层

 图 2 级联卷积层

(1) 参数量

1个 $7 \times 7$ 卷积核的参数量为:

 $K{P_{7 \times 7}} = {7^2}{C^2} = 49{C^2}$ (1)

3个 $3 \times 3$ 卷积核的参数量为:

 $3K{P_{3 \times 3}} = 3\left({3^2}{C^2}\right) = 27{C^2}$ (2)

 $RT = \frac{{K{P_{7 \times 7}}}}{{3K{P_{3 \times 3}}}} \approx 1.81$ (3)

(2) 感受野

 $R{F_i} = \left\{ \begin{array}{l}(R{F_{i - 1}} - 1) \times stride + ksize,\;\;\;\;i{\rm{ > 0}}\\1,\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;i{\rm{ = 0}}\end{array} \right.$ (4)

3.2 残差层

 ${x_L} = {x_l} + \sum\limits_{i = 1}^{L - 1} {F({x_i})}$ (5)

 $\frac{{\partial E}}{{\partial {x_l}}} = \frac{{\partial E}}{{\partial {x_L}}}\frac{{\partial {x_L}}}{{\partial {x_l}}} = \frac{{\partial E}}{{\partial {x_L}}}\left(1 + \frac{{\partial \displaystyle\sum\limits_{i = 1}^{L - 1} {F({x_i})} }}{{\partial {x_l}}}\right)$ (6)

 图 3 残差模块

 图 4 卷积分组

3.3 分类池化层

 图 5 分类池化层

 图 6 全局平均池化层

 ${y_{i,j}} = \frac{{\displaystyle\sum\limits_{k = 0}^M {\displaystyle\sum\limits_{t = 0}^N {{x_{i + k,j + t}}} } }}{{MN}}$ (7)
3.4 网络结构

4 实验与分析 4.1 平台介绍

4.2 数据集介绍

(1) 购票网站验证码

 图 7 购票网站验证码

(2) 正方教务系统验证码

 图 8 正方教务系统验证码

4.3 铁路购票验证码实验

(1) 图形验证码

(2) 中文验证码

1) 从验证码中截取出中文汉字部分ChiWords;

2) 对ChiWords进行灰度化、二值化处理, 得到BChiWords;

3) 从BChiWords中依次统计出每列黑色像素点个数BNP;

4) 设置阈值T1, T2. 如果BNP小于阈值T1, 则说明此位置有可能是分割点, 记连续分割点的起止位置st. 如果 $t - s \geqslant {T_2}$ 时, 保存st.

5) 根据结果切分中文词组, 并获取下一张验证码, 返回第1)步.

6) 直至终止条件满足, 结束.

(3) 整体验证码

 图 9 整体识别过程

4.4 正方教务系统验证码实验

4.5 实验分析

(1) 卷积分组实验

(2) 特征图可视化

 图 10 最后层特征图可视化结果

5 总结

 [1] Krizhevsky A, Sutskever I, Hinton GE. Imagenet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems, 2012, 25(2): 1097-1105. [2] Zhang JS, Wang XF. Breaking Internet banking CAPTCHA based on instance learning. Proceedings of 2010 IEEE International Symposium on Computational Intelligence and Design. Hangzhou, China. 2010. 39–43. [3] 李兴国, 高炜. 基于滴水算法的验证码中粘连字符分割方法. 计算机工程与应用, 2014, 50(1): 163-166. DOI:10.3778/j.issn.1002-8331.1208-0310 [4] Lu PP, Shan L, Li J, et al. A new segmentation method for connected characters in CAPTCHA. Proceedings of 2015 International Conference on Automation and Information Sciences. Changshu, China. 2015. 128–131. [5] Yan J, El Ahmad AS. A low-cost attack on a Microsoft CAPTCHA. Proceedings of the 15th ACM Conference on Computer and Communications Security. Alexandria, VA, USA. 2008. 543–554. [6] Mori G, Malik J. Recognizing objects in adversarial clutter: Breaking a visual CAPTCHA. Proceedings of 2003 IEEE Conference on Computer Vision and Pattern Recognition. Madison, WI, USA. 2003, 1. 134-141. [7] LeCun Y, Bottou L, Bengio Y, et al. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 1998, 86(11): 2278-2324. DOI:10.1109/5.726791 [8] Hinton GE, Osindero S, Teh YW. A fast learning algorithm for deep belief nets. Neural Computation, 2006, 18(7): 1527-1554. DOI:10.1162/neco.2006.18.7.1527 [9] Ciregan D, Meier U, Schmidhuber J. Multi-column deep neural networks for image classification. Proceedings of 2012 IEEE Conference on Computer Vision and Pattern Recognition. Providence, RI, USA. 2012. 3642–3649. [10] Zhong ZY, Jin LW, Xie ZC. High performance offline handwritten Chinese character recognition using GoogLeNet and directional feature maps. Proceedings of 2015 13th International Conference on Document Analysis and Recognition. Tunis, Tunisia. 2015. 846–850. [11] 范望, 韩俊刚, 苟凡, 等. 卷积神经网络识别汉字验证码. 计算机工程与应用, 2018, 54(3): 160-165. [12] Goodfellow IJ, Bulatov Y, Ibarz J, et al. Multi-digit number recognition from street view imagery using deep convolutional neural networks. arXiv: 1312.6082, 2013. [13] Matan O, Burges CJC, LeCun Y, et al. Multi-digit recognition using a space displacement neural network. Moody JM, Hanson SJ, Lippman RP. Neural Information Processing Systems. San Mateo, CA: Morgan Kaufmann Publishers, 1992: 488–495. [14] Shi BG, Bai X, Yao C. An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(11): 2298-2304. DOI:10.1109/TPAMI.2016.2646371 [15] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. arXiv: 1409.1556, 2014. [16] He KM, Sun J. Convolutional neural networks at constrained time cost. Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, MA, USA. 2015. 5353–5360. [17] He KM, Zhang XY, Ren SQ, et al. Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA. 2016. 770–778. [18] Lin M, Chen Q, Yan SC. Network in network. arXiv: 1312.4400, 2013. [19] Szegedy C, Liu W, Jia YQ, et al. Going deeper with convolutions. Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, MA, USA. 2015. 1–9.