1. 中国科学技术大学 计算机科学与技术学院 网络计算与高效算法实验室, 合肥 230027;
2. 安徽省计算与通信软件重点实验室, 合肥 230027;
3. 中国科学技术大学 先进技术研究院, 合肥 230027

Image CAPTCHA Recognition Based on Convolutional Neural Network
QIN Bo, GU Nai-Jie, ZHANG Xiao-Ci, LIN Chuan-Wen
Laboratory of Network Computing and High Efficient Algorithm, School of Computer Science and Technology, University of Science and Technology of China, Hefei 230027, China;
Anhui Provincial Key Laboratory of Computing and Communication Software, Hefei 230027, China;
Institute of Advanced Technology, University of Science and Technology of China, Hefei 230027, China
Abstract: As a security measure, CAPTCHA is widely used in Internet. This study proposes a CAPTCHA identification method based on convolutional neural network. Through convolutional layer concatenation, residual learning, global pool, and other technical means, under the premise of ensuring the recognition accuracy rate is not affected, it greatly reduces the amount of network parameters. This study uses the CAPTCHA in the railway ticket website and the CAPTCHA in the educational system as examples to test the performance of the model. For the CAPTCHA in railway ticket website, the experimental results show that this method has the least amount of parameters, and the recognition accuracy of this method is 98.76% for image and the recognition accuracy of the Chinese phrases is 99.14%. For the CAPTCHA in educational system, it has the least amount of parameters and the accuracy is 87.30%.
Key words: image CAPTCHA recognition     convolutional neural network     residual learning     visualization

1 引言

2 相关工作

3 网络模型设计

 图 1 本文网络结构

3.1 级联卷积层

 图 2 级联卷积层

(1) 参数量

1个 $7 \times 7$ 卷积核的参数量为:

 $K{P_{7 \times 7}} = {7^2}{C^2} = 49{C^2}$ (1)

3个 $3 \times 3$ 卷积核的参数量为:

 $3K{P_{3 \times 3}} = 3\left({3^2}{C^2}\right) = 27{C^2}$ (2)

 $RT = \frac{{K{P_{7 \times 7}}}}{{3K{P_{3 \times 3}}}} \approx 1.81$ (3)

(2) 感受野

 $R{F_i} = \left\{ \begin{array}{l}(R{F_{i - 1}} - 1) \times stride + ksize,\;\;\;\;i{\rm{ > 0}}\\1,\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;i{\rm{ = 0}}\end{array} \right.$ (4)

3.2 残差层

 ${x_L} = {x_l} + \sum\limits_{i = 1}^{L - 1} {F({x_i})}$ (5)

 $\frac{{\partial E}}{{\partial {x_l}}} = \frac{{\partial E}}{{\partial {x_L}}}\frac{{\partial {x_L}}}{{\partial {x_l}}} = \frac{{\partial E}}{{\partial {x_L}}}\left(1 + \frac{{\partial \displaystyle\sum\limits_{i = 1}^{L - 1} {F({x_i})} }}{{\partial {x_l}}}\right)$ (6)

 图 3 残差模块

 图 4 卷积分组

3.3 分类池化层

 图 5 分类池化层

 图 6 全局平均池化层

 ${y_{i,j}} = \frac{{\displaystyle\sum\limits_{k = 0}^M {\displaystyle\sum\limits_{t = 0}^N {{x_{i + k,j + t}}} } }}{{MN}}$ (7)
3.4 网络结构

4 实验与分析 4.1 平台介绍

4.2 数据集介绍

(1) 购票网站验证码

 图 7 购票网站验证码

(2) 正方教务系统验证码

 图 8 正方教务系统验证码

4.3 铁路购票验证码实验

(1) 图形验证码

(2) 中文验证码

1) 从验证码中截取出中文汉字部分ChiWords;

2) 对ChiWords进行灰度化、二值化处理, 得到BChiWords;

3) 从BChiWords中依次统计出每列黑色像素点个数BNP;

4) 设置阈值T1, T2. 如果BNP小于阈值T1, 则说明此位置有可能是分割点, 记连续分割点的起止位置st. 如果 $t - s \geqslant {T_2}$ 时, 保存st.

5) 根据结果切分中文词组, 并获取下一张验证码, 返回第1)步.

6) 直至终止条件满足, 结束.

(3) 整体验证码

 图 9 整体识别过程

4.4 正方教务系统验证码实验

4.5 实验分析

(1) 卷积分组实验

(2) 特征图可视化

 图 10 最后层特征图可视化结果

5 总结

