﻿ 基于多重几何特征和CNN的脱机手写算式识别
 计算机系统应用  2020, Vol. 29 Issue (8): 271-279 PDF

Off-Line Handwritten Equation Recognition Based on Multiple Geometric Features and CNN
FU Peng-Bin, PENG Jing-Xuan, YANG Hui-Rong, LI Jian-Jun
Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China
Foundation item: Natural Science Foundation of Beijing Municipality (4153058)
Abstract: In view of the handwritten equation with complex two-dimensional spatial structure in the mathematics class of primary and secondary schools, this study proposes a solution of off-line handwritten equation recognition based on multiple geometric features and Convolutional Neural Network (CNN). First, based on CNN classification algorithm, the single handwritten character is recognized after image preprocessing. Then, using geometric features, such as aspect ratio, center of mass coordinate, center of mass offset angle, center offset, horizontal overlap interval ratio, etc., to recognize common handwritten formulas such as decimal, fraction, index, and root formula with complex spatial structure, and using the divide-and-conquer algorithm to complete the recognition of composite formulas nested by the above formula combination. Finally, the off-line handwritten arithmetic recognition system is designed and implemented. The experimental results show that under certain illumination conditions, the recognition rate of handwritten equation of different resolutions and noisy images can reach 90.43%, which has certain application value.
Key words: image preprocessing     convolutional neural network     geometric features     handwritten equation recognition

1 相关工作 1.1 图像预处理

 图 1 图像预处理

1.2 数据准备

2 基于CNN的字符识别 2.1 字符识别

 图 2 CNN模型

2.2 实验测试及分析

 图 3 基于EQU-MNIST训练集的字符分类准确率

3 基于多重几何特征的手写算式识别

 图 4 手写算式识别算法流程图

 图 5 复合手写算式样例

3.1 多重几何特征提取

 图 6 字符几何特征

 ${M_{pq}} = \sum\limits_x {\sum\limits_y {g{\rm{(}}x,y{\rm{)}}} } \cdot {x^p} \cdot {y^q}$ (1)

 ${{(}}{x_{\rm mass}},{y_{\rm mass}}{\rm{) = (}}{{{M_{10}}}/{{M_{00}}}},{{{M_{01}}}/{{M_{00}}}}{\rm{)}}$ (2)

 $\alpha = \arctan {\rm{(}}\dfrac{{y{2_{\rm mass}} - y{1_{\rm mass}}}}{{x{2_{\rm mass}} - x{1_{\rm mass}}}}{\rm{)}}$ (3)
 图 7 字符对几何模型

 $\delta = \dfrac{{y{2_{\rm center}} - y{1_{\rm center}}}}{{H1}}$ (4)

 $HOR = \dfrac{{len{\rm{([}}\max{\rm{(}}x{{\rm{1}}_{\min}},x{{\rm{2}}_{\min}}{\rm{)}},\min {\rm{(}}x{{\rm{1}}_{\max }},x{{\rm{2}}_{\max }}{\rm{)])}}}}{{len{\rm{([}}x{{\rm{1}}_{\min}},x{{\rm{1}}_{\max }}{\rm{])}}}}$ (5)
 $VOR = \dfrac{{len{\rm{([}}\max{\rm{(}}y{{\rm{1}}_{\min}},y{{\rm{2}}_{\min}}{\rm{)}},\min {\rm{(}}y{{\rm{1}}_{\max }},y{{\rm{2}}_{\max }}{\rm{)])}}}}{{len{\rm{([}}y{{\rm{1}}_{\min}},y{{\rm{1}}_{\max }}{\rm{])}}}}$ (6)

 $\begin{split} &{\rm{(}}{X_{\rm center}},{Y_{\rm center}}{\rm{) =}}\\ &\left(\dfrac{{\min{\rm{(}}x{{\rm{1}}_{\min}}, \cdots ,x{n_{\min}}{\rm{) + }}\max {\rm{(}}x{{\rm{1}}_{\max }}, \cdots ,x{n_{\max }}{\rm{)}}}}{{\rm{2}}},\right.\\ &\left.\dfrac{{\min{\rm{(}}y{{\rm{1}}_{\min}}, \cdots ,y{n_{\min}}{\rm{) + }}\max {\rm{(}}y{{\rm{1}}_{\max }}, \cdots ,y{n_{\max }}{\rm{)}}}}{{\rm{2}}}\right) \end{split}$ (7)
3.2 小数和分数识别

 $\left\{\begin{split} &{\rm{(1)}}\;y{{\rm{2}}_{\min }} > y{1_{\rm center}}\;{\rm and}\;y{2_{\min }} > y{3_{\rm center}}\\ &{\rm{(2)}}\;y{2_{\min }} > {\rm{(}}y{1_{\min }}{\rm{ + }}y{3_{\min }}{\rm{)}}/2\\ &{\rm{(3)}}\;H2 < {\rm{(}}H{\rm{1}}/2{\rm{ + }}H3/2{\rm{)}}/2\\ &{\rm{(4)}}\;{\text{相邻两个字符均为数字}} \end{split}\right.$

 图 8 小数识别算法流程图

Step 1. 逆序遍历原始识别列表oriList, 记录带分数的整数系数在列表中的终止索引endIndex, 起始索引startIndex.

Step 2. 基于索引从原始列表oriList中提取带分数的整数系数列表, 并依据表1标签字符对应关系转换成整数字符串integerStr.

Step 3. 基于终止索引endIndex从原始列表oriList中提取带分数的真分数字符串fractionStr.

Step 4. 将带分数的整数系数字符串与真分数字符串相加, 生成结果字符串resultStr=“(”+integerStr+“+”+fractionStr+“)”.

 图 9 分数识别算法流程图

3.3 指数和根式识别优化

 $\left\{\begin{split} &{\rm{(1)}}\;y{2_{\min }} < y{1_{\min }} - 1/7*H2\\ &{\rm{(2)}}\;y{2_{\min }} + H2 < y{1_{\min }} + 4/7*H1\\ &{\rm{(3)}}\;2/7 < HR < 6/7\\ &{\rm{(4)}}\;\pi /12 < \alpha < 5\pi /12 \end{split}\right.$

Input: charImages, charLabels

Output: exponentialresult

baseLabel = [] # 底数字符标记

indexLabel = [] # 指数字符标记

i = 1 # 从第二个字符开始遍历

while i < len(charImages):

priorCharImage = charImages[i–1]

# 第一步: 获取指数标记列表

tempI = i

while tempI < len(charImages) and isexponent-

ial(priorCharImage, charImages[tempI]):

indexLabel.append(charLabels[tempI])

tempI = tempI + 1

# 第二步: 获取底数标记列表

if len(indexLabel) > 0:

baseLabel.append(charLabels[0: i])

if tempI != i:

i = tempI – 1

i = i + 1

# 第三步: 组合指数识别结果串

exponentialresult = ["pow(" + list2str(baseLabel) + "," + list2str(indexLabel) + ")"]

Step 1. 基于角点提取算法获取字符的拐点信息, 并对其进行编号(如图10(a)所示), 如果拐点个数小于4, 则result=False; 否则, 执行Step2.

Step 2. 依据根号的①②③拐点信息定义半包围结构的有效区域(如图10(b)所示).

Step 3. 基于 $\scriptstyle HOR$ , $\scriptstyle VOR$ 判断半包围结构的有效区域内是否存在其他字符, 如果不存在, 则result=False; 否则result=True.

Step 1. 遍历字符图像集images, 基于算法2判断是否存在根号, 如果存在, 则执行Step2; 否则, 返回根式识别结果串radicalStr.

Step 2. 基于根号的②③拐点信息提取根指数图像, 识别分类并转换为根指数字符串indexStr.

Step 3. 基于根式转指数的运算规则, 若indexStr为空, 则indexStr=“1/2”; 否则, indexStr=“1/”+indexStr.

Step 4. 依据根号的半包围结构的有效区域, 结合 $\scriptstyle HOR$ , $\scriptstyle VOR$ , 提取根底数图像, 识别分类并转换为根底数字符串baseStr.

 图 10 根号的结构特征

Step 1. 逆序遍历原始列表oriList, 记录系数在列表中的终止索引endIndex, 起始索引startIndex.

Step 2. 基于索引从原始列表oriList中提取系数列表, 并依据表1标签字符对应关系转换成系数字符串coefficientStr.

3.4 复合算式识别

 图 11 复合算式识别

4 实验测试与分析 4.1 实验环境及数据

4.2 实验结果及分析

 $p = \frac{{TN}}{{TN + FN}} \times 100\%$ (8)

 图 12 图像分辨率对算式识别率的影响

 图 13 光照对算式识别率的影响

 图 14 噪声对算式识别率的影响

 图 15 3个系统识别效果截图

5 结论与展望

 [1] 张松林, 李雪. 灵敏度正则化极限学习机及其在数字识别中的应用. 计算机系统应用, 2017, 26(6): 143-147. DOI:10.15888/j.cnki.csa.005823 [2] Mellouli D, Hamdani TM, Sanchez-Medina JJ, et al. Morphological convolutional neural network architecture for digit recognition. IEEE Transactions on Neural Networks and Learning Systems, 2019, 30(9): 2876-2885. DOI:10.1109/TNNLS.2018.2890334 [3] Wang F, Zhu HQ, Li W, et al. A hybrid convolution network for serial number recognition on banknotes. Information Sciences, 2020, 512: 952-963. DOI:10.1016/j.ins.2019.09.070 [4] 王维, 万文略. 基于组合特征的手写英文字母识别方法. 计算机应用, 2018, 38(S2): 286-289. [5] 叶锋, 邓衍晨, 汪敏, 等. 部分级联特征的离线手写体汉字识别方法. 计算机系统应用, 2017, 26(8): 134-140. DOI:10.15888/j.cnki.csa.005913 [6] Wang QF, Yin F, Liu CL. Unsupervised language model adaptation for handwritten Chinese text recognition. Pattern recognition, 2014, 47(3): 1202-1216. DOI:10.1016/j.patcog.2013.09.015 [7] Alvaro F, S’nchez JA, Benedi JM. Recognition of printed mathematical expressions using two-dimensional stochastic context-free grammars. Proceedings of 2011 International Conference on Document Analysis and Recognition. Beijing, China. 2011. 1225–1229. [8] 宗亚辉, 李双庆. 印刷体数学公式的结构分析与识别. 计算机工程与应用, 2015, 51(9): 196-200. DOI:10.3778/j.issn.1002-8331.1305-0486 [9] Aly W, Uchida S, Suzuki M. Automatic classification of spatial relationships among mathematical symbols using geometric features. IEICE Transactions on Information and Systems, 2009, E92–D(11): 2235-2243. DOI:10.1587/transinf.E92.D.2235 [10] Wang JM, Du J, Zhang JS, et al. Multi-modal attention network for handwritten mathematical expression recognition. Proceedings of 2019 International Conference on Document Analysis and Recognition. Sydney, Australia. 2019. 1181–1186. [11] Dai JY, Sun Y, Su GP, et al. Recognizing offline handwritten mathematical expressions efficiently. Proceedings of the 10th International Conference on E-Education, E-Business, E-Management and E-Learning. Tokyo, Japan. 2019. 198–204. [12] 胡龙灿, 杨帆, 樊爱军. 手写数学公式的识别研究及在Android上的应用. 计算机应用与软件, 2014, 31(8): 28-31, 44. DOI:10.3969/j.issn.1000-386x.2014.08.008 [13] 卢蓉, 范勇, 陈念年, 等. 一种提取目标图像最小外接矩形的快速算法. 计算机工程, 2010, 36(21): 178-180. DOI:10.3969/j.issn.1000-3428.2010.21.064 [14] 杨谢柳, 牛玺辉, 梁文峰. 面向非规则排列汉字文本的字符分割方法. 计算机辅助设计与图形学学报, 2019, 31(9): 1542-1548. DOI:10.3724/SP.J.1089.2019.17608 [15] Liu CL, Nakashima K, Sako H, et al. Handwritten digit recognition: Investigation of normalization and feature extraction techniques. Pattern Recognition, 2004, 37(2): 265-279. DOI:10.1016/s0031-3203(03)00224-3 [16] 王改, 郑启龙, 邓文齐, 等. 基于BWDSP众核的CNN计算任务划分优化. 计算机系统应用, 2019, 28(9): 88-94. DOI:10.15888/j.cnki.csa.007055