基于双态非对称网络的卷烟激光码识别
作者:
基金项目:

陕西省烟草公司咸阳公司科技项目(2022610425240008)


Cigarette Laser Code Recognition Based on Dual-state Asymmetric Network
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [33]
  • |
  • 相似文献
  • |
  • 引证文献
  • | |
  • 文章评论
    摘要:

    卷烟激光喷码识别是烟草稽查工作的重要手段. 本文提出一种基于双态非对称网络的烟码识别方法, 针对畸变烟码训练样本不足导致模型泛化能力弱的问题, 设计非线性局部增强方法(nonlinear local augmentation, NLA), 通过在烟码图像边缘设置可控基准点进行空间变换, 生成有效畸变训练样本以增强模型泛化能力; 针对烟码与背景图案特征相似导致识别精度低的问题, 提出双态非对称网络(dual-state asymmetric network, DSANet), 将CRNN的卷积层划分为训练模式和部署模式, 训练模式通过引入非对称卷积优化特征权重分布, 增强模型关键特征提取能力; 为保证实时性, 部署模式设计BN融合和分支融合方法, 通过计算融合权重并初始化卷积核, 将卷积层等效转换回原始网络结构, 降低用户端推理时间; 最后, 在循环层中引入自注意力机制, 通过动态调整序列特征权重, 进一步加强模型对烟码特征的提取能力. 通过对比实验, 该方法具有更高的识别精度和速度, 其识别精度达到87.34%.

    Abstract:

    Cigarette laser code recognition is an important tool for tobacco inspection. This study proposes a method for recognizing cigarette codes based on a dual-state asymmetric network. Insufficient training on samples of distorted cigarette codes leads to the weak generalization ability of the model. To address this issue, a nonlinear local augmentation (NLA) method is designed, which generates effective training samples with distortion to enhance the generalization ability of the model through spatial transformation using controllable datums at the edges of cigarette codes. To address the problem of low recognition accuracy due to the similarity between cigarette codes and their background patterns, a dual-state asymmetric network (DSANet) is proposed, which divides the convolutional layers of the CRNN into training and deployment modes. The training mode enhances the key feature extraction capability of the model by introducing asymmetric convolution for optimizing feature weight distribution. For real-time performance, the deployment mode designs BN fusion and branch fusion methods. By calculating fusion weights and initializing convolutional kernels, convolutional layers are equivalently converted back to their original structures, which reduces user-side inference time. Finally, a self-attention mechanism is introduced into the loop layer to enhance the extraction capability of the model for cigarette code features by dynamically adjusting the weights of sequence features. Comparative experiments show that this method has higher recognition accuracy and speed, with the recognition accuracy reaching 87.34%.

    参考文献
    [1] 王杰. 复杂背景下激光防伪码图像识别研究 [硕士学位论文]. 武汉: 武汉理工大学, 2016.
    [2] 杜梦圆. 复杂背景下的32位激光防伪码识别算法研究 [硕士学位论文]. 武汉: 武汉理工大学, 2014.
    [3] 倪晶. 低质量激光防伪码识别及其系统实现 [硕士学位论文]. 武汉: 武汉理工大学, 2017.
    [4] 吕妃. 复杂背景下的激光防伪码切分与识别算法设计 [硕士学位论文]. 武汉: 武汉理工大学, 2018.
    [5] 胡承东, 速永仓, 曹玲芝. 烟包喷码字符识别系统研究. 机械工程与自动化, 2010(3): 117–119.
    [6] 彭召, 丁晓, 刘哲, 等. 基于先验知识的香烟防伪数字串识别. 计算机应用与软件, 2019, 36(12): 189–194, 244.
    [7] 谢志峰, 吴佳萍, 章曙涵, 等. 基于深度神经网络的烟码智能识别方法. 计算机辅助设计与图形学学报, 2019, 31(1): 111–117.
    [8] 曾凡智, 冯文婕, 周燕. 深度学习的自然场景文本识别方法综述. 计算机科学与探索, 2024, 18(5): 1160–1181.
    [9] Shi BG, Bai X, Yao C. An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(11): 2298–2304.
    [10] Graves A, Fernández S, Gomez F, et al. Connectionist temporal classification: Labelling unsegmented sequence data with recurrent neural networks. Proceedings of the 23rd International Conference on Machine Learning. Pittsburgh: ACM, 2006. 369–376.
    [11] 蔡景轩. 基于深度学习的烟草码定位识别算法研究与实现[硕士学位论文]. 杭州: 浙江大学, 2021.
    [12] Du YK, Chen ZN, Jia CY, et al. SVTR: Scene text recognition with a single visual model. Proceedings of the 31st International Joint Conference on Artificial Intelligence. Vienna: ACM, 2022. 884–890.
    [13] Sutskever I, Vinyals O, Le QV. Sequence to sequence learning with neural networks. Proceedings of the 27th International Conference on Neural Information Processing Systems. Montreal: MIT Press, 2014. 3104–3112.
    [14] Shi BG, Wang XG, Lyu PY, et al. Robust scene text recognition with automatic rectification. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016. 4168–4176.
    [15] Jaderberg M, Simonyan K, Zisserman A, et al. Spatial Transformer networks. Proceedings of the 28th International Conference on Neural Information Processing Systems. Montreal: MIT Press, 2016. 2017–2025.
    [16] Shi BG, Yang MK, Wang XG, et al. ASTER: An attentional scene text recognizer with flexible rectification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 41(9): 2035–2048.
    [17] Yang MK, Guan YS, Liao MH, et al. Symmetry-constrained rectification network for scene text recognition. Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019. 9147–9156.
    [18] Luo CJ, Jin LW, Sun ZH. MORAN: A multi-object rectified attention network for scene text recognition. Pattern Recognition, 2019, 90: 109–118.
    [19] 陈佐瓒, 徐兵, 丁小军, 等. 基于Encoder-Decoder框架的双监督机制自然场景文本识别. 计算机工程与应用, 2022, 58(6): 128–133.
    [20] Zhong DJ, Lyu SJ, Shivakumara P, et al. SGBANet: Semantic GAN and balanced attention network for arbitrarily oriented scene text recognition. Proceedings of the 17th European Conference on Computer Vision. Tel Aviv: Springer, 2022. 464–480.
    [21] Fang SC, Xie HT, Wang YX, et al. Read like humans: Autonomous, bidirectional and iterative language modeling for scene text recognition. Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021. 7098–7107.
    [22] Chu XJ, Wang YT. IterVM: Iterative vision modeling module for scene text recognition. Proceedings of the 26th International Conference on Pattern Recognition (ICPR). Montreal: IEEE, 2022. 1393–1399.
    [23] Wang P, Da C, Yao C. Multi-granularity prediction for scene text recognition. Proceedings of the 17th European Conference on Computer Vision. Tel Aviv: Springer, 2022. 339–355.
    [24] Atienza R. Vision Transformer for fast and efficient scene text recognition. Proceedings of the 16th International Conference on Document Analysis and Recognition. Lausanne: Springer, 2021. 319–334.
    [25] Dosovitskiy A, Beyer L, Kolesnikov A, et al. An image is worth 16x16 words: Transformers for image recognition at scale. Proceedings of International Conference on Learning Representations. Washington: ICLR, 2021.
    [26] Yan XM, Fang ZH, Jin YC. Augmented Transformers with adaptive n-grams embedding for multilingual scene text recognition. arXiv:2302.14261, 2023.
    [27] Schaefer S, Mcphail T, Warren J. Image deformation using moving least squares. ACM Transactions on Graphics, 2006, 25(3): 533–540.
    [28] Bookstein FL. Principal warps: Thin-plate splines and the decomposition of deformations. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1989, 11(6): 567–585.
    [29] 连哲, 殷雁君, 米增, 等. 用于场景文本检测的非对称迭代细化预测网络. 计算机工程与应用. http://kns.cnki.net/kcms/detail/11.2127.TP.20231229.1051.002.html. (2024-01-02).
    [30] Ding XH, Guo YC, Ding GG, et al. ACNet: Strengthening the kernel skeletons for powerful CNN via asymmetric convolution blocks. Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019. 1911–1920.
    [31] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need. Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach: Curran Associates Inc., 2017. 6000–6010.
    [32] 罗灿. 融合残差网络的改进CRNN模型的银行回执单识别算法研究 [硕士学位论文]. 重庆: 重庆大学, 2022.
    [33] 姜思明, 谭升达, 陈冠达, 等. 基于BiSeNet和YOLOv5s的条烟32位激光码识别方法研究. 电子技术与软件工程, 2023(6): 163–168.
    相似文献
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

梁尚荣,王慧琴,马琦,王可,文钰栋.基于双态非对称网络的卷烟激光码识别.计算机系统应用,2025,34(1):211-222

复制
分享
文章指标
  • 点击次数:66
  • 下载次数: 358
  • HTML阅读次数: 113
  • 引用次数: 0
历史
  • 收稿日期:2024-06-03
  • 最后修改日期:2024-06-28
  • 在线发布日期: 2024-11-15
文章二维码
您是第11198011位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京海淀区中关村南四街4号 中科院软件园区 7号楼305房间,邮政编码:100190
电话:010-62661041 传真: Email:csa (a) iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号