###
计算机系统应用英文版:2024,33(6):223-231
本文二维码信息
码上扫一扫!
基于多维侧窗聚类分块的退化书法文档二值化
(1.南京信息工程大学 软件学院, 南京 210044;2.南京信息工程大学 计算机学院, 南京 210044;3.江苏省少儿春互联教育科技有限公司 南京研发中心, 南京 210031)
Degraded Calligraphic Document Binarization Based on Multidimensional Side Window Clustering Segmentation
(1.School of Software, Nanjing University of Information Science and Technology, Nanjing 210044, China;2.School of Computer Science, Nanjing University of Information Science and Technology, Nanjing 210044, China;3.Nanjing Technology R & D Center, Jiangsu Shao Er Chun Internet Education Technology Co. Ltd., Nanjing 210031, China)
摘要
图/表
参考文献
相似文献
本文已被:浏览 211次   下载 524
Received:December 20, 2023    Revised:January 17, 2024
中文摘要: 书法字文档图像在不良光照条件下的灰度值分布差异较大, 低光照区域图像对比度较低、笔画形态纹理特征出现退化, 传统方法通常仅考虑了局部信息的均值、平方差、熵等因素, 在形态纹理方面考虑较少, 从而对低对比度区域的特征信息不敏感. 针对此类问题, 本文提出了一种多维侧窗聚类分块的退化书法文档的二值化方法CS-SWF (clustering segmentation based SWF), 该方法首先利用SWF卷积核描述具有相似形态学特征的像素块, 之后提出多种修正规则利用下采样提取低纬度信息去修正特征区域. 最后, 对特征图中聚类块进行前后景分离, 得到二值化结果图. 本文使用FMPSNRDRD为指标, 将现有方法和本文方法进行对比, 实验结果表明, 在自建的100张手写退化文档图像数据集下, 本文方法在低对比度暗部区域的二值化效果较为稳定, 在精准度和鲁棒性上优于对比算法.
Abstract:The distribution of grayscale values in calligraphic character document images exhibits significant variations under poor lighting conditions, resulting in lower image contrast in low-light areas and degradation of morphological texture features of the strokes. Traditional methods typically focus on local information such as mean, squared deviation, and entropy, while giving less consideration to morphological texture, rendering them insensitive to the features of low-contrast areas. To address these issues, this study proposes a binarization method called clustering segmentation-based side-window filter (CS-SWF) specifically designed for degraded calligraphic documents. Firstly, this method utilizes multi-dimensional SWF to describe pixel chunks with similar morphological features. Then, with multiple correction rules, it utilizes downsampling to extract low-latitude information and correct feature regions. Finally, the clustered blocks in the feature map are classified to obtain the binarization results. To evaluate the performance of the proposed method, it is compared with existing methods using F-measure (FM), peak signal-to-noise ratio (PSNR), and distance reciprocal distortion (DRD) as indicators. Experimental results on a self-constructed dataset consisting of 100 handwritten degraded document images demonstrate that the proposed binarization method exhibits greater stability in low-contrast dark regions and outperforms the comparison algorithm in terms of accuracy and robustness.
文章编号:     中图分类号:    文献标志码:
基金项目:中小学书法教学智能评价平台项目(SRC202201)
引用文本:
徐占洋,张家瑞,侍虹言,秦飞扬,林巍.基于多维侧窗聚类分块的退化书法文档二值化.计算机系统应用,2024,33(6):223-231
XU Zhan-Yang,ZHANG Jia-Rui,SHI Hong-Yan,QIN Fei-Yang,LIN Wei.Degraded Calligraphic Document Binarization Based on Multidimensional Side Window Clustering Segmentation.COMPUTER SYSTEMS APPLICATIONS,2024,33(6):223-231