Interferential Line Elimination in Document Image Based on Greedy Algorithm
CSTR:
Author:
  • Article
  • | |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • | |
  • Comments
    Abstract:

    Documents often contain horizontal lines, hand lines, etc., which are used for various special functions. When these documents are stored in computers by scanning or the like and need to be further recognized and processed into text codes, these lines become interference factors of OCR, thus the recognition rate of document content is decreased. This study proposes a new document interference line removal algorithm, which first binarizes the document image, and the binarization process takes into account the effects of uneven illumination; then the foreground is refined into single pixels, reducing the thickness of the lines. The effect is then calculated by an improved greedy algorithm to calculate the weights of the horizontal and vertical line segments, and the line segment with higher weight is determined as the interference line; finally, the distance of each foreground pixel in the image is determined by the distance from the interference line. Thereby obtaining a complete document recovery map. The simulation results show that the proposed algorithm can effectively remove the interference lines, especially in the case of interference lines and text adhesion, and remove the interference lines while affecting the quality of document images less, and has a higher computing speed and better removal effect. The removal effect provides a good basis for further OCR recognition of images.

    Reference
    Related
    Cited by
Get Citation

王平,张晓峰,王宜怀,程仁贵.基于贪婪算法的文档图像中干扰线的去除.计算机系统应用,2019,28(11):238-244

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:March 29,2019
  • Revised:April 26,2019
  • Online: November 08,2019
Article QR Code
You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-3
Address:4# South Fourth Street, Zhongguancun,Haidian, Beijing,Postal Code:100190
Phone:010-62661041 Fax: Email:csa (a) iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063