Abstract:In view of how to quickly find Chinese printed text image files in bulk image file for Optical Character Recognition (OCR) recognition in practical application scenarios, this study designs heuristic rules for the inherent characteristics of Chinese text, based on the Stroke Width Transform algorithm (SWT), and combines horizontal projection technology with discrete Fourier transform, a Chinese printed text image file recognition technique suitable for tilt angles between -90 and 90° is proposed. The experimental results show that in 1606 test set image files, the overall recognition F-measure of the algorithm for text image files is 0.95, and the average recognition time is 0.65 s.