Abstract:Text detection in the natural scenes is of great significance to the retrieval and management of large amounts of information such as video, images, and pictures. Depending on the complex background, low resolution and random distribution of the text detection in natural scenes, a scene text detection method was proposed, which combined the maximum stable extremal region algorithm and convolutional deep belief networks. In this method, candidate text region extracted from the maximally stable extremal region entered into the convolutional deep belief network for feature extraction. Then these features were classified by Softmax classifier. Experiments were carried out on ICDAR datasets and SVT datasets, and the experiment results show that the proposed method is helpful for improving the precision and recall rate of scene text detection.