###

计算机系统应用英文版:2020,29(12):257-262

View/Add Comment 过刊浏览高级检索 HTML

←前一篇 | 后一篇→

码上扫一扫！

下载全文

基于分割的任意形状场景文本检测

蔡鑫鑫, 王敏

(河海大学计算机与信息学院, 南京 211100)

Arbitrary Shape Scene Text Detection Based on Segmentation

CAI Xin-Xin, WANG Min

(College of Computer and Information, Hohai University, Nanjing 211100, China)

摘要

图/表

参考文献

相似文献

本文已被：浏览 677次下载 1303次
Received:May 01, 2020 Revised:May 27, 2020

中文摘要: 随着深度学习技术的发展, 自然场景文本检测的性能获得了显著的提升. 但目前仍然存在两个主要的挑战: 一是速度和准确度之间的权衡, 二是对任意形状的文本实例的检测. 本文采用基于分割的方法高效准确的检测任意形状场景文本. 具体来说, 使用具有低计算成本的分割头和简洁高效的后处理, 分割头由特征金字塔增强模块和特征融合模块组成, 前者可以引入多层次的信息来指导更好的分割, 后者可以将前者给出的不同深度的特征集合成最终的特征进行分割. 本文采用可微二值化模块, 自适应地设置二值化阈值, 将分割方法产生的概率图转换为文本区域, 从而提高文本检测的性能. 在标准数据集ICDAR2015和Total-Text上, 本文提出的方法使用轻量级主干网络如ResNet18在速度和准确度方面都达到了可比较的结果.

中文关键词: 自然场景文本检测分割特征金字塔增强模块特征融合模块可微二值化模块

Abstract:With the development of deep learning technology, the performance of natural scene text detection has been significantly improved. Nonetheless, two main challenges still exist: the first problem is the trade-off between speed and accuracy, and the second one is to model the arbitrary-shaped text instance. In this study, we propose a segmentation-based method to tackle arbitrary-shaped text detection efficiently and accurately. Specifically, we use a low computational-cost segmentation head and efficient post-processing. The segmentation head is made up of Feature Pyramid Enhancement Module (FPEM) and Feature Fusion Module (FFM). FPEM can introduce multi-level information to guide the better segmentation. FFM can integrate the features given by the FPEMs of different depths into a final feature for segmentation. We use a Differentiable Binarization (DB) module, which can perform the binarization process in a segmentation network. Optimized along with a DB module, a segmentation network can adaptively set the thresholds for binarization, which not only simplifies the post-processing but also enhances the performance of text detection. On the standard datasets ICDAR2015 and Total-Text, the method proposed in this study uses a lightweight backbone network such as ResNet18 to achieve comparable results in terms of speed and accuracy.

keywords: natural scene text detection segmentation feature pyramid enhancement module feature fusion module differentiable binarization module

文章编号： 中图分类号： 文献标志码：

基金项目:

Author Name	Affiliation	E-mail
CAI Xin-Xin	College of Computer and Information, Hohai University, Nanjing 211100, China	2360866893@qq.com
WANG Min	College of Computer and Information, Hohai University, Nanjing 211100, China

Author Name	Affiliation	E-mail
CAI Xin-Xin	College of Computer and Information, Hohai University, Nanjing 211100, China	2360866893@qq.com
WANG Min	College of Computer and Information, Hohai University, Nanjing 211100, China

引用文本：
蔡鑫鑫,王敏.基于分割的任意形状场景文本检测.计算机系统应用,2020,29(12):257-262
CAI Xin-Xin,WANG Min.Arbitrary Shape Scene Text Detection Based on Segmentation.COMPUTER SYSTEMS APPLICATIONS,2020,29(12):257-262