###

计算机系统应用英文版:2023,32(4):317-328

View/Add Comment 过刊浏览高级检索 HTML

←前一篇 | 后一篇→

码上扫一扫！

下载全文

时空渐进式学习的视频显著性目标检测

王星驰¹, 李军侠²

(1.南京信息工程大学自动化学院, 南京 210044;2.南京信息工程大学计算机与软件学院, 南京 210044)

Spatial-temporal Progressive Learning for Video Salient Object Detection

WANG Xing-Chi¹, LI Jun-Xia²

(1.School of Automation, Nanjing University of Information Science and Technology, Nanjing 210044, China;2.School of Computer Science, Nanjing University of Information Science and Technology, Nanjing 210044, China)

摘要

图/表

参考文献

相似文献

本文已被：浏览 388次下载 904次
Received:August 24, 2022 Revised:September 27, 2022

中文摘要: 视频显著性目标检测需要同时结合空间信息和时间信息, 连续地定位视频序列中与运动相关的显著性目标, 其核心问题在于如何高效地刻画运动目标的时空特征. 现有的视频显著性目标检测算法大多使用光流, ConvLSTM以及3D卷积等提取时域特征, 缺乏对时间信息的连续学习能力. 为此, 设计了一种鲁棒的时空渐进式学习网络(spatial-temporal progressive learning network, STPLNet), 以完成对视频序列中显著性目标的高效定位. 在空间域中使用一种U型结构对各视频帧进行编码解码, 在时间域中通过学习视频序列中帧间运动目标的主体部分和形变区域特征, 渐进地对运动目标特征进行编码, 能够捕捉到目标的时间相关性特征和运动趋向性. 在4个公开数据集上与13个主流的视频显著性目标检测算法进行一系列对比实验, 所提出的模型在多个指标(maxF, S-measure (S), MAE)上达到了最优结果, 同时在运行速度上具有较好的实时性.

中文关键词: 视频显著性目标检测深度学习空间信息时间信息静态特征挖掘运动特征渐进学习

Abstract:Video salient object detection (VSOD) can continuously locate motion-related salient objects in video sequences by combining spatial and temporal information. Its core lies in how to efficiently describe the spatial and temporal features of moving objects. Existing VSOD algorithms mainly use optical flow, ConvLSTM, and 3D convolution to extract time domain features, but their continuous learning ability of temporal information is insufficient. Therefore, a robust spatial-temporal progressive learning network (STPLNet) is proposed to realize the efficient localization of salient objects in the video sequences. In the spatial domain, the method uses a U-shaped structure to encode and decode each video frame. In the temporal domain, it progressively encodes the features of the moving objects by learning the features of subject parts and deformation regions about the moving objects between frames in the video sequences. In this way, the method can capture the time correlation features and motion tendency of the objects. A series of comparative experiments are carried out on four public datasets, with 13 mainstream VSOD algorithms involved. The proposed model achieves optimal results on multiple indicators including maxF, S-measure (S), and MAE, and it has excellent real-time performance in running speed.

keywords: video salient object detection (VSOD) deep learning spatial information temporal information static feature mining motion feature progressive learning

文章编号： 中图分类号： 文献标志码：

基金项目:科技创新2030-“新一代人工智能”重大项目(2018AAA0100400)

Author Name	Affiliation	E-mail
WANG Xing-Chi	School of Automation, Nanjing University of Information Science and Technology, Nanjing 210044, China
LI Jun-Xia	School of Computer Science, Nanjing University of Information Science and Technology, Nanjing 210044, China	jxli@nuist.edu.cn

Author Name	Affiliation	E-mail
WANG Xing-Chi	School of Automation, Nanjing University of Information Science and Technology, Nanjing 210044, China
LI Jun-Xia	School of Computer Science, Nanjing University of Information Science and Technology, Nanjing 210044, China	jxli@nuist.edu.cn

引用文本：
王星驰,李军侠.时空渐进式学习的视频显著性目标检测.计算机系统应用,2023,32(4):317-328
WANG Xing-Chi,LI Jun-Xia.Spatial-temporal Progressive Learning for Video Salient Object Detection.COMPUTER SYSTEMS APPLICATIONS,2023,32(4):317-328