Self-supervised Learning and Multi-scale Spatio-temporal Feature Fusion for Video Quality Assessment

doi:10.15888/j.cnki.csa.009784

AIPUB归智期刊联盟

WeChat

Mobile website

2025-4-21- 21

Home > Archive>Volume 34, Issue 3, 2025 >51-61. DOI:10.15888/j.cnki.csa.009784

PDF HTML XML Export Cite reminder

Self-supervised Learning and Multi-scale Spatio-temporal Feature Fusion for Video Quality Assessment
DOI:
                        10.15888/j.cnki.csa.009784
                    
CSTR:
                        32024.14.csa.009784
                    
Author:
                        YU LiYU Li
School of Computer Science & School of Cyber Science and Engineering, Nanjing University of Information Science & Technology, Nanjing 210044, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
WANG Si-TuoWANG Si-Tuo
School of Computer Science & School of Cyber Science and Engineering, Nanjing University of Information Science & Technology, Nanjing 210044, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
CHEN Ya-DangCHEN Ya-Dang
School of Computer Science & School of Cyber Science and Engineering, Nanjing University of Information Science & Technology, Nanjing 210044, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
GAO PanGAO Pan
College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
SUN Yu-BaoSUN Yu-Bao
School of Computer Science & School of Cyber Science and Engineering, Nanjing University of Information Science & Technology, Nanjing 210044, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site

                    
Affiliation:
Clc Number:
Fund Project:

Article

Figures

Metrics

Reference

Cited by

Materials

Comments

Abstract:

Faced with insufficient labeled data in the field of video quality assessment, researchers begin to turn to self-supervised learning methods, aiming to learn video quality assessment models with the help of large amounts of unlabeled data. However, existing self-supervised learning methods primarily focus on video distortion types and content information, while ignoring dynamic information and spatiotemporal features of videos changing over time. This leads to unsatisfactory evaluation performance in complex dynamic scenes. To address these issues, a new self-supervised learning method is proposed. By taking playback speed prediction as an auxiliary pretraining task, the model can better capture dynamic changes and spatiotemporal features of videos. Combined with distortion type prediction and contrastive learning, the model’s sensitivity to video quality differences is enhanced. At the same time, to more comprehensively capture the spatiotemporal features of videos, a multi-scale spatiotemporal feature extraction module is further designed to enhance the model’s spatiotemporal modeling capability. Experimental results demonstrate that the proposed method significantly outperforms existing self-supervised learning-based approaches on the LIVE, CSIQ, and LIVE-VQC datasets. On the LIVE-VQC dataset, the proposed method achieves an average improvement of 7.90% and a maximum improvement of 17.70% in the PLCC index. Similarly, it also shows considerable competitiveness on the KoNViD-1k dataset. These results indicate that the proposed self-supervised learning framework effectively enhances the dynamic feature capture ability of the video quality assessment model and exhibits unique advantages in processing complex dynamic videos.

Key words:video quality assessment (VQA);self-supervised learning;multi-task learning;playback speed prediction;multi-scale

Get Citation

于莉,王思拓,陈亚当,高攀,孙玉宝.基于自监督学习与多尺度时空特征融合的视频质量评估.计算机系统应用,2025,34(3):51-61

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:August 23,2024
Revised:September 19,2024
Adopted:
Online: December 09,2024
Published:

Article QR Code

You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-3
Address：4# South Fourth Street, Zhongguancun,Haidian, Beijing,Postal Code：100190
Phone：010-62661041 Fax： Email：csa (a) iscas.ac.cn
Technical Support：Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063