Object Detection Based on Adaptive Token Pooling and Enhanced Set Prediction

doi:10.15888/j.cnki.csa.009765

AIPUB归智期刊联盟

WeChat

Mobile website

2025-4-13- 12

Home > Archive>Volume 34, Issue 2, 2025 >74-83. DOI:10.15888/j.cnki.csa.009765

PDF HTML XML Export Cite reminder

Object Detection Based on Adaptive Token Pooling and Enhanced Set Prediction
DOI:
                        10.15888/j.cnki.csa.009765
                    
CSTR:
                        
                    
Author:
                        LIU YaoLIU Yao
School of Computer Science and Technology, Wuhan University of Science and Technology, Wuhan 430081, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
CHEN Dong-FangCHEN Dong-Fang
School of Computer Science and Technology, Wuhan University of Science and Technology, Wuhan 430081, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
WANG Xiao-FengWANG Xiao-Feng
School of Computer Science and Technology, Wuhan University of Science and Technology, Wuhan 430081, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site

                    
Affiliation:
Clc Number:
Fund Project:

Article

Figures

Metrics

Reference

Cited by

Materials

Comments

Abstract:

Transformer-based object detection algorithms often suffer from problems such as insufficient accuracy and slow convergence. Although many studies have proposed improvements to address these problems and have achieved certain outcomes, most of them overlook two key shortcomings when applying Transformer structure to the field of object detection. Firstly, self-attention computation results are not diversified. Secondly, due to the complexity of set prediction, the models are unstable during target matching. To overcome these deficiencies, this study proposes several enhancements. Firstly, an adaptive token pooling module is designed to increase self-attention weight diversity. Secondly, a rough-prediction-based anchor box localization module is introduced, which provides positional prior information for queries to enhance stability during bipartite matching. Lastly, a group-based denoising task is designed, which trains the model to distinguish between positive and negative queries near the target, thereby improving the model’s ability to perform set prediction. Experimental results show that the proposed improved algorithm achieves better training results on the COCO dataset. Compared with the baseline model, the improved algorithm significantly outperforms in both detection accuracy and convergence speed.

Key words:object detection;query initialization mode;self-attention;training strategy

Get Citation

刘耀,陈东方,王晓峰.基于自适应Token池化与集合预测增强的目标检测.计算机系统应用,2025,34(2):74-83

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:July 17,2024
Revised:August 13,2024
Adopted:
Online: December 16,2024
Published:

Article QR Code

You are the first991115Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-3
Address：4# South Fourth Street, Zhongguancun,Haidian, Beijing,Postal Code：100190
Phone：010-62661041 Fax： Email：csa (a) iscas.ac.cn
Technical Support：Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063