Decoupled Knowledge Distillation Based on Perception Reconstruction

doi:10.15888/j.cnki.csa.009773

AIPUB归智期刊联盟

WeChat

Mobile website

2025-4-16- 14

Home > Archive>Volume 34, Issue 2, 2025 >11-18. DOI:10.15888/j.cnki.csa.009773

PDF HTML XML Export Cite reminder

Decoupled Knowledge Distillation Based on Perception Reconstruction
DOI:
                        10.15888/j.cnki.csa.009773
                    
CSTR:
                        
                    
Author:
                        ZHU Ying-CeZHU Ying-Ce
School of Computer Science and Technology, Wuhan University of Science and Technology, Wuhan 430065, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
ZHU Zi-QiZHU Zi-Qi
School of Computer Science and Technology, Wuhan University of Science and Technology, Wuhan 430065, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site

                    
Affiliation:
Clc Number:
Fund Project:

Article

Figures

Metrics

Reference

Cited by

Materials

Comments

Abstract:

In the field of knowledge distillation (KD), feature-based methods can effectively extract the rich knowledge embedded in the teacher model. However, Logit-based methods often face issues such as insufficient knowledge transfer and low efficiency. Decoupled knowledge distillation (DKD) conducts distillation by dividing the Logits output by the teacher and student models into target and non-target classes. While this method improves distillation accuracy, its single-instance-based distillation approach fails to capture the dynamic relationships among samples within a batch. Especially when there are significant differences in the output distributions of the teacher and student models, relying solely on decoupled distillation cannot effectively bridge these differences. To address the issues inherent in DKD, this study proposes a perception reconstruction method. This method introduces a perception matrix. By utilizing the representational capabilities of the model, it recalibrates Logits, meticulously analyzes intra-class dynamic relationships, and reconstructs finer-grained inter-class relationships. Since the objective of the student model is to minimize representational disparity, this method is extended to decoupled knowledge distillation. The outputs of the teacher and student models are mapped onto the perception matrix, enabling the student model to learn richer knowledge from the teacher model. A series of validations on the CIFAR-100 and ImageNet-1K datasets demonstrate that the student model trained with this method achieves a classification accuracy of 74.98% on the CIFAR-100 dataset, which is 0.87 percentage points higher than that of baseline methods, thereby enhancing the image classification performance of the student model. Additionally, comparative experiments with various methods further verify the superiority of this method.

Key words:model compression;knowledge distillation (KD);decoupled knowledge distillation (DKD);perception reconstruction;intra-class relationship matching

Get Citation

祝英策,朱子奇.基于感知重构的解耦知识蒸馏.计算机系统应用,2025,34(2):11-18

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:July 29,2024
Revised:August 20,2024
Adopted:
Online: December 19,2024
Published:

Article QR Code

You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-3
Address：4# South Fourth Street, Zhongguancun,Haidian, Beijing,Postal Code：100190
Phone：010-62661041 Fax： Email：csa (a) iscas.ac.cn
Technical Support：Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063