Abstract:This study aims at the problem that the distillation effect decreases when the gap between the teacher network and the student network in relational knowledge distillation is too large. A stepwise neural network compression method based on relational distillation is proposed. The key point of this method is to add an intermediate network between the teacher network and the student network for relational distillation step by step. Moreover, in each distillation process, additional monomer information is added to further optimize and enhance the learning ability of the student model. The experimental results show that the classification accuracy of the proposed method on CIFAR-10 and CIFAR-100 image classification datasets is improved by about 0.2% compared with that of the original relational knowledge distillation method.