Abstract:Existing super-resolution reconstruction methods based on convolutional neural networks are limited by their receptive fields, which makes it difficult to fully utilize the rich contextual information and auto-correlation in remote sensing images, resulting in suboptimal reconstruction performance. To address this issue, this study proposes a novel network, termed MDT, a remote sensing image super-resolution rebuilding method based on multi-distillation and Transformer. Firstly, the network combines multiple distillations with a dual attention mechanism to progressively extract multi-scale features from low-resolution images, thereby reducing feature loss. Next, a convolutional modulation-based Transformer is constructed to capture global information in the images, recovering more complex texture details and enhancing the visual quality of the reconstructed images. Finally, a global residual path is added during upsampling to improve the propagation efficiency of features within the network, effectively reducing image distortion and artifacts. Experiments conducted on the AID and UCMerced datasets demonstrate that the proposed method achieves a peak signal-to-noise ratio (PSNR) and a peak structural similarity index (SSIM) of 29.10 dB and 0.7807, respectively, on ×4 super-resolution tasks. The quality of the reconstructed images is significantly improved, with better visual effects in terms of detail preservation.