Abstract:Considering the problems caused by insufficient attention to receptive field scale and inadequate extraction of feature channel information in existing super-resolution reconstruction models for optical remote sensing images, this study proposes a new super-resolution reconstruction model for optical remote sensing images, which is based on multi-scale feature extraction and coordinate attention. On the basis of the deep residual network structure, some cascaded multi-scale feature & coordinate attention blocks (MFCABs) are designed in the high-frequency branch of the network to fully explore the high-frequency features of the input low-resolution images. Firstly, the Inception submodule is introduced into MFCABs to capture spatial features under different receptive fields by convolution kernels of different scales. Secondly, the coordinate attention submodule is added after the Inception submodule, and attention is paid to the channel and coordinate dimensions to obtain a better channel attention effect. Finally, the features extracted by each MFCAB are fused in multiple paths to realize the effective fusion of multi-scale spatial information and multi-channel attention information. In the double and triple magnification of the MFCAB model on the NWPU4500 dataset, the PSNR reaches 34.73 dB and 30.12 dB, respectively, which is 0.66 dB and 0.01 dB higher than EDSR. In the double, triple, and quadruple magnification of the model on the AID1600 dataset, the PSNR reaches 34.71 dB, 30.58 dB, and 28.44 dB, respectively, which is 0.09 dB, 0.03 dB, and 0.04 dB higher than EDSR. The experimental results show that the reconstruction effect of this model on the optical remote sensing image datasets is better than the mainstream super-resolution image reconstruction model.