Abstract:Synthetic aperture radar (SAR) and optical image fusion aim to leverage the imaging complementarity of satellite sensors for generating more comprehensive geomorphological information. However, existing network models often exhibit low imaging accuracy during the fusion process due to the heterogeneity in data distribution of each single satellite sensor and differences in imaging physical mechanisms. This study proposes the DNAP-Fusion, a novel SAR and optical image fusion network that incorporates dual non-local attention perception. The proposed method utilizes a dual non-local perceptual attention module to extract structural information from SAR images and texture details from optical images within a multi-level image pyramid with a gradually decreasing spatial scale. It then fuses their complementary features in both spatial and channel dimensions. Subsequently, the fused features are injected into the upsampled optical image through image reconstruction, resulting in the final fusion outcome. Additionally, before network training, image encapsulation decisions are employed to enhance the commonality between objects in SAR and optical images within the same scene. Qualitative and quantitative experimental results demonstrate that the proposed method outperforms state-of-the-art (SOTA) multisensor fusion methods. The correlation coefficient (CC) in the objective evaluation indices is 0.990 6, and the peak signal to noise ratio (PSNR) is 32.156 0 dB. Moreover, the proposed method effectively fuses the complementary features of SAR and optical images, offering a valuable idea and method for enhancing the accuracy and effectiveness of remote sensing image fusion.