To further improve the fusion effect of visual and infrared images, this paper proposes an image fusion model based on multi-scale convolution operators and DenseNet. This model first uses multi-scale convolution operators to get the direct multi-scale features of images. Then, the DenseNet is used to calculate the indirect multi-scale features of images. To get the fusion weights of image pixel information on different scales, this paper fuses the DenseNet on different scales in a stacking manner, and the fusion weights of the two kinds of images can be derived by activity graphs. At last, the fused image is derived according to the fusion weights. The experimental results show that the recognition rate is high on the THO and CMA sets.