Abstract:In semantic segmentation tasks, the downsampling process of the encoder can lead to a decrease in resolution, resulting in the loss of spatial information details in the image. As a result, segmentation discontinuity or incorrect segmentation may occur at object edges, which can damage overall segmentation performance. To address the above issues, an image semantic segmentation model EASSNet based on edge features and attention mechanisms is proposed. Firstly, the edge detection operator is used to calculate the edge map of the original image, and edge features are extracted through pooling downsampling and convolution operations. Next, edge features are fused into deep semantic features extracted by the encoder, restoring the spatial detail information of downsampled feature images, and strengthening meaningful information through attention mechanisms to improve the accuracy of object edge segmentation and overall semantic segmentation performance. Finally, EASSNet achieves the average intersection over the union of 85.9% and 76.7% on the PASCAL VOC 2012 and Cityscapes datasets, respectively. Compared with current popular semantic segmentation networks, EASSNet has significant advantages in overall segmentation performance and object edge segmentation.