基于边缘特征和注意力机制的图像语义分割

doi:10.15888/j.cnki.csa.009588

AIPUB归智期刊联盟

微信公众号

网站二维码

2025年4月15日 11:30 星期二

首页 > 过刊浏览>2024年第33卷第7期 >63-73. DOI:10.15888/j.cnki.csa.009588

PDF HTML阅读 XML下载导出引用引用提醒

基于边缘特征和注意力机制的图像语义分割
DOI:
                        10.15888/j.cnki.csa.009588
                    
CSTR:
                        32024.14.csa.009588
                    
作者:
                        王军王军
南京信息工程大学 计算机学院, 南京 210044;南京信息工程大学 科技产业处, 南京 210044
在期刊界中查找
在百度中查找
在本站中查找
张霁云张霁云
南京信息工程大学 计算机学院, 南京 210044
在期刊界中查找
在百度中查找
在本站中查找
程勇程勇
南京信息工程大学 计算机学院, 南京 210044;南京信息工程大学 科技产业处, 南京 210044
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:
通讯作者:
中图分类号:
基金项目:国家自然科学基金(41975183)

Image Semantic Segmentation Based on Edge Features and Attention Mechanism

Author:

WANG Jun
WANG Jun
School of Computer Science, Nanjing University of Information Science and Technology, Nanjing 210044, China;Science and Technology Industry Division, Nanjing University of Information Science and Technology, Nanjing 210044, China
在期刊界中查找
在百度中查找
在本站中查找
ZHANG Ji-Yun
ZHANG Ji-Yun
School of Computer Science, Nanjing University of Information Science and Technology, Nanjing 210044, China
在期刊界中查找
在百度中查找
在本站中查找
CHENG Yong
CHENG Yong
School of Computer Science, Nanjing University of Information Science and Technology, Nanjing 210044, China;Science and Technology Industry Division, Nanjing University of Information Science and Technology, Nanjing 210044, China
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献 [30]

相似文献 [20]

引证文献

资源附件

文章评论

摘要:

在语义分割任务中, 编码器的下采样过程会导致分辨率降低, 造成图像空间信息细节的丢失, 因此在物体边缘会出现分割不连续或者错误分割的现象, 进而对整体分割性能产生负面影响. 针对上述问题, 提出基于边缘特征和注意力机制的图像语义分割模型EASSNet. 首先, 使用边缘检测算子计算原始图像的边缘图, 通过池化下采样和卷积运算提取边缘特征. 接着, 将边缘特征融合到经过编码器提取的深层语义特征当中, 恢复经过下采样的特征图像的空间细节信息, 并且通过注意力机制来强化有意义的信息, 从而提高物体边缘分割的准确性, 进而提升语义分割的整体性能. 最后, EASSNet在PASCAL VOC 2012和Cityscapes数据集上的平均交并比分别达到85.9%和76.7%, 与当前流行的语义分割网络相比, 整体分割性能和物体边缘的分割效果都具有明显优势.

关键词:语义分割;空间细节信息;边缘特征;特征融合;注意力机制

Abstract:

In semantic segmentation tasks, the downsampling process of the encoder can lead to a decrease in resolution, resulting in the loss of spatial information details in the image. As a result, segmentation discontinuity or incorrect segmentation may occur at object edges, which can damage overall segmentation performance. To address the above issues, an image semantic segmentation model EASSNet based on edge features and attention mechanisms is proposed. Firstly, the edge detection operator is used to calculate the edge map of the original image, and edge features are extracted through pooling downsampling and convolution operations. Next, edge features are fused into deep semantic features extracted by the encoder, restoring the spatial detail information of downsampled feature images, and strengthening meaningful information through attention mechanisms to improve the accuracy of object edge segmentation and overall semantic segmentation performance. Finally, EASSNet achieves the average intersection over the union of 85.9% and 76.7% on the PASCAL VOC 2012 and Cityscapes datasets, respectively. Compared with current popular semantic segmentation networks, EASSNet has significant advantages in overall segmentation performance and object edge segmentation.

Key words:semantic segmentation;spatial detail information;edge feature;feature fusion;attention mechanism

参考文献

[1] 王龙飞, 严春满. 道路场景语义分割综述. 激光与光电子学进展, 2021, 58(12): 1200002.

[2] Jiang HJ, Wang RP, Shan SG, et al. Adaptive metric learning for zero-shot recognition. IEEE Signal Processing Letters, 2019, 26(9): 1270–1274.

[3] Kong YY, Zhang BW, Yan BY, et al. Affiliated fusion conditional random field for urban UAV image semantic segmentation. Sensors, 2020, 20(4): 993.

[4] Jiang F, Grigorev A, Rho S, et al. Medical image semantic segmentation based on deep learning. Neural Computing and Applications, 2018, 29(5): 1257–1265.

[5] Xiao AR, Yang XF, Lu SJ, et al. FPS-Net: A convolutional fusion network for large-scale LiDAR point cloud segmentation. ISPRS Journal of Photogrammetry and Remote Sensing, 2021, 176: 237–249.

[6] Shelhamer E, Long J, Darrell T. Fully convolutional networks for semantic segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(4): 640–651.

[7] Ronneberger O, Fischer P, Brox T. U-Net: Convolutional networks for biomedical image segmentation. Proceedings of the 18th International Conference on Medical Image Computing and Computer-assisted Intervention. Munich: Springer, 2015. 234–241.

[8] Chen LC, Papandreou G, Kokkinos I, et al. Semantic image segmentation with deep convolutional nets and fully connected CRFs. Proceedings of the 3rd International Conference on Learning Representations. San Diego, 2015.

[9] Chen LC, Papandreou G, Kokkinos I, et al. DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(4): 834–848.

[10] Chen LC, Papandreou G, Schroff F, et al. Rethinking atrous convolution for semantic image segmentation. arXiv:1706.05587, 2017.

[11] Chen LC, Zhu YK, Papandreou G, et al. Encoder-decoder with atrous separable convolution for semantic image segmentation. Proceedings of the 15th European Conference on Computer Vision. Munich: Springer, 2018. 833–851.

[12] He KM, Zhang XY, Ren SQ, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1904–1916.

[13] Dong RS, Pan XQ, Li FY. DenseU-Net-based semantic segmentation of small objects in urban remote sensing images. IEEE Access, 2019, 7: 65347–65356.

[14] Yu CQ, Wang JB, Peng C, et al. BiSeNet: Bilateral segmentation network for real-time semantic segmentation. Proceedings of the 15th European Conference on Computer Vision (ECCV). Munich: Springer, 2018. 334–349.

[15] Noh H, Hong S, Han B. Learning deconvolution network for semantic segmentation. Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV). Santiago: IEEE, 2015. 1520–1528.

[16] He JJ, Deng ZY, Zhou L, et al. Adaptive pyramid context network for semantic segmentation. Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach: IEEE, 2019. 7511–7520.

[17] Li HC, Xiong PF, Fan HQ, et al. DFANet: Deep feature aggregation for real-time semantic segmentation. Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach: IEEE, 2019. 9514–9523.

[18] Zhao HS, Shi JP, Qi XJ, et al. Pyramid scene parsing network. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017. 6230–6239.

[19] Wang JD, Sun K, Cheng TH, et al. Deep high-resolution representation learning for visual recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(10): 3349–3364.

[20] Zhou L, Kong XY, Gong C, et al. FC-RCCN: Fully convolutional residual continuous CRF network for semantic segmentation. Pattern Recognition Letters, 2020, 130: 54–63.

[21] Michieli U, Zanuttigh P. Edge-aware graph matching network for part-based semantic segmentation. International Journal of Computer Vision, 2022, 130(11): 2797–2821.

[22] Chen RS, Zhang FL, Rhee T. Edge-aware convolution for RGB-D image segmentation. Proceedings of the 35th International Conference on Image and Vision Computing New Zealand. Wellington: IEEE, 2020. 1–6.

[23] Kuang HL, Liang YX, Liu N, et al. BEA-SegNet: Body and edge aware network for medical image segmentation. Proceedings of the 2021 IEEE International Conference on Bioinformatics and Biomedicine. Houston: IEEE, 2021. 939–944.

[24] Hu J, Shen L, Sun G. Squeeze-and-excitation networks. Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018. 7132–7141.

[25] Wang QL, Wu BG, Zhu PF, et al. ECA-Net: Efficient channel attention for deep convolutional neural networks. Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle: IEEE, 2020. 11531–11539.

[26] Hu Y, Wen GH, Luo MN, et al. Competitive inner-imaging squeeze and excitation for residual network. arXiv:1807.08920, 2018.

[27] Hou QB, Zhou DQ, Feng JS. Coordinate attention for efficient mobile network design. Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021. 13708–13717.

[28] Woo S, Park J, Lee JY, et al. CBAM: Convolutional block attention module. Proceedings of the 15th European Conference on Computer Vision (ECCV). Munich: Springer, 2018. 3–19.

[29] Badrinarayanan V, Kendall A, Cipolla R. SegNet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12): 2481–2495.

[30] He JJ, Deng ZY, Qiao Y. Dynamic multi-scale filters for semantic segmentation. Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019. 3561–3571.

引用本文

王军,张霁云,程勇.基于边缘特征和注意力机制的图像语义分割.计算机系统应用,2024,33(7):63-73

复制

文章指标

点击次数:577
下载次数: 1587
HTML阅读次数: 735
引用次数: 0

历史

收稿日期:2024-02-22
最后修改日期:2024-03-19
录用日期:
在线发布日期: 2024-05-31
出版日期:

微信公众号

网站二维码

引用本文

分享

文章指标

历史

文章二维码

微信公众号

网站二维码

引用本文

分享

微信扫一扫：分享

文章指标

历史

文章二维码