Semantic Segmentation of Street Scenes Images Based on Multi-scale Feature Pyramid Fusion

doi:10.15888/j.cnki.csa.009411

AIPUB归智期刊联盟

WeChat

Mobile website

2025-4-24- 17

Home > Archive>Volume 33, Issue 3, 2024 >73-84. DOI:10.15888/j.cnki.csa.009411

PDF HTML XML Export Cite reminder

Semantic Segmentation of Street Scenes Images Based on Multi-scale Feature Pyramid Fusion
DOI:
                        10.15888/j.cnki.csa.009411
                    
CSTR:
                        [cstr]
                    
Author:
                        QU Hai-ChengQU Hai-Cheng
Software College, Liaoning Technical University, Huludao 125105, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
WANG YingWANG Ying
Software College, Liaoning Technical University, Huludao 125105, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
DONG Kang-LongDONG Kang-Long
Software College, Liaoning Technical University, Huludao 125105, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
LIU Wan-JunLIU Wan-Jun
Software College, Liaoning Technical University, Huludao 125105, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site

                    
Affiliation:
Clc Number:
Fund Project:

Article

Figures

Metrics

Reference

Cited by

Materials

Comments

Abstract:

This study proposes a semantic segmentation network called LDPANet to address the challenges of significant variations in target sizes and the difficulty of efficient extraction of multi-scale features in semantic segmentation tasks of street scene images. Firstly, the void convolution is combined with the deeply separable convolution introduced into the residual learning unit to optimize the encoder structure, which reduces computational complexity and alleviates the problem of gradient vanishing. Secondly, the network utilizes a layer-wise iterative void spatial pyramid to sequentially fuse top-down feature information, enhancing the effective interaction of contextual information. After multi-scale feature fusion, an attribute attention module is introduced to suppress redundant information and strengthen important features. Furthermore, channel-extended upsampling replaces two-wire interpolation upsampling as the decoder to further improve the resolution of feature maps. Finally, the accuracy of the LDPANet method on Cityscapes and CamVid datasets reaches 91.8% and 87.52%, respectively. Compared with the network model in recent years, the proposed network model can accurately extract pixel position information and spatial dimension information and improve the accuracy of semantic segmentation.

Key words:semantic segmentation;mixed depthwise separable dilated convolution (MDSDC);iterative dilated convolution pyramid with layer cascade (IDCP-LC);attribute attention;channel expansion upsampling;feature fusion

Get Citation

曲海成,王莹,董康龙,刘万军.多尺度特征金字塔融合的街景图像语义分割.计算机系统应用,2024,33(3):73-84

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:August 31,2023
Revised:September 26,2023
Adopted:
Online: December 26,2023
Published:

Article QR Code

You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-3
Address：4# South Fourth Street, Zhongguancun,Haidian, Beijing,Postal Code：100190
Phone：010-62661041 Fax： Email：csa (a) iscas.ac.cn
Technical Support：Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063