Semantic Segmentation of Street View Image Based on Attention and Multi-scale Features

doi:10.15888/j.cnki.csa.009513

AIPUB归智期刊联盟

WeChat

Mobile website

2025-4-25- 12

Home > Archive>Volume 33, Issue 5, 2024 >94-102. DOI:10.15888/j.cnki.csa.009513

PDF HTML XML Export Cite reminder

Semantic Segmentation of Street View Image Based on Attention and Multi-scale Features
DOI:
                        10.15888/j.cnki.csa.009513
                    
CSTR:
                        [cstr]
                    
Author:
                        HONG JunHONG Jun
School of Information Science and Engineering, Shenyang University of Technology, Shenyang 110870, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
LIU Xiao-NanLIU Xiao-Nan
School of Information Science and Engineering, Shenyang University of Technology, Shenyang 110870, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
LIU Zhen-YuLIU Zhen-Yu
School of Information Science and Engineering, Shenyang University of Technology, Shenyang 110870, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site

                    
Affiliation:
Clc Number:
Fund Project:

Article

Figures

Metrics

Reference

Cited by

Materials

Comments

Abstract:

This study aims to solve the problems faced by traditional U-Net network in the semantic segmentation task of street scene images, such as the low accuracy of object segmentation under multi-scale categories and the poor correlation of image context features. To this end, it proposes an improved U-Net semantic segmentation network AS-UNet to achieve accurate segmentation of street scene images. Firstly, the spatial and channel squeeze & excitation block (scSE) attention mechanism module is integrated into the U-Net network to guide the convolutional neural network to focus on semantic categories related to segmentation tasks in both channel and space dimensions, to extract more effective semantic information. Secondly, to obtain the global context information of the image, the multi-scale feature map is aggregated for feature enhancement, and the atrous spatial pyramid pooling (ASPP) multi-scale feature fusion module is embedded into the U-Net network. Finally, the cross-entropy loss function and Dice loss function are combined to solve the problem of unbalanced target categories in street scenes, and the accuracy of segmentation is further improved. The experimental results show that the mean intersection over union (MIoU) of the AS-UNet network model in the Cityscapes and CamVid datasets increases by 3.9% and 3.0%, respectively, compared with the traditional U-Net network. The improved network model significantly improves the segmentation effect of street scene images.

Key words:image semantic segmentation;street scene;U-Net;attention mechanism;multi-scale feature fusion

Get Citation

洪军,刘笑楠,刘振宇.融合注意力和多尺度特征的街景图像语义分割.计算机系统应用,2024,33(5):94-102

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:December 06,2023
Revised:January 09,2024
Adopted:
Online: April 07,2024
Published:

Article QR Code

You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-3
Address：4# South Fourth Street, Zhongguancun,Haidian, Beijing,Postal Code：100190
Phone：010-62661041 Fax： Email：csa (a) iscas.ac.cn
Technical Support：Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063