基于编码器-解码器的半监督图像语义分割
作者:

Encoder-Decoder for Semi-Supervised Image Semantic Segmentation
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [16]
  • |
  • 相似文献 [20]
  • | | |
  • 文章评论
    摘要:

    基于深度卷积神经网络的图像语义分割方法需要大量像素级标注的训练数据,但标注的过程费时又费力.本文基于生成对抗网络提出一种编码-解码结构的半监督图像语义分割方法,其中编码器-解码器模块作为生成器,整个网络通过耦合标准多分类交叉熵损失和对抗损失进行训练.为充分利用浅层网络包含的丰富的语义信息,本文将编码器中不同尺度的特征输入到分类器,并将得到的不同粒度的分类结果融合,进而优化目标边界.此外,鉴别器通过发现无标签数据分割结果中的可信区域,以此提供额外的监督信号,来实现半监督学习.在PASCAL VOC 2012和Cityscapes上的实验表明,本文提出的方法优于现有的半监督图像语义分割方法.

    Abstract:

    Image semantic segmentation methods based on deep convolutional neural network requires a large number of pixel-level annotation training data, but the labeling process is time-consuming and laborious. In this study, a semi-supervised image semantic segmentation method with encoder-decoder based on generative adversarial networks is proposed, in which the encoder-decoder as the generator. The entire network is trained by coupling the standard multi-class cross entropy loss with the adversarial loss. In order to make full use of the rich semantic information contained in the shallow layers, this study puts the features of multi-scales in the encoder into the classifier, and fuses the obtained classification results with different granularities to optimize the object boundaries. In addition, the discriminator enables semi-supervised learning by discovering the trusted regions in the unlabeled data segmentation results to provide additional supervisory signals. Experiments on PASCAL VOC 2012 and Cityscapes show that the proposed method is superior to the existing semi-supervised image semantic segmentation methods.

    参考文献
    [1] 郑菲, 孟朝晖, 郭闯世. 基于反卷积特征学习的图像语义分割算法. 计算机系统应用, 2019, 28(1):147-155.[doi:10.15888/j.cnki.csa.006716
    [2] Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation. Proceedings of 2005 IEEE Conference on Computer Vision and Pattern Recognition. Boston, MA, USA. 2015. 3431-3440.
    [3] Yu F, Koltun V. Multi-scale context aggregation by dilated convolutions. arXiv:1511.07122, 2016.
    [4] Chen LC, Papandreou G, Kokkinos I, et al. Deeplab:Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFS. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(4):834-848.[doi:10.1109/TPAMI.2017.2699184
    [5] Pinheiro PO, Collobert R. Weakly supervised semantic segmentation with convolutional networks. arXiv:1411.6228v1, 2015.
    [6] Dai JF, He KM, Sun J. Boxsup:Exploiting bounding boxes to supervise convolutional networks for semantic segmentation. Proceedings of the 2015 IEEE International Conference on Computer Vision. Santiago, Chile. 2015. 1635-1643.
    [7] Hong S, Noh H, Han B. Decoupled deep neural network for semi-supervised semantic segmentation. Proceedings of the 28th International Conference on Neural Information Processing Systems. Montreal, QC, Canada. 2015. 1495-1503.
    [8] Goodfellow I, Pouget-Abadie J, Mirza M, et al. Generative adversarial nets. Proceedings of the 27th International Conference on Neural Information Processing Systems. Montreal, QC, Canada. 2014. 2672-2680.
    [9] Isola P, Zhu JY, Zhou TH, et al. Image-to-image translation with conditional adversarial networks. Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA. 2017. 1125-1134.
    [10] Luc P, Couprie C, Chintala S, et al. Semantic segmentation using adversarial networks. arXiv:1611.08408, 2016.
    [11] Hung WC, Tsai YH, Liou YT, et al. Adversarial learning for semi-supervised semantic segmentation. arXiv:1802.07934, 2018.
    [12] Souly N, Spampinato C, Shah M. Semi supervised semantic segmentation using generative adversarial network. 2007 IEEE International Conference on Computer Vision. Venice, Italy. 2017. 5688-5696.
    [13] He KM, Zhang XY, Ren SQ, et al. Deep residual learning for image recognition. Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA. 2016. 770-778.
    [14] Everingham M, Van Gool L, Williams CKI, et al. The pascal visual object classes (VOC) challenge. International Journal of Computer Vision, 2010, 88(2):303-338.[doi:10.1007/s11263-009-0275-4
    [15] Cordts M, Omran M, Ramos S, et al. The cityscapes dataset for semantic urban scene understanding. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA. 2016. 3213-3223.
    [16] Hariharan B, Arbeláez P, Bourdev L, et al. Semantic contours from inverse detectors. IEEE 2011 International Conference on Computer Vision. Barcelona, Spain. 2011. 991-998.
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

刘贝贝,华蓓.基于编码器-解码器的半监督图像语义分割.计算机系统应用,2019,28(11):182-187

复制
分享
文章指标
  • 点击次数:2953
  • 下载次数: 2993
  • HTML阅读次数: 1986
  • 引用次数: 0
历史
  • 收稿日期:2019-04-17
  • 最后修改日期:2019-05-21
  • 在线发布日期: 2019-11-08
  • 出版日期: 2019-11-15
文章二维码
您是第11185318位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京海淀区中关村南四街4号 中科院软件园区 7号楼305房间,邮政编码:100190
电话:010-62661041 传真: Email:csa (a) iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号