Bi-branch Remote Sensing Cloud and Shadow Detection Network Based on ViT-D-UNet
CSTR:
Author:
  • Article
  • | |
  • Metrics
  • |
  • Reference [30]
  • |
  • Related [20]
  • | | |
  • Comments
    Abstract:

    Effective segmentation of clouds and their shadows is a critical issue in the field of remote sensing image processing. It plays a significant role in surface feature extraction, climate detection, atmospheric correction, and more. However, the complex features of clouds and cloud shadows in remote sensing images, characterized by their diverse, irregular distributions and fuzzy boundary information that is easily disturbed by the background, make accurate feature extraction challenging. Moreover, there are few networks specifically designed for this task. To address these issues, this study proposes a dual-path network combining vision Transformer (ViT) and D-UNet. The network is divided into two branches: one is a convolutional local feature extraction module based on the dilated convolution module of D-UNet, which introduces a multi-scale atrous spatial pyramid pooling (ASPP) to extract multi-dimensional features; the other branch comprehends the context semantics globally through the vision Transformer, enhancing feature extraction. Finally, the study performs an upsampling through a feature fusion decoder. The model achieves superior performance on both a self-built dataset of clouds and cloud shadows and the publicly available HRC_WHU dataset, leading the second-best model by 0.52% and 0.44% in the MIoU metric, achieving 92.05% and 85.37%, respectively.

    Reference
    [1] Moses WJ, Philpot WD. Evaluation of atmospheric correction using bi-temporal hyperspectral images. Israel Journal of Plant Sciences, 2012, 60(1–2): 253–263.
    [2] Tapakis R, Charalambides AG. Equipment and methodologies for cloud detection and classification: A review. Solar Energy, 2013, 95: 392–430.
    [3] Zhu Z, Woodcock CE. Object-based cloud and cloud shadow detection in Landsat imagery. Remote Sensing of Environment, 2012, 118: 83–94.
    [4] Qiu S, He BB, Zhu Z, et al. Improving Fmask cloud and cloud shadow detection in mountainous area for Landsats 4–8 images. Remote Sensing of Environment, 2017, 199: 107–119.
    [5] Zhu Z, Woodcock CE. Automated cloud, cloud shadow, and snow detection in multitemporal Landsat data: An algorithm designed specifically for monitoring land cover change. Remote Sensing of Environment, 2014, 152: 217–234.
    [6] Li S, Wang M, Wu J, et al. CloudDeepLabV3+: A lightweight ground-based cloud segmentation method based on multi-scale feature aggregation and multi-level attention feature enhancement. International Journal of Remote Sensing, 2023, 44(15): 4836–4856.
    [7] Wang ZW, Xia M, Lu M, et al. Parameter identification in power transmission systems based on graph convolution network. IEEE Transactions on Power Delivery, 2022, 37(4): 3155–3163.
    [8] Ayala C, Sesma R, Aranda C, et al. A deep learning approach to an enhanced building footprint and road detection in high-resolution satellite imagery. Remote Sensing, 2021, 13(16): 3135.
    [9] Prathap G, Afanasyev I. Deep learning approach for building detection in satellite multispectral imagery. Proceedings of the 2018 International Conference on Intelligent Systems (IS). Funchal: IEEE, 2018. 461–465.
    [10] Xie WY, Fan XY, Zhang X, et al. Co-compression via superior gene for remote sensing scene classification. IEEE Transactions on Geoscience and Remote Sensing, 2023, 61: 5604112.
    [11] Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation. Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston: IEEE, 2015. 3431–3440.
    [12] Ronneberger O, Fischer P, Brox T. U-Net: Convolutional networks for biomedical image segmentation. Proceedings of the 18th International Conference on Medical Image Computing and Computer-assisted Intervention. Munich: Springer, 2015. 234–241.
    [13] Chen LC, Papandreou G, Kokkinos I, et al. DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(4): 834–848.
    [14] 龙丽红, 朱宇霆, 闫敬文, 等. 新型语义分割D-UNet的建筑物提取. 遥感学报, 2023, 27(11): 2593–2602.
    [15] Howard AG, Zhu ML, Chen B, et al. MobileNets: Efficient convolutional neural networks for mobile vision applications. arXiv:1704.04861, 2017.
    [16] Yang MK, Yu K, Zhang C, et al. DenseASPP for semantic segmentation in street scenes. Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018. 3684–3692.
    [17] Yu F, Koltun V. Multi-scale context aggregation by dilated convolutions. Proceedings of the 4th International Conference on Learning Representations. San Juan, 2016.
    [18] Dosovitskiy A, Beyer L, Kolesnikov A, et al. An image is worth 16x16 words: Transformers for image recognition at scale. Proceedings of the 9th International Conference on Learning Representations. OpenReview.net, 2021.
    [19] Wang XL, Girshick R, Gupta A, et al. Non-local neural networks. Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018. 7794–7803.
    [20] Li ZW, Shen HF, Cheng Q, et al. Deep learning based cloud detection for medium and high resolution remote sensing images of different sensors. ISPRS Journal of Photogrammetry and Remote Sensing, 2019, 150: 197–212.
    [21] Zhao HS, Shi JP, Qi XJ, et al. Pyramid scene parsing network. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017. 6230–6239.
    [22] Mohajerani S, Saeedi P. Cloud-Net: An end-to-end cloud detection algorithm for Landsat 8 imagery. Proceedings of the 2019 IEEE International Geoscience and Remote Sensing Symposium. Yokohama: IEEE, 2019. 1029–1032.
    [23] Li G, Kim J. DABNet: Depth-wise asymmetric bottleneck for real-time semantic segmentation. Proceedings of the 30th British Machine Vision Conference 2019. Cardiff: BMVA Press, 2019.
    [24] Lu C, Xia M, Qian M, et al. Dual-branch network for cloud and cloud shadow segmentation. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 5410012.
    [25] Paszke A, Chaurasia A, Kim S, et al. ENet: A deep neural network architecture for real-time semantic segmentation. arXiv:1606.02147, 2016.
    [26] Dev S, Nautiyal A, Lee YH, et al. CloudSegNet: A deep network for nychthemeron cloud image segmentation. IEEE Geoscience and Remote Sensing Letters, 2019, 16(12): 1814–1818.
    [27] Wang WH, Xie EZ, Li X, et al. Pyramid vision Transformer: A versatile backbone for dense prediction without convolutions. Proceedings of the 2021 IEEE/CVF international Conference on Computer Vision. Montreal: IEEE, 2021. 548–558.
    [28] Xu K, Guan KY, Peng J, et al. DeepMask: An algorithm for cloud and cloud shadow detection in optical satellite remote sensing images using deep residual network. arXiv:1911.03607, 2019.
    [29] Zhang F, Chen YQ, Li ZH, et al. ACFNet: Attentional class feature network for semantic segmentation. Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019. 6797–6806.
    [30] Chen LC, Zhu YK, Papandreou G, et al. Encoder-decoder with atrous separable convolution for semantic image segmentation. Proceedings of the 15th European Conference on Computer Vision (ECCV). Munich: Springer, 2018. 833–851.
    Cited by
    Comments
    Comments
    分享到微博
    Submit
Get Citation

李远禄,王键翔,范小婷,周昕,吴明轩.基于ViT-D-UNet的双分支遥感云影检测网络.计算机系统应用,2024,33(8):68-77

Copy
Share
Article Metrics
  • Abstract:250
  • PDF: 727
  • HTML: 385
  • Cited by: 0
History
  • Received:February 28,2024
  • Revised:March 28,2024
  • Online: June 28,2024
Article QR Code
You are the first987776Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-3
Address:4# South Fourth Street, Zhongguancun,Haidian, Beijing,Postal Code:100190
Phone:010-62661041 Fax: Email:csa (a) iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063