Point Cloud Classification and Segmentation Based on Self-attention Mechanism
CSTR:
Author:
  • Article
  • | |
  • Metrics
  • |
  • Reference [37]
  • |
  • Related [20]
  • | | |
  • Comments
    Abstract:

    Due to the disorder and lack of topological information, the classification and segmentation of 3D point clouds is still challenging. To this end, this study designs a 3D point cloud classification algorithm based on the self-attention mechanism to learn point cloud feature information for object classification and segmentation. Firstly, a self-attention module suitable for point clouds is designed for feature extraction. A neighborhood graph is constructed to enhance the input embedding, and the local features are extracted and aggregated by utilizing the self-attention mechanism. Finally, the local features are combined via multi-layer perceptron and encoder-decoder approaches to achieve 3D point cloud classification and segmentation. This method considers the local context information of individual points in the point cloud during input embedding, constructs a network structure under local long distances, and ultimately yields more distinctive results. Experiments on datasets such as ShapeNetPart and RoofN3D demonstrate that the proposed method performs better in classification and segmentation.

    Reference
    [1] Guo YL, Wang HY, Hu QY, et al. Deep learning for 3D point clouds: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(12): 4338–4364.
    [2] 李晶晶, 范大昭, 耿弘毅, 等. 城市点云的区域生长三角网构建方法. 测绘科学技术学报, 2016, 33(1): 65–70.
    [3] Rahman MM, Tan YH, Xue J, et al. Notice of violation of IEEE publication principles: Recent advances in 3D object detection in the era of deep neural networks: A survey. IEEE Transactions on Image Processing, 2020, 29: 2947–2962.
    [4] 李娇娇, 孙红岩, 董雨, 等. 基于深度学习的3维点云处理综述. 计算机研究与发展, 2022, 59(5): 1160–1179.
    [5] Tchapmi L, Choy C, Armeni I, et al. SEGCloud: Semantic segmentation of 3D point clouds. Proceedings of the 2017 International Conference on 3D Vision. Qingdao: IEEE, 2017. 537–547.
    [6] Wang PS, Liu Y, Gou YX, et al. O-CNN: Octree-based convolutional neural networks for 3D shape analysis. ACM Transactions on Graphics, 2017, 36(4): 72.
    [7] Charles RQ, Su H, Kaichun M, et al. PointNet: Deep learning on point sets for 3D classification and segmentation. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017. 77–85.
    [8] Qi CR, Yi L, Su H, et al. PointNet++: Deep hierarchical feature learning on point sets in a metric space. Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach: Curran Associates Inc., 2017. 5105–5114.
    [9] Wang Y, Sun YB, Liu ZW, et al. Dynamic graph CNN for learning on point cloud. ACM Transactions on Graphics, 2019, 38(5): 146.
    [10] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need. Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach: Curran Associates Inc., 2017. 6000–6010.
    [11] Engel N, Belagiannis V, Dietmayer K. Point transformer. IEEE Access, 2021, 9: 134826–134840.
    [12] Lawin FJ, Danelljan M, Tosteberg P, et al. Deep projective 3D semantic segmentation. Proceedings of the 17th International Conference on Computer Analysis of Images and Patterns. Ystad: Springer, 2017. 95–107.
    [13] Zhou WG, Jiang X, Liu YH. MVPointNet: Multi-view network for 3D object based on point cloud. IEEE Sensors Journal, 2019, 19(24): 12145–12152.
    [14] Jaritz M, Gu JY, Su H. Multi-view PointNet for 3D scene understanding. Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshops. Seoul: IEEE, 2019. 3995–4003.
    [15] Graham B, Engelcke M, van der Maaten L. 3D semantic segmentation with submanifold sparse convolutional networks. Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018. 9224–9232.
    [16] Xu MY, Zhou ZP, Qiao Y. Geometry sharing network for 3D point cloud classification and segmentation. Proceedings of the 34th AAAI Conference on Artificial Intelligence, the 32nd Innovative Applications of Artificial Intelligence Conference, the 10th AAAI Symposium on Educational Advances in Artificial Intelligence. New York: AAAI, 2020. 12500–12507.
    [17] Xu MT, Zhang JH, Zhou ZP, et al. Learning geometry-disentangled representation for complementary understanding of 3D object point cloud. Proceedings of the 35th AAAI Conference on Artificial Intelligence, the 33rd Conference on Innovative Applications of Artificial Intelligence, the 11th Symposium on Educational Advances in Artificial Intelligence. AAAI, 2021. 3056–3064.
    [18] Hu QY, Yang B, Xie LH, et al. RandLA-Net: Efficient semantic segmentation of large-scale point clouds. Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020. 11105–11114.
    [19] Li YY, Bu R, Sun MC, et al. PointCNN: Convolution on X-transformed points. Proceedings of the 32nd International Conference on Neural Information Processing Systems. Montréal: Curran Associates Inc., 2018. 828–838.
    [20] Tatarchenko M, Park J, Koltun V, et al. Tangent convolutions for dense prediction in 3D. Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018. 3887–3896.
    [21] Wu HP, Xiao B, Codella N, et al. CvT: Introducing convolutions to vision transformers. Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Montreal: IEEE, 2021. 22–31.
    [22] Carion N, Massa F, Synnaeve G, et al. End-to-end object detection with transformers. Proceedings of the 16th European Conference on Computer Vision. Glasgow: Springer, 2020. 213–229.
    [23] Guo MH, Cai JX, Liu ZN, et al. PCT: Point cloud transformer. Computational Visual Media, 2021, 7(2): 187–199.
    [24] Zhao HS, Jiang L, Jia JY, et al. Point Transformer. Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Montreal: IEEE, 2021. 16239–16248.
    [25] Lai X, Liu JH, Jiang L, et al. Stratified transformer for 3D point cloud segmentation. Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans: IEEE, 2022. 8490–8499.
    [26] Bello I. LambdaNetworks: Modeling long-range interactions without attention. Proceedings of the 9th International Conference on Learning Representations. OpenReview.net, 2021.
    [27] Ba JL, Kiros JR, Hinton GE. Layer normalization. arXiv:1607.06450, 2016.
    [28] Ronneberger O, Fischer P, Brox T. U-Net: Convolutional Networks for Biomedical Image Segmentation. Proceedings of the 18th International Conference on Medical Image Computing and Computer-assisted Intervention. Munich: Springer, 2015. 234–241.
    [29] Yi L, Kim VG, Ceylan D, et al. A scalable active framework for region annotation in 3D shape collections. ACM Transactions on Graphics, 2016, 35(6): 210.
    [30] Yang Z, Wang LW. Learning relationships for multi-view 3D object recognition. Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019. 7504–7513.
    [31] Wichmann A, Agoub A, Schmidt V, et al. RoofN3D: A database for 3D building reconstruction with deep learning. Photogrammetric Engineering & Remote Sensing, 2019, 85(6): 435–443.
    [32] Wang SL, Suo S, Ma WC, et al. Deep parametric continuous convolutional neural networks. Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018. 2589–2597.
    [33] Wu WX, Qi ZG, Fuxin L. PointConv: Deep convolutional networks on 3D point clouds. Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019. 9613–9622.
    [34] Thomas H, Qi CR, Deschaud JE, et al. KPConv: Flexible and deformable convolution for point clouds. Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019. 6410–6419.
    [35] Yang JC, Zhang Q, Ni BB, et al. Modeling point clouds with self-attention and gumbel subset sampling. Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019. 3318–3327.
    [36] Lee J, Lee Y, Kim J, et al. Set transformer: A framework for attention-based permutation-invariant neural networks. Proceedings of the 36th International Conference on Machine Learning. Long Beach: PMLR, 2019. 3744–3753.
    [37] Xie SN, Liu SN, Chen ZY, et al. Attentional shapeContextNet for point cloud recognition. Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018. 4606–4615.
    Cited by
    Comments
    Comments
    分享到微博
    Submit
Get Citation

孟繁林,何晓曦,刘应浒,李茄濡,朱群.基于自注意力机制的点云分类分割.计算机系统应用,2024,33(1):177-184

Copy
Share
Article Metrics
  • Abstract:660
  • PDF: 1911
  • HTML: 1161
  • Cited by: 0
History
  • Received:May 24,2023
  • Revised:June 26,2023
  • Online: November 28,2023
  • Published: January 05,2023
Article QR Code
You are the first990361Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-3
Address:4# South Fourth Street, Zhongguancun,Haidian, Beijing,Postal Code:100190
Phone:010-62661041 Fax: Email:csa (a) iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063