Pneumonia Assisted Diagnosis Based on Hybrid Model of CNN and Transformer
Author:
  • Article
  • | |
  • Metrics
  • |
  • Reference [36]
  • |
  • Related
  • | | |
  • Comments
    Abstract:

    Pneumonia is a prevalent respiratory disease for which early diagnosis is crucial to effective treatment. This study proposes a hybrid model, CTFNet, which combines convolutional neural network (CNN) and Transformer to aid in the effective and accurate diagnosis of pneumonia. The model integrates a convolutional tokenizer and a focused linear attention mechanism. The convolutional tokenizer performs more compact feature extraction through convolution operations, retaining key local features of images while reducing computational complexity to enhance model expressiveness. The focused linear attention mechanism reduces the computational demands of the Transformer and optimizes the attention framework, significantly improving model performance. On the Chest X-ray Images dataset, CTFNet demonstrates outstanding performance in pneumonia classification tasks, achieving an accuracy of 99.32%, a precision of 99.55%, a recall of 99.55%, and an F1-score of 99.55%. The impressive performance highlights the model’s potential for clinical applications. The model is evaluated on the COVID-19 Radiography Database dataset for its generalization ability. In this dataset, CTFNet achieves an accuracy above 98% in multiple binary classification tasks. These results indicate that CTFNet exhibits strong generalization ability and reliability across various tasks in pneumonia image classification.

    Reference
    [1] Biemba G, Chiluba B, Yeboah-Antwi K, et al. Impact of mobile health-enhanced supportive supervision and supply chain management on appropriate integrated community case management of malaria, diarrhoea, and pneumonia in children 2–59 months: A cluster randomised trial in Eastern Province, Zambia. Journal of Global Health, 2020, 10(1): 010425.
    [2] Li Q. Convolutional neural networks for pneumonia diagnosis based on chest X-ray images. Proceedings of the 2022 International Conference on Big Data, Information and Computer Network (BDICN). Sanya: IEEE, 2022. 717–720.
    [3] Wei XX, Niu XK, Zhang XS, et al. Deep pneumonia: Attention-based contrastive learning for class-imbalanced pneumonia lesion recognition in chest X-rays. Proceedings of the 2022 IEEE International Conference on Big Data (Big Data). Osaka: IEEE, 2022. 5361–5369.
    [4] Qi X, Foran DJ, Nosher JL, et al. Multi-feature semi-supervised learning for COVID-19 diagnosis from chest X-ray images. Proceedings of the 12th International Workshop on Machine Learning in Medical Imaging. Strasbourg: Springer International Publishing, 2021. 151–160.
    [5] Suganyadevi S, Seethalakshmi V. CVD-HNet: Classifying pneumonia and COVID-19 in chest X-ray images using deep network. Wireless Personal Communications, 2022, 126(4): 3279–3303.
    [6] Kundu R, Das R, Geem ZW, et al. Pneumonia detection in chest X-ray images using an ensemble of deep learning models. PLoS One, 2021, 16(9): e0256630.
    [7] Salahuddin Z, Woodruff HC, Chatterjee A, et al. Transparency of deep neural networks for medical image analysis: A review of interpretability methods. Computers in Biology and Medicine, 2022, 140: 105111.
    [8] Litjens G, Kooi T, Bejnordi BE, et al. A survey on deep learning in medical image analysis. Medical Image Analysis, 2017, 42: 60–88.
    [9] Szepesi P, Szilágyi L. Detection of pneumonia using convolutional neural networks and deep learning. Biocybernetics and Biomedical Engineering, 2022, 42(3): 1012–1022.
    [10] Abubeker KM, Baskar S. B2-Net: An artificial intelligence powered machine learning framework for the classification of pneumonia in chest X-ray images. Machine Learning: Science and Technology, 2023, 4(1): 015036.
    [11] Sharma A, Singh K, Koundal D. A novel fusion based convolutional neural network approach for classification of COVID-19 from chest X-ray images. Biomedical Signal Processing and Control, 2022, 77: 103778.
    [12] Reiter W. Domain generalization improves end-to-end object detection for real-time surgical tool detection. International Journal of Computer Assisted Radiology and Surgery, 2023, 18(5): 939–944.
    [13] Ghojogh B, Ghodsi A. Attention mechanism, Transformers, BERT, and GPT: Tutorial and survey. Open Science Framework, 2020.
    [14] Dosovitskiy A, Beyer L, Kolesnikov A, et al. An image is worth 16x16 words: Transformers for image recognition at scale. Proceedings of the 9th International Conference on Learning Representations. OpenReview.net, 2021.
    [15] Zhang ZH, Gong ZJ, Hong QQ. A survey on: Application of Transformer in computer vision. Proceedings of the 8th International Conference on Intelligent Systems and Image Processing 2021. The Institute of Industrial Applications Engineers, 2021. 21–28.
    [16] Viteritti LL, Rende R, Becca F. Transformer variational wave functions for frustrated quantum spin systems. Physical Review Letters, 2023, 130(23): 236401.
    [17] Chen H, Zhang T, Chen RB, et al. A novel COVID-19 image classification method based on the improved residual network. Electronics, 2023, 12(1): 80.
    [18] Ukwuoma CC, Qin ZG, Heyat MBB, et al. Automated lung-related pneumonia and COVID-19 detection based on novel feature extraction framework and vision Transformer approaches using chest X-ray images. Bioengineering, 2022, 9(11): 709.
    [19] Gandhi D, Shah V, Chawan PM. A vision Transformer approach for classification an a small-sized medical image dataset. Proceedings of the 5th International Conference on Advances in Science and Technology (ICAST). Mumbai: IEEE, 2022. 519–524.
    [20] Dai ZH, Liu HX, Le QV, et al. CoAtNet: Marrying convolution and attention for all data sizes. Proceedings of the 35th International Conference on Neural Information Processing Systems. Curran Associates Inc., 2021. 303.
    [21] Sandler M, Howard A, Zhu ML, et al. MobileNetV2: Inverted residuals and linear bottlenecks. Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018. 4510–4520.
    [22] Clark JH, Garrette D, Turc I, et al. Canine: Pre-training an efficient tokenization-free encoder for language representation. Transactions of the Association for Computational Linguistics, 2022, 10: 73–91.
    [23] Viriyasaranon T, Woo SM, Choi JH. Unsupervised visual representation learning based on segmentation of geometric pseudo-shapes for Transformer-based medical tasks. IEEE Journal of Biomedical and Health Informatics, 2023, 27(4): 2003–2014.
    [24] Hu J, Shen L, Sun G. Squeeze-and-excitation networks. Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018. 7132–7141.
    [25] Wang WH, Xie EZ, Li X, et al. Pyramid vision Transformer: A versatile backbone for dense prediction without convolutions. Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV). Montreal: IEEE, 2021. 548–558.
    [26] Wang WH, Xie EZ, Li X, et al. PVT v2: Improved baselines with pyramid vision Transformer. Computational Visual Media, 2022, 8(3): 415–424.
    [27] Liu Z, Lin YT, Cao Y, et al. Swin Transformer: Hierarchical vision Transformer using shifted windows. Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV). Montreal: IEEE, 2021. 9992–10002.
    [28] Dong XY, Bao JM, Chen DD, et al. CSWin Transformer: A general vision Transformer backbone with cross-shaped windows. Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New Orleans: IEEE, 2022. 12114–12124.
    [29] Han DC, Pan XR, Han YZ, et al. FLatten Transformer: Vision Transformer using focused linear attention. Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision (ICCV). Paris: IEEE, 2023. 5938–5948.
    [30] Kermany DS, Goldbaum M, Cai WJ, et al. Identifying medical diagnoses and treatable diseases by image-based deep learning. Cell, 2018, 172(5): 1122–1131. e9.
    [31] Chowdhury MEH, Rahman T, Khandakar A, et al. Can AI help in screening viral and COVID-19 pneumonia? IEEE Access, 2020, 8: 132665–132676.
    [32] Amid E, Warmuth MK, Anil R, et al. Robust bi-tempered logistic loss based on bregman divergences. Proceedings of the 33rd International Conference on Neural Information Processing Systems. Vancouver: Curran Associates Inc., 2019. 1344.
    [33] Loshchilov I, Hutter F. Fixing weight decay regularization in adam. arxiv:1711.05101, 2017.
    [34] Al-Kababji A, Bensaali F, Dakua SP. Scheduling techniques for liver segmentation: ReduceLRonPlateau vs OneCycleLR. Proceedings of the 2nd International Conference on Intelligent Systems and Pattern Recognition. Hammamet: Springer, 2022. 204–212.
    [35] He KM, Zhang XY, Ren SQ, et al. Deep residual learning for image recognition. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas: IEEE, 2016. 770–778.
    [36] Touvron H, Cord M, Douze M, et al. Training data-efficient image Transformers & distillation through attention. Proceedings of the 38th International Conference on Machine Learning. PMLR, 2021. 10347–10357.
    Related
    Cited by
Get Citation

贠恺,贾荣浩,魏国辉,赵爽,李学辉,马志庆.基于CNN与Transformer混合模型的肺炎辅助诊断.计算机系统应用,2025,34(2):216-224

Copy
Share
Article Metrics
  • Abstract:144
  • PDF: 385
  • HTML: 83
  • Cited by: 0
History
  • Received:July 09,2024
  • Revised:August 01,2024
  • Online: December 16,2024
Article QR Code
You are the first990335Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-3
Address:4# South Fourth Street, Zhongguancun,Haidian, Beijing,Postal Code:100190
Phone:010-62661041 Fax: Email:csa (a) iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063