基于CNN与Transformer混合模型的肺炎辅助诊断
作者:
基金项目:

国家自然科学基金(61702087); 山东省研究生教育质量提升计划(SDYJG1943); 山东中医药大学科学研究基金 (KYZK2024Q30)


Pneumonia Assisted Diagnosis Based on Hybrid Model of CNN and Transformer
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [36]
  • |
  • 相似文献 [20]
  • | | |
  • 文章评论
    摘要:

    肺炎是一种常见的呼吸系统疾病, 早期诊断对于有效治疗至关重要. 本研究提出了卷积神经网络(CNN)和Transformer结合的CTFNet混合模型, 旨在实现高效而准确的肺炎辅助诊断. 该模型融合了卷积分词器和聚焦线性注意力机制. 卷积分词器通过卷积操作实现更紧凑的特征提取, 并保留图像的关键局部特征降低计算复杂度, 提高模型的表达能力. 聚焦线性注意力机制缓解了Transformer的计算需求, 优化了注意力框架, 大幅提升了模型性能. 在Chest X-ray Images数据集上, CTFNet在肺炎分类任务中表现出色, 达到了99.32%的准确率、99.55%的精确率、99.55%的召回率和99.55%的F1值. 较好的性能凸显了该模型在临床应用中的潜力. 为了评估CTFNet的泛化能力, 我们将其应用于COVID-19 Radiography Database数据集. 在该数据集中, CTFNet被用于多个二分类任务均达到98%以上的准确率. 这些结果表明, CTFNet在肺炎图像分类的各种任务中具有较好的泛化能力和可靠性.

    Abstract:

    Pneumonia is a prevalent respiratory disease for which early diagnosis is crucial to effective treatment. This study proposes a hybrid model, CTFNet, which combines convolutional neural network (CNN) and Transformer to aid in the effective and accurate diagnosis of pneumonia. The model integrates a convolutional tokenizer and a focused linear attention mechanism. The convolutional tokenizer performs more compact feature extraction through convolution operations, retaining key local features of images while reducing computational complexity to enhance model expressiveness. The focused linear attention mechanism reduces the computational demands of the Transformer and optimizes the attention framework, significantly improving model performance. On the Chest X-ray Images dataset, CTFNet demonstrates outstanding performance in pneumonia classification tasks, achieving an accuracy of 99.32%, a precision of 99.55%, a recall of 99.55%, and an F1-score of 99.55%. The impressive performance highlights the model’s potential for clinical applications. The model is evaluated on the COVID-19 Radiography Database dataset for its generalization ability. In this dataset, CTFNet achieves an accuracy above 98% in multiple binary classification tasks. These results indicate that CTFNet exhibits strong generalization ability and reliability across various tasks in pneumonia image classification.

    参考文献
    [1] Biemba G, Chiluba B, Yeboah-Antwi K, et al. Impact of mobile health-enhanced supportive supervision and supply chain management on appropriate integrated community case management of malaria, diarrhoea, and pneumonia in children 2–59 months: A cluster randomised trial in Eastern Province, Zambia. Journal of Global Health, 2020, 10(1): 010425.
    [2] Li Q. Convolutional neural networks for pneumonia diagnosis based on chest X-ray images. Proceedings of the 2022 International Conference on Big Data, Information and Computer Network (BDICN). Sanya: IEEE, 2022. 717–720.
    [3] Wei XX, Niu XK, Zhang XS, et al. Deep pneumonia: Attention-based contrastive learning for class-imbalanced pneumonia lesion recognition in chest X-rays. Proceedings of the 2022 IEEE International Conference on Big Data (Big Data). Osaka: IEEE, 2022. 5361–5369.
    [4] Qi X, Foran DJ, Nosher JL, et al. Multi-feature semi-supervised learning for COVID-19 diagnosis from chest X-ray images. Proceedings of the 12th International Workshop on Machine Learning in Medical Imaging. Strasbourg: Springer International Publishing, 2021. 151–160.
    [5] Suganyadevi S, Seethalakshmi V. CVD-HNet: Classifying pneumonia and COVID-19 in chest X-ray images using deep network. Wireless Personal Communications, 2022, 126(4): 3279–3303.
    [6] Kundu R, Das R, Geem ZW, et al. Pneumonia detection in chest X-ray images using an ensemble of deep learning models. PLoS One, 2021, 16(9): e0256630.
    [7] Salahuddin Z, Woodruff HC, Chatterjee A, et al. Transparency of deep neural networks for medical image analysis: A review of interpretability methods. Computers in Biology and Medicine, 2022, 140: 105111.
    [8] Litjens G, Kooi T, Bejnordi BE, et al. A survey on deep learning in medical image analysis. Medical Image Analysis, 2017, 42: 60–88.
    [9] Szepesi P, Szilágyi L. Detection of pneumonia using convolutional neural networks and deep learning. Biocybernetics and Biomedical Engineering, 2022, 42(3): 1012–1022.
    [10] Abubeker KM, Baskar S. B2-Net: An artificial intelligence powered machine learning framework for the classification of pneumonia in chest X-ray images. Machine Learning: Science and Technology, 2023, 4(1): 015036.
    [11] Sharma A, Singh K, Koundal D. A novel fusion based convolutional neural network approach for classification of COVID-19 from chest X-ray images. Biomedical Signal Processing and Control, 2022, 77: 103778.
    [12] Reiter W. Domain generalization improves end-to-end object detection for real-time surgical tool detection. International Journal of Computer Assisted Radiology and Surgery, 2023, 18(5): 939–944.
    [13] Ghojogh B, Ghodsi A. Attention mechanism, Transformers, BERT, and GPT: Tutorial and survey. Open Science Framework, 2020.
    [14] Dosovitskiy A, Beyer L, Kolesnikov A, et al. An image is worth 16x16 words: Transformers for image recognition at scale. Proceedings of the 9th International Conference on Learning Representations. OpenReview.net, 2021.
    [15] Zhang ZH, Gong ZJ, Hong QQ. A survey on: Application of Transformer in computer vision. Proceedings of the 8th International Conference on Intelligent Systems and Image Processing 2021. The Institute of Industrial Applications Engineers, 2021. 21–28.
    [16] Viteritti LL, Rende R, Becca F. Transformer variational wave functions for frustrated quantum spin systems. Physical Review Letters, 2023, 130(23): 236401.
    [17] Chen H, Zhang T, Chen RB, et al. A novel COVID-19 image classification method based on the improved residual network. Electronics, 2023, 12(1): 80.
    [18] Ukwuoma CC, Qin ZG, Heyat MBB, et al. Automated lung-related pneumonia and COVID-19 detection based on novel feature extraction framework and vision Transformer approaches using chest X-ray images. Bioengineering, 2022, 9(11): 709.
    [19] Gandhi D, Shah V, Chawan PM. A vision Transformer approach for classification an a small-sized medical image dataset. Proceedings of the 5th International Conference on Advances in Science and Technology (ICAST). Mumbai: IEEE, 2022. 519–524.
    [20] Dai ZH, Liu HX, Le QV, et al. CoAtNet: Marrying convolution and attention for all data sizes. Proceedings of the 35th International Conference on Neural Information Processing Systems. Curran Associates Inc., 2021. 303.
    [21] Sandler M, Howard A, Zhu ML, et al. MobileNetV2: Inverted residuals and linear bottlenecks. Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018. 4510–4520.
    [22] Clark JH, Garrette D, Turc I, et al. Canine: Pre-training an efficient tokenization-free encoder for language representation. Transactions of the Association for Computational Linguistics, 2022, 10: 73–91.
    [23] Viriyasaranon T, Woo SM, Choi JH. Unsupervised visual representation learning based on segmentation of geometric pseudo-shapes for Transformer-based medical tasks. IEEE Journal of Biomedical and Health Informatics, 2023, 27(4): 2003–2014.
    [24] Hu J, Shen L, Sun G. Squeeze-and-excitation networks. Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018. 7132–7141.
    [25] Wang WH, Xie EZ, Li X, et al. Pyramid vision Transformer: A versatile backbone for dense prediction without convolutions. Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV). Montreal: IEEE, 2021. 548–558.
    [26] Wang WH, Xie EZ, Li X, et al. PVT v2: Improved baselines with pyramid vision Transformer. Computational Visual Media, 2022, 8(3): 415–424.
    [27] Liu Z, Lin YT, Cao Y, et al. Swin Transformer: Hierarchical vision Transformer using shifted windows. Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV). Montreal: IEEE, 2021. 9992–10002.
    [28] Dong XY, Bao JM, Chen DD, et al. CSWin Transformer: A general vision Transformer backbone with cross-shaped windows. Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New Orleans: IEEE, 2022. 12114–12124.
    [29] Han DC, Pan XR, Han YZ, et al. FLatten Transformer: Vision Transformer using focused linear attention. Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision (ICCV). Paris: IEEE, 2023. 5938–5948.
    [30] Kermany DS, Goldbaum M, Cai WJ, et al. Identifying medical diagnoses and treatable diseases by image-based deep learning. Cell, 2018, 172(5): 1122–1131. e9.
    [31] Chowdhury MEH, Rahman T, Khandakar A, et al. Can AI help in screening viral and COVID-19 pneumonia? IEEE Access, 2020, 8: 132665–132676.
    [32] Amid E, Warmuth MK, Anil R, et al. Robust bi-tempered logistic loss based on bregman divergences. Proceedings of the 33rd International Conference on Neural Information Processing Systems. Vancouver: Curran Associates Inc., 2019. 1344.
    [33] Loshchilov I, Hutter F. Fixing weight decay regularization in adam. arxiv:1711.05101, 2017.
    [34] Al-Kababji A, Bensaali F, Dakua SP. Scheduling techniques for liver segmentation: ReduceLRonPlateau vs OneCycleLR. Proceedings of the 2nd International Conference on Intelligent Systems and Pattern Recognition. Hammamet: Springer, 2022. 204–212.
    [35] He KM, Zhang XY, Ren SQ, et al. Deep residual learning for image recognition. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas: IEEE, 2016. 770–778.
    [36] Touvron H, Cord M, Douze M, et al. Training data-efficient image Transformers & distillation through attention. Proceedings of the 38th International Conference on Machine Learning. PMLR, 2021. 10347–10357.
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

贠恺,贾荣浩,魏国辉,赵爽,李学辉,马志庆.基于CNN与Transformer混合模型的肺炎辅助诊断.计算机系统应用,2025,34(2):216-224

复制
分享
文章指标
  • 点击次数:175
  • 下载次数: 420
  • HTML阅读次数: 89
  • 引用次数: 0
历史
  • 收稿日期:2024-07-09
  • 最后修改日期:2024-08-01
  • 在线发布日期: 2024-12-16
文章二维码
您是第11360708位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京海淀区中关村南四街4号 中科院软件园区 7号楼305房间,邮政编码:100190
电话:010-62661041 传真: Email:csa (a) iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号