面向目标用户的深度学习模型可视化综述
作者:
基金项目:

国家重点研发计划(2020AAA0107705)


Review on Visualization of Deep Learning Models for Target Users
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [99]
  • |
  • 相似文献 [20]
  • | | |
  • 文章评论
    摘要:

    深度学习模型在某些场景的实际应用中要求其具备一定的可解释性, 而视觉是人类认识周围世界的基本工具, 可视化技术能够将模型训练过程从不可见的黑盒状态转换为可交互分析的视觉过程, 从而有效提高模型的可信性和可解释度. 目前, 国内外相关领域缺少有关深度学习模型可视化工具的综述, 也缺乏对不同用户实际需求的研究和使用体验的评估. 因此, 本文通过调研近年来学术界模型可解释性和可视化相关文献, 总结可视化工具在不同领域的应用现状, 提出面向目标用户的可视化工具分类方法及依据, 对每一类工具从可视化内容、计算成本等方面进行介绍和对比, 以便不同用户选取与部署合适的工具. 最后在此基础上讨论可视化领域存在的问题并加以展望.

    Abstract:

    Deep learning models require certain interpretability in practical applications in certain scenarios, and vision is a basic tool for humans to understand the surrounding world. Visualization technology can transform the model training process from an invisible black box to an interactive and analyzable visual process, effectively improving the credibility and interpretability of the model. At present, there is a lack of review on deep learning model visualization tools in related fields, as well as a lack of research on the actual needs of different users and the evaluation of user experience. Therefore, this study summarizes the current situation of the application of visualization tools in different fields by investigating the literature related to interpretability and visualization in recent years. It proposes a classification method and basis for target user-oriented visualization tools and introduces and compares each type of tool from the aspects of visualization content, computational cost, etc., so that different users can select and deploy suitable tools. Finally, on this basis, the problems in the field of visualization are discussed and its prospects are provided.

    参考文献
    [1] 雷霞, 罗雄麟. 深度学习可解释性研究综述. 计算机应用, 2022, 42(11): 3588-3602
    [2] 李汇来, 杨斌, 于秀丽, 等. 软件缺陷预测模型可解释性对比. 计算机科学, 2023, 50(5): 21-30. [doi: 10.11896/jsjkx.221000028
    [3] Lipton ZC. The mythos of model interpretability. Communications of the ACM, 2018, 61(10): 36-43. [doi: 10.1145/3233231
    [4] Guidotti R, Monreale A, Ruggieri S, et al. A survey of methods for explaining black box models. ACM Computing Surveys, 2018, 51(5): 93. [doi: 10.1145/3236009
    [5] Abdul A, Vermeulen J, Wang D, et al. Trends and trajectories for explainable, accountable and intelligible systems: An HCI research agenda. Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. 2018. 1-18.
    [6] Chatzimparmpas A, Martins RM, Jusufi I, et al. A survey of surveys on the use of visualization for interpreting machine learning models. Information Visualization, 2020, 19(3): 207-233. [doi: 10.1177/1473871620904671
    [7] Freitas AA. Comprehensible classification models: A position paper. ACM Sigkdd Explorations Newsletter, 2014, 15(1): 1-10. [doi: 10.1145/2594473.2594475
    [8] 曾悠. 大数据时代背景下的数据可视化概念研究[硕士学位论文]. 杭州: 浙江大学, 2014.
    [9] Wong PC, Thomas J. Visual analytics. IEEE Computer Graphics and Applications, 2004, 24(5): 20-21. [doi: 10.1109/MCG.2004.39
    [10] Lu JH, Chen W, Ma YX, et al. Recent progress and trends in predictive visual analytics. Frontiers of Computer Science, 2017, 11(2): 192-207. [doi: 10.1007/s11704-016-6028-y
    [11] Tzeng FY, Ma KL. Opening the black box—Data driven visua-lization of neural networks. Proceedings of the 16th IEEE Visualization Conference. Minneapolis: IEEE, 2005. 383-390.
    [12] Karpathy A, Johnson J, Fei-Fei L. Visualizing and understanding recurrent networks. arXiv:1506.02078, 2015.
    [13] Sun XF, Yang DY, Li XY, et al. Interpreting deep learning models in natural language processing: A review. arXiv:2110.10470, 2021.
    [14] Choo J, Liu SX. Visual analytics for explainable deep learning. IEEE Computer Graphics and Applications, 2018, 38(4): 84-92. [doi: 10.1109/mcg.2018.042731661
    [15] Garcia R, Telea AC, Da Silva BC, et al. A task-and-technique centered survey on visual analytics for deep learning model engineering. Computers & Graphics, 2018, 77: 30-49. [doi: 10.1016/j.cag.2018.09.018
    [16] Yu RL, Shi L. A user-based taxonomy for deep learning visualization. Visual Informatics, 2018, 2(3): 147-154. [doi: 10.1016/j.visinf.2018.09.001
    [17] Mohseni S, Zarei N, Ragan ED. A multidisciplinary survey and framework for design and evaluation of explainable AI systems. ACM Transactions on Interactive Intelligent Systems, 2021, 11(3-4): 24.
    [18] Hohman F, Kahng M, Pienta R, et al. Visual analytics in deep learning: An interrogative survey for the next frontiers. IEEE Transactions on Visualization and Computer Graphics, 2019, 25(8): 2674-2693. [doi: 10.1109/TVCG.2018.2843369
    [19] Smilkov D, Carter S, Sculley D, et al. Direct-manipulation visualization of deep networks. arXiv:1708.03788, 2017.
    [20] Chauhan JS, Wang Y. Context-aware action detection in untrimmed videos using bidirectional LSTM. Proceedings of the 15th Conference on Computer and Robot Vision (CRV). Toronto: IEEE, 2018. 222-229.
    [21] LeCun Y, Bottou L, Bengio Y, et al. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 1998, 86(11): 2278-2324. [doi: 10.1109/5.726791
    [22] Yosinski J, Clune J, Nguyen A, et al. Understanding neural networks through deep visualization. arXiv:1506.06579, 2015.
    [23] Li JW, Chen XL, Hovy E, et al. Visualizing and understanding neural models in NLP. Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. San Diego: ACL, 2016. 681-691.
    [24] Elman JL. Finding structure in time. Cognitive Science, 1990, 14(2): 179-211. [doi: 10.1207/s15516709cog1402_1
    [25] Hochreiter S, Schmidhuber J. Long short-term memory. Neural Computation, 1997, 9(8): 1735-1780. [doi: 10.1162/neco.1997.9.8.1735
    [26] Cho K, van Merriënboer B, Gulcehre C, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Doha: ACL, 2015. 1724-1734.
    [27] Yao M, Cao SZ, Zhang RX, et al. Understanding hidden memories of recurrent neural networks. Proceedings of the 2017 IEEE Conference on Visual Analytics Science and Technology (VAST). Phoenix: IEEE, 2018. 13-24.
    [28] Pascanu R, Mikolov T, Bengio Y. On the difficulty of training recurrent neural networks. Proceedings of the 30th International Conference on International Conference on Machine Learning. Atlanta: JMLR.org, 2013. III-1310-III-1318.
    [29] Cashman D, Patterson G, Mosca A, et al. RNNbow: Visualizing learning via backpropagation gradients in RNNs. IEEE Computer Graphics and Applications, 2018, 38(6): 39-50. [doi: 10.1109/mcg.2018.2878902
    [30] Deng L. The MNIST database of handwritten digit images for machine learning research [Best of the Web]. IEEE Signal Processing Magazine, 2012, 29(6): 141-142. [doi: 10.1109/MSP.2012.2211477
    [31] Harley AW. An interactive node-link visualization of convolutional neural networks. Proceedings of the 11th International Symposium on Visual Computing. Las Vegas: Springer, 2015. 867-877.
    [32] Krizhevsky A, Sutskever I, Hinton GE. ImageNet classification with deep convolutional neural networks. Proceedings of the 25th International Conference on Neural Information Processing Systems. Lake Tahoe: Curran Associates Inc., 2012. 3079-3087.
    [33] Dai AM, Le QV. Semi-supervised sequence learning. Proceedings of the 28th International Conference on Neural Information Processing Systems. Montreal: MIT Press, 2015. 3079-3087.
    [34] Mohamed AR, Dahl GE, Hinton G. Acoustic modeling using deep belief networks. IEEE Transactions on Audio, Speech, and Language Processing, 2012, 20(1): 14-22. [doi: 10.1109/tasl.2011.2109382
    [35] Liu MC, Shi JX, Li Z, et al. Towards better analysis of deep convolutional neural networks. IEEE Transactions on Visualization and Computer Graphics, 2017, 23(1): 91-100. [doi: 10.1109/TVCG.2016.2598831
    [36] Jarrett K, Kavukcuoglu K, Ranzato MA, et al. What is the best multi-stage architecture for object recognition? Proceedings of the 12th IEEE International Conference on Computer Vision. Kyoto: IEEE, 2010. 2146-2153.
    [37] Chung S, Suh S, Park C, et al. ReVACNN: Real-time visual analytics for convolutional neural network. ACM SIGKDD Workshop on Interactive Data Exploration and Analytics (IDEA). San Francisco: ACM, 2016. 7.
    [38] Chen L, Xie YZ, Zheng ZB, et al. Friend recommendation based on multi-social graph convolutional network. IEEE Access, 2020, 8: 43618-43629. [doi: 10.1109/access.2020.2977407
    [39] Dosovitskiy A, Brox T. Generating images with perceptual similarity metrics based on deep networks. Proceedings of the 30th International Conference on Neural Information Processing Systems. Barcelona: Curran Associates Inc., 2016. 658-666.
    [40] Kipf TN, Welling M. Semi-supervised classification with graph convolutional networks. Proceedings of the 5th International Conference on Learning Representations. Toulon: ICLR, 2017.
    [41] Veličković P, Cucurull G, Casanova A, et al. Graph attention networks. arXiv:1710.10903, 2017.
    [42] Jin ZH, Wang Y, Wang QW, et al. GNNVis: A visual analytics approach for prediction error diagnosis of graph neural networks. arXiv:2011.11048, 2020.
    [43] Liu ZP, Wang Y, Bernard J, et al. Visualizing graph neural networks with CorGIE: Corresponding a graph to its embedding. IEEE Transactions on Visualization and Computer Graphics, 2022, 28(6): 2500-2516. [doi: 10.1109/TVCG.2022.3148197
    [44] Tenney I, Wexler J, Bastings J, et al. The language interpretability tool: Extensible, interactive visualizations and analysis for NLP models. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. ACL, 2020. 107-118.
    [45] Pezzotti N, Höllt T, van Gemert J, et al. DeepEyes: Progressive visual analytics for designing deep neural networks. IEEE Transactions on Visualization and Computer Graphics, 2018, 24(1): 98-108. [doi: 10.1109/tvcg.2017.2744358
    [46] Abadi M, Agarwal A, Barham P, et al. TensorFlow: Large-scale machine learning on heterogeneous distributed systems. arXiv:1603.04467, 2015.
    [47] Paszke A, Gross S, Massa F, et al. PyTorch: An imperative style, high-performance deep learning library. Proceedings of the 33rd International Conference on Neural Information Processing Systems. Vancouver: Curran Associates Inc., 2019. 721.
    [48] Ramasubramanian K, Singh A. Deep learning using Keras and TensorFlow. Machine Learning Using R. Apress: Springer, 2018. 667-688.
    [49] Wallace E, Tuyls J, Wang JL, et al. AllenNLP interpret: A framework for explaining predictions of NLP models. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP): System Demonstrations. Hong Kong: ACL, 2019. 7-12.
    [50] Devlin J, Chang MW, Lee K, et al. BERT: Pre-training of deep bidirectional transformers for language understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Minneapolis: ACL, 2019. 4171-4186.
    [51] Shahid MLUR, Molchanov V, Mir J, et al. Interactive visual analytics tool for multidimensional quantitative and categorical data analysis. Information Visualization, 2020, 19(3): 234-246. [doi: 10.1177/1473871620908034
    [52] Roumani AM, Madkour A, Ouzzani M, et al. BioNetApp: An interactive visual data analysis platform for molecular expressions. PLoS One, 2019, 14(2): e0211277. [doi: 10.1371/journal.pone.0211277
    [53] Strobelt H, Hoover B, Satyanaryan A, et al. LMdiff: A visual diff tool to compare language models. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. ACL, 2021. 96-105.
    [54] Liao XQ, Nazir S, Zhou YB, et al. User knowledge, data modelling, and visualization: Handling through the fuzzy logic-based approach. Complexity, 2021, 2021: 6629086. [doi: 10.1155/2021/6629086
    [55] Chatzimparmpas A, Martins RM, Jusufi I, et al. The state of the art in enhancing trust in machine learning models with the use of visualizations. Computer Graphics Forum, 2020, 39(3): 713-756. [doi: 10.1111/cgf.14034
    [56] Matveev SA, Oseledets IV, Ponomarev ES, et al. Overview of visualization methods for artificial neural networks. Computational Mathematics and Mathematical Physics, 2021, 61(5): 887-899. [doi: 10.1134/s0965542521050134
    [57] Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate. Proceedings of the 3rd International Conference on Learning Representations. San Diego: ICLR, 2014.
    [58] Kalchbrenner N, Blunsom P. Recurrent continuous translation models. Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. Seattle: ACL, 2013. 1700-1709.
    [59] Gatt A, Krahmer E. Survey of the state of the art in natural language generation: Core tasks, applications and evaluation. Journal of Artificial Intelligence Research, 2018, 61(1): 65-170. [doi: 10.1613/jair.5477
    [60] Liu Y, Lapata M. Text summarization with pretrained encoders. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Hong Kong: ACL, 2019. 3730-3740.
    [61] Sutskever I, Vinyals O, Le QV. Sequence to sequence learning with neural networks. Proceedings of the 27th International Conference on Neural Information Processing Systems. Montreal: MIT Press, 2014. 3104-3112.
    [62] Strobelt H, Gehrmann S, Behrisch M, et al. Seq2Seq-Vis: A visual debugging tool for sequence-to-sequence models. IEEE Transactions on Visualization and Computer Graphics, 2019, 25(1): 353-363. [doi: 10.1109/TVCG.2018.2865044
    [63] Cettolo M, Girardi C, Federico M. WIT3: Web inventory of transcribed and translated talks. Proceedings of the 16th Annual Conference of the European Association for Machine Translation. Trento: ACL, 2012. 261-268.
    [64] Seifert C, Aamir A, Balagopalan A, et al. Visualizations of deep neural networks in computer vision: A survey. Transparent Data Mining for Big and Small Data. Cham: Springer, 2017. 123-144.
    [65] Maitra C, Seal DB, De RK. NeuroDAVIS: A neural network model for data visualization. arXiv:2304.01222, 2023.
    [66] Jose A, Shetty SD. DistilledCTR: Accurate and scalable CTR prediction model through model distillation. Expert Systems with Applications, 2022, 193: 116474. [doi: 10.1016/j.eswa.2021.116474
    [67] Yan JQ, Rong RC, Xiao GH, et al. HiddenVis: A hidden state visualization toolkit to visualize and interpret deep learning models for time series data. bioRxiv, 2020.
    [68] Dalvi F, Nortonsmith A, Bau A, et al. NeuroX: A toolkit for analyzing individual neurons in neural networks. Proceedings of the 33rd AAAI Conference on Artificial Intelligence. Honolulu: AAAI, 2018. 9851-9852.
    [69] 潘旭东, 张谧, 杨珉. 基于神经元激活模式控制的深度学习训练数据泄露诱导. 计算机研究与发展, 2022, 59(10): 2323-2337. [doi: 10.7544/issn1000-1239.20220498
    [70] Strobelt H, Gehrmann S, Pfister H, et al. LSTMVis: A tool for visual analysis of hidden state dynamics in recurrent neural networks. IEEE Transactions on Visualization and Computer Graphics, 2018, 24(1): 667-676. [doi: 10.1109/TVCG.2017.2744158
    [71] 黄立威, 江碧涛, 吕守业, 等. 基于深度学习的推荐系统研究综述. 计算机学报, 2018, 41(7): 1619-1647. [doi: 10.11897/SP.J.1016.2018.01619
    [72] 朱张莉, 饶元, 吴渊, 等. 注意力机制在深度学习中的研究进展. 中文信息学报, 2019, 33(6): 1-11. [doi: 10.3969/j.issn.1003-0077.2019.06.001
    [73] 任欢, 王旭光. 注意力机制综述. 计算机应用, 2021, 41(S1): 1-6. [doi: 10.11772/j.issn.1001-9081.2020101634
    [74] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need. Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach: Curran Associates Inc., 2017. 6000-6010.
    [75] Li R, Xiao W, Wang LJ, et al. T3-Vis: A visual analytic framework for training and fine-tuning Transformers in NLP. arXiv:2108.13587, 2021.
    [76] Liu YH, Ott M, Goyal N, et al. RoBERTa: A robustly optimized BERT pretraining approach. arXiv:1907.11692, 2019.
    [77] Radford A, Wu J, Child R, et al. Language models are unsupervised multitask learners. OpenAI blog, 2019, 1.8: 9
    [78] Yang ZL, Dai ZH, Yang YM, et al. XLNet: Generalized autoregressive pretraining for language understanding. Proceedings of the 33rd International Conference on Neural Information Processing Systems. Vancouver: Curran Associates Inc., 2019. 517.
    [79] Xu K, Ba JL, Kiros R, et al. Show, attend and tell: Neural image caption generation with visual attention. Proceedings of the 32nd International Conference on International Conference on Machine Learning. Lille: JMLR.org, 2015. 2048-2057.
    [80] Vig J. BertViz: A tool for visualizing multi-head self-attention in the BERT model. Proceedings of the 2019 International Conference on Learning Representations. ICLR, 2019.
    [81] Hoover B, Strobelt H, Gehrmann S. exBERT: A visual analysis tool to explore learned representations in transformer models. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations. ACL, 2019. 187-196.
    [82] Le Q, Mikolov T. Distributed representations of sentences and documents. Proceedings of the 31st International Conference on International Conference on Machine Learning. Beijing: JMLR.org, 2014.
    [83] Koren Y. Drawing graphs by eigenvectors: Theory and practice. Computers & Mathematics with Applications, 2005, 49(11-12): 1867-1888.
    [84] Görg C, Liu ZC, Stasko J. Reflections on the evolution of the Jigsaw visual analytics system. Information Visualization, 2014, 13(4): 336-345. [doi: 10.1177/1473871613495674
    [85] Ji XN, Tu YM, He WB, et al. USEVis: Visual analytics of attention-based neural embedding in information retrieval. Visual Informatics, 2021, 5(2): 1-12. [doi: 10.1016/j.visinf.2021.03.003
    [86] Park C, Na I, Jo Y, et al. SANVis: Visual analytics for understanding self-attention networks. Proceedings of the 2019 IEEE Visualization Conference (VIS). Vancouver: IEEE, 2019. 146-150.
    [87] Covington P, Adams J, Sargin E. Deep neural networks for YouTube recommendations. Proceedings of the 10th ACM Conference on Recommender Systems. Boston: ACM, 2016. 191-198.
    [88] 任磊, 杜一, 马帅, 等. 大数据可视分析综述. 软件学报, 2014, 25(9): 1909-1936. [doi: 10.13328/j.cnki.jos.004645
    [89] Smilkov D, Thorat N, Nicholson C, et al. Embedding projector: Interactive visualization and interpretation of embeddings. arXiv:1611.05469, 2016.
    [90] Liu SX, Wang XT, Liu MC, et al. Towards better analysis of machine learning models: A visual analytics perspective. Visual Informatics, 2017, 1(1): 48-56. [doi: 10.1016/j.visinf.2017.01.006
    [91] Pruksachatkun Y, Yeres P, Liu HK, et al. jiant: A software toolkit for research on general-purpose text understanding models. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations. ACL, 2020. 109-117.
    [92] Kahng M, Andrews PY, Kalro A, et al. ActiVis: Visual exploration of industry-scale deep neural network models. IEEE Transactions on Visualization and Computer Graphics, 2018, 24(1): 88-97. [doi: 10.1109/tvcg.2017.2744718
    [93] Ellison NB, Steinfield C, Lampe C. The benefits of Facebook “friends:” Social capital and college students’ use of online social network sites. Journal of Computer-mediated Communication, 2007, 12(4): 1143-1168. [doi: 10.1111/j.1083-6101.2007.00367.x
    [94] Li X, Roth D. Learning question classifiers. Proceedings of the 19th International Conference on Computational Linguistics. Taipei: ACL, 2002. 1-7.
    [95] 窦慧, 张凌茗, 韩峰, 等. 卷积神经网络的可解释性研究综述. 软件学报, 1-27. https://doi.org/10.13328/j.cnki.jos.006758. [2023-05-18].
    [96] Chen ZP, Zheng YJ, Li XJ, et al. Interactive trimap generation for digital matting based on single-sample learning. Electronics, 2020, 9(4): 659. [doi: 10.3390/electronics9040659
    [97] 王格荣. 基于语义对齐的零样本图像分类研究[硕士学位论文]. 西安: 西安电子科技大学, 2022.
    [98] Ramesh A, Pavlov M, Goh G, et al. Zero-shot text-to-image generation. Proceedings of the 38th International Conference on Machine Learning. ICML, 2021. 8821-8831.
    [99] 梁俊杰, 韦舰晶, 蒋正锋. 生成对抗网络GAN综述. 计算机科学与探索, 2020, 14(1): 1-17. [doi: 10.3778/j.issn.1673-9418.1910026
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

胡凯茜,李欣,裴炳森.面向目标用户的深度学习模型可视化综述.计算机系统应用,2023,32(11):36-47

复制
分享
文章指标
  • 点击次数:731
  • 下载次数: 5636
  • HTML阅读次数: 3073
  • 引用次数: 0
历史
  • 收稿日期:2023-04-05
  • 最后修改日期:2023-05-06
  • 在线发布日期: 2023-08-09
文章二维码
您是第11114369位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京海淀区中关村南四街4号 中科院软件园区 7号楼305房间,邮政编码:100190
电话:010-62661041 传真: Email:csa (a) iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号