###
计算机系统应用英文版:2021,30(5):1-11
本文二维码信息
码上扫一扫!
基于相似度匹配的微服务故障诊断方法
(1.中国科学院 软件研究所, 北京 100190;2.中国科学院大学, 北京 100049;3.中国科学院 软件研究所 计算机科学国家重点实验室, 100190)
Fault Diagnosis Method Based on Trace Similarity Matching
(1.Institute of Software, Chinese Academy of Sciences, Beijing 100190, China;2.University of Chinese Academy of Sciences, Beijing 100049, China;3.State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences, Beijing 100190, China)
摘要
图/表
参考文献
相似文献
本文已被:浏览 1014次   下载 1486
Received:August 31, 2020    Revised:September 23, 2020
中文摘要: 随着互联网服务的快速发展, 分布式的微服务应用逐渐取代传统的单体应用成为互联网应用的主要形式之一. 微服务应用在具有可伸缩性、容错性、高可用性等优点的同时, 也存在着构建繁琐、部署复杂和维护困难等挑战. 面向云计算环境的微服务监测与运维是当前的研究热点, 但仍然存在粒度较粗、故障定位不准确等缺点. 针对以上问题, 本文提出了一种基于模式匹配的微服务故障诊断方法. 首先, 使用注入代理转发请求流量的方式收集并建模微服务的追踪信息; 然后, 收集系统正常运行下的状态信息, 并通过注入已知故障来收集并刻画故障发生后应用的运行状态; 最后, 将未知故障的执行追踪信息与已知故障的执行追踪信息相匹配, 采用字符串编辑距离衡量相似度以诊断可能的故障原因. 实验结果表明, 该方法可以有效刻画请求的处理执行追踪信息, 以微服务为粒度准确定位应用的故障原因.
中文关键词: 云计算  故障诊断  执行轨迹  微服务
Abstract:Along with the rapid development of internet services, the distributed microservice-based application has gradually replaced the traditional application as one of the main forms of Internet applications. Distributed microservice-based applications boast scalability, high fault tolerance, and great availability, but they are often challenged by cumbersome installation, complicated deployment, and difficult maintenance. Kubernetes, as the most popular container-based cluster management system, is affected by coarse grains, inaccurate fault location, and other weaknesses. To address the above issues, this study proposes a fault detection method based on trace similarity matching: First, use injecting proxy to forward request traffic to collect tracking information about microservices. Then, collect the state information during normal operation of the system and record the performance of the system after the failure occurs by injecting known faults. Finally, take string edit distance as the standard for the execution tracking models of unknown and known faults. The edit distance serves as a standard to measure the similarity, and the possible cause of failure is identified. Experimental results show that the method can accurately describe the processing and execution tracking information of the request and find the cause of system failure with microservices as the granularity.
文章编号:     中图分类号:    文献标志码:
基金项目:国家重点研发计划(2017YFB1400804); 国家自然科学基金(61872344); 北京市自然科学基金(4182070); 中国科学院青年创新促进会人才专项(2018144)
引用文本:
陈皓,许源佳,王焘,张文博.基于相似度匹配的微服务故障诊断方法.计算机系统应用,2021,30(5):1-11
CHEN Hao,XU Yuan-Jia,WANG Tao,ZHANG Wen-Bo.Fault Diagnosis Method Based on Trace Similarity Matching.COMPUTER SYSTEMS APPLICATIONS,2021,30(5):1-11