期刊文献+

面向云原生网络的微服务系统故障诊断方法

A fault diagnosis framework for microservice systems in cloud-native networks
原文传递
导出
摘要 云原生数据中心网络的复杂性和动态性对多源数据驱动的故障检测及细粒度分析提出了更高要求.有效的故障检测与详细的故障关联和分析成为提升云原生网络韧性的关键.然而,现有方法缺乏对故障与多源数据特征的关联分析,难以生成与网络组件和实例相关的故障报告.为此,本文提出了一种面向云原生网络的微服务系统故障诊断方法,旨在通过故障检测、关联和分析来提升网络韧性.该方法分析3类故障与多源数据的相关性,结合关键指标、跨度持续时间和偏移特征来构建特征序列.通过追踪数据和网络资源接口构建动态的多层微服务架构图,并关联深度学习模型输出的故障检测结果,最终生成细粒度的故障分析报告.实验结果表明,本文方法在故障检测性能上高于其他基线方法,并为工程师提供了云原生网络相关的故障关联和分析报告,帮助提升云原生网络韧性. The complexity and dynamics of cloud-native data center networks impose increased demands on multisource data-driven fault detection and fine-grained analysis.Effective fault detection and detailed fault correlation and analysis are highly important for improving resilience in cloud-native networks.However,the existing methods lack correlation analysis between faults and multisource data features and face difficulties in generating fault reports related to network components or instances.In this study,we propose a faultdiagnosis method for microservice systems in cloud-native networks to improve network resilience via fault detection,correlation,and analysis.In the proposed method,the correlation among three types of faults and multisource data is analyzed,and feature sequences are constructed by combining key metrics,span duration,and offset features.Additionally,a dynamic multilayer microservice architecture graph employing traces and network resource interfaces is constructed using the fault detection results obtained from a deep learning model.Ultimately,the proposed method generates detailed fault analysis reports.The experimental results show that the fault detection performance of our method is superior to that of baseline approaches and provides engineers with fault correlation and analysis reports related to cloud-native networks,thus improving network resilience.
作者 付楠 程光 戴广晔 滕跃 Nan FU;Guang CHENG;Guangye DAI;Yue TENG(School of Cyber Science and Engineering,Southeast University,Nanjing 211189,China;Jiangsu Province Engineering Research Center of Security for Ubiquitous Network,Nanjing 211189,China;Purple Mountain Laboratories,Nanjing 211111,China)
出处 《中国科学:信息科学》 北大核心 2025年第8期1888-1905,共18页 Scientia Sinica(Informationis)
基金 国家自然科学基金联合基金(批准号:U22B2025)资助项目。
关键词 云原生 微服务 网络韧性 故障诊断 人工智能运维 cloud-native microservice network resilience fault diagnosis AIOpos
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部