随着软件服务系统日益庞大、复杂,基于日志的故障诊断对保证软件服务的可靠性至关重要.已有的日志故障诊断方法虽然可以确定故障类型,但无法为其推理过程提供解释让运维人员信服,从而导致它们难以在实际生产环境中进行部署.为此,本文提...随着软件服务系统日益庞大、复杂,基于日志的故障诊断对保证软件服务的可靠性至关重要.已有的日志故障诊断方法虽然可以确定故障类型,但无法为其推理过程提供解释让运维人员信服,从而导致它们难以在实际生产环境中进行部署.为此,本文提出了一种全新的通过自动构建思维链指令提示(log Chain of Thought-Prompting,CoT-Prompting)来进行日志故障诊断的框架——LogCoT(Log Chain of Thought),它利用基于两阶段思维链提示工程(Auto-Few-Shot-CoT,Auto-FSC)算法,通过大语言模型(Large Language Model,LLM)提取日志的语义信息,从而生成可解释的根因分析报告.此外,LogCoT结合无类别标注的指令优化(prompt-tuning)工程和有类别标注的参数微调(preference-tuning)技术优化微调Mistral基座模型.然后通过大模型反馈身份偏好优化(Large-Language Model feedback Identity Preference Optimisation,LLMf-IPO)算法纠正Mistral生成的错误诊断结果,以更好对齐用户意图.最后,本文基于从一家互联网服务提供商和一家云服务提供商的生产环境中收集的两个日志数据集对LogCoT的性能进行了全面综合的实验评估.实验结果表明,LogCoT在Accuracy、Macro-F1、Weighted-F1等三个性能指标上均优于当前典型的基线模型,在两个数据集上比现有最佳模型的Accuracy分别高出31.88个百分点和10.51个百分点.展开更多
As the increasing popularity and complexity of Web applications and the emergence of their new characteristics, the testing and maintenance of large, complex Web applications are becoming more complex and difficult. W...As the increasing popularity and complexity of Web applications and the emergence of their new characteristics, the testing and maintenance of large, complex Web applications are becoming more complex and difficult. Web applications generally contain lots of pages and are used by enormous users. Statistical testing is an effective way of ensuring their quality. Web usage can be accurately described by Markov chain which has been proved to be an ideal model for software statistical testing. The results of unit testing can be utilized in the latter stages, which is an important strategy for bottom-to-top integration testing, and the other improvement of extended Markov chain model (EMM) is to present the error type vector which is treated as a part of page node. this paper also proposes the algorithm for generating test cases of usage paths. Finally, optional usage reliability evaluation methods and an incremental usability regression testing model for testing and evaluation are presented. Key words statistical testing - evaluation for Web usability - extended Markov chain model (EMM) - Web log mining - reliability evaluation CLC number TP311. 5 Foundation item: Supported by the National Defence Research Project (No. 41315. 9. 2) and National Science and Technology Plan (2001BA102A04-02-03)Biography: MAO Cheng-ying (1978-), male, Ph.D. candidate, research direction: software testing. Research direction: advanced database system, software testing, component technology and data mining.展开更多
提出一种基于判别模型的拼写校正方法.它针对已有拼写校正系统Aspell的输出进行重排序,使用判别模型Ranking SVM来改进其性能.将现今较为成熟的拼写校正技术(包括编辑距离、基于字母的n元语法、发音相似度和噪音信道模型)以特征的形式...提出一种基于判别模型的拼写校正方法.它针对已有拼写校正系统Aspell的输出进行重排序,使用判别模型Ranking SVM来改进其性能.将现今较为成熟的拼写校正技术(包括编辑距离、基于字母的n元语法、发音相似度和噪音信道模型)以特征的形式整合到该模型中来,显著地提高了基准系统Aspell的初始排序质量,同时性能也超过了一些商用系统(如Microsoft Word 2003)的拼写校正模块.此外,还提出了一种在搜索引擎查询日志链中自动抽取拼写校正训练对的方法.基于这种方法训练的模型获得了基于人工标注数据所得结果相近的性能,它们分别将基准系统的错误率降低了32.2%和32.6%.展开更多
文摘随着软件服务系统日益庞大、复杂,基于日志的故障诊断对保证软件服务的可靠性至关重要.已有的日志故障诊断方法虽然可以确定故障类型,但无法为其推理过程提供解释让运维人员信服,从而导致它们难以在实际生产环境中进行部署.为此,本文提出了一种全新的通过自动构建思维链指令提示(log Chain of Thought-Prompting,CoT-Prompting)来进行日志故障诊断的框架——LogCoT(Log Chain of Thought),它利用基于两阶段思维链提示工程(Auto-Few-Shot-CoT,Auto-FSC)算法,通过大语言模型(Large Language Model,LLM)提取日志的语义信息,从而生成可解释的根因分析报告.此外,LogCoT结合无类别标注的指令优化(prompt-tuning)工程和有类别标注的参数微调(preference-tuning)技术优化微调Mistral基座模型.然后通过大模型反馈身份偏好优化(Large-Language Model feedback Identity Preference Optimisation,LLMf-IPO)算法纠正Mistral生成的错误诊断结果,以更好对齐用户意图.最后,本文基于从一家互联网服务提供商和一家云服务提供商的生产环境中收集的两个日志数据集对LogCoT的性能进行了全面综合的实验评估.实验结果表明,LogCoT在Accuracy、Macro-F1、Weighted-F1等三个性能指标上均优于当前典型的基线模型,在两个数据集上比现有最佳模型的Accuracy分别高出31.88个百分点和10.51个百分点.
文摘As the increasing popularity and complexity of Web applications and the emergence of their new characteristics, the testing and maintenance of large, complex Web applications are becoming more complex and difficult. Web applications generally contain lots of pages and are used by enormous users. Statistical testing is an effective way of ensuring their quality. Web usage can be accurately described by Markov chain which has been proved to be an ideal model for software statistical testing. The results of unit testing can be utilized in the latter stages, which is an important strategy for bottom-to-top integration testing, and the other improvement of extended Markov chain model (EMM) is to present the error type vector which is treated as a part of page node. this paper also proposes the algorithm for generating test cases of usage paths. Finally, optional usage reliability evaluation methods and an incremental usability regression testing model for testing and evaluation are presented. Key words statistical testing - evaluation for Web usability - extended Markov chain model (EMM) - Web log mining - reliability evaluation CLC number TP311. 5 Foundation item: Supported by the National Defence Research Project (No. 41315. 9. 2) and National Science and Technology Plan (2001BA102A04-02-03)Biography: MAO Cheng-ying (1978-), male, Ph.D. candidate, research direction: software testing. Research direction: advanced database system, software testing, component technology and data mining.
基金Supported by the National Natural Science Foundation of China under Grant No.60603027 (国家自然科学基金)the Science-Technology Development Project of Tianjin of China under Grant No.04310941R (天津市科技发展计划)the Applied Basic Research Project of Tianjin of China under Grant No.05YFJMJC11700 (天津市应用基础研究计划)
文摘提出一种基于判别模型的拼写校正方法.它针对已有拼写校正系统Aspell的输出进行重排序,使用判别模型Ranking SVM来改进其性能.将现今较为成熟的拼写校正技术(包括编辑距离、基于字母的n元语法、发音相似度和噪音信道模型)以特征的形式整合到该模型中来,显著地提高了基准系统Aspell的初始排序质量,同时性能也超过了一些商用系统(如Microsoft Word 2003)的拼写校正模块.此外,还提出了一种在搜索引擎查询日志链中自动抽取拼写校正训练对的方法.基于这种方法训练的模型获得了基于人工标注数据所得结果相近的性能,它们分别将基准系统的错误率降低了32.2%和32.6%.