In the era of big data,data-driven technologies are increasingly leveraged by industry to facilitate autonomous learning and intelligent decision-making.However,the challenge of“small samples in big data”emerges whe...In the era of big data,data-driven technologies are increasingly leveraged by industry to facilitate autonomous learning and intelligent decision-making.However,the challenge of“small samples in big data”emerges when datasets lack the comprehensive information necessary for addressing complex scenarios,which hampers adaptability.Thus,enhancing data completeness is essential.Knowledge-guided virtual sample generation transforms domain knowledge into extensive virtual datasets,thereby reducing dependence on limited real samples and enabling zero-sample fault diagnosis.This study used building air conditioning systems as a case study.We innovatively used the large language model(LLM)to acquire domain knowledge for sample generation,significantly lowering knowledge acquisition costs and establishing a generalized framework for knowledge acquisition in engineering applications.This acquired knowledge guided the design of diffusion boundaries in mega-trend diffusion(MTD),while the Monte Carlo method was used to sample within the diffusion function to create information-rich virtual samples.Additionally,a noise-adding technique was introduced to enhance the information entropy of these samples,thereby improving the robustness of neural networks trained with them.Experimental results showed that training the diagnostic model exclusively with virtual samples achieved an accuracy of 72.80%,significantly surpassing traditional small-sample supervised learning in terms of generalization.This underscores the quality and completeness of the generated virtual samples.展开更多
基金supported by the National Natural Science Foundation of China(No.62306281)the Natural Science Foundation of Zhejiang Province(Nos.LQ23E060006 and LTGG24E050005)the Key Research Plan of Jiaxing City(No.2024BZ20016).
文摘In the era of big data,data-driven technologies are increasingly leveraged by industry to facilitate autonomous learning and intelligent decision-making.However,the challenge of“small samples in big data”emerges when datasets lack the comprehensive information necessary for addressing complex scenarios,which hampers adaptability.Thus,enhancing data completeness is essential.Knowledge-guided virtual sample generation transforms domain knowledge into extensive virtual datasets,thereby reducing dependence on limited real samples and enabling zero-sample fault diagnosis.This study used building air conditioning systems as a case study.We innovatively used the large language model(LLM)to acquire domain knowledge for sample generation,significantly lowering knowledge acquisition costs and establishing a generalized framework for knowledge acquisition in engineering applications.This acquired knowledge guided the design of diffusion boundaries in mega-trend diffusion(MTD),while the Monte Carlo method was used to sample within the diffusion function to create information-rich virtual samples.Additionally,a noise-adding technique was introduced to enhance the information entropy of these samples,thereby improving the robustness of neural networks trained with them.Experimental results showed that training the diagnostic model exclusively with virtual samples achieved an accuracy of 72.80%,significantly surpassing traditional small-sample supervised learning in terms of generalization.This underscores the quality and completeness of the generated virtual samples.