With the rise of open-source software,the social development paradigm occupies an indispensable position in the current software development process.This paper puts forward a variant of the PageRank algorithm to build...With the rise of open-source software,the social development paradigm occupies an indispensable position in the current software development process.This paper puts forward a variant of the PageRank algorithm to build the importance assessment model,which provides quantifiable importance assessment metrics for new Java projects based on Java open-source projects or components.The critical point of the model is to use crawlers to obtain relevant information about Java open-source projects in the GitHub open-source community to build a domain knowledge graph.According to the three dimensions of the Java open-source project’s project influence,project activity and project popularity,the project is measured.A modified PageRank algorithm is proposed to construct the importance evaluation model.Thereby providing quantifiable importance evaluation indicators for new Java projects based on or components of Java open-source projects.This article evaluates the importance of 4512 Java open-source projects obtained on GitHub and has a good effect.展开更多
As large language models(LLMs)continue to demonstrate their potential in handling complex tasks,their value in knowledge-intensive industrial scenarios is becoming increasingly evident.Fault diagnosis,a critical domai...As large language models(LLMs)continue to demonstrate their potential in handling complex tasks,their value in knowledge-intensive industrial scenarios is becoming increasingly evident.Fault diagnosis,a critical domain in the industrial sector,has long faced the dual challenges of managing vast amounts of experiential knowledge and improving human-machine collaboration efficiency.Traditional fault diagnosis systems,which are primarily based on expert systems,suffer from three major limitations:(1)ineffective organization of fault diagnosis knowledge,(2)lack of adaptability between static knowledge frameworks and dynamic engineering environments,and(3)difficulties in integrating expert knowledge with real-time data streams.These systemic shortcomings restrict the ability of conventional approaches to handle uncertainty.In this study,we proposed an intelligent computer numerical control(CNC)fault diagnosis system,integrating LLMs with knowledge graph(KG).First,we constructed a comprehensive KG that consolidated multi-source data for structured representation.Second,we designed a retrievalaugmented generation(RAG)framework leveraging the KG to support multi-turn interactive fault diagnosis while incorporating real-time engineering data into the decision-making process.Finally,we introduced a learning mechanism to facilitate dynamic knowledge updates.The experimental results demonstrated that our system significantly improved fault diagnosis accuracy,outperforming engineers with two years of professional experience on our constructed benchmark datasets.By integrating LLMs and KG,our framework surpassed the limitations of traditional expert systems rooted in symbolic reasoning,offering a novel approach to addressing the cognitive paradox of unstructured knowledge modeling and dynamic environment adaptation in industrial settings.展开更多
Generally,knowledge extraction technology is used to obtain nodes and relationships of unstructured data and structured data,and then the data fuse with the original knowledge graph to achieve the extension of the kno...Generally,knowledge extraction technology is used to obtain nodes and relationships of unstructured data and structured data,and then the data fuse with the original knowledge graph to achieve the extension of the knowledge graph.Because the concepts and knowledge structures expressed on the Internet have problems of multi-source heterogeneity and low accuracy,it is usually difficult to achieve a good effect simply by using knowledge extraction technology.Considering that domain knowledge is highly dependent on the relevant expert knowledge,the method of this paper try to expand the domain knowledge through the crowdsourcing method.The method split the domain knowledge system into subgraph of knowledge according to corresponding concept,form subtasks with moderate granularity,and use the crowdsourcing technology for the acquisition and integration of knowledge subgraph to improve the knowledge system.展开更多
基金This work has been supported by the National Science Foundation of China Grant No.61762092“Dynamic multi-objective requirement optimization based on transfer learning,”and the Open Foundation of the Key Laboratory in Software Engineering of Yunnan Province,Grant No.2017SE204+1 种基金“Research on extracting software feature models using transfer learning,”and the National Science Foundation of China Grant No.61762089“The key research of high order tensor decomposition in a distributed environment”.
文摘With the rise of open-source software,the social development paradigm occupies an indispensable position in the current software development process.This paper puts forward a variant of the PageRank algorithm to build the importance assessment model,which provides quantifiable importance assessment metrics for new Java projects based on Java open-source projects or components.The critical point of the model is to use crawlers to obtain relevant information about Java open-source projects in the GitHub open-source community to build a domain knowledge graph.According to the three dimensions of the Java open-source project’s project influence,project activity and project popularity,the project is measured.A modified PageRank algorithm is proposed to construct the importance evaluation model.Thereby providing quantifiable importance evaluation indicators for new Java projects based on or components of Java open-source projects.This article evaluates the importance of 4512 Java open-source projects obtained on GitHub and has a good effect.
基金funded by the National Natural Science Foundation of China(72104224,L2424237,71974107,L2224059,L2124002,and 91646102)the Beijing Natural Science Foundation(9232015)+4 种基金the Beijing Social Science Foundation(24GLC058)the Construction Project of China Knowledge Center for Engineering Sciences and Technology(CKCEST-2023-1-7)the MOE(Ministry of Education in China)Project of Humanities and Social Sciences(16JDGC011)the Tsinghua University Initiative Scientific Research Program(2019Z02CAU)the Tsinghua University Project of Volvo-Supported Green Economy and Sustainable Development(20183910020)。
文摘As large language models(LLMs)continue to demonstrate their potential in handling complex tasks,their value in knowledge-intensive industrial scenarios is becoming increasingly evident.Fault diagnosis,a critical domain in the industrial sector,has long faced the dual challenges of managing vast amounts of experiential knowledge and improving human-machine collaboration efficiency.Traditional fault diagnosis systems,which are primarily based on expert systems,suffer from three major limitations:(1)ineffective organization of fault diagnosis knowledge,(2)lack of adaptability between static knowledge frameworks and dynamic engineering environments,and(3)difficulties in integrating expert knowledge with real-time data streams.These systemic shortcomings restrict the ability of conventional approaches to handle uncertainty.In this study,we proposed an intelligent computer numerical control(CNC)fault diagnosis system,integrating LLMs with knowledge graph(KG).First,we constructed a comprehensive KG that consolidated multi-source data for structured representation.Second,we designed a retrievalaugmented generation(RAG)framework leveraging the KG to support multi-turn interactive fault diagnosis while incorporating real-time engineering data into the decision-making process.Finally,we introduced a learning mechanism to facilitate dynamic knowledge updates.The experimental results demonstrated that our system significantly improved fault diagnosis accuracy,outperforming engineers with two years of professional experience on our constructed benchmark datasets.By integrating LLMs and KG,our framework surpassed the limitations of traditional expert systems rooted in symbolic reasoning,offering a novel approach to addressing the cognitive paradox of unstructured knowledge modeling and dynamic environment adaptation in industrial settings.
文摘Generally,knowledge extraction technology is used to obtain nodes and relationships of unstructured data and structured data,and then the data fuse with the original knowledge graph to achieve the extension of the knowledge graph.Because the concepts and knowledge structures expressed on the Internet have problems of multi-source heterogeneity and low accuracy,it is usually difficult to achieve a good effect simply by using knowledge extraction technology.Considering that domain knowledge is highly dependent on the relevant expert knowledge,the method of this paper try to expand the domain knowledge through the crowdsourcing method.The method split the domain knowledge system into subgraph of knowledge according to corresponding concept,form subtasks with moderate granularity,and use the crowdsourcing technology for the acquisition and integration of knowledge subgraph to improve the knowledge system.