Large language models cross-domain named entity recognition task in the face of the scarcity of large language labeled data in a specific domain,due to the entity bias arising from the variation of entity information ...Large language models cross-domain named entity recognition task in the face of the scarcity of large language labeled data in a specific domain,due to the entity bias arising from the variation of entity information between different domains,which makes large language models prone to spurious correlations problems when dealing with specific domains and entities.In order to solve this problem,this paper proposes a cross-domain named entity recognition method based on causal graph structure enhancement,which captures the cross-domain invariant causal structural representations between feature representations of text sequences and annotation sequences by establishing a causal learning and intervention module,so as to improve the utilization of causal structural features by the large languagemodels in the target domains,and thus effectively alleviate the false entity bias triggered by the false relevance problem;meanwhile,through the semantic feature fusion module,the semantic information of the source and target domains is effectively combined.The results show an improvement of 2.47%and 4.12%in the political and medical domains,respectively,compared with the benchmark model,and an excellent performance in small-sample scenarios,which proves the effectiveness of causal graph structural enhancement in improving the accuracy of cross-domain entity recognition and reducing false correlations.展开更多
Detection and clarification of cause-effect relationships among variables is an important problem in time series analysis. Traditional causality inference methods have a salient limitation that the model must be linea...Detection and clarification of cause-effect relationships among variables is an important problem in time series analysis. Traditional causality inference methods have a salient limitation that the model must be linear and with Gaussian noise. Although additive model regression can effectively infer the nonlinear causal relationships of additive nonlinear time series, it suffers from the limitation that contemporaneous causal relationships of variables must be linear and not always valid to test conditional independence relations. This paper provides a nonparametric method that employs both mutual information and conditional mutual information to identify causal structure of a class of nonlinear time series models, which extends the additive nonlinear times series to nonlinear structural vector autoregressive models. An algorithm is developed to learn the contemporaneous and the lagged causal relationships of variables. Simulations demonstrate the effectiveness of the nroosed method.展开更多
Detection and clarification of cause-effect relationships among variables is an important problem in time series analysis.This paper provides a method that employs both mutual information and conditional mutual inform...Detection and clarification of cause-effect relationships among variables is an important problem in time series analysis.This paper provides a method that employs both mutual information and conditional mutual information to identify the causal structure of multivariate time series causal graphical models.A three-step procedure is developed to learn the contemporaneous and the lagged causal relationships of time series causal graphs.Contrary to conventional constraint-based algorithm, the proposed algorithm does not involve any special kinds of distribution and is nonparametric.These properties are especially appealing for inference of time series causal graphs when the prior knowledge about the data model is not available.Simulations and case analysis demonstrate the effectiveness of the method.展开更多
Objective:Chronic fatigue syndrome(CFS)is a prevalent symptom of post-coronavirus disease 2019(COVID-19)and is associated with unclear disease mechanisms.The herbal medicine Qingjin Yiqi granules(QJYQ)constitute a cli...Objective:Chronic fatigue syndrome(CFS)is a prevalent symptom of post-coronavirus disease 2019(COVID-19)and is associated with unclear disease mechanisms.The herbal medicine Qingjin Yiqi granules(QJYQ)constitute a clinically approved formula for treating post-COVID-19;however,its potential as a drug target for treating CFS remains largely unknown.This study aimed to identify novel causal factors for CFS and elucidate the potential targets and pharmacological mechanisms of action of QJYQ in treating CFS.Methods:This prospective cohort analysis included 4,212 adults aged≥65 years who were followed up for 7 years with 435 incident CFS cases.Causal modeling and multivariate logistic regression analysis were performed to identify the potential causal determinants of CFS.A proteome-wide,two-sample Mendelian randomization(MR)analysis was employed to explore the proteins associated with the identified causal factors of CFS,which may serve as potential drug targets.Furthermore,we performed a virtual screening analysis to assess the binding affinity between the bioactive compounds in QJYQ and CFS-associated proteins.Results:Among 4,212 participants(47.5%men)with a median age of 69 years(interquartile range:69–70 years)enrolled in 2004,435 developed CFS by 2011.Causal graph analysis with multivariate logistic regression identified frequent cough(odds ratio:1.74,95%confidence interval[CI]:1.15–2.63)and insomnia(odds ratio:2.59,95%CI:1.77–3.79)as novel causal factors of CFS.Proteome-wide MR analysis revealed that the upregulation of endothelial cell-selective adhesion molecule(ESAM)was causally linked to both chronic cough(odds ratio:1.019,95%CI:1.012–1.026,P=2.75 e^(−05))and insomnia(odds ratio:1.015,95%CI:1.008–1.022,P=4.40 e^(−08))in CFS.The major bioactive compounds of QJYQ,ginsenoside Rb2(docking score:−6.03)and RG4(docking score:−6.15),bound to ESAM with high affinity based on virtual screening.Conclusions:Our integrated analytical framework combining epidemiological,genetic,and in silico data provides a novel strategy for elucidating complex disease mechanisms,such as CFS,and informing models of action of traditional Chinese medicines,such as QJYQ.Further validation in animal models is warranted to confirm the potential pharmacological effects of QJYQ on ESAM and as a treatment for CFS.展开更多
Fault diagnostics is important for safe operation of nuclear power plants(NPPs). In recent years, data-driven approaches have been proposed and implemented to tackle the problem, e.g., neural networks, fuzzy and neuro...Fault diagnostics is important for safe operation of nuclear power plants(NPPs). In recent years, data-driven approaches have been proposed and implemented to tackle the problem, e.g., neural networks, fuzzy and neurofuzzy approaches, support vector machine, K-nearest neighbor classifiers and inference methodologies. Among these methods, dynamic uncertain causality graph(DUCG)has been proved effective in many practical cases. However, the causal graph construction behind the DUCG is complicate and, in many cases, results redundant on the symptoms needed to correctly classify the fault. In this paper, we propose a method to simplify causal graph construction in an automatic way. The method consists in transforming the expert knowledge-based DCUG into a fuzzy decision tree(FDT) by extracting from the DUCG a fuzzy rule base that resumes the used symptoms at the basis of the FDT. Genetic algorithm(GA) is, then, used for the optimization of the FDT, by performing a wrapper search around the FDT: the set of symptoms selected during the iterative search are taken as the best set of symptoms for the diagnosis of the faults that can occur in the system. The effectiveness of the approach is shown with respect to a DUCG model initially built to diagnose 23 faults originally using 262 symptoms of Unit-1 in the Ningde NPP of the China Guangdong Nuclear Power Corporation. The results show that the FDT, with GA-optimized symptoms and diagnosis strategy, can drive the construction of DUCG and lower the computational burden without loss of accuracy in diagnosis.展开更多
Jaundice is a common and complex clinical symptom potentially occurring in hepatology, general surgery, pediatrics, infectious diseases, gynecology, and obstetrics, and it is faidy difficult to distinguish the cause o...Jaundice is a common and complex clinical symptom potentially occurring in hepatology, general surgery, pediatrics, infectious diseases, gynecology, and obstetrics, and it is faidy difficult to distinguish the cause of jaundice in clinical practice, especially for general practitioners in less developed regions. With collaboration between physicians and artificial intelligence engineers, a comprehensive knowledge base relevant to jaundice was created based on demographic information, symptoms, physical signs, laboratory tests, imaging diagnosis, medical histories, and risk factors. Then a diagnostic modeling and reasoning system using the dynamic uncertain causality graph was proposed. A modularized modeling scheme was presented to reduce the complexity of model construction, providing multiple perspectives and arbitrary granularity for disease causality representations. A "chaining" inference algorithm and weighted logic operation mechanism were employed to guarantee the exactness and efficiency of diagnostic rea- soning under situations of incomplete and uncertain information. Moreover, the causal interactions among diseases and symptoms intuitively demonstrated the reasoning process in a graphical manner. Verification was performed using 203 randomly pooled clinical cases, and the accuracy was 99.01% and 84.73%, respectively, with or without laboratory tests in the model. The solutions were more explicable and convincing than common methods such as Bayesian Networks, further increasing the objectivity of clinical decision-making. The promising results indicated that our model could be potentially used in intelligent diagnosis and help decrease public health expenditure.展开更多
Aim To improve the causal diagnosis method presented by Bandekar and propose a new method of finding the root fault order according to the fault possibility by means of numerical calculation. Methods Based on the ca...Aim To improve the causal diagnosis method presented by Bandekar and propose a new method of finding the root fault order according to the fault possibility by means of numerical calculation. Methods Based on the causal graph, by utilization of fuzzified threshold value and fuzzy discrimination matrix, a kind of fuzzy causal diagnosis method was given and the fault possibility of each elements in the root fault candidate set (RFCS) was obtained. Results and Conclusion The order of each element in the RFCS can be obtained by the fault possibility, which makes the location of fault much easier. The diagnosis speed of this method is quite high, and by means of the fuzzified threshold value and fuzzy discrimination matrix, the result is more robust to noises and bad parameter's choice.展开更多
基金supported by National Natural Science Foundation of China Joint Fund for Enterprise Innovation Development(U23B2029)National Natural Science Foundation of China(62076167,61772020)+1 种基金Key Scientific Research Project of Higher Education Institutions in Henan Province(24A520058,24A520060,23A520022)Postgraduate Education Reform and Quality Improvement Project of Henan Province(YJS2024AL053).
文摘Large language models cross-domain named entity recognition task in the face of the scarcity of large language labeled data in a specific domain,due to the entity bias arising from the variation of entity information between different domains,which makes large language models prone to spurious correlations problems when dealing with specific domains and entities.In order to solve this problem,this paper proposes a cross-domain named entity recognition method based on causal graph structure enhancement,which captures the cross-domain invariant causal structural representations between feature representations of text sequences and annotation sequences by establishing a causal learning and intervention module,so as to improve the utilization of causal structural features by the large languagemodels in the target domains,and thus effectively alleviate the false entity bias triggered by the false relevance problem;meanwhile,through the semantic feature fusion module,the semantic information of the source and target domains is effectively combined.The results show an improvement of 2.47%and 4.12%in the political and medical domains,respectively,compared with the benchmark model,and an excellent performance in small-sample scenarios,which proves the effectiveness of causal graph structural enhancement in improving the accuracy of cross-domain entity recognition and reducing false correlations.
基金supported by the National Natural Science Foundation of China under Grant Nos.60972150 and 10926197
文摘Detection and clarification of cause-effect relationships among variables is an important problem in time series analysis. Traditional causality inference methods have a salient limitation that the model must be linear and with Gaussian noise. Although additive model regression can effectively infer the nonlinear causal relationships of additive nonlinear time series, it suffers from the limitation that contemporaneous causal relationships of variables must be linear and not always valid to test conditional independence relations. This paper provides a nonparametric method that employs both mutual information and conditional mutual information to identify causal structure of a class of nonlinear time series models, which extends the additive nonlinear times series to nonlinear structural vector autoregressive models. An algorithm is developed to learn the contemporaneous and the lagged causal relationships of variables. Simulations demonstrate the effectiveness of the nroosed method.
基金supported by the National Natural Science Foundation of China under Grant Nos.60972150, 10926197,61201323
文摘Detection and clarification of cause-effect relationships among variables is an important problem in time series analysis.This paper provides a method that employs both mutual information and conditional mutual information to identify the causal structure of multivariate time series causal graphical models.A three-step procedure is developed to learn the contemporaneous and the lagged causal relationships of time series causal graphs.Contrary to conventional constraint-based algorithm, the proposed algorithm does not involve any special kinds of distribution and is nonparametric.These properties are especially appealing for inference of time series causal graphs when the prior knowledge about the data model is not available.Simulations and case analysis demonstrate the effectiveness of the method.
基金supported by an internal fund from Macao Polytechnic University(RP/FCSD-02/2022).
文摘Objective:Chronic fatigue syndrome(CFS)is a prevalent symptom of post-coronavirus disease 2019(COVID-19)and is associated with unclear disease mechanisms.The herbal medicine Qingjin Yiqi granules(QJYQ)constitute a clinically approved formula for treating post-COVID-19;however,its potential as a drug target for treating CFS remains largely unknown.This study aimed to identify novel causal factors for CFS and elucidate the potential targets and pharmacological mechanisms of action of QJYQ in treating CFS.Methods:This prospective cohort analysis included 4,212 adults aged≥65 years who were followed up for 7 years with 435 incident CFS cases.Causal modeling and multivariate logistic regression analysis were performed to identify the potential causal determinants of CFS.A proteome-wide,two-sample Mendelian randomization(MR)analysis was employed to explore the proteins associated with the identified causal factors of CFS,which may serve as potential drug targets.Furthermore,we performed a virtual screening analysis to assess the binding affinity between the bioactive compounds in QJYQ and CFS-associated proteins.Results:Among 4,212 participants(47.5%men)with a median age of 69 years(interquartile range:69–70 years)enrolled in 2004,435 developed CFS by 2011.Causal graph analysis with multivariate logistic regression identified frequent cough(odds ratio:1.74,95%confidence interval[CI]:1.15–2.63)and insomnia(odds ratio:2.59,95%CI:1.77–3.79)as novel causal factors of CFS.Proteome-wide MR analysis revealed that the upregulation of endothelial cell-selective adhesion molecule(ESAM)was causally linked to both chronic cough(odds ratio:1.019,95%CI:1.012–1.026,P=2.75 e^(−05))and insomnia(odds ratio:1.015,95%CI:1.008–1.022,P=4.40 e^(−08))in CFS.The major bioactive compounds of QJYQ,ginsenoside Rb2(docking score:−6.03)and RG4(docking score:−6.15),bound to ESAM with high affinity based on virtual screening.Conclusions:Our integrated analytical framework combining epidemiological,genetic,and in silico data provides a novel strategy for elucidating complex disease mechanisms,such as CFS,and informing models of action of traditional Chinese medicines,such as QJYQ.Further validation in animal models is warranted to confirm the potential pharmacological effects of QJYQ on ESAM and as a treatment for CFS.
文摘Fault diagnostics is important for safe operation of nuclear power plants(NPPs). In recent years, data-driven approaches have been proposed and implemented to tackle the problem, e.g., neural networks, fuzzy and neurofuzzy approaches, support vector machine, K-nearest neighbor classifiers and inference methodologies. Among these methods, dynamic uncertain causality graph(DUCG)has been proved effective in many practical cases. However, the causal graph construction behind the DUCG is complicate and, in many cases, results redundant on the symptoms needed to correctly classify the fault. In this paper, we propose a method to simplify causal graph construction in an automatic way. The method consists in transforming the expert knowledge-based DCUG into a fuzzy decision tree(FDT) by extracting from the DUCG a fuzzy rule base that resumes the used symptoms at the basis of the FDT. Genetic algorithm(GA) is, then, used for the optimization of the FDT, by performing a wrapper search around the FDT: the set of symptoms selected during the iterative search are taken as the best set of symptoms for the diagnosis of the faults that can occur in the system. The effectiveness of the approach is shown with respect to a DUCG model initially built to diagnose 23 faults originally using 262 symptoms of Unit-1 in the Ningde NPP of the China Guangdong Nuclear Power Corporation. The results show that the FDT, with GA-optimized symptoms and diagnosis strategy, can drive the construction of DUCG and lower the computational burden without loss of accuracy in diagnosis.
基金supported by the Medical and Health Research Program of Zhejiang Province(No.2015KYB128)the Zhejiang Provincial Natural Science Foundation(No.LQ15H030004),China
文摘Jaundice is a common and complex clinical symptom potentially occurring in hepatology, general surgery, pediatrics, infectious diseases, gynecology, and obstetrics, and it is faidy difficult to distinguish the cause of jaundice in clinical practice, especially for general practitioners in less developed regions. With collaboration between physicians and artificial intelligence engineers, a comprehensive knowledge base relevant to jaundice was created based on demographic information, symptoms, physical signs, laboratory tests, imaging diagnosis, medical histories, and risk factors. Then a diagnostic modeling and reasoning system using the dynamic uncertain causality graph was proposed. A modularized modeling scheme was presented to reduce the complexity of model construction, providing multiple perspectives and arbitrary granularity for disease causality representations. A "chaining" inference algorithm and weighted logic operation mechanism were employed to guarantee the exactness and efficiency of diagnostic rea- soning under situations of incomplete and uncertain information. Moreover, the causal interactions among diseases and symptoms intuitively demonstrated the reasoning process in a graphical manner. Verification was performed using 203 randomly pooled clinical cases, and the accuracy was 99.01% and 84.73%, respectively, with or without laboratory tests in the model. The solutions were more explicable and convincing than common methods such as Bayesian Networks, further increasing the objectivity of clinical decision-making. The promising results indicated that our model could be potentially used in intelligent diagnosis and help decrease public health expenditure.
文摘Aim To improve the causal diagnosis method presented by Bandekar and propose a new method of finding the root fault order according to the fault possibility by means of numerical calculation. Methods Based on the causal graph, by utilization of fuzzified threshold value and fuzzy discrimination matrix, a kind of fuzzy causal diagnosis method was given and the fault possibility of each elements in the root fault candidate set (RFCS) was obtained. Results and Conclusion The order of each element in the RFCS can be obtained by the fault possibility, which makes the location of fault much easier. The diagnosis speed of this method is quite high, and by means of the fuzzified threshold value and fuzzy discrimination matrix, the result is more robust to noises and bad parameter's choice.