Wave velocities in haloanhydrites are difficult to determine and significantly depend on the mineralogy. We used petrophysical parameters to study the wave velocity in haloanhydrites in the Amur Darya Basin and constr...Wave velocities in haloanhydrites are difficult to determine and significantly depend on the mineralogy. We used petrophysical parameters to study the wave velocity in haloanhydrites in the Amur Darya Basin and constructed a template of the relation between haloanhydrite mineralogy (anhydrite, salt, mudstone, and pore water) and wave velocities. We used the relation between the P-wave rnoduli ratio and porosity as constraint and constructed a graphical model (petrophysical template) for the relation between wave velocity, mineral content and porosity. We tested the graphical model using rock core and well logging data.展开更多
Using Bayesian networks to model promising solutions from the current population of the evolutionary algorithms can ensure efficiency and intelligence search for the optimum. However, to construct a Bayesian network t...Using Bayesian networks to model promising solutions from the current population of the evolutionary algorithms can ensure efficiency and intelligence search for the optimum. However, to construct a Bayesian network that fits a given dataset is a NP-hard problem, and it also needs consuming mass computational resources. This paper develops a methodology for constructing a graphical model based on Bayesian Dirichlet metric. Our approach is derived from a set of propositions and theorems by researching the local metric relationship of networks matching dataset. This paper presents the algorithm to construct a tree model from a set of potential solutions using above approach. This method is important not only for evolutionary algorithms based on graphical models, but also for machine learning and data mining. The experimental results show that the exact theoretical results and the approximations match very well.展开更多
In the technique of video multi-target tracking,the common particle filter can not deal well with uncertain relations among multiple targets.To solve this problem,many researchers use data association method to reduce...In the technique of video multi-target tracking,the common particle filter can not deal well with uncertain relations among multiple targets.To solve this problem,many researchers use data association method to reduce the multi-target uncertainty.However,the traditional data association method is difficult to track accurately when the target is occluded.To remove the occlusion in the video,combined with the theory of data association,this paper adopts the probabilistic graphical model for multi-target modeling and analysis of the targets relationship in the particle filter framework.Ex-perimental results show that the proposed algorithm can solve the occlusion problem better compared with the traditional algorithm.展开更多
Gaussian graphical models(GGMs) are widely used as intuitive and efficient tools for data analysis in several application domains. To address the reproducibility issue of structure learning of a GGM, it is essential t...Gaussian graphical models(GGMs) are widely used as intuitive and efficient tools for data analysis in several application domains. To address the reproducibility issue of structure learning of a GGM, it is essential to control the false discovery rate(FDR) of the estimated edge set of the graph in terms of the graphical model. Hence, in recent years, the problem of GGM estimation with FDR control is receiving more and more attention. In this paper, we propose a new GGM estimation method by implementing multiple data splitting. Instead of using the node-by-node regressions to estimate each row of the precision matrix, we suggest directly estimating the entire precision matrix using the graphical Lasso in the multiple data splitting, and our calculation speed is p times faster than the previous. We show that the proposed method can asymptotically control FDR, and the proposed method has significant advantages in computational efficiency. Finally, we demonstrate the usefulness of the proposed method through a real data analysis.展开更多
The rapid expansion of offshore wind energy necessitates robust and cost-effective electrical collector system(ECS)designs that prioritize lifetime operational reliability.Traditional optimization approaches often sim...The rapid expansion of offshore wind energy necessitates robust and cost-effective electrical collector system(ECS)designs that prioritize lifetime operational reliability.Traditional optimization approaches often simplify reliability considerations or fail to holistically integrate them with economic and technical constraints.This paper introduces a novel,two-stage optimization framework for offshore wind farm(OWF)ECS planning that systematically incorporates reliability.The first stage employs Mixed-Integer Linear Programming(MILP)to determine an optimal radial network topology,considering linearized reliability approximations and geographical constraints.The second stage enhances this design by strategically placing tie-lines using a Mixed-Integer Quadratically Constrained Program(MIQCP).This stage leverages a dynamic-aware adaptation of Multi-Source Multi-Terminal Network Reliability(MSMT-NR)assessment,with its inherent nonlinear equations successfully transformed into a solvable MIQCP form for loopy networks.A benchmark case study demonstrates the framework’s efficacy,illustrating how increasing the emphasis on reliability leads to more distributed and interconnected network topologies,effectively balancing investment costs against enhanced system resilience.展开更多
This paper proposes a simple and discriminative framework, using graphical model and 3D geometry to understand the diversity of urban scenes with varying viewpoints. Our algorithm constructs a conditional random field...This paper proposes a simple and discriminative framework, using graphical model and 3D geometry to understand the diversity of urban scenes with varying viewpoints. Our algorithm constructs a conditional random field (CRF) network using over-segmented superpixels and learns the appearance model from different set of features for specific classes of our interest. Also, we introduce a training algorithm to learn a model for edge potential among these superpixel areas based on their feature difference. The proposed algorithm gives competitive and visually pleasing results for urban scene segmentation. We show the inference from our trained network improves the class labeling performance compared to the result when using the appearance model solely.展开更多
In this paper, we consider the problem of estimating a high dimensional precision matrix of Gaussian graphical model. Taking advantage of the connection between multivariate linear regression and entries of the precis...In this paper, we consider the problem of estimating a high dimensional precision matrix of Gaussian graphical model. Taking advantage of the connection between multivariate linear regression and entries of the precision matrix, we propose Bayesian Lasso together with neighborhood regression estimate for Gaussian graphical model. This method can obtain parameter estimation and model selection simultaneously. Moreover, the proposed method can provide symmetric confidence intervals of all entries of the precision matrix.展开更多
In this paper, we combine Leimer's algorithm with MCS-M algorithm to decompose graphical models into marginal models on prime blocks. It is shown by experiments that our method has an easier and faster implementation...In this paper, we combine Leimer's algorithm with MCS-M algorithm to decompose graphical models into marginal models on prime blocks. It is shown by experiments that our method has an easier and faster implementation than Leimer's algorithm.展开更多
A new procedure of learning in Gaussian graphical models is proposed under the assumption that samples are possibly dependent.This assumption,which is pragmatically applied in various areas of multivariate analysis ra...A new procedure of learning in Gaussian graphical models is proposed under the assumption that samples are possibly dependent.This assumption,which is pragmatically applied in various areas of multivariate analysis ranging from bioinformatics to finance,makes standard Gaussian graphical models(GGMs) unsuitable.We demonstrate that the advantage of modeling dependence among samples is that the true discovery rate and positive predictive value are improved substantially than if standard GGMs are applied and the dependence among samples is ignored.The new method,called matrix-variate Gaussian graphical models(MGGMs),involves simultaneously modeling variable and sample dependencies with the matrix-normal distribution.The computation is carried out using a Markov chain Monte Carlo(MCMC) sampling scheme for graphical model determination and parameter estimation.Simulation studies and two real-world examples in biology and finance further illustrate the benefits of the new models.展开更多
The development process of complex equipment involves multi-stage business processes,multi-level product architecture,and multi-disciplinary physical processes.The relationship between its system model and various dis...The development process of complex equipment involves multi-stage business processes,multi-level product architecture,and multi-disciplinary physical processes.The relationship between its system model and various disciplinary models is extremely complicated.In the modeling and integration process,extensive customized development is needed to realize model integration and interoperability in different business scenarios.Meanwhile,the differences in modeling and interaction between different modeling tools make it difficult to support the consistent representation of models in complex scenarios.To improve the efficiency of system modeling and integration in complex business scenarios,a system modeling and integration method was proposed.This method took the Sys ML language kernel as the core and system model function integration as the main line.Through the technical means of model view separation,abstract operation interface,and model view configuration,the model modeling and integration of multi-user,multi-model,multi-view,and different business logic in complex business scenarios were realized.展开更多
The problem of estimating high-dimensional Gaussian graphical models has gained much attention in recent years. Most existing methods can be considered as one-step approaches, being either regression-based or likeliho...The problem of estimating high-dimensional Gaussian graphical models has gained much attention in recent years. Most existing methods can be considered as one-step approaches, being either regression-based or likelihood-based. In this paper, we propose a two-step method for estimating the high-dimensional Gaussian graphical model. Specifically, the first step serves as a screening step, in which many entries of the concentration matrix are identified as zeros and thus removed from further consideration. Then in the second step, we focus on the remaining entries of the concentration matrix and perform selection and estimation for nonzero entries of the concentration matrix. Since the dimension of the parameter space is effectively reduced by the screening step,the estimation accuracy of the estimated concentration matrix can be potentially improved. We show that the proposed method enjoys desirable asymptotic properties. Numerical comparisons of the proposed method with several existing methods indicate that the proposed method works well. We also apply the proposed method to a breast cancer microarray data set and obtain some biologically meaningful results.展开更多
This paper focuses on the support recovery of the Gaussian graphical model(GGM)with false discovery rate(FDR)control.The graceful symmetrized data aggregation(SDA)technique which involves sample splitting,data screeni...This paper focuses on the support recovery of the Gaussian graphical model(GGM)with false discovery rate(FDR)control.The graceful symmetrized data aggregation(SDA)technique which involves sample splitting,data screening and information pooling is exploited via a node-based way.A matrix of test statistics with symmetry property is constructed and a data-driven threshold is chosen to control the FDR for the support recovery of GGM.The proposed method is shown to control the FDR asymptotically under some mild conditions.Extensive simulation studies and a real-data example demonstrate that it yields a better FDR control while offering reasonable power in most cases.展开更多
Probabilistic graphical models(PGMs)can effectively deal with the problems of energy consumption and occupancy prediction,fault detection and diagnosis,reliability analysis,and optimization in energy systems.Compared ...Probabilistic graphical models(PGMs)can effectively deal with the problems of energy consumption and occupancy prediction,fault detection and diagnosis,reliability analysis,and optimization in energy systems.Compared with the black-box models,PGMs show advantages in model interpretability,scalability and reliability.They have great potential to realize the true artificial intelligence in energy systems of the next generation.This paper intends to provide a comprehensive review of the PGM-based approaches published in the last decades.It reveals the advantages,limitations and potential future research directions of the PGM-based approaches for energy systems.Two types of PGMs are summarized in this review,including static models(SPGMs)and dynamic models(DPGMs).SPGMs can conduct probabilistic inference based on incomplete,uncertain or even conflicting information.SPGM-based approaches are proposed to deal with various management tasks in energy systems.They show outstanding performance in fault detection and diagnosis of energy systems.DPGMs can represent a dynamic and stochastic process by describing how its state changes with time.DPGM-based approaches have high accuracy in predicting the energy consumption,occupancy and failures of energy systems.In the future,a unified framework is suggested to fuse the knowledge-driven and data-driven PGMs for achieving better performances.Universal PGM-based approaches are needed that can be adapted to various energy systems.Hybrid algorithms would outperform the basic PGMs by integrating advanced techniques such as deep learning and first-order logic.展开更多
Sample compression schemes were first proposed by Littlestone and Warmuth in 1986.Undi-rected graphical model is a powerful tool for classification in statistical learning.In this paper,we consider labelled compressio...Sample compression schemes were first proposed by Littlestone and Warmuth in 1986.Undi-rected graphical model is a powerful tool for classification in statistical learning.In this paper,we consider labelled compression schemes for concept classes induced by discrete undirected graphical models.For the undirected graph of two vertices with no edge,where one vertex takes two values and the other vertex can take any finite number of values,we propose an algorithm to establish a labelled compression scheme of size VC dimension of associated concept class.Further,we extend the result to other two types of undirected graphical models and show the existence of labelled compression schemes of size VC dimension for induced concept classes.The work of this paper makes a step forward in solving sample compression problem for concept class induced by a general discrete undirected graphical model.展开更多
Graphical models are wildly used to describe conditional dependence relationships among interacting random variables.Among statistical inference problems of a graphical model,one particular interest is utilizing its i...Graphical models are wildly used to describe conditional dependence relationships among interacting random variables.Among statistical inference problems of a graphical model,one particular interest is utilizing its interaction structure to reduce model complexity.As an important approach to utilizing structural information,decomposition allows a statistical inference problem to be divided into some sub-problems with lower complexities.In this paper,to investigate decomposition of covariate-dependent graphical models,we propose some useful definitions of decomposition of covariate-dependent graphical models with categorical data in the form of contingency tables.Based on such a decomposition,a covariate-dependent graphical model can be split into some sub-models,and the maximum likelihood estimation of this model can be factorized into the maximum likelihood estimations of the sub-models.Moreover,some sufficient and necessary conditions of the proposed definitions of decomposition are studied.展开更多
BACKGROUND Recently,research has linked Helicobacter pylori(H.pylori)stomach infection to colonic inflammation,mediated by toxin production,potentially impacting colorectal cancer occurrence.AIM To investigate the ris...BACKGROUND Recently,research has linked Helicobacter pylori(H.pylori)stomach infection to colonic inflammation,mediated by toxin production,potentially impacting colorectal cancer occurrence.AIM To investigate the risk factors for post-colon polyp surgery,H.pylori infection,and its correlation with pathologic type.METHODS Eighty patients who underwent colon polypectomy in our hospital between January 2019 and January 2023 were retrospectively chosen.They were then randomly split into modeling(n=56)and model validation(n=24)sets using R.The modeling cohort was divided into an H.pylori-infected group(n=37)and an H.pylori-uninfected group(n=19).Binary logistic regression analysis was used to analyze the factors influencing the occurrence of H.pylori infection after colon polyp surgery.A roadmap prediction model was established and validated.Finally,the correlation between the different pathological types of colon polyps and the occurrence of H.pylori infection was analyzed after colon polyp surgery.RESULTS Univariate results showed that age,body mass index(BMI),literacy,alcohol consumption,polyp pathology type,high-risk adenomas,and heavy diet were all influential factors in the development of H.pylori infection after intestinal polypectomy.Binary multifactorial logistic regression analysis showed that age,BMI,and type of polyp pathology were independent predictors of the occurrence of H.pylori infection after intestinal polypectomy.The area under the receiver operating characteristic curve was 0.969[95%confidence interval(95%CI):0.928–1.000]and 0.898(95%CI:0.773–1.000)in the modeling and validation sets,respectively.The slope of the calibration curve of the graph was close to 1,and the goodness-of-fit test was P>0.05 in the two sets.The decision analysis curve showed a high rate of return in both sets.The results of the correlation analysis between different pathological types and the occurrence of H.pylori infection after colon polyp surgery showed that hyperplastic polyps,inflammatory polyps,and the occurrence of H.pylori infection were not significantly correlated.In contrast,adenomatous polyps showed a significant positive correlation with the occurrence of H.pylori infection.CONCLUSION Age,BMI,and polyps of the adenomatous type were independent predictors of H.pylori infection after intestinal polypectomy.Moreover,the further constructed column-line graph prediction model of H.pylori infection after intestinal polypectomy showed good predictive ability.展开更多
The main aim of this paper is to compare the stability, in terms of systemic risk, of conventional and Islamic banking systems. To this aim, we propose correlation network models for stock market returns based on grap...The main aim of this paper is to compare the stability, in terms of systemic risk, of conventional and Islamic banking systems. To this aim, we propose correlation network models for stock market returns based on graphical Gaussian distributions, which allows us to capture the contagion effects that move along countries. We also consider Bayesian graphical models, to account for model uncertainty in the measurement of financial systems interconnectedness. Our proposed model is applied to the Middle East and North Africa (MENA) region banking sector, characterized by the presence of both conventional and Islamic banks, for the period from 2007 to the beginning of 2014. Our empirical findings show that there are differences in the systemic risk and stability of the two banking systems during crisis times. In addition, the differences are subject to country specific effects that are amplified during crisis period.展开更多
Topic models such as Latent Dirichlet Allocation(LDA) have been successfully applied to many text mining tasks for extracting topics embedded in corpora. However, existing topic models generally cannot discover bursty...Topic models such as Latent Dirichlet Allocation(LDA) have been successfully applied to many text mining tasks for extracting topics embedded in corpora. However, existing topic models generally cannot discover bursty topics that experience a sudden increase during a period of time. In this paper, we propose a new topic model named Burst-LDA, which simultaneously discovers topics and reveals their burstiness through explicitly modeling each topic's burst states with a first order Markov chain and using the chain to generate the topic proportion of documents in a Logistic Normal fashion. A Gibbs sampling algorithm is developed for the posterior inference of the proposed model. Experimental results on a news data set show our model can efficiently discover bursty topics, outperforming the state-of-the-art method.展开更多
Online automatic fault diagnosis in industrial systems is essential for guaranteeing safe, reliable and efficient operations.However, difficulties associated with computational overload, ubiquitous uncertainties and i...Online automatic fault diagnosis in industrial systems is essential for guaranteeing safe, reliable and efficient operations.However, difficulties associated with computational overload, ubiquitous uncertainties and insufficient fault samples hamper the engineering application of intelligent fault diagnosis technology. Geared towards the settlement of these problems, this paper introduces the method of dynamic uncertain causality graph, which is a new attempt to model complex behaviors of real-world systems under uncertainties. The visual representation to causality pathways and self-relied "chaining" inference mechanisms are analyzed. In particular, some solutions are investigated for the diagnostic reasoning algorithm to aim at reducing its computational complexity and improving the robustness to potential losses and imprecisions in observations. To evaluate the effectiveness and performance of this method, experiments are conducted using both synthetic calculation cases and generator faults of a nuclear power plant. The results manifest the high diagnostic accuracy and efficiency, suggesting its practical significance in large-scale industrial applications.展开更多
Magnetic resonance imaging(MRI)is a clinically relevant,real-time imaging modality that is frequently utilized to assess stroke type and severity.However,specific MRI biomarkers that can be used to predict long-term f...Magnetic resonance imaging(MRI)is a clinically relevant,real-time imaging modality that is frequently utilized to assess stroke type and severity.However,specific MRI biomarkers that can be used to predict long-term functional recovery are still a critical need.Consequently,the present study sought to examine the prognostic value of commonly utilized MRI parameters to predict functional outcomes in a porcine model of ischemic stroke.Stroke was induced via permanent middle cerebral artery occlusion.At 24 hours post-stroke,MRI analysis revealed focal ischemic lesions,decreased diffusivity,hemispheric swelling,and white matter degradation.Functional deficits including behavioral abnormalities in open field and novel object exploration as well as spatiotemporal gait impairments were observed at 4 weeks post-stroke.Gaussian graphical models identified specific MRI outputs and functional recovery variables,including white matter integrity and gait performance,that exhibited strong conditional dependencies.Canonical correlation analysis revealed a prognostic relationship between lesion volume and white matter integrity and novel object exploration and gait performance.Consequently,these analyses may also have the potential of predicting patient recovery at chronic time points as pigs and humans share many anatomical similarities(e.g.,white matter composition)that have proven to be critical in ischemic stroke pathophysiology.The study was approved by the University of Georgia(UGA)Institutional Animal Care and Use Committee(IACUC;Protocol Number:A2014-07-021-Y3-A11 and 2018-01-029-Y1-A5)on November 22,2017.展开更多
基金supported by the National Major Scientific and Technological Special Project(No.2011ZX05029-003)the project of the Research Institute of Petroleum Exploration&Development(No.2012Y-058)
文摘Wave velocities in haloanhydrites are difficult to determine and significantly depend on the mineralogy. We used petrophysical parameters to study the wave velocity in haloanhydrites in the Amur Darya Basin and constructed a template of the relation between haloanhydrite mineralogy (anhydrite, salt, mudstone, and pore water) and wave velocities. We used the relation between the P-wave rnoduli ratio and porosity as constraint and constructed a graphical model (petrophysical template) for the relation between wave velocity, mineral content and porosity. We tested the graphical model using rock core and well logging data.
基金This work was supported by the National Natural Science Foundation of China(No.60574075) and by Natural Science Foundation of ShaanxiProvince(No.2005A07).
文摘Using Bayesian networks to model promising solutions from the current population of the evolutionary algorithms can ensure efficiency and intelligence search for the optimum. However, to construct a Bayesian network that fits a given dataset is a NP-hard problem, and it also needs consuming mass computational resources. This paper develops a methodology for constructing a graphical model based on Bayesian Dirichlet metric. Our approach is derived from a set of propositions and theorems by researching the local metric relationship of networks matching dataset. This paper presents the algorithm to construct a tree model from a set of potential solutions using above approach. This method is important not only for evolutionary algorithms based on graphical models, but also for machine learning and data mining. The experimental results show that the exact theoretical results and the approximations match very well.
基金Supported by the National High Technology Research and Development Program of China (No. 2007AA11Z227)the Natural Science Foundation of Jiangsu Province of China(No. BK2009352)the Fundamental Research Funds for the Central Universities of China (No. 2010B16414)
文摘In the technique of video multi-target tracking,the common particle filter can not deal well with uncertain relations among multiple targets.To solve this problem,many researchers use data association method to reduce the multi-target uncertainty.However,the traditional data association method is difficult to track accurately when the target is occluded.To remove the occlusion in the video,combined with the theory of data association,this paper adopts the probabilistic graphical model for multi-target modeling and analysis of the targets relationship in the particle filter framework.Ex-perimental results show that the proposed algorithm can solve the occlusion problem better compared with the traditional algorithm.
基金partially supported by the National Natural Science Foundation of China(Grant No.12171079)the National Key R&D Program of China(Grant No.2020YFA0714102)+1 种基金partially supported by the National Natural Science Foundation of China(Grant No.12101116)the National Key Research and Development Program of China(Grant No.2022YFA1003701)。
文摘Gaussian graphical models(GGMs) are widely used as intuitive and efficient tools for data analysis in several application domains. To address the reproducibility issue of structure learning of a GGM, it is essential to control the false discovery rate(FDR) of the estimated edge set of the graph in terms of the graphical model. Hence, in recent years, the problem of GGM estimation with FDR control is receiving more and more attention. In this paper, we propose a new GGM estimation method by implementing multiple data splitting. Instead of using the node-by-node regressions to estimate each row of the precision matrix, we suggest directly estimating the entire precision matrix using the graphical Lasso in the multiple data splitting, and our calculation speed is p times faster than the previous. We show that the proposed method can asymptotically control FDR, and the proposed method has significant advantages in computational efficiency. Finally, we demonstrate the usefulness of the proposed method through a real data analysis.
基金supported by the Science and Technology Project of China South Power Grid Co.,Ltd.,Grant Nos.036000KK52222044,GDKJXM20222430。
文摘The rapid expansion of offshore wind energy necessitates robust and cost-effective electrical collector system(ECS)designs that prioritize lifetime operational reliability.Traditional optimization approaches often simplify reliability considerations or fail to holistically integrate them with economic and technical constraints.This paper introduces a novel,two-stage optimization framework for offshore wind farm(OWF)ECS planning that systematically incorporates reliability.The first stage employs Mixed-Integer Linear Programming(MILP)to determine an optimal radial network topology,considering linearized reliability approximations and geographical constraints.The second stage enhances this design by strategically placing tie-lines using a Mixed-Integer Quadratically Constrained Program(MIQCP).This stage leverages a dynamic-aware adaptation of Multi-Source Multi-Terminal Network Reliability(MSMT-NR)assessment,with its inherent nonlinear equations successfully transformed into a solvable MIQCP form for loopy networks.A benchmark case study demonstrates the framework’s efficacy,illustrating how increasing the emphasis on reliability leads to more distributed and interconnected network topologies,effectively balancing investment costs against enhanced system resilience.
基金supported by the National Natural Science Foundation of China (60803103)Research Found For Doctoral Program of Higher Education of China (200800131026)Fundamental Research Funds for the Central Universities (2009RC0603, 2009RC0601)
文摘This paper proposes a simple and discriminative framework, using graphical model and 3D geometry to understand the diversity of urban scenes with varying viewpoints. Our algorithm constructs a conditional random field (CRF) network using over-segmented superpixels and learns the appearance model from different set of features for specific classes of our interest. Also, we introduce a training algorithm to learn a model for edge potential among these superpixel areas based on their feature difference. The proposed algorithm gives competitive and visually pleasing results for urban scene segmentation. We show the inference from our trained network improves the class labeling performance compared to the result when using the appearance model solely.
基金Supported by the National Natural Science Foundation of China(No.11571080)
文摘In this paper, we consider the problem of estimating a high dimensional precision matrix of Gaussian graphical model. Taking advantage of the connection between multivariate linear regression and entries of the precision matrix, we propose Bayesian Lasso together with neighborhood regression estimate for Gaussian graphical model. This method can obtain parameter estimation and model selection simultaneously. Moreover, the proposed method can provide symmetric confidence intervals of all entries of the precision matrix.
基金Supported by the National Natural Science Foundation of China (Nos. 10871038, 10926186, 11025102, 11071026 and 11101052)the Jilin Project (No. 20100401)
文摘In this paper, we combine Leimer's algorithm with MCS-M algorithm to decompose graphical models into marginal models on prime blocks. It is shown by experiments that our method has an easier and faster implementation than Leimer's algorithm.
文摘A new procedure of learning in Gaussian graphical models is proposed under the assumption that samples are possibly dependent.This assumption,which is pragmatically applied in various areas of multivariate analysis ranging from bioinformatics to finance,makes standard Gaussian graphical models(GGMs) unsuitable.We demonstrate that the advantage of modeling dependence among samples is that the true discovery rate and positive predictive value are improved substantially than if standard GGMs are applied and the dependence among samples is ignored.The new method,called matrix-variate Gaussian graphical models(MGGMs),involves simultaneously modeling variable and sample dependencies with the matrix-normal distribution.The computation is carried out using a Markov chain Monte Carlo(MCMC) sampling scheme for graphical model determination and parameter estimation.Simulation studies and two real-world examples in biology and finance further illustrate the benefits of the new models.
文摘The development process of complex equipment involves multi-stage business processes,multi-level product architecture,and multi-disciplinary physical processes.The relationship between its system model and various disciplinary models is extremely complicated.In the modeling and integration process,extensive customized development is needed to realize model integration and interoperability in different business scenarios.Meanwhile,the differences in modeling and interaction between different modeling tools make it difficult to support the consistent representation of models in complex scenarios.To improve the efficiency of system modeling and integration in complex business scenarios,a system modeling and integration method was proposed.This method took the Sys ML language kernel as the core and system model function integration as the main line.Through the technical means of model view separation,abstract operation interface,and model view configuration,the model modeling and integration of multi-user,multi-model,multi-view,and different business logic in complex business scenarios were realized.
基金National Natural Science Foundation of China (Grant No. 11671059)。
文摘The problem of estimating high-dimensional Gaussian graphical models has gained much attention in recent years. Most existing methods can be considered as one-step approaches, being either regression-based or likelihood-based. In this paper, we propose a two-step method for estimating the high-dimensional Gaussian graphical model. Specifically, the first step serves as a screening step, in which many entries of the concentration matrix are identified as zeros and thus removed from further consideration. Then in the second step, we focus on the remaining entries of the concentration matrix and perform selection and estimation for nonzero entries of the concentration matrix. Since the dimension of the parameter space is effectively reduced by the screening step,the estimation accuracy of the estimated concentration matrix can be potentially improved. We show that the proposed method enjoys desirable asymptotic properties. Numerical comparisons of the proposed method with several existing methods indicate that the proposed method works well. We also apply the proposed method to a breast cancer microarray data set and obtain some biologically meaningful results.
基金supported partially by the China National Key R&D Program under Grant Nos.2019YFC1908502,2022YFA1003703,2022YFA1003802,and 2022YFA1003803the National Natural Science Foundation of China under Grant Nos.11925106,12231011,11931001,and 11971247。
文摘This paper focuses on the support recovery of the Gaussian graphical model(GGM)with false discovery rate(FDR)control.The graceful symmetrized data aggregation(SDA)technique which involves sample splitting,data screening and information pooling is exploited via a node-based way.A matrix of test statistics with symmetry property is constructed and a data-driven threshold is chosen to control the FDR for the support recovery of GGM.The proposed method is shown to control the FDR asymptotically under some mild conditions.Extensive simulation studies and a real-data example demonstrate that it yields a better FDR control while offering reasonable power in most cases.
基金supported by the National Key Research and Development Program of China(No.2018YFE0116300)the National Natural Science Foundation of China(No.51978601).
文摘Probabilistic graphical models(PGMs)can effectively deal with the problems of energy consumption and occupancy prediction,fault detection and diagnosis,reliability analysis,and optimization in energy systems.Compared with the black-box models,PGMs show advantages in model interpretability,scalability and reliability.They have great potential to realize the true artificial intelligence in energy systems of the next generation.This paper intends to provide a comprehensive review of the PGM-based approaches published in the last decades.It reveals the advantages,limitations and potential future research directions of the PGM-based approaches for energy systems.Two types of PGMs are summarized in this review,including static models(SPGMs)and dynamic models(DPGMs).SPGMs can conduct probabilistic inference based on incomplete,uncertain or even conflicting information.SPGM-based approaches are proposed to deal with various management tasks in energy systems.They show outstanding performance in fault detection and diagnosis of energy systems.DPGMs can represent a dynamic and stochastic process by describing how its state changes with time.DPGM-based approaches have high accuracy in predicting the energy consumption,occupancy and failures of energy systems.In the future,a unified framework is suggested to fuse the knowledge-driven and data-driven PGMs for achieving better performances.Universal PGM-based approaches are needed that can be adapted to various energy systems.Hybrid algorithms would outperform the basic PGMs by integrating advanced techniques such as deep learning and first-order logic.
基金This work was supported byNationalNatural Science Foundation of China(Research Plan in Shaanxi Province[GrantNumber 12171382])the Natural Science Basic Research Plan in Shaanxi Province[Grant Number 2020JM-188]the Fundamental Research Funds for the Central Universities[Grant Number QTZX23002].
文摘Sample compression schemes were first proposed by Littlestone and Warmuth in 1986.Undi-rected graphical model is a powerful tool for classification in statistical learning.In this paper,we consider labelled compression schemes for concept classes induced by discrete undirected graphical models.For the undirected graph of two vertices with no edge,where one vertex takes two values and the other vertex can take any finite number of values,we propose an algorithm to establish a labelled compression scheme of size VC dimension of associated concept class.Further,we extend the result to other two types of undirected graphical models and show the existence of labelled compression schemes of size VC dimension for induced concept classes.The work of this paper makes a step forward in solving sample compression problem for concept class induced by a general discrete undirected graphical model.
基金supported by the National Key R&D Program of China (Grant 2020YFA0714102)the National Natural Science Foundation of China (Grant 12171079).
文摘Graphical models are wildly used to describe conditional dependence relationships among interacting random variables.Among statistical inference problems of a graphical model,one particular interest is utilizing its interaction structure to reduce model complexity.As an important approach to utilizing structural information,decomposition allows a statistical inference problem to be divided into some sub-problems with lower complexities.In this paper,to investigate decomposition of covariate-dependent graphical models,we propose some useful definitions of decomposition of covariate-dependent graphical models with categorical data in the form of contingency tables.Based on such a decomposition,a covariate-dependent graphical model can be split into some sub-models,and the maximum likelihood estimation of this model can be factorized into the maximum likelihood estimations of the sub-models.Moreover,some sufficient and necessary conditions of the proposed definitions of decomposition are studied.
文摘BACKGROUND Recently,research has linked Helicobacter pylori(H.pylori)stomach infection to colonic inflammation,mediated by toxin production,potentially impacting colorectal cancer occurrence.AIM To investigate the risk factors for post-colon polyp surgery,H.pylori infection,and its correlation with pathologic type.METHODS Eighty patients who underwent colon polypectomy in our hospital between January 2019 and January 2023 were retrospectively chosen.They were then randomly split into modeling(n=56)and model validation(n=24)sets using R.The modeling cohort was divided into an H.pylori-infected group(n=37)and an H.pylori-uninfected group(n=19).Binary logistic regression analysis was used to analyze the factors influencing the occurrence of H.pylori infection after colon polyp surgery.A roadmap prediction model was established and validated.Finally,the correlation between the different pathological types of colon polyps and the occurrence of H.pylori infection was analyzed after colon polyp surgery.RESULTS Univariate results showed that age,body mass index(BMI),literacy,alcohol consumption,polyp pathology type,high-risk adenomas,and heavy diet were all influential factors in the development of H.pylori infection after intestinal polypectomy.Binary multifactorial logistic regression analysis showed that age,BMI,and type of polyp pathology were independent predictors of the occurrence of H.pylori infection after intestinal polypectomy.The area under the receiver operating characteristic curve was 0.969[95%confidence interval(95%CI):0.928–1.000]and 0.898(95%CI:0.773–1.000)in the modeling and validation sets,respectively.The slope of the calibration curve of the graph was close to 1,and the goodness-of-fit test was P>0.05 in the two sets.The decision analysis curve showed a high rate of return in both sets.The results of the correlation analysis between different pathological types and the occurrence of H.pylori infection after colon polyp surgery showed that hyperplastic polyps,inflammatory polyps,and the occurrence of H.pylori infection were not significantly correlated.In contrast,adenomatous polyps showed a significant positive correlation with the occurrence of H.pylori infection.CONCLUSION Age,BMI,and polyps of the adenomatous type were independent predictors of H.pylori infection after intestinal polypectomy.Moreover,the further constructed column-line graph prediction model of H.pylori infection after intestinal polypectomy showed good predictive ability.
文摘The main aim of this paper is to compare the stability, in terms of systemic risk, of conventional and Islamic banking systems. To this aim, we propose correlation network models for stock market returns based on graphical Gaussian distributions, which allows us to capture the contagion effects that move along countries. We also consider Bayesian graphical models, to account for model uncertainty in the measurement of financial systems interconnectedness. Our proposed model is applied to the Middle East and North Africa (MENA) region banking sector, characterized by the presence of both conventional and Islamic banks, for the period from 2007 to the beginning of 2014. Our empirical findings show that there are differences in the systemic risk and stability of the two banking systems during crisis times. In addition, the differences are subject to country specific effects that are amplified during crisis period.
基金Supported by the National High Technology Research and Development Program of China(No.2012AA011005)
文摘Topic models such as Latent Dirichlet Allocation(LDA) have been successfully applied to many text mining tasks for extracting topics embedded in corpora. However, existing topic models generally cannot discover bursty topics that experience a sudden increase during a period of time. In this paper, we propose a new topic model named Burst-LDA, which simultaneously discovers topics and reveals their burstiness through explicitly modeling each topic's burst states with a first order Markov chain and using the chain to generate the topic proportion of documents in a Logistic Normal fashion. A Gibbs sampling algorithm is developed for the posterior inference of the proposed model. Experimental results on a news data set show our model can efficiently discover bursty topics, outperforming the state-of-the-art method.
基金supported by the National Natural Science Foundation of China(Nos.61050005 and 61273330)Research Foundation for the Doctoral Program of China Ministry of Education(No.20120002110037)+1 种基金the 2014 Teaching Reform Project of Shandong Normal UniversityDevelopment Project of China Guangdong Nuclear Power Group(No.CNPRI-ST10P005)
文摘Online automatic fault diagnosis in industrial systems is essential for guaranteeing safe, reliable and efficient operations.However, difficulties associated with computational overload, ubiquitous uncertainties and insufficient fault samples hamper the engineering application of intelligent fault diagnosis technology. Geared towards the settlement of these problems, this paper introduces the method of dynamic uncertain causality graph, which is a new attempt to model complex behaviors of real-world systems under uncertainties. The visual representation to causality pathways and self-relied "chaining" inference mechanisms are analyzed. In particular, some solutions are investigated for the diagnostic reasoning algorithm to aim at reducing its computational complexity and improving the robustness to potential losses and imprecisions in observations. To evaluate the effectiveness and performance of this method, experiments are conducted using both synthetic calculation cases and generator faults of a nuclear power plant. The results manifest the high diagnostic accuracy and efficiency, suggesting its practical significance in large-scale industrial applications.
基金This work was supported by the National Institutes of Health,National Institute of Neurological Disorders and Stroke grant R01NS093314 as well as Small Business Innovation Research grant 1R43NS103596-01.
文摘Magnetic resonance imaging(MRI)is a clinically relevant,real-time imaging modality that is frequently utilized to assess stroke type and severity.However,specific MRI biomarkers that can be used to predict long-term functional recovery are still a critical need.Consequently,the present study sought to examine the prognostic value of commonly utilized MRI parameters to predict functional outcomes in a porcine model of ischemic stroke.Stroke was induced via permanent middle cerebral artery occlusion.At 24 hours post-stroke,MRI analysis revealed focal ischemic lesions,decreased diffusivity,hemispheric swelling,and white matter degradation.Functional deficits including behavioral abnormalities in open field and novel object exploration as well as spatiotemporal gait impairments were observed at 4 weeks post-stroke.Gaussian graphical models identified specific MRI outputs and functional recovery variables,including white matter integrity and gait performance,that exhibited strong conditional dependencies.Canonical correlation analysis revealed a prognostic relationship between lesion volume and white matter integrity and novel object exploration and gait performance.Consequently,these analyses may also have the potential of predicting patient recovery at chronic time points as pigs and humans share many anatomical similarities(e.g.,white matter composition)that have proven to be critical in ischemic stroke pathophysiology.The study was approved by the University of Georgia(UGA)Institutional Animal Care and Use Committee(IACUC;Protocol Number:A2014-07-021-Y3-A11 and 2018-01-029-Y1-A5)on November 22,2017.