期刊文献+
共找到4篇文章
< 1 >
每页显示 20 50 100
Reproducible Learning of Gaussian Graphical Models via Graphical Lasso Multiple Data Splitting
1
作者 Kang Hu Danning Li Binghui Liu 《Acta Mathematica Sinica,English Series》 2025年第2期553-568,共16页
Gaussian graphical models(GGMs) are widely used as intuitive and efficient tools for data analysis in several application domains. To address the reproducibility issue of structure learning of a GGM, it is essential t... Gaussian graphical models(GGMs) are widely used as intuitive and efficient tools for data analysis in several application domains. To address the reproducibility issue of structure learning of a GGM, it is essential to control the false discovery rate(FDR) of the estimated edge set of the graph in terms of the graphical model. Hence, in recent years, the problem of GGM estimation with FDR control is receiving more and more attention. In this paper, we propose a new GGM estimation method by implementing multiple data splitting. Instead of using the node-by-node regressions to estimate each row of the precision matrix, we suggest directly estimating the entire precision matrix using the graphical Lasso in the multiple data splitting, and our calculation speed is p times faster than the previous. We show that the proposed method can asymptotically control FDR, and the proposed method has significant advantages in computational efficiency. Finally, we demonstrate the usefulness of the proposed method through a real data analysis. 展开更多
关键词 False discovery rate Gaussian graphical model multiple data splitting graphical Lasso
原文传递
Predictors of the Aggregate of COVID-19 Cases and Its Case-Fatality: A Global Investigation Involving 120 Countries 被引量:1
2
作者 Sarah Al-Gahtani Mohamed Shoukri Maha Al-Eid 《Open Journal of Statistics》 2021年第2期259-277,共19页
<strong>Objective</strong><span><span><span style="font-family:;" "=""><span style="font-family:Verdana;"><strong>: </strong>Since the... <strong>Objective</strong><span><span><span style="font-family:;" "=""><span style="font-family:Verdana;"><strong>: </strong>Since the identification of COVID-19 in December 2019 as a pandemic, over 4500 research papers were published with the term “COVID-19” contained in its title. Many of these reports on the COVID-19 pandemic suggested that the coronavirus was associated with more serious chronic diseases and mortality particularly in patients with chronic diseases regardless of country and age. Therefore, there is a need to understand how common comorbidities and other factors are associated with the risk of death due to COVID-19 infection. Our investigation aims at exploring this relationship. Specifically, our analysis aimed to explore the relationship between the total number of COVID-19 cases and mortality associated with COVID-19 infection accounting for other risk factors. </span><b><span style="font-family:Verdana;">Methods</span></b><span style="font-family:Verdana;">: Due to the presence of over dispersion, the Negative Binomial Regression is used to model the aggregate number of COVID-19 cases. Case-fatality associated with this infection is modeled as an outcome variable using machine learning predictive multivariable regression. The data we used are the COVID-19 cases and associated deaths from the start of the pandemic up to December 02-2020, the day Pfizer was granted approval for their new COVID-19 vaccine. </span><b><span style="font-family:Verdana;">Results</span></b><span style="font-family:Verdana;">: Our analysis found significant regional variation in case fatality. Moreover, the aggregate number of cases had several risk factors including chronic kidney disease, population density and the percentage of gross domestic product spent on healthcare. </span><b><span style="font-family:Verdana;">The Conclusions</span></b><span style="font-family:Verdana;">: There are important regional variations in COVID-19 case fatality. We identified three factors to be significantly correlated with case fatality</span></span></span></span><span style="font-family:Verdana;">.</span> 展开更多
关键词 Intraclass Correlation Coefficient Hierarchical data Structure Negative Binomial Regression data splitting Mixed Effects Linear Regression Model
在线阅读 下载PDF
Resampling in neural networks with application to spatial analysis
3
作者 Bruno Póvoa Rodrigues Vinicius Francisco Rofatto +1 位作者 Marcelo Tomio Matsuoka Talita Teles Assunção 《Geo-Spatial Information Science》 SCIE EI CSCD 2022年第3期413-424,共12页
In developing Artificial Neural Networks(ANNs),the available dataset is split into three categories:training,validation and testing.However,an important problem arises:How to trust the predic-tion provided by a partic... In developing Artificial Neural Networks(ANNs),the available dataset is split into three categories:training,validation and testing.However,an important problem arises:How to trust the predic-tion provided by a particular ANN?Due to the randomness related to the network itself(architecture,initialization and learning procedure),there is usually no best choice.Considering this issue,we provide a framework,which captures the randomness related to the network itself.The idea is to perform several training and test trials based on the Jackknife resampling method.Jackknife consists of iteratively deleting a single observation each time from the sample and recomputing the ANN on the rest of the sample data.Consequently,interval prediction is available instead of point prediction.The proposed method was applied and tested using pH,Ca and P data obtained by analyzing 118 georeferenced soil points.The results,based on the dataset size simulation,showed that 60%reduction in available dataset offers compatible accuracy in relation to full dataset,and therefore a higher cost of sampling in the field would not be necessary.The re-sampling method spatially characterizes the points of greater or lesser accuracy and uncertainty.The re-sampling method increased the success rate by using interval prediction instead of using the mean as the most probable value.Although we restrict it to the regression neural network model,the resampling method proposed can also be extended to other modern statistical tools,such as Kriging,Least Squares Collocation(LSC),Convolutional Neural Network(CNN),and so on. 展开更多
关键词 Artificial Neural Network(ANN) data splitting RESAMPLING delete-1 Jackknife spatial analysis
原文传递
Conditional dependence learning with high-dimensional conditioning variables
4
作者 Jianxin Bi Xingdong Feng Jingyuan Liu 《Science China Mathematics》 2025年第8期1779-1806,共28页
Conditional dependence plays a crucial role in various statistical procedures,including variable selection,network analysis and causal inference.However,there remains a paucity of relevant research in the context of h... Conditional dependence plays a crucial role in various statistical procedures,including variable selection,network analysis and causal inference.However,there remains a paucity of relevant research in the context of high-dimensional conditioning variables,a common challenge encountered in the era of big data.To address this issue,many existing studies impose certain model structures,yet high-dimensional conditioning variables often introduce spurious correlations in these models.In this paper,we systematically study the estimation biases inherent in widely-used measures of conditional dependence when spurious variables are present under high-dimensional settings.We discuss the estimation inconsistency both intuitively and theoretically,demonstrating that the conditional dependencies can be either overestimated or underestimated under different scenarios.To mitigate these biases and attain consistency,we introduce a measure based on data splitting and refitting techniques for high-dimensional conditional dependence.A conditional independence test is also developed using the newly advocated measure,with a tuning-free asymptotic null distribution.Furthermore,the proposed test is applied to generating high-dimensional network graphs in graphical modeling.The superior performances of newly proposed methods are illustrated both theoretically and through simulation studies.We also utilize the method to construct the gene-gene networks using a dataset of breast invasive carcinoma,which contains interesting discoveries that are worth further scientific exploration. 展开更多
关键词 conditional dependence high-dimensional data refitted cross-validation data splitting graphical modeling
暂未订购
上一页 1 下一页 到第
使用帮助 返回顶部