In order to solve the problem of chronic heart failure risk prediction in the elderly,a logistic regression modeling framework with Bayesian method was proposed,aiming to solve the problem of insufficient generalizati...In order to solve the problem of chronic heart failure risk prediction in the elderly,a logistic regression modeling framework with Bayesian method was proposed,aiming to solve the problem of insufficient generalization perfor-mance caused by overfitting in small sample data of traditional logistic regres-sion.By including 16 multi-dimensional clinical indicators(age,gender,BMI and alcohol history,etc.)in 20 elderly patients with chronic heart failure,the initial feature set was multicollinearity screened based on the variance infla-tion factor(VIF)test,and the high collinearity variables with VIF value≥10(such as fall risk,frailty assessment,etc.)were retained,so as to reduce the interference of redundant information on the stability of the model.Subse-quently,the entropy weight method was used to weight the filtered variables,and the information contribution of each index was quantified by information entropy,and standardized weighted data was generated,so as to optimize the feature importance allocation and alleviate the residual collinearity.Finally,based on the weighted data,Spearman correlation analysis was used to quan-titatively evaluate the association strength of each variable with heart failure classification,and the core predictors of balance and gait ability(correlation coefficient 0.52)and physical function status were identified.The results show that although the traditional logistic model achieves 100%accuracy on the training set,its parameters are significantly abnormal due to the singularity of the Hasten matrix,indicating that the model has a serious risk of overfitting.To this end,a Bayesian framework was introduced in this study,with a normal prior constraint regression coefficient with a mean of 0 and a standard devia-tion of 10,through the Markov Chain Monte Carlo(MCMC).The posterior distribution of parameters is obtained by sampling,which effectively balances the complexity of the model and the likelihood of the data.The experimental results show that Bayesian logistic regression has a classification accuracy of 85%on the independent test set,and the confusion matrix shows that the mis-judgments are only concentrated in the categories with overlapping features(one case in the second category is misjudged to the first category),and the F1 score is significantly improved(category 1:0.86,category 2:0.80,category 3:1.00),which avoids the singularity of the Haysen matrix.This study confirms that Bayesian logistic regression provides a highly robust solution for model-ing chronic heart failure in small elderly populations through probability reg-ularization and uncertainty quantification.展开更多
文摘In order to solve the problem of chronic heart failure risk prediction in the elderly,a logistic regression modeling framework with Bayesian method was proposed,aiming to solve the problem of insufficient generalization perfor-mance caused by overfitting in small sample data of traditional logistic regres-sion.By including 16 multi-dimensional clinical indicators(age,gender,BMI and alcohol history,etc.)in 20 elderly patients with chronic heart failure,the initial feature set was multicollinearity screened based on the variance infla-tion factor(VIF)test,and the high collinearity variables with VIF value≥10(such as fall risk,frailty assessment,etc.)were retained,so as to reduce the interference of redundant information on the stability of the model.Subse-quently,the entropy weight method was used to weight the filtered variables,and the information contribution of each index was quantified by information entropy,and standardized weighted data was generated,so as to optimize the feature importance allocation and alleviate the residual collinearity.Finally,based on the weighted data,Spearman correlation analysis was used to quan-titatively evaluate the association strength of each variable with heart failure classification,and the core predictors of balance and gait ability(correlation coefficient 0.52)and physical function status were identified.The results show that although the traditional logistic model achieves 100%accuracy on the training set,its parameters are significantly abnormal due to the singularity of the Hasten matrix,indicating that the model has a serious risk of overfitting.To this end,a Bayesian framework was introduced in this study,with a normal prior constraint regression coefficient with a mean of 0 and a standard devia-tion of 10,through the Markov Chain Monte Carlo(MCMC).The posterior distribution of parameters is obtained by sampling,which effectively balances the complexity of the model and the likelihood of the data.The experimental results show that Bayesian logistic regression has a classification accuracy of 85%on the independent test set,and the confusion matrix shows that the mis-judgments are only concentrated in the categories with overlapping features(one case in the second category is misjudged to the first category),and the F1 score is significantly improved(category 1:0.86,category 2:0.80,category 3:1.00),which avoids the singularity of the Haysen matrix.This study confirms that Bayesian logistic regression provides a highly robust solution for model-ing chronic heart failure in small elderly populations through probability reg-ularization and uncertainty quantification.