期刊文献+
共找到18,385篇文章
< 1 2 250 >
每页显示 20 50 100
Subgroup Analysis of a Single-Index Threshold Penalty Quantile Regression Model Based on Variable Selection
1
作者 QI Hui XUE Yaxin 《Wuhan University Journal of Natural Sciences》 2025年第2期169-183,共15页
In clinical research,subgroup analysis can help identify patient groups that respond better or worse to specific treatments,improve therapeutic effect and safety,and is of great significance in precision medicine.This... In clinical research,subgroup analysis can help identify patient groups that respond better or worse to specific treatments,improve therapeutic effect and safety,and is of great significance in precision medicine.This article considers subgroup analysis methods for longitudinal data containing multiple covariates and biomarkers.We divide subgroups based on whether a linear combination of these biomarkers exceeds a predetermined threshold,and assess the heterogeneity of treatment effects across subgroups using the interaction between subgroups and exposure variables.Quantile regression is used to better characterize the global distribution of the response variable and sparsity penalties are imposed to achieve variable selection of covariates and biomarkers.The effectiveness of our proposed methodology for both variable selection and parameter estimation is verified through random simulations.Finally,we demonstrate the application of this method by analyzing data from the PA.3 trial,further illustrating the practicality of the method proposed in this paper. 展开更多
关键词 longitudinal data subgroup analysis threshold model quantile regression variable selection
原文传递
Temperature error compensation method for fiber optic gyroscope based on a composite model of k-means,support vector regression and particle swarm optimization
2
作者 CAO Yin LI Lijing LIANG Sheng 《Journal of Systems Engineering and Electronics》 2025年第2期510-522,共13页
As the core component of inertial navigation systems, fiber optic gyroscope (FOG), with technical advantages such as low power consumption, long lifespan, fast startup speed, and flexible structural design, are widely... As the core component of inertial navigation systems, fiber optic gyroscope (FOG), with technical advantages such as low power consumption, long lifespan, fast startup speed, and flexible structural design, are widely used in aerospace, unmanned driving, and other fields. However, due to the temper-ature sensitivity of optical devices, the influence of environmen-tal temperature causes errors in FOG, thereby greatly limiting their output accuracy. This work researches on machine-learn-ing based temperature error compensation techniques for FOG. Specifically, it focuses on compensating for the bias errors gen-erated in the fiber ring due to the Shupe effect. This work pro-poses a composite model based on k-means clustering, sup-port vector regression, and particle swarm optimization algo-rithms. And it significantly reduced redundancy within the sam-ples by adopting the interval sequence sample. Moreover, met-rics such as root mean square error (RMSE), mean absolute error (MAE), bias stability, and Allan variance, are selected to evaluate the model’s performance and compensation effective-ness. This work effectively enhances the consistency between data and models across different temperature ranges and tem-perature gradients, improving the bias stability of the FOG from 0.022 °/h to 0.006 °/h. Compared to the existing methods utiliz-ing a single machine learning model, the proposed method increases the bias stability of the compensated FOG from 57.11% to 71.98%, and enhances the suppression of rate ramp noise coefficient from 2.29% to 14.83%. This work improves the accuracy of FOG after compensation, providing theoretical guid-ance and technical references for sensors error compensation work in other fields. 展开更多
关键词 fiber optic gyroscope(FOG) temperature error com-pensation composite model machine learning CLUSTERING regression.
在线阅读 下载PDF
Assessing Ecological Impacts of Urban Land Valuation:AI and Regression Models for Sustainable Land Management
3
作者 Yana Volkova Elena Bykowa +9 位作者 Oksana Pirogova Sergey Barykin Dmitriy Rodionov Ilya Sonts Angela Mottaeva Alexey Mikhaylov Dmitry Morkovkin N.B.A.Yousif Tomonobu Senjyu Farooq Ahmed Shah 《Research in Ecology》 2025年第2期192-208,共17页
The results of mass appraisal in many countries are used as a basis for calculating the amount of real estate tax,therefore,regardless of the methods used to calculate it,the resulting value should be as close as poss... The results of mass appraisal in many countries are used as a basis for calculating the amount of real estate tax,therefore,regardless of the methods used to calculate it,the resulting value should be as close as possible to the market value of the real estate to maintain a balance of interests between the state and the rights holders.In practice,this condition is not always met,since,firstly,the quality of market data is often very low,and secondly,some markets are characterized by low activity,which is expressed in a deficit of information on asking prices.The aim of the work is ecological valuation of land use:how regression-based mass appraisal can inform ecological conservation,land degradation,and sustainable land management.Four multiple regression models were constructed for AI generated map of land plots for recreational use in St.Petersburg(Russia)with different volumes of market information(32,30,20 and 15 units of market information with four price-forming factors).During the analysis of the quality of the models,it was revealed that the best result is shown by the model built on the maximum sample size,then the model based on 15 analogs,which proves that a larger number of analog objects does not always allow us to achieve better results,since the more analog objects there are. 展开更多
关键词 Land Use Sustainability Ecological Valuation regression modeling AI in Ecology Landscape Conservation
在线阅读 下载PDF
Stability analysis of distributed Kalman filtering algorithm for stochastic regression model
4
作者 Siyu Xie Die Gan Zhixin Liu 《Control Theory and Technology》 2025年第2期161-175,共15页
The work proposes a distributed Kalman filtering(KF)algorithm to track a time-varying unknown signal process for a stochastic regression model over network systems in a cooperative way.We provide the stability analysi... The work proposes a distributed Kalman filtering(KF)algorithm to track a time-varying unknown signal process for a stochastic regression model over network systems in a cooperative way.We provide the stability analysis of the proposed distributed KF algorithm without independent and stationary signal assumptions,which implies that the theoretical results are able to be applied to stochastic feedback systems.Note that the main difficulty of stability analysis lies in analyzing the properties of the product of non-independent and non-stationary random matrices involved in the error equation.We employ analysis techniques such as stochastic Lyapunov function,stability theory of stochastic systems,and algebraic graph theory to deal with the above issue.The stochastic spatio-temporal cooperative information condition shows the cooperative property of multiple sensors that even though any local sensor cannot track the time-varying unknown signal,the distributed KF algorithm can be utilized to finish the filtering task in a cooperative way.At last,we illustrate the property of the proposed distributed KF algorithm by a simulation example. 展开更多
关键词 Distributed Kalman filtering algorithm Stochastic cooperative information condition Sensor networks (L_(p))-exponential stability Stochastic regression model
原文传递
A comparison of model choice strategies for logistic regression
5
作者 Markku Karhunen 《Journal of Data and Information Science》 CSCD 2024年第1期37-52,共16页
Purpose:The purpose of this study is to develop and compare model choice strategies in context of logistic regression.Model choice means the choice of the covariates to be included in the model.Design/methodology/appr... Purpose:The purpose of this study is to develop and compare model choice strategies in context of logistic regression.Model choice means the choice of the covariates to be included in the model.Design/methodology/approach:The study is based on Monte Carlo simulations.The methods are compared in terms of three measures of accuracy:specificity and two kinds of sensitivity.A loss function combining sensitivity and specificity is introduced and used for a final comparison.Findings:The choice of method depends on how much the users emphasize sensitivity against specificity.It also depends on the sample size.For a typical logistic regression setting with a moderate sample size and a small to moderate effect size,either BIC,BICc or Lasso seems to be optimal.Research limitations:Numerical simulations cannot cover the whole range of data-generating processes occurring with real-world data.Thus,more simulations are needed.Practical implications:Researchers can refer to these results if they believe that their data-generating process is somewhat similar to some of the scenarios presented in this paper.Alternatively,they could run their own simulations and calculate the loss function.Originality/value:This is a systematic comparison of model choice algorithms and heuristics in context of logistic regression.The distinction between two types of sensitivity and a comparison based on a loss function are methodological novelties. 展开更多
关键词 model choice Logistic regression Logit regression Monte Carlo simulations Sensitivity SPECIFICITY
在线阅读 下载PDF
Optimization of Artificial Viscosity in Production Codes Based on Gaussian Regression Surrogate Models
6
作者 Vitaliy Gyrya Evan Lieberman +1 位作者 Mark Kenamond Mikhail Shashkov 《Communications on Applied Mathematics and Computation》 EI 2024年第3期1521-1550,共30页
To accurately model flows with shock waves using staggered-grid Lagrangian hydrodynamics, the artificial viscosity has to be introduced to convert kinetic energy into internal energy, thereby increasing the entropy ac... To accurately model flows with shock waves using staggered-grid Lagrangian hydrodynamics, the artificial viscosity has to be introduced to convert kinetic energy into internal energy, thereby increasing the entropy across shocks. Determining the appropriate strength of the artificial viscosity is an art and strongly depends on the particular problem and experience of the researcher. The objective of this study is to pose the problem of finding the appropriate strength of the artificial viscosity as an optimization problem and solve this problem using machine learning (ML) tools, specifically using surrogate models based on Gaussian Process regression (GPR) and Bayesian analysis. We describe the optimization method and discuss various practical details of its implementation. The shock-containing problems for which we apply this method all have been implemented in the LANL code FLAG (Burton in Connectivity structures and differencing techniques for staggered-grid free-Lagrange hydrodynamics, Tech. Rep. UCRL-JC-110555, Lawrence Livermore National Laboratory, Livermore, CA, 1992, 1992, in Consistent finite-volume discretization of hydrodynamic conservation laws for unstructured grids, Tech. Rep. CRL-JC-118788, Lawrence Livermore National Laboratory, Livermore, CA, 1992, 1994, Multidimensional discretization of conservation laws for unstructured polyhedral grids, Tech. Rep. UCRL-JC-118306, Lawrence Livermore National Laboratory, Livermore, CA, 1992, 1994, in FLAG, a multi-dimensional, multiple mesh, adaptive free-Lagrange, hydrodynamics code. In: NECDC, 1992). First, we apply ML to find optimal values to isolated shock problems of different strengths. Second, we apply ML to optimize the viscosity for a one-dimensional (1D) propagating detonation problem based on Zel’dovich-von Neumann-Doring (ZND) (Fickett and Davis in Detonation: theory and experiment. Dover books on physics. Dover Publications, Mineola, 2000) detonation theory using a reactive burn model. We compare results for default (currently used values in FLAG) and optimized values of the artificial viscosity for these problems demonstrating the potential for significant improvement in the accuracy of computations. 展开更多
关键词 OPTIMIZATION Artificial viscosity Gaussian regression surrigate model
在线阅读 下载PDF
Driving factors of CO_(2)emissions in South American countries:An application of Seemingly Unrelated Regression model
7
作者 Gadir BAYRAMLI Turan KARIMLI 《Regional Sustainability》 2024年第4期120-132,共13页
Carbon emissions have become a critical concern in the global effort to combat climate change,with each country or region contributing differently based on its economic structures,energy sources,and industrial activit... Carbon emissions have become a critical concern in the global effort to combat climate change,with each country or region contributing differently based on its economic structures,energy sources,and industrial activities.The factors influencing carbon emissions vary across countries and sectors.This study examined the factors influencing CO_(2)emissions in the 7 South American countries including Argentina,Brazil,Chile,Colombia,Ecuador,Peru,and Venezuela.We used the Seemingly Unrelated Regression(SUR)model to analyse the relationship of CO_(2)emissions with gross domestic product(GDP),renewable energy use,urbanization,industrialization,international tourism,agricultural productivity,and forest area based on data from 2000 to 2022.According to the SUR model,we found that GDP and industrialization had a moderate positive effect on CO_(2)emissions,whereas renewable energy use had a moderate negative effect on CO_(2)emissions.International tourism generally had a positive impact on CO_(2)emissions,while forest area tended to decrease CO_(2)emissions.Different variables had different effects on CO_(2)emissions in the 7 South American countries.In Argentina and Venezuela,GDP,international tourism,and agricultural productivity significantly affected CO_(2)emissions.In Colombia,GDP and international tourism had a negative impact on CO_(2)emissions.In Brazil,CO_(2)emissions were primarily driven by GDP,while in Chile,Ecuador,and Peru,international tourism had a negative effect on CO_(2)emissions.Overall,this study highlights the importance of country-specific strategies for reducing CO_(2)emissions and emphasizes the varying roles of these driving factors in shaping environmental quality in the 7 South American countries. 展开更多
关键词 CO_(2)emissions URBANIZATION INDUSTRIALIZATION International tourism Agricultural productivity Seemingly Unrelated regression(SUR)model South American countries
在线阅读 下载PDF
Country-based modelling of COVID-19 case fatality rate:A multiple regression analysis
8
作者 Soodeh Sagheb Ali Gholamrezanezhad +2 位作者 Elizabeth Pavlovic Mohsen Karami Mina Fakhrzadegan 《World Journal of Virology》 2024年第1期84-94,共11页
BACKGROUND The spread of the severe acute respiratory syndrome coronavirus 2 outbreak worldwide has caused concern regarding the mortality rate caused by the infection.The determinants of mortality on a global scale c... BACKGROUND The spread of the severe acute respiratory syndrome coronavirus 2 outbreak worldwide has caused concern regarding the mortality rate caused by the infection.The determinants of mortality on a global scale cannot be fully understood due to lack of information.AIM To identify key factors that may explain the variability in case lethality across countries.METHODS We identified 21 Potential risk factors for coronavirus disease 2019(COVID-19)case fatality rate for all the countries with available data.We examined univariate relationships of each variable with case fatality rate(CFR),and all independent variables to identify candidate variables for our final multiple model.Multiple regression analysis technique was used to assess the strength of relationship.RESULTS The mean of COVID-19 mortality was 1.52±1.72%.There was a statistically significant inverse correlation between health expenditure,and number of computed tomography scanners per 1 million with CFR,and significant direct correlation was found between literacy,and air pollution with CFR.This final model can predict approximately 97%of the changes in CFR.CONCLUSION The current study recommends some new predictors explaining affect mortality rate.Thus,it could help decision-makers develop health policies to fight COVID-19. 展开更多
关键词 COVID-19 SARS-CoV-2 Case fatality rate Predictive model Multiple regression
暂未订购
Extended linear regression model for vessel trajectory prediction with a-priori AIS information
9
作者 Christiaan Neil Burger Waldo Kleynhans Trienko Lups Grobler 《Geo-Spatial Information Science》 CSCD 2024年第1期202-220,共19页
As maritime activities increase globally,there is a greater dependency on technology in monitoring,control,and surveillance of vessel activity.One of the most prominent systems for monitoring vessel activity is the Au... As maritime activities increase globally,there is a greater dependency on technology in monitoring,control,and surveillance of vessel activity.One of the most prominent systems for monitoring vessel activity is the Automatic Identification System(AIS).An increase in both vessels fitted with AIS transponders and satellite and terrestrial AIS receivers has resulted in a significant increase in AIS messages received globally.This resultant rich spatial and temporal data source related to vessel activity provides analysts with the ability to perform enhanced vessel movement analytics,of which a pertinent example is the improvement of vessel location predictions.In this paper,we propose a novel strategy for predicting future locations of vessels making use of historic AIS data.The proposed method uses a Linear Regression Model(LRM)and utilizes historic AIS movement data in the form of a-priori generated spatial maps of the course over ground(LRMAC).The LRMAC is an accurate low complexity first-order method that is easy to implement operationally and shows promising results in areas where there is a consistency in the directionality of historic vessel movement.In areas where the historic directionality of vessel movement is diverse,such as areas close to harbors and ports,the LRMAC defaults to the LRM.The proposed LRMAC method is compared to the Single-Point Neighbor Search(SPNS),which is also a first-order method and has a similar level of computational complexity,and for the use case of predicting tanker and cargo vessel trajectories up to 8 hours into the future,the LRMAC showed improved results both in terms of prediction accuracy and execution time. 展开更多
关键词 Automatic Identification System(AIS)data Linear regression model(LRM) trajectory mining spatial map historic data trajectory prediction
原文传递
A Hybrid Model Evaluation Based on PCA Regression Schemes Applied to Seasonal Precipitation Forecast
10
作者 Pedro M. González-Jardines Aleida Rosquete-Estévez +1 位作者 Maibys Sierra-Lorenzo Arnoldo Bezanilla-Morlot 《Atmospheric and Climate Sciences》 2024年第3期328-353,共26页
Possible changes in the structure and seasonal variability of the subtropical ridge may lead to changes in the rainfall’s variability modes over Caribbean region. This generates additional difficulties around water r... Possible changes in the structure and seasonal variability of the subtropical ridge may lead to changes in the rainfall’s variability modes over Caribbean region. This generates additional difficulties around water resource planning, therefore, obtaining seasonal prediction models that allow these variations to be characterized in detail, it’s a concern, specially for island states. This research proposes the construction of statistical-dynamic models based on PCA regression methods. It is used as predictand the monthly precipitation accumulated, while the predictors (6) are extracted from the ECMWF-SEAS5 ensemble mean forecasts with a lag of one month with respect to the target month. In the construction of the models, two sequential training schemes are evaluated, obtaining that only the shorter preserves the seasonal characteristics of the predictand. The evaluation metrics used, where cell-point and dichotomous methodologies are combined, suggest that the predictors related to sea surface temperatures do not adequately represent the seasonal variability of the predictand, however, others such as the temperature at 850 hPa and the Outgoing Longwave Radiation are represented with a good approximation regardless of the model chosen. In this sense, the models built with the nearest neighbor methodology were the most efficient. Using the individual models with the best results, an ensemble is built that allows improving the individual skill of the models selected as members by correcting the underestimation of precipitation in the dynamic model during the wet season, although problems of overestimation persist for thresholds lower than 50 mm. 展开更多
关键词 Seasonal Forecast Principal Component regression Statistical-Dynamic models
在线阅读 下载PDF
Modeling of Total Dissolved Solids (TDS) and Sodium Absorption Ratio (SAR) in the Edwards-Trinity Plateau and Ogallala Aquifers in the Midland-Odessa Region Using Random Forest Regression and eXtreme Gradient Boosting
11
作者 Azuka I. Udeh Osayamen J. Imarhiagbe Erepamo J. Omietimi 《Journal of Geoscience and Environment Protection》 2024年第5期218-241,共24页
Efficient water quality monitoring and ensuring the safety of drinking water by government agencies in areas where the resource is constantly depleted due to anthropogenic or natural factors cannot be overemphasized. ... Efficient water quality monitoring and ensuring the safety of drinking water by government agencies in areas where the resource is constantly depleted due to anthropogenic or natural factors cannot be overemphasized. The above statement holds for West Texas, Midland, and Odessa Precisely. Two machine learning regression algorithms (Random Forest and XGBoost) were employed to develop models for the prediction of total dissolved solids (TDS) and sodium absorption ratio (SAR) for efficient water quality monitoring of two vital aquifers: Edward-Trinity (plateau), and Ogallala aquifers. These two aquifers have contributed immensely to providing water for different uses ranging from domestic, agricultural, industrial, etc. The data was obtained from the Texas Water Development Board (TWDB). The XGBoost and Random Forest models used in this study gave an accurate prediction of observed data (TDS and SAR) for both the Edward-Trinity (plateau) and Ogallala aquifers with the R<sup>2</sup> values consistently greater than 0.83. The Random Forest model gave a better prediction of TDS and SAR concentration with an average R, MAE, RMSE and MSE of 0.977, 0.015, 0.029 and 0.00, respectively. For the XGBoost, an average R, MAE, RMSE, and MSE of 0.953, 0.016, 0.037 and 0.00, respectively, were achieved. The overall performance of the models produced was impressive. From this study, we can clearly understand that Random Forest and XGBoost are appropriate for water quality prediction and monitoring in an area of high hydrocarbon activities like Midland and Odessa and West Texas at large. 展开更多
关键词 Water Quality Prediction Predictive modeling Aquifers Machine Learning regression eXtreme Gradient Boosting
在线阅读 下载PDF
Utilization of Logistical Regression to the Modified Sine-Gordon Model in the MST Experiment
12
作者 Nizar J. Alkhateeb Hameed K. Ebraheem Eman M. Al-Otaibi 《Open Journal of Modelling and Simulation》 2024年第2期43-58,共16页
In this paper, a logistical regression statistical analysis (LR) is presented for a set of variables used in experimental measurements in reversed field pinch (RFP) machines, commonly known as “slinky mode” (SM), ob... In this paper, a logistical regression statistical analysis (LR) is presented for a set of variables used in experimental measurements in reversed field pinch (RFP) machines, commonly known as “slinky mode” (SM), observed to travel around the torus in Madison Symmetric Torus (MST). The LR analysis is used to utilize the modified Sine-Gordon dynamic equation model to predict with high confidence whether the slinky mode will lock or not lock when compared to the experimentally measured motion of the slinky mode. It is observed that under certain conditions, the slinky mode “locks” at or near the intersection of poloidal and/or toroidal gaps in MST. However, locked mode cease to travel around the torus;while unlocked mode keeps traveling without a change in the energy, making it hard to determine an exact set of conditions to predict locking/unlocking behaviour. The significant key model parameters determined by LR analysis are shown to improve the Sine-Gordon model’s ability to determine the locking/unlocking of magnetohydrodyamic (MHD) modes. The LR analysis of measured variables provides high confidence in anticipating locking versus unlocking of slinky mode proven by relational comparisons between simulations and the experimentally measured motion of the slinky mode in MST. 展开更多
关键词 Madison Symmetric Torus (MST) Magnetohydrodyamic (MHD) SINE-GORDON TOROIDAL Dynamic modelling Reversed Field Pinch (RFP) Logistical regression
在线阅读 下载PDF
Research on the Relationship Between Average Cigarette Price per Box and Government Procurement in City A Based on a Regression Model
13
作者 Yao Nie Hongbo Wan Mingming Mao 《Proceedings of Business and Economic Studies》 2024年第5期68-72,共5页
This study aims to analyze and predict the relationship between the average price per box in the cigarette market of City A and government procurement,providing a scientific basis and support for decision-making.By re... This study aims to analyze and predict the relationship between the average price per box in the cigarette market of City A and government procurement,providing a scientific basis and support for decision-making.By reviewing relevant theories and literature,qualitative prediction methods,regression prediction models,and other related theories were explored.Through the analysis of annual cigarette sales data and government procurement data in City A,a comprehensive understanding of the development of the tobacco industry and the economic trends of tobacco companies in the county was obtained.By predicting and analyzing the average price per box of cigarette sales across different years,corresponding prediction results were derived and compared with actual sales data.The prediction results indicate that the correlation coefficient between the average price per box of cigarette sales and government procurement is 0.982,implying that government procurement accounts for 96.4%of the changes in the average price per box of cigarettes.These findings offer an in-depth exploration of the relationship between the average price per box of cigarettes in City A and government procurement,providing a scientific foundation for corporate decision-making and market operations. 展开更多
关键词 Cigarette marketing regression model Predictive model Government purchasing
在线阅读 下载PDF
Establishment and Effect Evaluation of Prediction Models of Ozone Concentration in Baoding City
14
作者 Xiangru KONG Jiajia ZHANG +2 位作者 Luntao YAO Tianning YANG Rongfang YANG 《Meteorological and Environmental Research》 2025年第3期44-50,共7页
Firstly,based on the data of air quality and the meteorological data in Baoding City from 2017 to 2021,the correlations of meteorological elements and pollutants with O_(3)concentration were explored to determine the ... Firstly,based on the data of air quality and the meteorological data in Baoding City from 2017 to 2021,the correlations of meteorological elements and pollutants with O_(3)concentration were explored to determine the forecast factors of forecast models.Secondly,the O_(3)-8h concentration in Baoding City in 2021 was predicted based on the constructed models of multiple linear regression(MLR),backward propagation neural network(BPNN),and auto regressive integrated moving average(ARIMA),and the predicted values were compared with the observed values to test their prediction effects.The results show that overall,the MLR,BPNN and ARIMA models were able to forecast the changing trend of O_(3)-8h concentration in Baoding in 2021,but the BPNN model gave better forecast results than the ARIMA and MLR models,especially for the prediction of the high values of O_(3)-8h concentration,and the correlation coefficients between the predicted values and the observed values were all higher than 0.9 during June-September.The mean error(ME),mean absolute error(MAE),and root mean square error(RMSE)of the predicted values and the observed values of daily O_(3)-8h concentration based on the BPNN model were 0.45,19.11 and 24.41μg/m 3,respectively,which were significantly better than those of the MLR and ARIMA models.The prediction effects of the MLR,BPNN and ARIMA models were the best at the pollution level,followed by the excellent level,and it was the worst at the good level.In comparison,the prediction effect of BPNN model was better than that of the MLR and ARIMA models as a whole,especially for the pollution and excellent levels.The TS scores of the BPNN model were all above 66%,and the PC values were above 86%.The BPNN model can forecast the changing trend of O_(3)concentration more accurately,and has a good practical application value,but at the same time,the predicted high values of O_(3)concentration should be appropriately increased according to error characteristics of the model. 展开更多
关键词 Ozone(O_(3)) Multiple linear regression model Back propagation neural network model Auto regressive integrated moving average model TS
在线阅读 下载PDF
Enhancing patient rehabilitation predictions with a hybrid anomaly detection model:Density-based clustering and interquartile range methods
15
作者 Murad Ali Khan Jong-Hyun Jang +5 位作者 Naeem Iqbal Harun Jamil Syed Shehryar Ali Naqvi Salabat Khan Jae-Chul Kim Do-Hyeun Kim 《CAAI Transactions on Intelligence Technology》 2025年第4期983-1006,共24页
In recent years,there has been a concerted effort to improve anomaly detection tech-niques,particularly in the context of high-dimensional,distributed clinical data.Analysing patient data within clinical settings reve... In recent years,there has been a concerted effort to improve anomaly detection tech-niques,particularly in the context of high-dimensional,distributed clinical data.Analysing patient data within clinical settings reveals a pronounced focus on refining diagnostic accuracy,personalising treatment plans,and optimising resource allocation to enhance clinical outcomes.Nonetheless,this domain faces unique challenges,such as irregular data collection,inconsistent data quality,and patient-specific structural variations.This paper proposed a novel hybrid approach that integrates heuristic and stochastic methods for anomaly detection in patient clinical data to address these challenges.The strategy combines HPO-based optimal Density-Based Spatial Clustering of Applications with Noise for clustering patient exercise data,facilitating efficient anomaly identification.Subsequently,a stochastic method based on the Interquartile Range filters unreliable data points,ensuring that medical tools and professionals receive only the most pertinent and accurate information.The primary objective of this study is to equip healthcare pro-fessionals and researchers with a robust tool for managing extensive,high-dimensional clinical datasets,enabling effective isolation and removal of aberrant data points.Furthermore,a sophisticated regression model has been developed using Automated Machine Learning(AutoML)to assess the impact of the ensemble abnormal pattern detection approach.Various statistical error estimation techniques validate the efficacy of the hybrid approach alongside AutoML.Experimental results show that implementing this innovative hybrid model on patient rehabilitation data leads to a notable enhance-ment in AutoML performance,with an average improvement of 0.041 in the R2 score,surpassing the effectiveness of traditional regression models. 展开更多
关键词 anomaly detection deep learning density-based clustering hybrid model IQR regression
在线阅读 下载PDF
Construction and validation of a machine learning algorithm-based predictive model for difficult colonoscopy insertion
16
作者 Ren-Xuan Gao Xin-Lei Wang +6 位作者 Ming-Jie Tian Xiao-Ming Li Jia-Jia Zhang Jun-Jing Wang Jing Gao Chao Zhang Zhi-Ting Li 《World Journal of Gastrointestinal Endoscopy》 2025年第7期149-161,共13页
BACKGROUND Difficulty of colonoscopy insertion(DCI)significantly affects colonoscopy effectiveness and serves as a key quality indicator.Predicting and evaluating DCI risk preoperatively is crucial for optimizing intr... BACKGROUND Difficulty of colonoscopy insertion(DCI)significantly affects colonoscopy effectiveness and serves as a key quality indicator.Predicting and evaluating DCI risk preoperatively is crucial for optimizing intraoperative strategies.AIM To evaluate the predictive performance of machine learning(ML)algorithms for DCI by comparing three modeling approaches,identify factors influencing DCI,and develop a preoperative prediction model using ML algorithms to enhance colonoscopy quality and efficiency.METHODS This cross-sectional study enrolled 712 patients who underwent colonoscopy at a tertiary hospital between June 2020 and May 2021.Demographic data,past medical history,medication use,and psychological status were collected.The endoscopist assessed DCI using the visual analogue scale.After univariate screening,predictive models were developed using multivariable logistic regression,least absolute shrinkage and selection operator(LASSO)regression,and random forest(RF)algorithms.Model performance was evaluated based on discrimination,calibration,and decision curve analysis(DCA),and results were visualized using nomograms.RESULTS A total of 712 patients(53.8%male;mean age 54.5 years±12.9 years)were included.Logistic regression analysis identified constipation[odds ratio(OR)=2.254,95%confidence interval(CI):1.289-3.931],abdominal circumference(AC)(77.5–91.9 cm,OR=1.895,95%CI:1.065-3.350;AC≥92 cm,OR=1.271,95%CI:0.730-2.188),and anxiety(OR=1.071,95%CI:1.044-1.100)as predictive factors for DCI,validated by LASSO and RF methods.Model performance revealed training/validation sensitivities of 0.826/0.925,0.924/0.868,and 1.000/0.981;specificities of 0.602/0.511,0.510/0.562,and 0.977/0.526;and corresponding area under the receiver operating characteristic curves(AUCs)of 0.780(0.737-0.823)/0.726(0.654-0.799),0.754(0.710-0.798)/0.723(0.656-0.791),and 1.000(1.000-1.000)/0.754(0.688-0.820),respectively.DCA indicated optimal net benefit within probability thresholds of 0-0.9 and 0.05-0.37.The RF model demonstrated superior diagnostic accuracy,reflected by perfect training sensitivity(1.000)and highest validation AUC(0.754),outperforming other methods in clinical applicability.CONCLUSION The RF-based model exhibited superior predictive accuracy for DCI compared to multivariable logistic and LASSO regression models.This approach supports individualized preoperative optimization,enhancing colonoscopy quality through targeted risk stratification. 展开更多
关键词 COLONOSCOPY Difficulty of colonoscopy insertion Machine learning algorithms Predictive model Logistic regression Least absolute shrinkage and selection operator regression Random forest
暂未订购
Semiparametric expectile regression for high-dimensional heavy-tailed and heterogeneous data
17
作者 ZHAO Jun YAN Guan-ao ZHANG Yi 《Applied Mathematics(A Journal of Chinese Universities)》 2025年第1期53-77,共25页
High-dimensional heterogeneous data have acquired increasing attention and discussion in the past decade.In the context of heterogeneity,semiparametric regression emerges as a popular method to model this type of data... High-dimensional heterogeneous data have acquired increasing attention and discussion in the past decade.In the context of heterogeneity,semiparametric regression emerges as a popular method to model this type of data in statistics.In this paper,we leverage the benefits of expectile regression for computational efficiency and analytical robustness in heterogeneity,and propose a regularized partially linear additive expectile regression model with a nonconvex penalty,such as SCAD or MCP,for high-dimensional heterogeneous data.We focus on a more realistic scenario where the regression error exhibits a heavy-tailed distribution with only finite moments.This scenario challenges the classical sub-gaussian distribution assumption and is more prevalent in practical applications.Under certain regular conditions,we demonstrate that with probability tending to one,the oracle estimator is one of the local minima of the induced optimization problem.Our theoretical analysis suggests that the dimensionality of linear covariates that our estimation procedure can handle is fundamentally limited by the moment condition of the regression error.Computationally,given the nonconvex and nonsmooth nature of the induced optimization problem,we have developed a two-step algorithm.Finally,our method’s effectiveness is demonstrated through its high estimation accuracy and effective model selection,as evidenced by Monte Carlo simulation studies and a real-data application.Furthermore,by taking various expectile weights,our method effectively detects heterogeneity and explores the complete conditional distribution of the response variable,underscoring its utility in analyzing high-dimensional heterogeneous data. 展开更多
关键词 expectile regression HETEROGENEITY heavy tail partially linear additive model
在线阅读 下载PDF
Integrated spatial generalized additive modeling for forest fire prediction:a case study in Fujian Province,China
18
作者 Chunhui Li Zhangwen Su +4 位作者 Rongyu Ni Guangyu Wang Yiyun Ouyang Aicong Zeng Futao Guo 《Journal of Forestry Research》 2025年第3期208-223,共16页
The increasing frequency of extreme weather events raises the likelihood of forest wildfires.Therefore,establishing an effective fire prediction model is vital for protecting human life and property,and the environmen... The increasing frequency of extreme weather events raises the likelihood of forest wildfires.Therefore,establishing an effective fire prediction model is vital for protecting human life and property,and the environment.This study aims to build a prediction model to understand the spatial characteristics and piecewise effects of forest fire drivers.Using monthly grid data from 2006 to 2020,a modeling study analyzed fire occurrences during the September to April fire season in Fujian Province,China.We compared the fitting performance of the logistic regression model(LRM),the generalized additive logistic model(GALM),and the spatial generalized additive logistic model(SGALM).The results indicate that SGALMs had the best fitting results and the highest prediction accuracy.Meteorological factors significantly impacted forest fires in Fujian Province.Areas with high fire incidence were mainly concentrated in the northwest and southeast.SGALMs improved the fitting effect of fire prediction models by considering spatial effects and the flexible fitting ability of nonlinear interpretation.This model provides piecewise interpretations of forest wildfire occurrences,which can be valuable for relevant departments and will assist forest managers in refining prevention measures based on temporal and spatial differences. 展开更多
关键词 Forest fire prediction Logistic regression Spatial generalized additive model Spline functions Piecewise effects
在线阅读 下载PDF
Development of a Model Material Suitable for Reservoir Landslide Model Tests
19
作者 Minghao Miao Huiming Tang +4 位作者 Sha Lu Changdong Li Kun Fang Yixiao Gu Chunyan Tang 《Journal of Earth Science》 2025年第5期1989-2004,共16页
In the physical model test of landslides,the selection of analogous materials is the key,and it is difficult to consider the similarity of mechanical properties and seepage performance at the same time.To develop a mo... In the physical model test of landslides,the selection of analogous materials is the key,and it is difficult to consider the similarity of mechanical properties and seepage performance at the same time.To develop a model material suitable for analysing the deformation and failure of reservoir landslides,based on the existing research foundation of analogous materials,5 materials and 5 physical-mechanical parameters were selected to design an orthogonal test.The factor sensitivity of each component ratio and its influence on the physical-mechanical indices were studied by range analysis and stepwise regression analysis,and the proportioning method was determined.Finally,the model material was developed,and a model test was carried out considering Huangtupo as the prototype application.The results showed that(1)the model material composed of sand,barite powder,glass beads,clay,and bentonite had a wide distribution of physical-mechanical parameters,which could be applied to model tests under different conditions;(2)the physical-mechanical parameters of analogous materials matched the application prototype;and(3)the mechanical properties and seepage performance of the model material sample met the requirements of reservoir landslide model tests,which could be used to simulate landslide evolution and analyse the deformation process. 展开更多
关键词 analogous material physical model test reservoir landslide range analysis stepwise regression stage division PIVlab LANDSLIDES engineering geology
原文传递
Machine learning-based models for prediction of in-hospital mortality in patients with dengue shock syndrome
20
作者 Luan Thanh Vo Thien Vu +2 位作者 Thach Ngoc Pham Tung Huu Trinh Thanh Tat Nguyen 《World Journal of Methodology》 2025年第3期89-99,共11页
BACKGROUND Severe dengue children with critical complications have been attributed to high mortality rates,varying from approximately 1%to over 20%.To date,there is a lack of data on machine-learning-based algorithms ... BACKGROUND Severe dengue children with critical complications have been attributed to high mortality rates,varying from approximately 1%to over 20%.To date,there is a lack of data on machine-learning-based algorithms for predicting the risk of inhospital mortality in children with dengue shock syndrome(DSS).AIM To develop machine-learning models to estimate the risk of death in hospitalized children with DSS.METHODS This single-center retrospective study was conducted at tertiary Children’s Hospital No.2 in Viet Nam,between 2013 and 2022.The primary outcome was the in-hospital mortality rate in children with DSS admitted to the pediatric intensive care unit(PICU).Nine significant features were predetermined for further analysis using machine learning models.An oversampling method was used to enhance the model performance.Supervised models,including logistic regression,Naïve Bayes,Random Forest(RF),K-nearest neighbors,Decision Tree and Extreme Gradient Boosting(XGBoost),were employed to develop predictive models.The Shapley Additive Explanation was used to determine the degree of contribution of the features.RESULTS In total,1278 PICU-admitted children with complete data were included in the analysis.The median patient age was 8.1 years(interquartile range:5.4-10.7).Thirty-nine patients(3%)died.The RF and XGboost models demonstrated the highest performance.The Shapley Addictive Explanations model revealed that the most important predictive features included younger age,female patients,presence of underlying diseases,severe transaminitis,severe bleeding,low platelet counts requiring platelet transfusion,elevated levels of international normalized ratio,blood lactate and serum creatinine,large volume of resuscitation fluid and a high vasoactive inotropic score(>30).CONCLUSION We developed robust machine learning-based models to estimate the risk of death in hospitalized children with DSS.The study findings are applicable to the design of management schemes to enhance survival outcomes of patients with DSS. 展开更多
关键词 Dengue shock syndrome Dengue mortality Machine learning Supervised models Logistic regression Random forest K-nearest neighbors Support vector machine Extreme Gradient Boost Shapley addictive explanations
暂未订购
上一页 1 2 250 下一页 到第
使用帮助 返回顶部