To cater the need for real-time crack monitoring of infrastructural facilities,a CNN-regression model is proposed to directly estimate the crack properties from patches.RGB crack images and their corresponding masks o...To cater the need for real-time crack monitoring of infrastructural facilities,a CNN-regression model is proposed to directly estimate the crack properties from patches.RGB crack images and their corresponding masks obtained from a public dataset are cropped into patches of 256 square pixels that are classified with a pre-trained deep convolution neural network,the true positives are segmented,and crack properties are extracted using two different methods.The first method is primarily based on active contour models and level-set segmentation and the second method consists of the domain adaptation of a mathematical morphology-based method known as FIL-FINDER.A statistical test has been performed for the comparison of the stated methods and a database prepared with the more suitable method.An advanced convolution neural network-based multi-output regression model has been proposed which was trained with the prepared database and validated with the held-out dataset for the prediction of crack-length,crack-width,and width-uncertainty directly from input image patches.The pro-posed model has been tested on crack patches collected from different locations.Huber loss has been used to ensure the robustness of the proposed model selected from a set of 288 different variations of it.Additionally,an ablation study has been conducted on the top 3 models that demonstrated the influence of each network component on the pre-diction results.Finally,the best performing model HHc-X among the top 3 has been proposed that predicted crack properties which are in close agreement to the ground truths in the test data.展开更多
In recent years,machine learning(ML)techniques have been shown to be effective in accelerating the development process of optoelectronic devices.However,as"black box"models,they have limited theoretical inte...In recent years,machine learning(ML)techniques have been shown to be effective in accelerating the development process of optoelectronic devices.However,as"black box"models,they have limited theoretical interpretability.In this work,we leverage symbolic regression(SR)technique for discovering the explicit symbolic relationship between the structure of the optoelectronic Fabry-Perot(FP)laser and its optical field distribution,which greatly improves model transparency compared to ML.We demonstrated that the expressions explored through SR exhibit lower errors on the test set compared to ML models,which suggests that the expressions have better fitting and generalization capabilities.展开更多
On the basis of experimental observations on animals, applications to clinical data on patients and theoretical statistical reasoning, the author developed a com-puter-assisted general mathematical model of the ‘prob...On the basis of experimental observations on animals, applications to clinical data on patients and theoretical statistical reasoning, the author developed a com-puter-assisted general mathematical model of the ‘probacent’-probability equation, Equation (1) and death rate (mortality probability) equation, Equation (2) derivable from Equation (1) that may be applica-ble as a general approximation method to make use-ful predictions of probable outcomes in a variety of biomedical phenomena [1-4]. Equations (1) and (2) contain a constant, γ and c, respectively. In the pre-vious studies, the author used the least maximum- difference principle to determine these constants that were expected to best fit reported data, minimizing the deviation. In this study, the author uses the method of computer-assisted least sum of squares to determine the constants, γ and c in constructing the ‘probacent’-related formulas best fitting the NCHS- reported data on survival probabilities and death rates in the US total adult population for 2001. The results of this study reveal that the method of com-puter-assisted mathematical analysis with the least sum of squares seems to be simple, more accurate, convenient and preferable than the previously used least maximum-difference principle, and better fit-ting the NCHS-reported data on survival probabili-ties and death rates in the US total adult population. The computer program of curved regression for the ‘probacent’-probability and death rate equations may be helpful in research in biomedicine.展开更多
This article presents a mathematical model addressing a scenario involving a hybrid nanofluid flow between two infinite parallel plates.One plate remains stationary,while the other moves downward at a squeezing veloci...This article presents a mathematical model addressing a scenario involving a hybrid nanofluid flow between two infinite parallel plates.One plate remains stationary,while the other moves downward at a squeezing velocity.The space between these plates contains a Darcy-Forchheimer porous medium.A mixture of water-based fluid with gold(Au)and silicon dioxide(Si O2)nanoparticles is formulated.In contrast to the conventional Fourier's heat flux equation,this study employs the Cattaneo-Christov heat flux equation.A uniform magnetic field is applied perpendicular to the flow direction,invoking magnetohydrodynamic(MHD)effects.Further,the model accounts for Joule heating,which is the heat generated when an electric current passes through the fluid.The problem is solved via NDSolve in MATHEMATICA.Numerical and statistical analyses are conducted to provide insights into the behavior of the nanomaterials between the parallel plates with respect to the flow,energy transport,and skin friction.The findings of this study have potential applications in enhancing cooling systems and optimizing thermal management strategies.It is observed that the squeezing motion generates additional pressure gradients within the fluid,which enhances the flow rate but reduces the frictional drag.Consequently,the fluid is pushed more vigorously between the plates,increasing the flow velocity.As the fluid experiences higher flow rates due to the increased squeezing effect,it spends less time in the region between the plates.The thermal relaxation,however,abruptly changes the temperature,leading to a decrease in the temperature fluctuations.展开更多
Power converters are essential components in modern life,being widely used in industry,automation,transportation,and household appliances.In many critical applications,their failure can lead not only to financial loss...Power converters are essential components in modern life,being widely used in industry,automation,transportation,and household appliances.In many critical applications,their failure can lead not only to financial losses due to operational downtime but also to serious risks to human safety.The capacitors forming the output filter,typically aluminumelectrolytic capacitors(AECs),are among the most critical and susceptible components in power converters.The electrolyte in AECs often evaporates over time,causing the internal resistance to rise and the capacitance to drop,ultimately leading to component failure.Detecting this fault requires measuring the current in the capacitor,rendering the method invasive and frequently impractical due to spatial constraints or operational limitations imposed by the integration of a current sensor in the capacitor branch.This article proposes the implementation of an online noninvasive fault diagnosis technique for estimating the Equivalent Series Resistance(ESR)and Capacitance(C)values of the capacitor,employing a combination of signal processing techniques(SPT)and machine learning(ML)algorithms.This solution relies solely on the converter’s input and output signals,therefore making it a non-invasive approach.The ML algorithm used was linear regression,applied to 27 attributes,21 of which were generated through feature engineering to enhance the model’s performance.The proposed solution demonstrates an R^(2) score greater than 0.99 in the estimation of both ESR and C.展开更多
Multinomial logistic regression (MNL) is an attractive statistical approach in modeling the vehicle crash severity as it does not require the assumption of normality, linearity, or homoscedasticity compared to other a...Multinomial logistic regression (MNL) is an attractive statistical approach in modeling the vehicle crash severity as it does not require the assumption of normality, linearity, or homoscedasticity compared to other approaches, such as the discriminant analysis which requires these assumptions to be met. Moreover, it produces sound estimates by changing the probability range between 0.0 and 1.0 to log odds ranging from negative infinity to positive infinity, as it applies transformation of the dependent variable to a continuous variable. The estimates are asymptotically consistent with the requirements of the nonlinear regression process. The results of MNL can be interpreted by both the regression coefficient estimates and/or the odd ratios (the exponentiated coefficients) as well. In addition, the MNL can be used to improve the fitted model by comparing the full model that includes all predictors to a chosen restricted model by excluding the non-significant predictors. As such, this paper presents a detailed step by step overview of incorporating the MNL in crash severity modeling, using vehicle crash data of the Interstate I70 in the State of Missouri, USA for the years (2013-2015).展开更多
Gastric cancer is the third leading cause of cancer-related mortality and remains a major global health issue^([1]).Annually,approximately 479,000individuals in China are diagnosed with gastric cancer,accounting for a...Gastric cancer is the third leading cause of cancer-related mortality and remains a major global health issue^([1]).Annually,approximately 479,000individuals in China are diagnosed with gastric cancer,accounting for almost 45%of all new cases worldwide^([2]).展开更多
In this study,we examine the problem of sliced inverse regression(SIR),a widely used method for sufficient dimension reduction(SDR).It was designed to find reduced-dimensional versions of multivariate predictors by re...In this study,we examine the problem of sliced inverse regression(SIR),a widely used method for sufficient dimension reduction(SDR).It was designed to find reduced-dimensional versions of multivariate predictors by replacing them with a minimally adequate collection of their linear combinations without loss of information.Recently,regularization methods have been proposed in SIR to incorporate a sparse structure of predictors for better interpretability.However,existing methods consider convex relaxation to bypass the sparsity constraint,which may not lead to the best subset,and particularly tends to include irrelevant variables when predictors are correlated.In this study,we approach sparse SIR as a nonconvex optimization problem and directly tackle the sparsity constraint by establishing the optimal conditions and iteratively solving them by means of the splicing technique.Without employing convex relaxation on the sparsity constraint and the orthogonal constraint,our algorithm exhibits superior empirical merits,as evidenced by extensive numerical studies.Computationally,our algorithm is much faster than the relaxed approach for the natural sparse SIR estimator.Statistically,our algorithm surpasses existing methods in terms of accuracy for central subspace estimation and best subset selection and sustains high performance even with correlated predictors.展开更多
Satellite-based precipitation products have been widely used to estimate precipitation, especially over regions with sparse rain gauge networks. However, the low spatial resolution of these products has limited their ...Satellite-based precipitation products have been widely used to estimate precipitation, especially over regions with sparse rain gauge networks. However, the low spatial resolution of these products has limited their application in localized regions and watersheds.This study investigated a spatial downscaling approach, Geographically Weighted Regression Kriging(GWRK), to downscale the Tropical Rainfall Measuring Mission(TRMM) 3 B43 Version 7 over the Lancang River Basin(LRB) for 2001–2015. Downscaling was performed based on the relationships between the TRMM precipitation and the Normalized Difference Vegetation Index(NDVI), the Land Surface Temperature(LST), and the Digital Elevation Model(DEM). Geographical ratio analysis(GRA) was used to calibrate the annual downscaled precipitation data, and the monthly fractions derived from the original TRMM data were used to disaggregate annual downscaled and calibrated precipitation to monthly precipitation at 1 km resolution. The final downscaled precipitation datasets were validated against station-based observed precipitation in 2001–2015. Results showed that: 1) The TRMM 3 B43 precipitation was highly accurate with slight overestimation at the basin scale(i.e., CC(correlation coefficient) = 0.91, Bias = 13.3%). Spatially, the accuracies of the upstream and downstream regions were higher than that of the midstream region. 2) The annual downscaled TRMM precipitation data at 1 km spatial resolution obtained by GWRK effectively captured the high spatial variability of precipitation over the LRB. 3) The annual downscaled TRMM precipitation with GRA calibration gave better accuracy compared with the original TRMM dataset. 4) The final downscaled and calibrated precipitation had significantly improved spatial resolution, and agreed well with data from the validated rain gauge stations, i.e., CC = 0.75, RMSE(root mean square error) = 182 mm, MAE(mean absolute error) = 142 mm, and Bias = 0.78%for annual precipitation and CC = 0.95, RMSE = 25 mm, MAE = 16 mm, and Bias = 0.67% for monthly precipitation.展开更多
This study explored and reviewed the logistic regression (LR) model, a multivariable method for modeling the relationship between multiple independent variables and a categorical dependent variable, with emphasis on m...This study explored and reviewed the logistic regression (LR) model, a multivariable method for modeling the relationship between multiple independent variables and a categorical dependent variable, with emphasis on medical research. Thirty seven research articles published between 2000 and 2018 which employed logistic regression as the main statistical tool as well as six text books on logistic regression were reviewed. Logistic regression concepts such as odds, odds ratio, logit transformation, logistic curve, assumption, selecting dependent and independent variables, model fitting, reporting and interpreting were presented. Upon perusing the literature, considerable deficiencies were found in both the use and reporting of LR. For many studies, the ratio of the number of outcome events to predictor variables (events per variable) was sufficiently small to call into question the accuracy of the regression model. Also, most studies did not report on validation analysis, regression diagnostics or goodness-of-fit measures;measures which authenticate the robustness of the LR model. Here, we demonstrate a good example of the application of the LR model using data obtained on a cohort of pregnant women and the factors that influence their decision to opt for caesarean delivery or vaginal birth. It is recommended that researchers should be more rigorous and pay greater attention to guidelines concerning the use and reporting of LR models.展开更多
Triaxial tests,a staple in rock engineering,are labor-intensive,sample-demanding,and costly,making their optimization highly advantageous.These tests are essential for characterizing rock strength,and by adopting a fa...Triaxial tests,a staple in rock engineering,are labor-intensive,sample-demanding,and costly,making their optimization highly advantageous.These tests are essential for characterizing rock strength,and by adopting a failure criterion,they allow for the derivation of criterion parameters through regression,facilitating their integration into modeling programs.In this study,we introduce the application of an underutilized statistical technique—orthogonal regression—well-suited for analyzing triaxial test data.Additionally,we present an innovation in this technique by minimizing the Euclidean distance while incorporating orthogonality between vectors as a constraint,for the case of orthogonal linear regression.Also,we consider the Modified Least Squares method.We exemplify this approach by developing the necessary equations to apply the Mohr-Coulomb,Murrell,Hoek-Brown,andÚcar criteria,and implement these equations in both spreadsheet calculations and R scripts.Finally,we demonstrate the technique's application using five datasets of varied lithologies from specialized literature,showcasing its versatility and effectiveness.展开更多
Knowing the influence of the size of datasets for regression models can help in improving the accuracy of a solar power forecast and make the most out of renewable energy systems.This research explores the influence o...Knowing the influence of the size of datasets for regression models can help in improving the accuracy of a solar power forecast and make the most out of renewable energy systems.This research explores the influence of dataset size on the accuracy and reliability of regression models for solar power prediction,contributing to better forecasting methods.The study analyzes data from two solar panels,aSiMicro03036 and aSiTandem72-46,over 7,14,17,21,28,and 38 days,with each dataset comprising five independent and one dependent parameter,and split 80–20 for training and testing.Results indicate that Random Forest consistently outperforms other models,achieving the highest correlation coefficient of 0.9822 and the lowest Mean Absolute Error(MAE)of 2.0544 on the aSiTandem72-46 panel with 21 days of data.For the aSiMicro03036 panel,the best MAE of 4.2978 was reached using the k-Nearest Neighbor(k-NN)algorithm,which was set up as instance-based k-Nearest neighbors(IBk)in Weka after being trained on 17 days of data.Regression performance for most models(excluding IBk)stabilizes at 14 days or more.Compared to the 7-day dataset,increasing to 21 days reduced the MAE by around 20%and improved correlation coefficients by around 2.1%,highlighting the value of moderate dataset expansion.These findings suggest that datasets spanning 17 to 21 days,with 80%used for training,can significantly enhance the predictive accuracy of solar power generation models.展开更多
Bigeye tuna is a protein-rich fish that is susceptible to spoilage during cold storage,however,there is limited information on untargeted metabolomic profiling of bigeye tuna concerning spoilage-associated enzymes and...Bigeye tuna is a protein-rich fish that is susceptible to spoilage during cold storage,however,there is limited information on untargeted metabolomic profiling of bigeye tuna concerning spoilage-associated enzymes and metabolites.This study aimed to investigate how cold storage affects enzyme activities,nutrient composition,tissue microstructures and spoilage metabolites of bigeye tuna.The activities of cathepsins B,H,L increased,while Na^(+)/K^(+)-ATPase and Mg^(2+)-ATPase decreased,α-glucosidase,lipase and lipoxygenase first increased and then decreased during cold storage,suggesting that proteins undergo degradation and ATP metabolism occurs at a faster rate during cold storage.Nutrient composition(moisture and lipid content),total amino acids decreased,suggesting that the nutritional value of bigeye tuna was reduced.Besides,a logistic regression equation has been established as a food analysis tool and assesses the dynamics and correlation of the enzyme of bigeye tuna during cold storage.Based on untargeted metabolomic profiling analysis,a total of 524 metabolites were identified in the bigeye tuna contained several spoilage metabolites involved in lipid metabolism(glycerophosphocholine and choline phosphate),amino acid metabolism(L-histidine,5-deoxy-5′-(methylthio)adenosine,5-methylthioadenosine),carbohydrate metabolism(D-gluconic acid,α-D-fructose 1,6-bisphosphate,D-glyceraldehyde 3-phosphate).The results of tissue microstructures of tuna showed a looser network and visible deterioration of tissue fiber during cold storage.Therefore,metabolomic analysis and tissue microstructures provide insight into the spoilage mechanism investigations on bigeye tuna during cold storage.展开更多
As the core component of inertial navigation systems, fiber optic gyroscope (FOG), with technical advantages such as low power consumption, long lifespan, fast startup speed, and flexible structural design, are widely...As the core component of inertial navigation systems, fiber optic gyroscope (FOG), with technical advantages such as low power consumption, long lifespan, fast startup speed, and flexible structural design, are widely used in aerospace, unmanned driving, and other fields. However, due to the temper-ature sensitivity of optical devices, the influence of environmen-tal temperature causes errors in FOG, thereby greatly limiting their output accuracy. This work researches on machine-learn-ing based temperature error compensation techniques for FOG. Specifically, it focuses on compensating for the bias errors gen-erated in the fiber ring due to the Shupe effect. This work pro-poses a composite model based on k-means clustering, sup-port vector regression, and particle swarm optimization algo-rithms. And it significantly reduced redundancy within the sam-ples by adopting the interval sequence sample. Moreover, met-rics such as root mean square error (RMSE), mean absolute error (MAE), bias stability, and Allan variance, are selected to evaluate the model’s performance and compensation effective-ness. This work effectively enhances the consistency between data and models across different temperature ranges and tem-perature gradients, improving the bias stability of the FOG from 0.022 °/h to 0.006 °/h. Compared to the existing methods utiliz-ing a single machine learning model, the proposed method increases the bias stability of the compensated FOG from 57.11% to 71.98%, and enhances the suppression of rate ramp noise coefficient from 2.29% to 14.83%. This work improves the accuracy of FOG after compensation, providing theoretical guid-ance and technical references for sensors error compensation work in other fields.展开更多
Sonic Hedgehog Medulloblastoma(SHH-MB)is one of the four primary molecular subgroups of Medulloblastoma.It is estimated to be responsible for nearly one-third of allMB cases.Using transcriptomic and DNA methylation pr...Sonic Hedgehog Medulloblastoma(SHH-MB)is one of the four primary molecular subgroups of Medulloblastoma.It is estimated to be responsible for nearly one-third of allMB cases.Using transcriptomic and DNA methylation profiling techniques,new developments in this field determined four molecular subtypes for SHH-MB.SHH-MB subtypes show distinct DNAmethylation patterns that allow their discrimination fromoverlapping subtypes and predict clinical outcomes.Class overlapping occurs when two or more classes share common features,making it difficult to distinguish them as separate.Using the DNA methylation dataset,a novel classification technique is presented to address the issue of overlapping SHH-MBsubtypes.Penalizedmultinomial regression(PMR),Tomek links(TL),and singular value decomposition(SVD)were all smoothly integrated into a single framework.SVD and group lasso improve computational efficiency,address the problem of high-dimensional datasets,and clarify class distinctions by removing redundant or irrelevant features that might lead to class overlap.As a method to eliminate the issues of decision boundary overlap and class imbalance in the classification task,TL enhances dataset balance and increases the clarity of decision boundaries through the elimination of overlapping samples.Using fivefold cross-validation,our proposed method(TL-SVDPMR)achieved a remarkable overall accuracy of almost 95%in the classification of SHH-MB molecular subtypes.The results demonstrate the strong performance of the proposed classification model among the various SHH-MB subtypes given a high average of the area under the curve(AUC)values.Additionally,the statistical significance test indicates that TL-SVDPMR is more accurate than both SVM and random forest algorithms in classifying the overlapping SHH-MB subtypes,highlighting its importance for precision medicine applications.Our findings emphasized the success of combining SVD,TL,and PMRtechniques to improve the classification performance for biomedical applications with many features and overlapping subtypes.展开更多
In modern complex systems,real-time regression prediction plays a vital role in performance evaluation and risk warning.Nevertheless,existing methods still face challenges in maintaining stability and predictive accur...In modern complex systems,real-time regression prediction plays a vital role in performance evaluation and risk warning.Nevertheless,existing methods still face challenges in maintaining stability and predictive accuracy under complex conditions.To address these limitations,this study proposes an online prediction approach that integrates event tracking sensitivity analysis with machine learning.Specifically,a real-time event tracking sensitivity analysis method is employed to capture and quantify the impact of key events on system outputs.On this basis,a mutualinformation–based self-extraction mechanism is introduced to construct prior weights,which are then incorporated into a LightGBM prediction model.Furthermore,iterative optimization of the feature selection threshold is performed to enhance both stability and accuracy.Experiments on composite microsensor data demonstrate that the proposed method achieves robust and efficient real-time prediction,with potential extension to industrial monitoring and control applications.展开更多
Metabolic endoscopy represents a promising alternative in the management of steatotic liver disease,particularly metabolic dysfunction-associated steatohep-atitis(MASH),a progressive form of metabolic dysfunction-asso...Metabolic endoscopy represents a promising alternative in the management of steatotic liver disease,particularly metabolic dysfunction-associated steatohep-atitis(MASH),a progressive form of metabolic dysfunction-associated steatotic liver disease(MASLD).With the rising global prevalence of MASLD—affecting over one-third of the adult population—and its close association with obesity,insulin resistance,and metabolic syndrome,there is an urgent need for inno-vative,minimally invasive therapies that can reverse liver fibrosis and prevent progression to cirrhosis and hepatocellular carcinoma.Traditional management of MASLD relies on lifestyle modifications and bariatric surgery,yet these app-roaches are hampered by issues of adherence,invasiveness,and accessibility.This review examines endoscopic bariatric metabolic therapies including endoscopic sleeve gastroplasty(ESG),intragastric balloons(IGB),duodenal mucosal resur-facing(DMR),and duodeno-jejunal bypass liners(DJBL),as well as revisional procedures like endoscopic revisional gastroplasty(ERG)and transoral outlet reduction(TORe).Clinical studies and meta-analyses indicate that metabolic en-doscopy is safe and effective for liver fibrosis in MASH.ESG appears to offer the greatest fibrosis reduction,while IGB and DJBL yield modest improvements,and DMR shows no significant effect.Among revisional therapies,ERG has dem-onstrated fibrosis reduction,although the benefits of TORe remain to be fully evaluated.展开更多
Bangladesh is a subtropical monsoon climate characterized by wide seasonal variations in rainfall, moderately warm temperatures, and high humidity. Rainfall is the main source of irrigation water everywhere in the Ban...Bangladesh is a subtropical monsoon climate characterized by wide seasonal variations in rainfall, moderately warm temperatures, and high humidity. Rainfall is the main source of irrigation water everywhere in the Bangladesh where the inhabitants derive their income primarily from farming. Stochastic rainfall models were concerned with the occurrence of wet day and depth of rainfall for different regions to model the daily occurrence of rainfall and achieved satisfactory results around the world. In connection to the Markov chain of different order, logistic regression is conducted to visualize the dependence of current rainfall upon the rainfall of previous two-time period. It had been shown that wet day of the previous two time period compared to the dry day of previous two time period influences positively the wet day of current time period, that is the dependency of dry-wet spell for the occurrence of rain in the rainy season from April to September in the study area. Daily data are collected from meteorological department of about 26 years on rainfall of Dhaka station during the period January 1985-August 2011 to conduct the study. The test result shows that the occurrence of rainfall follows a second order Markov chain and logistic regression also tells that dry followed by dry and wet followed by wet is more likely for the rainfall of Dhaka station and also the model could perform adequately for many applications of rainfall data satisfactorily.展开更多
Least Absolute Shrinkage and Selection Operator (LASSO) is used for variable selection as well as for handling the multicollinearity problem simultaneously in the linear regression model. LASSO produces estimates havi...Least Absolute Shrinkage and Selection Operator (LASSO) is used for variable selection as well as for handling the multicollinearity problem simultaneously in the linear regression model. LASSO produces estimates having high variance if the number of predictors is higher than the number of observations and if high multicollinearity exists among the predictor variables. To handle this problem, Elastic Net (ENet) estimator was introduced by combining LASSO and Ridge estimator (RE). The solutions of LASSO and ENet have been obtained using Least Angle Regression (LARS) and LARS-EN algorithms, respectively. In this article, we proposed an alternative algorithm to overcome the issues in LASSO that can be combined LASSO with other exiting biased estimators namely Almost Unbiased Ridge Estimator (AURE), Liu Estimator (LE), Almost Unbiased Liu Estimator (AULE), Principal Component Regression Estimator (PCRE), r-k class estimator and r-d class estimator. Further, we examine the performance of the proposed algorithm using a Monte-Carlo simulation study and real-world examples. The results showed that the LARS-rk and LARS-rd algorithms,?which are combined LASSO with r-k class estimator and r-d class estimator,?outperformed other algorithms under the moderated and severe multicollinearity.展开更多
As optimization of parameters affects prediction accuracy and generalization ability of support vector regression(SVR) greatly and the predictive model often mismatches nonlinear system model predictive control,a mult...As optimization of parameters affects prediction accuracy and generalization ability of support vector regression(SVR) greatly and the predictive model often mismatches nonlinear system model predictive control,a multi-step model predictive control based on online SVR(OSVR) optimized by multi-agent particle swarm optimization algorithm(MAPSO) is put forward. By integrating the online learning ability of OSVR, the predictive model can self-correct and adapt to the dynamic changes in nonlinear process well.展开更多
文摘To cater the need for real-time crack monitoring of infrastructural facilities,a CNN-regression model is proposed to directly estimate the crack properties from patches.RGB crack images and their corresponding masks obtained from a public dataset are cropped into patches of 256 square pixels that are classified with a pre-trained deep convolution neural network,the true positives are segmented,and crack properties are extracted using two different methods.The first method is primarily based on active contour models and level-set segmentation and the second method consists of the domain adaptation of a mathematical morphology-based method known as FIL-FINDER.A statistical test has been performed for the comparison of the stated methods and a database prepared with the more suitable method.An advanced convolution neural network-based multi-output regression model has been proposed which was trained with the prepared database and validated with the held-out dataset for the prediction of crack-length,crack-width,and width-uncertainty directly from input image patches.The pro-posed model has been tested on crack patches collected from different locations.Huber loss has been used to ensure the robustness of the proposed model selected from a set of 288 different variations of it.Additionally,an ablation study has been conducted on the top 3 models that demonstrated the influence of each network component on the pre-diction results.Finally,the best performing model HHc-X among the top 3 has been proposed that predicted crack properties which are in close agreement to the ground truths in the test data.
基金supported by the National Natural Science Foundation of China(No.92370117)the CAS Project for Young Scientists in Basic Research(No.YSBR-090)。
文摘In recent years,machine learning(ML)techniques have been shown to be effective in accelerating the development process of optoelectronic devices.However,as"black box"models,they have limited theoretical interpretability.In this work,we leverage symbolic regression(SR)technique for discovering the explicit symbolic relationship between the structure of the optoelectronic Fabry-Perot(FP)laser and its optical field distribution,which greatly improves model transparency compared to ML.We demonstrated that the expressions explored through SR exhibit lower errors on the test set compared to ML models,which suggests that the expressions have better fitting and generalization capabilities.
文摘On the basis of experimental observations on animals, applications to clinical data on patients and theoretical statistical reasoning, the author developed a com-puter-assisted general mathematical model of the ‘probacent’-probability equation, Equation (1) and death rate (mortality probability) equation, Equation (2) derivable from Equation (1) that may be applica-ble as a general approximation method to make use-ful predictions of probable outcomes in a variety of biomedical phenomena [1-4]. Equations (1) and (2) contain a constant, γ and c, respectively. In the pre-vious studies, the author used the least maximum- difference principle to determine these constants that were expected to best fit reported data, minimizing the deviation. In this study, the author uses the method of computer-assisted least sum of squares to determine the constants, γ and c in constructing the ‘probacent’-related formulas best fitting the NCHS- reported data on survival probabilities and death rates in the US total adult population for 2001. The results of this study reveal that the method of com-puter-assisted mathematical analysis with the least sum of squares seems to be simple, more accurate, convenient and preferable than the previously used least maximum-difference principle, and better fit-ting the NCHS-reported data on survival probabili-ties and death rates in the US total adult population. The computer program of curved regression for the ‘probacent’-probability and death rate equations may be helpful in research in biomedicine.
文摘This article presents a mathematical model addressing a scenario involving a hybrid nanofluid flow between two infinite parallel plates.One plate remains stationary,while the other moves downward at a squeezing velocity.The space between these plates contains a Darcy-Forchheimer porous medium.A mixture of water-based fluid with gold(Au)and silicon dioxide(Si O2)nanoparticles is formulated.In contrast to the conventional Fourier's heat flux equation,this study employs the Cattaneo-Christov heat flux equation.A uniform magnetic field is applied perpendicular to the flow direction,invoking magnetohydrodynamic(MHD)effects.Further,the model accounts for Joule heating,which is the heat generated when an electric current passes through the fluid.The problem is solved via NDSolve in MATHEMATICA.Numerical and statistical analyses are conducted to provide insights into the behavior of the nanomaterials between the parallel plates with respect to the flow,energy transport,and skin friction.The findings of this study have potential applications in enhancing cooling systems and optimizing thermal management strategies.It is observed that the squeezing motion generates additional pressure gradients within the fluid,which enhances the flow rate but reduces the frictional drag.Consequently,the fluid is pushed more vigorously between the plates,increasing the flow velocity.As the fluid experiences higher flow rates due to the increased squeezing effect,it spends less time in the region between the plates.The thermal relaxation,however,abruptly changes the temperature,leading to a decrease in the temperature fluctuations.
文摘Power converters are essential components in modern life,being widely used in industry,automation,transportation,and household appliances.In many critical applications,their failure can lead not only to financial losses due to operational downtime but also to serious risks to human safety.The capacitors forming the output filter,typically aluminumelectrolytic capacitors(AECs),are among the most critical and susceptible components in power converters.The electrolyte in AECs often evaporates over time,causing the internal resistance to rise and the capacitance to drop,ultimately leading to component failure.Detecting this fault requires measuring the current in the capacitor,rendering the method invasive and frequently impractical due to spatial constraints or operational limitations imposed by the integration of a current sensor in the capacitor branch.This article proposes the implementation of an online noninvasive fault diagnosis technique for estimating the Equivalent Series Resistance(ESR)and Capacitance(C)values of the capacitor,employing a combination of signal processing techniques(SPT)and machine learning(ML)algorithms.This solution relies solely on the converter’s input and output signals,therefore making it a non-invasive approach.The ML algorithm used was linear regression,applied to 27 attributes,21 of which were generated through feature engineering to enhance the model’s performance.The proposed solution demonstrates an R^(2) score greater than 0.99 in the estimation of both ESR and C.
文摘Multinomial logistic regression (MNL) is an attractive statistical approach in modeling the vehicle crash severity as it does not require the assumption of normality, linearity, or homoscedasticity compared to other approaches, such as the discriminant analysis which requires these assumptions to be met. Moreover, it produces sound estimates by changing the probability range between 0.0 and 1.0 to log odds ranging from negative infinity to positive infinity, as it applies transformation of the dependent variable to a continuous variable. The estimates are asymptotically consistent with the requirements of the nonlinear regression process. The results of MNL can be interpreted by both the regression coefficient estimates and/or the odd ratios (the exponentiated coefficients) as well. In addition, the MNL can be used to improve the fitted model by comparing the full model that includes all predictors to a chosen restricted model by excluding the non-significant predictors. As such, this paper presents a detailed step by step overview of incorporating the MNL in crash severity modeling, using vehicle crash data of the Interstate I70 in the State of Missouri, USA for the years (2013-2015).
基金supported by the Natural Science Foundation of Shanghai(23ZR1463600)Shanghai Pudong New Area Health Commission Research Project(PW2021A-69)Research Project of Clinical Research Center of Shanghai Health Medical University(22MC2022002)。
文摘Gastric cancer is the third leading cause of cancer-related mortality and remains a major global health issue^([1]).Annually,approximately 479,000individuals in China are diagnosed with gastric cancer,accounting for almost 45%of all new cases worldwide^([2]).
文摘In this study,we examine the problem of sliced inverse regression(SIR),a widely used method for sufficient dimension reduction(SDR).It was designed to find reduced-dimensional versions of multivariate predictors by replacing them with a minimally adequate collection of their linear combinations without loss of information.Recently,regularization methods have been proposed in SIR to incorporate a sparse structure of predictors for better interpretability.However,existing methods consider convex relaxation to bypass the sparsity constraint,which may not lead to the best subset,and particularly tends to include irrelevant variables when predictors are correlated.In this study,we approach sparse SIR as a nonconvex optimization problem and directly tackle the sparsity constraint by establishing the optimal conditions and iteratively solving them by means of the splicing technique.Without employing convex relaxation on the sparsity constraint and the orthogonal constraint,our algorithm exhibits superior empirical merits,as evidenced by extensive numerical studies.Computationally,our algorithm is much faster than the relaxed approach for the natural sparse SIR estimator.Statistically,our algorithm surpasses existing methods in terms of accuracy for central subspace estimation and best subset selection and sustains high performance even with correlated predictors.
基金Under the auspices of the National Natural Science Foundation of China(No.41661099)the National Key Research and Development Program of China(No.Grant 2016YFA0601601)
文摘Satellite-based precipitation products have been widely used to estimate precipitation, especially over regions with sparse rain gauge networks. However, the low spatial resolution of these products has limited their application in localized regions and watersheds.This study investigated a spatial downscaling approach, Geographically Weighted Regression Kriging(GWRK), to downscale the Tropical Rainfall Measuring Mission(TRMM) 3 B43 Version 7 over the Lancang River Basin(LRB) for 2001–2015. Downscaling was performed based on the relationships between the TRMM precipitation and the Normalized Difference Vegetation Index(NDVI), the Land Surface Temperature(LST), and the Digital Elevation Model(DEM). Geographical ratio analysis(GRA) was used to calibrate the annual downscaled precipitation data, and the monthly fractions derived from the original TRMM data were used to disaggregate annual downscaled and calibrated precipitation to monthly precipitation at 1 km resolution. The final downscaled precipitation datasets were validated against station-based observed precipitation in 2001–2015. Results showed that: 1) The TRMM 3 B43 precipitation was highly accurate with slight overestimation at the basin scale(i.e., CC(correlation coefficient) = 0.91, Bias = 13.3%). Spatially, the accuracies of the upstream and downstream regions were higher than that of the midstream region. 2) The annual downscaled TRMM precipitation data at 1 km spatial resolution obtained by GWRK effectively captured the high spatial variability of precipitation over the LRB. 3) The annual downscaled TRMM precipitation with GRA calibration gave better accuracy compared with the original TRMM dataset. 4) The final downscaled and calibrated precipitation had significantly improved spatial resolution, and agreed well with data from the validated rain gauge stations, i.e., CC = 0.75, RMSE(root mean square error) = 182 mm, MAE(mean absolute error) = 142 mm, and Bias = 0.78%for annual precipitation and CC = 0.95, RMSE = 25 mm, MAE = 16 mm, and Bias = 0.67% for monthly precipitation.
文摘This study explored and reviewed the logistic regression (LR) model, a multivariable method for modeling the relationship between multiple independent variables and a categorical dependent variable, with emphasis on medical research. Thirty seven research articles published between 2000 and 2018 which employed logistic regression as the main statistical tool as well as six text books on logistic regression were reviewed. Logistic regression concepts such as odds, odds ratio, logit transformation, logistic curve, assumption, selecting dependent and independent variables, model fitting, reporting and interpreting were presented. Upon perusing the literature, considerable deficiencies were found in both the use and reporting of LR. For many studies, the ratio of the number of outcome events to predictor variables (events per variable) was sufficiently small to call into question the accuracy of the regression model. Also, most studies did not report on validation analysis, regression diagnostics or goodness-of-fit measures;measures which authenticate the robustness of the LR model. Here, we demonstrate a good example of the application of the LR model using data obtained on a cohort of pregnant women and the factors that influence their decision to opt for caesarean delivery or vaginal birth. It is recommended that researchers should be more rigorous and pay greater attention to guidelines concerning the use and reporting of LR models.
文摘Triaxial tests,a staple in rock engineering,are labor-intensive,sample-demanding,and costly,making their optimization highly advantageous.These tests are essential for characterizing rock strength,and by adopting a failure criterion,they allow for the derivation of criterion parameters through regression,facilitating their integration into modeling programs.In this study,we introduce the application of an underutilized statistical technique—orthogonal regression—well-suited for analyzing triaxial test data.Additionally,we present an innovation in this technique by minimizing the Euclidean distance while incorporating orthogonality between vectors as a constraint,for the case of orthogonal linear regression.Also,we consider the Modified Least Squares method.We exemplify this approach by developing the necessary equations to apply the Mohr-Coulomb,Murrell,Hoek-Brown,andÚcar criteria,and implement these equations in both spreadsheet calculations and R scripts.Finally,we demonstrate the technique's application using five datasets of varied lithologies from specialized literature,showcasing its versatility and effectiveness.
文摘Knowing the influence of the size of datasets for regression models can help in improving the accuracy of a solar power forecast and make the most out of renewable energy systems.This research explores the influence of dataset size on the accuracy and reliability of regression models for solar power prediction,contributing to better forecasting methods.The study analyzes data from two solar panels,aSiMicro03036 and aSiTandem72-46,over 7,14,17,21,28,and 38 days,with each dataset comprising five independent and one dependent parameter,and split 80–20 for training and testing.Results indicate that Random Forest consistently outperforms other models,achieving the highest correlation coefficient of 0.9822 and the lowest Mean Absolute Error(MAE)of 2.0544 on the aSiTandem72-46 panel with 21 days of data.For the aSiMicro03036 panel,the best MAE of 4.2978 was reached using the k-Nearest Neighbor(k-NN)algorithm,which was set up as instance-based k-Nearest neighbors(IBk)in Weka after being trained on 17 days of data.Regression performance for most models(excluding IBk)stabilizes at 14 days or more.Compared to the 7-day dataset,increasing to 21 days reduced the MAE by around 20%and improved correlation coefficients by around 2.1%,highlighting the value of moderate dataset expansion.These findings suggest that datasets spanning 17 to 21 days,with 80%used for training,can significantly enhance the predictive accuracy of solar power generation models.
基金supported by the Shanghai Sailing Program(22YF1416300)Youth Fund Project of National Natural Science Foundation of China(32202117)+1 种基金National Key Research and Development Program of China(2022YFD2100104)the China Agriculture Research System(CARS-47).
文摘Bigeye tuna is a protein-rich fish that is susceptible to spoilage during cold storage,however,there is limited information on untargeted metabolomic profiling of bigeye tuna concerning spoilage-associated enzymes and metabolites.This study aimed to investigate how cold storage affects enzyme activities,nutrient composition,tissue microstructures and spoilage metabolites of bigeye tuna.The activities of cathepsins B,H,L increased,while Na^(+)/K^(+)-ATPase and Mg^(2+)-ATPase decreased,α-glucosidase,lipase and lipoxygenase first increased and then decreased during cold storage,suggesting that proteins undergo degradation and ATP metabolism occurs at a faster rate during cold storage.Nutrient composition(moisture and lipid content),total amino acids decreased,suggesting that the nutritional value of bigeye tuna was reduced.Besides,a logistic regression equation has been established as a food analysis tool and assesses the dynamics and correlation of the enzyme of bigeye tuna during cold storage.Based on untargeted metabolomic profiling analysis,a total of 524 metabolites were identified in the bigeye tuna contained several spoilage metabolites involved in lipid metabolism(glycerophosphocholine and choline phosphate),amino acid metabolism(L-histidine,5-deoxy-5′-(methylthio)adenosine,5-methylthioadenosine),carbohydrate metabolism(D-gluconic acid,α-D-fructose 1,6-bisphosphate,D-glyceraldehyde 3-phosphate).The results of tissue microstructures of tuna showed a looser network and visible deterioration of tissue fiber during cold storage.Therefore,metabolomic analysis and tissue microstructures provide insight into the spoilage mechanism investigations on bigeye tuna during cold storage.
基金supported by the National Natural Science Foundation of China(62375013).
文摘As the core component of inertial navigation systems, fiber optic gyroscope (FOG), with technical advantages such as low power consumption, long lifespan, fast startup speed, and flexible structural design, are widely used in aerospace, unmanned driving, and other fields. However, due to the temper-ature sensitivity of optical devices, the influence of environmen-tal temperature causes errors in FOG, thereby greatly limiting their output accuracy. This work researches on machine-learn-ing based temperature error compensation techniques for FOG. Specifically, it focuses on compensating for the bias errors gen-erated in the fiber ring due to the Shupe effect. This work pro-poses a composite model based on k-means clustering, sup-port vector regression, and particle swarm optimization algo-rithms. And it significantly reduced redundancy within the sam-ples by adopting the interval sequence sample. Moreover, met-rics such as root mean square error (RMSE), mean absolute error (MAE), bias stability, and Allan variance, are selected to evaluate the model’s performance and compensation effective-ness. This work effectively enhances the consistency between data and models across different temperature ranges and tem-perature gradients, improving the bias stability of the FOG from 0.022 °/h to 0.006 °/h. Compared to the existing methods utiliz-ing a single machine learning model, the proposed method increases the bias stability of the compensated FOG from 57.11% to 71.98%, and enhances the suppression of rate ramp noise coefficient from 2.29% to 14.83%. This work improves the accuracy of FOG after compensation, providing theoretical guid-ance and technical references for sensors error compensation work in other fields.
基金funded by the Deanship of Graduate Studies and Scientific Research at Jouf University under grant No.(DGSSR-2024-02-01137).
文摘Sonic Hedgehog Medulloblastoma(SHH-MB)is one of the four primary molecular subgroups of Medulloblastoma.It is estimated to be responsible for nearly one-third of allMB cases.Using transcriptomic and DNA methylation profiling techniques,new developments in this field determined four molecular subtypes for SHH-MB.SHH-MB subtypes show distinct DNAmethylation patterns that allow their discrimination fromoverlapping subtypes and predict clinical outcomes.Class overlapping occurs when two or more classes share common features,making it difficult to distinguish them as separate.Using the DNA methylation dataset,a novel classification technique is presented to address the issue of overlapping SHH-MBsubtypes.Penalizedmultinomial regression(PMR),Tomek links(TL),and singular value decomposition(SVD)were all smoothly integrated into a single framework.SVD and group lasso improve computational efficiency,address the problem of high-dimensional datasets,and clarify class distinctions by removing redundant or irrelevant features that might lead to class overlap.As a method to eliminate the issues of decision boundary overlap and class imbalance in the classification task,TL enhances dataset balance and increases the clarity of decision boundaries through the elimination of overlapping samples.Using fivefold cross-validation,our proposed method(TL-SVDPMR)achieved a remarkable overall accuracy of almost 95%in the classification of SHH-MB molecular subtypes.The results demonstrate the strong performance of the proposed classification model among the various SHH-MB subtypes given a high average of the area under the curve(AUC)values.Additionally,the statistical significance test indicates that TL-SVDPMR is more accurate than both SVM and random forest algorithms in classifying the overlapping SHH-MB subtypes,highlighting its importance for precision medicine applications.Our findings emphasized the success of combining SVD,TL,and PMRtechniques to improve the classification performance for biomedical applications with many features and overlapping subtypes.
基金financial support from the National Natural Science Foundation of China(Grants No.U2330206,No.U2230206,and No.62173068)the Natural Science Foundation of Guangxi Province(Grant No.AB24010157)+1 种基金the Sichuan Forestry and Grassland Bureau(Grant Nos.G202206012 and G202206012-2)Sichuan Science and Technology Program(Grant Nos.2024NSFSC1483,2024ZYD0156,2023NSFC1962,and DQ202412).
文摘In modern complex systems,real-time regression prediction plays a vital role in performance evaluation and risk warning.Nevertheless,existing methods still face challenges in maintaining stability and predictive accuracy under complex conditions.To address these limitations,this study proposes an online prediction approach that integrates event tracking sensitivity analysis with machine learning.Specifically,a real-time event tracking sensitivity analysis method is employed to capture and quantify the impact of key events on system outputs.On this basis,a mutualinformation–based self-extraction mechanism is introduced to construct prior weights,which are then incorporated into a LightGBM prediction model.Furthermore,iterative optimization of the feature selection threshold is performed to enhance both stability and accuracy.Experiments on composite microsensor data demonstrate that the proposed method achieves robust and efficient real-time prediction,with potential extension to industrial monitoring and control applications.
文摘Metabolic endoscopy represents a promising alternative in the management of steatotic liver disease,particularly metabolic dysfunction-associated steatohep-atitis(MASH),a progressive form of metabolic dysfunction-associated steatotic liver disease(MASLD).With the rising global prevalence of MASLD—affecting over one-third of the adult population—and its close association with obesity,insulin resistance,and metabolic syndrome,there is an urgent need for inno-vative,minimally invasive therapies that can reverse liver fibrosis and prevent progression to cirrhosis and hepatocellular carcinoma.Traditional management of MASLD relies on lifestyle modifications and bariatric surgery,yet these app-roaches are hampered by issues of adherence,invasiveness,and accessibility.This review examines endoscopic bariatric metabolic therapies including endoscopic sleeve gastroplasty(ESG),intragastric balloons(IGB),duodenal mucosal resur-facing(DMR),and duodeno-jejunal bypass liners(DJBL),as well as revisional procedures like endoscopic revisional gastroplasty(ERG)and transoral outlet reduction(TORe).Clinical studies and meta-analyses indicate that metabolic en-doscopy is safe and effective for liver fibrosis in MASH.ESG appears to offer the greatest fibrosis reduction,while IGB and DJBL yield modest improvements,and DMR shows no significant effect.Among revisional therapies,ERG has dem-onstrated fibrosis reduction,although the benefits of TORe remain to be fully evaluated.
文摘Bangladesh is a subtropical monsoon climate characterized by wide seasonal variations in rainfall, moderately warm temperatures, and high humidity. Rainfall is the main source of irrigation water everywhere in the Bangladesh where the inhabitants derive their income primarily from farming. Stochastic rainfall models were concerned with the occurrence of wet day and depth of rainfall for different regions to model the daily occurrence of rainfall and achieved satisfactory results around the world. In connection to the Markov chain of different order, logistic regression is conducted to visualize the dependence of current rainfall upon the rainfall of previous two-time period. It had been shown that wet day of the previous two time period compared to the dry day of previous two time period influences positively the wet day of current time period, that is the dependency of dry-wet spell for the occurrence of rain in the rainy season from April to September in the study area. Daily data are collected from meteorological department of about 26 years on rainfall of Dhaka station during the period January 1985-August 2011 to conduct the study. The test result shows that the occurrence of rainfall follows a second order Markov chain and logistic regression also tells that dry followed by dry and wet followed by wet is more likely for the rainfall of Dhaka station and also the model could perform adequately for many applications of rainfall data satisfactorily.
文摘Least Absolute Shrinkage and Selection Operator (LASSO) is used for variable selection as well as for handling the multicollinearity problem simultaneously in the linear regression model. LASSO produces estimates having high variance if the number of predictors is higher than the number of observations and if high multicollinearity exists among the predictor variables. To handle this problem, Elastic Net (ENet) estimator was introduced by combining LASSO and Ridge estimator (RE). The solutions of LASSO and ENet have been obtained using Least Angle Regression (LARS) and LARS-EN algorithms, respectively. In this article, we proposed an alternative algorithm to overcome the issues in LASSO that can be combined LASSO with other exiting biased estimators namely Almost Unbiased Ridge Estimator (AURE), Liu Estimator (LE), Almost Unbiased Liu Estimator (AULE), Principal Component Regression Estimator (PCRE), r-k class estimator and r-d class estimator. Further, we examine the performance of the proposed algorithm using a Monte-Carlo simulation study and real-world examples. The results showed that the LARS-rk and LARS-rd algorithms,?which are combined LASSO with r-k class estimator and r-d class estimator,?outperformed other algorithms under the moderated and severe multicollinearity.
基金the National Natural Science Foundation of China(No.60905066)the Natural Science Foundation of Chongqing(No.cstc2018jcyjA0667)
文摘As optimization of parameters affects prediction accuracy and generalization ability of support vector regression(SVR) greatly and the predictive model often mismatches nonlinear system model predictive control,a multi-step model predictive control based on online SVR(OSVR) optimized by multi-agent particle swarm optimization algorithm(MAPSO) is put forward. By integrating the online learning ability of OSVR, the predictive model can self-correct and adapt to the dynamic changes in nonlinear process well.