This paper considers the approaches and methods for reducing the influence of multi-collinearity. Great attention is paid to the question of using shrinkage estimators for this purpose. Two classes of regression model...This paper considers the approaches and methods for reducing the influence of multi-collinearity. Great attention is paid to the question of using shrinkage estimators for this purpose. Two classes of regression models are investigated, the first of which corresponds to systems with a negative feedback, while the second class presents systems without the feedback. In the first case the use of shrinkage estimators, especially the Principal Component estimator, is inappropriate but is possible in the second case with the right choice of the regularization parameter or of the number of principal components included in the regression model. This fact is substantiated by the study of the distribution of the random variable , where b is the LS estimate and β is the true coefficient, since the form of this distribution is the basic characteristic of the specified classes. For this study, a regression approximation of the distribution of the event based on the Edgeworth series was developed. Also, alternative approaches are examined to resolve the multicollinearity issue, including an application of the known Inequality Constrained Least Squares method and the Dual estimator method proposed by the author. It is shown that with a priori information the Euclidean distance between the estimates and the true coefficients can be significantly reduced.展开更多
Power converters are essential components in modern life,being widely used in industry,automation,transportation,and household appliances.In many critical applications,their failure can lead not only to financial loss...Power converters are essential components in modern life,being widely used in industry,automation,transportation,and household appliances.In many critical applications,their failure can lead not only to financial losses due to operational downtime but also to serious risks to human safety.The capacitors forming the output filter,typically aluminumelectrolytic capacitors(AECs),are among the most critical and susceptible components in power converters.The electrolyte in AECs often evaporates over time,causing the internal resistance to rise and the capacitance to drop,ultimately leading to component failure.Detecting this fault requires measuring the current in the capacitor,rendering the method invasive and frequently impractical due to spatial constraints or operational limitations imposed by the integration of a current sensor in the capacitor branch.This article proposes the implementation of an online noninvasive fault diagnosis technique for estimating the Equivalent Series Resistance(ESR)and Capacitance(C)values of the capacitor,employing a combination of signal processing techniques(SPT)and machine learning(ML)algorithms.This solution relies solely on the converter’s input and output signals,therefore making it a non-invasive approach.The ML algorithm used was linear regression,applied to 27 attributes,21 of which were generated through feature engineering to enhance the model’s performance.The proposed solution demonstrates an R^(2) score greater than 0.99 in the estimation of both ESR and C.展开更多
Gastric cancer is the third leading cause of cancer-related mortality and remains a major global health issue^([1]).Annually,approximately 479,000individuals in China are diagnosed with gastric cancer,accounting for a...Gastric cancer is the third leading cause of cancer-related mortality and remains a major global health issue^([1]).Annually,approximately 479,000individuals in China are diagnosed with gastric cancer,accounting for almost 45%of all new cases worldwide^([2]).展开更多
This study aimed to create a model for calculating the total reference crop evapotranspiration(ETo)in Mersin Province from May 2015 to 2020 and to generate maps using spatial analysis.Lemons from citrus play a signifi...This study aimed to create a model for calculating the total reference crop evapotranspiration(ETo)in Mersin Province from May 2015 to 2020 and to generate maps using spatial analysis.Lemons from citrus play a significant role inMersin’s agriculture,and because of lemons’sensitivity to temperature,ETo is essential for them.Itwas observed that the ETo value(EToPM)calculated using the Penman-Monteith(PM)method increased over the years.A model was developed using data from 36 Automated Weather Observing Systems(AWOS)in Mersin,Turkiye,which is located in a semi-arid climate zone.The model was created using Multiple Linear Regression(MLR)and artificial neural network(ANN)methods.The station climate data were divided into training and test datasets separately and collectively,and ETo values were estimated with different combinations using three scenarios and six model constructs.The dataset was divided into training(2015-2018)and testing(2019-2020).ANN1 andMLR1 are analyses of individual AWOS,while the other models are analyses of all AWOS together.The statistical performance analysis involved a comparison of the R2,Mean Absolute Error(MAE),Mean Absolute Percentage Error(MAPE),and RootMean Square Error(RMSE)values.The analysis results indicated that ANN1(0.9997,0.0105,0.2718%,and 0.0162,respectively)and ANN2(0.9958,0.0678,1.5341%,and 0.0864,respectively)models successfully predicted as statistical with both single and all AWOS.Themodels were visually evaluated using the Inverse DistanceWeighting(IDW)interpolationmethod,andmaps of plant water consumption were generated.The relationships between bothmodels and years in themonthly total ETo maps allowed for a clearer comparison.展开更多
According to some main assumptions in the Rouse Formula,it analyzes the applicability of Rouse distribution in the coastal region.Based on the classical Rouse Formula,the linear form of Rouse Formula and the transport...According to some main assumptions in the Rouse Formula,it analyzes the applicability of Rouse distribution in the coastal region.Based on the classical Rouse Formula,the linear form of Rouse Formula and the transport characteristics of offshore sediment were used to take lnz/h,lnc_(a),c_(a),u,lnu and z/h as the independent variables.The multiple liner regression method was used to analyze the influence of the independent variables on the vertical distribution of sediment concentration.By using the method of significance test,the factors(lnu)that have less influence on sediment concentration among 6 variables were eliminated.The correlation coefficient between the calculated sediment concentration and the measured sediment concentration indicates that the adopted variables can reflect the characteristics of vertical distribution of concentration of fine sediment near shore under complex dynamic conditions.展开更多
The rock matrix bulk modulus or its inverse, the compressive coefficient, is an important input parameter for fluid substitution by the Biot-Gassmann equation in reservoir prediction. However, it is not easy to accura...The rock matrix bulk modulus or its inverse, the compressive coefficient, is an important input parameter for fluid substitution by the Biot-Gassmann equation in reservoir prediction. However, it is not easy to accurately estimate the bulk modulus by using conventional methods. In this paper, we present a new linear regression equation for calculating the parameter. In order to get this equation, we first derive a simplified Gassmann equation by using a reasonable assumption in which the compressive coefficient of the saturated pore fluid is much greater than the rock matrix, and, second, we use the Eshelby- Walsh relation to replace the equivalent modulus of a dry rock in the Gassmann equation. Results from the rock physics analysis of rock sample from a carbonate area show that rock matrix compressive coefficients calculated with water-saturated and dry rock samples using the linear regression method are very close (their error is less than 1%). This means the new method is accurate and reliable.展开更多
As the traditional methods and technical means cannot meet the quantitative research needs of the urban land use patterns, quantitative research methods for the urban land use pattern are established via the GIS (geo...As the traditional methods and technical means cannot meet the quantitative research needs of the urban land use patterns, quantitative research methods for the urban land use pattern are established via the GIS (geographic information system ) technique combined with the related theories and models. Taking the city of Nanjing as an example, a spatial database of urban land use and other environmental and socio-economic data is constructed. A multiple linear regression model is developed to determine the statistically significant factors affecting the residential land use distributions. To explain the spatial variations of urban land use patterns, the geographically weighted regression (GWR) is employed to establish spatial associations between these significant factors and the distribution of urban residential land use. The results demonstrate that the GWR can provide an effective approach to the exploration of the urban land use spatial patterns and also provide useful spatial information for planning residential development and other types of urban land use.展开更多
Siberian-Arctic heatwaves(SAHs)disrupt ecosystems by increasing wildfires,thawing permafrost,and threatening Arctic communities.As SAHs become more frequent and intense,accurate prediction is crucial for preparedness ...Siberian-Arctic heatwaves(SAHs)disrupt ecosystems by increasing wildfires,thawing permafrost,and threatening Arctic communities.As SAHs become more frequent and intense,accurate prediction is crucial for preparedness and mitigating their impacts.We demonstrate that April surface temperatures in the Siberian Arctic can be predicted one month in advance with a skill of 0.75(1979-2022)using a regression model based on Arctic stratospheric ozone,the Arctic Oscillation,and sea ice in the Kara Sea.This model successfully predicts six of seven SAHs,identifying three driven by extreme ozone depletion and three by significant sea-ice loss.Additionally,from 1979 to 1997,warming was primarily caused by ozone depletion,while from 1998 to 2022,sea-ice loss became the main factor.Our findings indicate that SAHs are predictable and recommend this model for real-time monitoring and forecasting,highlighting its potential to enhance preparedness and reduce adverse effects.展开更多
There is a lack of studies when dealing with the comparison between regression methods and machine learning(ML)-type methods in terms of their ability to interpret and describe how the components of a bituminous mixtu...There is a lack of studies when dealing with the comparison between regression methods and machine learning(ML)-type methods in terms of their ability to interpret and describe how the components of a bituminous mixture affect mechanistic performance.At the same time,artificial intelligence(AI)-driven approaches are becoming more popular in analysing asphalt mixtures,yet there are limited comparisons of regression and machine learning(ML)models for mechanistic performance interpretation.Consequently,a comparison of AI and statistical approaches is presented in this study for predicting bituminous mixture properties such as stiffness,fatigue resistance,and tensile strength.Some of the important input features are bitumen content,crumb rubber content,and air void content.The research uses random forest model(RFM),linear regression model(LRM),and polynomial regression model(PRM).RFM and PRM achieved an R^(2) as high as 0.94,with mean absolute error(MAE)less than 2.5,and are,therefore,good predictive models.Interestingly,RFM works best in one-third of instances,particularly when dealing with outliers,whereas traditional statistical models work better in two-thirds of instances.The results highlight AI's value in bituminous mixture optimisation,where RFM showed good prediction accuracy.In 30%of the cases,AI models outperformed the conventional statistical approaches.At the same time,analyses show that model performance varies significantly with scenarios and that even if AI models capture complex nonlinear relationships,they must not override DOE principles.展开更多
AIM:To evaluate long-term visual field(VF)prediction using K-means clustering in patients with primary open angle glaucoma(POAG).METHODS:Patients who underwent 24-2 VF tests≥10 were included in this study.Using 52 to...AIM:To evaluate long-term visual field(VF)prediction using K-means clustering in patients with primary open angle glaucoma(POAG).METHODS:Patients who underwent 24-2 VF tests≥10 were included in this study.Using 52 total deviation values(TDVs)from the first 10 VF tests of the training dataset,VF points were clustered into several regions using the hierarchical ordered partitioning and collapsing hybrid(HOPACH)and K-means clustering.Based on the clustering results,a linear regression analysis was applied to each clustered region of the testing dataset to predict the TDVs of the 10th VF test.Three to nine VF tests were used to predict the 10th VF test,and the prediction errors(root mean square error,RMSE)of each clustering method and pointwise linear regression(PLR)were compared.RESULTS:The training group consisted of 228 patients(mean age,54.20±14.38y;123 males and 105 females),and the testing group included 81 patients(mean age,54.88±15.22y;43 males and 38 females).All subjects were diagnosed with POAG.Fifty-two VF points were clustered into 11 and nine regions using HOPACH and K-means clustering,respectively.K-means clustering had a lower prediction error than PLR when n=1:3 and 1:4(both P≤0.003).The prediction errors of K-means clustering were lower than those of HOPACH in all sections(n=1:4 to 1:9;all P≤0.011),except for n=1:3(P=0.680).PLR outperformed K-means clustering only when n=1:8 and 1:9(both P≤0.020).CONCLUSION:K-means clustering can predict longterm VF test results more accurately in patients with POAG with limited VF data.展开更多
Abstract Using the method of stepwise multivariate linear regression (SMLR), the quantitative structure activity relationships (QSAR) of two isomeric series of taxol and its derivatives have been studied. It was foun...Abstract Using the method of stepwise multivariate linear regression (SMLR), the quantitative structure activity relationships (QSAR) of two isomeric series of taxol and its derivatives have been studied. It was found that the molar refractivity of the C3′substituent of the C13 side chain has significant correlation with its activity. We deduce that structural changes in the C3′substituents may be critical to the anticancer function. It would be useful to the design and synthesis of taxol like compounds with improved activities.展开更多
Rainfall is an important factor in estimating the event mean concentration (EMC) which is used to quantify the washed-off pollutant concentrations from non-point sources (NPSs). Pollutant loads could also be calcu...Rainfall is an important factor in estimating the event mean concentration (EMC) which is used to quantify the washed-off pollutant concentrations from non-point sources (NPSs). Pollutant loads could also be calculated using rainfall, catchment area and runoff coefficient. In this study, runoff quantity and quality data gathered from a 28-month monitoring conducted on the road and parking lot sites in Korea were evaluated using multiple linear regression (MLR) to develop equations for estimating pollutant loads and EMCs as a function of rainfall variables. The results revealed that total event rainfall and average rainfall intensity are possible predictors of pollutant loads. Overall, the models are indicators of the high uncertainties of NPSs; perhaps estimation of EMCs and loads could be accurately obtained by means of water quality sampling or a long term monitoring is needed to gather more data that can be used for the development of estimation models.展开更多
In this paper, based on the theory of parameter estimation, we give a selection method and, in a sense of a good character of the parameter estimation, we think that it is very reasonable. Moreover, we offer a calcula...In this paper, based on the theory of parameter estimation, we give a selection method and, in a sense of a good character of the parameter estimation, we think that it is very reasonable. Moreover, we offer a calculation method of selection statistic and an applied example.展开更多
The construction method of background value is improved in the original multi-variable grey model (MGM(1,m)) from its source of construction errors. The MGM(1,m) with optimized background value is used to elimin...The construction method of background value is improved in the original multi-variable grey model (MGM(1,m)) from its source of construction errors. The MGM(1,m) with optimized background value is used to eliminate the random fluctuations or errors of the observational data of all variables, and the combined prediction model together with the multiple linear regression is established in order to improve the simulation and prediction accuracy of the combined model. Finally, a combined model of the MGM(1,2) with optimized background value and the binary linear regression is constructed by an example. The results show that the model has good effects for simulation and prediction.展开更多
Many properties of fruit are influenced by plant nutrition. Fruit firmness is one of the most important fruit characteristics and determines post-harvest life of the fruit, in recent decades, artificial intelligence s...Many properties of fruit are influenced by plant nutrition. Fruit firmness is one of the most important fruit characteristics and determines post-harvest life of the fruit, in recent decades, artificial intelligence systems were employed for developing predictive models to estimate and predict many agriculture processes. In the present study, the predictive capabilities of multiple linear regressions (MLR) and artificial neural networks (ANNs) are evaluated to estimate fruit firmness in six months, including each of nutrients concentrations (nitrogen (N), potassium (K), calcium (Ca) and magnesium (Mg)) alone (P1), com- bination of nutrients concentrations (P2), nutrient concentration ratios alone (P3), and combination of nutrient concentrations and nutrient concentration ratios (P4). The results showed that MLR model estimated fruit firmness more accuracy than ANN model in three datasets (P1, P2 and P4). However, the application of P3 (N/Ca ratio) as the input dataset in ANN model improved the prediction of fruit firmness than the MLR model. Correlation coefficient and root mean squared error (RMSE) were 0.850 and 0.539 between the measured and the estimated data by the ANN model, respectively. Generally, the ANN model showed greater potential in determining the relationship between 6-mon-fruit firmness and nutrients concentration.展开更多
Multiple linear regression (MLR) method was applied to quantify the effects of the net heat flux (NHF), the net freshwater flux (NFF) and the wind stress on the mixed layer depth (MLD) of the South China Sea ...Multiple linear regression (MLR) method was applied to quantify the effects of the net heat flux (NHF), the net freshwater flux (NFF) and the wind stress on the mixed layer depth (MLD) of the South China Sea (SCS) based on the simple ocean data assimilation (SODA) dataset. The spatio-temporal distributions of the MLD, the buoyancy flux (combining the NHF and the NFF) and the wind stress of the SCS were presented. Then using an oceanic vertical mixing model, the MLD after a certain time under the same initial conditions but various pairs of boundary conditions (the three factors) was simulated. Applying the MLR method to the results, regression equations which modeling the relationship between the simulated MLD and the three factors were calculated. The equations indicate that when the NHF was negative, it was the primary driver of the mixed layer deepening; and when the NHF was positive, the wind stress played a more important role than that of the NHF while the NFF had the least effect. When the NHF was positive, the relative quantitative effects of the wind stress, the NHF, and the NFF were about i0, 6 and 2. The above conclusions were applied to explaining the spatio-temporal distributions of the MLD in the SCS and thus proved to be valid.展开更多
In agricultural systems,the regular monitoring of Soil Organic Matter(SOM)dynamics is essential.This task is costly and time-consuming when using the conventional method,especially in a very fragmented area and with i...In agricultural systems,the regular monitoring of Soil Organic Matter(SOM)dynamics is essential.This task is costly and time-consuming when using the conventional method,especially in a very fragmented area and with intensive agricultural activity,such as the area of Sidi Bennour.The study area is located in the Doukkala irrigated perimeter in Morocco.Satellite data can provide an alternative and fill this gap at a low cost.Models to predict SOM from a satellite image,whether linear or nonlinear,have shown considerable interest.This study aims to compare SOM prediction using Multiple Linear Regression(MLR)and Artificial Neural Networks(ANN).A total of 368 points were collected at a depth of 0-30 cm and analyzed in the laboratory.An image at 15 m resolution(MSPAN)was produced from a 30 m resolution(MS)Landsat-8 image using image pansharpening processing and panchromatic band(15 m).The results obtained show that the MLR models predicted the SOM with(training/validation)R^(2)values of 0.62/0.63 and 0.64/0.65 and RMSE values of 0.23/0.22 and 0.22/0.21 for the MS and MSPAN images,respectively.In contrast,the ANN models predicted SOM with R2 values of 0.65/0.66 and 0.69/0.71 and RMSE values of 0.22/0.10 and 0.21/0.18 for the MS and MSPAN images,respectively.Image pansharpening improved the prediction accuracy by 2.60%and 4.30%and reduced the estimation error by 0.80%and 1.30%for the MLR and ANN models,respectively.展开更多
The mechanical properties of TiAl alloy prepared by directional solidification were predicted through a machine learning algorithm model.The composition,input power,and pulling speed were designated as input variables...The mechanical properties of TiAl alloy prepared by directional solidification were predicted through a machine learning algorithm model.The composition,input power,and pulling speed were designated as input variables as representative factors influencing mechanical properties,and multiple linear regression analysis was conducted by collecting data obtained from the literature.In this study,the R^(2)value of the tensile strength prediction result was 0.7159,elongation was 0.8459,nanoindentation hardness was 0.7573,and interlamellar spacing was 0.9674.As the R^(2)value of the elongation obtained through the analysis was higher than the R^(2)value of the tensile strength,it was confirmed that the elongation had a closer relationship with the input variables(composition,input power,pulling speed)than the tensile strength.By adding the elongation to the tensile strength as an input variable,it was observed that the R^(2)value was further increased.The tensile test prediction results were divided into four groups:The group with the lowest residual value(predicted value-actual value)was designated as group A,and the group with the largest residual value was designated as group D.When comparing the values of group A and group D,more overpredictions occurred in group A,while more under predictions occurred in group D.Using the residuals and R^(2)values,the cause of the well-prediction was studied,and through this,the relationship between the mechanical properties and the microstructure was quantitatively investigated.展开更多
In current paper, a quantitative structure-activity relationship (QSAR) study was performed for the prediction of acute toxicity of aromatic amines. A set of 56 compounds was randomly divided into a training set of ...In current paper, a quantitative structure-activity relationship (QSAR) study was performed for the prediction of acute toxicity of aromatic amines. A set of 56 compounds was randomly divided into a training set of 46 compounds and a test set of 10 compounds. The electronic and topological descriptors computed by the Scigress package and Dragon software were used as predictor variables. Multiple linear regression (MLR) and support vector machine (SVM) were utilized to build the linear and nonlinear QSAR models, respectively. The obtained models with five descriptors show strong predictive ability. The linear model fits the training set with R2 = 0.71, with higher SVM values of R2 = 0.77. The validation results obtained from the test set indicate that the SVM model is comparable or superior to that obtained by MLR, both in terms of prediction ability and robustness.展开更多
A class of estimators of the mean survival time with interval censored data are studied by unbiased transformation method. The estimators are constructed based on the observations to ensure unbiasedness in the sense t...A class of estimators of the mean survival time with interval censored data are studied by unbiased transformation method. The estimators are constructed based on the observations to ensure unbiasedness in the sense that the estimators in a certain class have the same expectation as the mean survival time. The estimators have good properties such as strong consistency (with the rate of O(n^-1/1 (log log n)^1/2)) and asymptotic normality. The application to linear regression is considered and the simulation reports are given.展开更多
文摘This paper considers the approaches and methods for reducing the influence of multi-collinearity. Great attention is paid to the question of using shrinkage estimators for this purpose. Two classes of regression models are investigated, the first of which corresponds to systems with a negative feedback, while the second class presents systems without the feedback. In the first case the use of shrinkage estimators, especially the Principal Component estimator, is inappropriate but is possible in the second case with the right choice of the regularization parameter or of the number of principal components included in the regression model. This fact is substantiated by the study of the distribution of the random variable , where b is the LS estimate and β is the true coefficient, since the form of this distribution is the basic characteristic of the specified classes. For this study, a regression approximation of the distribution of the event based on the Edgeworth series was developed. Also, alternative approaches are examined to resolve the multicollinearity issue, including an application of the known Inequality Constrained Least Squares method and the Dual estimator method proposed by the author. It is shown that with a priori information the Euclidean distance between the estimates and the true coefficients can be significantly reduced.
文摘Power converters are essential components in modern life,being widely used in industry,automation,transportation,and household appliances.In many critical applications,their failure can lead not only to financial losses due to operational downtime but also to serious risks to human safety.The capacitors forming the output filter,typically aluminumelectrolytic capacitors(AECs),are among the most critical and susceptible components in power converters.The electrolyte in AECs often evaporates over time,causing the internal resistance to rise and the capacitance to drop,ultimately leading to component failure.Detecting this fault requires measuring the current in the capacitor,rendering the method invasive and frequently impractical due to spatial constraints or operational limitations imposed by the integration of a current sensor in the capacitor branch.This article proposes the implementation of an online noninvasive fault diagnosis technique for estimating the Equivalent Series Resistance(ESR)and Capacitance(C)values of the capacitor,employing a combination of signal processing techniques(SPT)and machine learning(ML)algorithms.This solution relies solely on the converter’s input and output signals,therefore making it a non-invasive approach.The ML algorithm used was linear regression,applied to 27 attributes,21 of which were generated through feature engineering to enhance the model’s performance.The proposed solution demonstrates an R^(2) score greater than 0.99 in the estimation of both ESR and C.
基金supported by the Natural Science Foundation of Shanghai(23ZR1463600)Shanghai Pudong New Area Health Commission Research Project(PW2021A-69)Research Project of Clinical Research Center of Shanghai Health Medical University(22MC2022002)。
文摘Gastric cancer is the third leading cause of cancer-related mortality and remains a major global health issue^([1]).Annually,approximately 479,000individuals in China are diagnosed with gastric cancer,accounting for almost 45%of all new cases worldwide^([2]).
文摘This study aimed to create a model for calculating the total reference crop evapotranspiration(ETo)in Mersin Province from May 2015 to 2020 and to generate maps using spatial analysis.Lemons from citrus play a significant role inMersin’s agriculture,and because of lemons’sensitivity to temperature,ETo is essential for them.Itwas observed that the ETo value(EToPM)calculated using the Penman-Monteith(PM)method increased over the years.A model was developed using data from 36 Automated Weather Observing Systems(AWOS)in Mersin,Turkiye,which is located in a semi-arid climate zone.The model was created using Multiple Linear Regression(MLR)and artificial neural network(ANN)methods.The station climate data were divided into training and test datasets separately and collectively,and ETo values were estimated with different combinations using three scenarios and six model constructs.The dataset was divided into training(2015-2018)and testing(2019-2020).ANN1 andMLR1 are analyses of individual AWOS,while the other models are analyses of all AWOS together.The statistical performance analysis involved a comparison of the R2,Mean Absolute Error(MAE),Mean Absolute Percentage Error(MAPE),and RootMean Square Error(RMSE)values.The analysis results indicated that ANN1(0.9997,0.0105,0.2718%,and 0.0162,respectively)and ANN2(0.9958,0.0678,1.5341%,and 0.0864,respectively)models successfully predicted as statistical with both single and all AWOS.Themodels were visually evaluated using the Inverse DistanceWeighting(IDW)interpolationmethod,andmaps of plant water consumption were generated.The relationships between bothmodels and years in themonthly total ETo maps allowed for a clearer comparison.
文摘According to some main assumptions in the Rouse Formula,it analyzes the applicability of Rouse distribution in the coastal region.Based on the classical Rouse Formula,the linear form of Rouse Formula and the transport characteristics of offshore sediment were used to take lnz/h,lnc_(a),c_(a),u,lnu and z/h as the independent variables.The multiple liner regression method was used to analyze the influence of the independent variables on the vertical distribution of sediment concentration.By using the method of significance test,the factors(lnu)that have less influence on sediment concentration among 6 variables were eliminated.The correlation coefficient between the calculated sediment concentration and the measured sediment concentration indicates that the adopted variables can reflect the characteristics of vertical distribution of concentration of fine sediment near shore under complex dynamic conditions.
基金supported by the National Nature Science Foundation of China (Grant Noss 40739907 and 40774064)National Science and Technology Major Project (Grant No. 2008ZX05025-003)
文摘The rock matrix bulk modulus or its inverse, the compressive coefficient, is an important input parameter for fluid substitution by the Biot-Gassmann equation in reservoir prediction. However, it is not easy to accurately estimate the bulk modulus by using conventional methods. In this paper, we present a new linear regression equation for calculating the parameter. In order to get this equation, we first derive a simplified Gassmann equation by using a reasonable assumption in which the compressive coefficient of the saturated pore fluid is much greater than the rock matrix, and, second, we use the Eshelby- Walsh relation to replace the equivalent modulus of a dry rock in the Gassmann equation. Results from the rock physics analysis of rock sample from a carbonate area show that rock matrix compressive coefficients calculated with water-saturated and dry rock samples using the linear regression method are very close (their error is less than 1%). This means the new method is accurate and reliable.
基金The National Natural Science Foundation of China(No.51378099)
文摘As the traditional methods and technical means cannot meet the quantitative research needs of the urban land use patterns, quantitative research methods for the urban land use pattern are established via the GIS (geographic information system ) technique combined with the related theories and models. Taking the city of Nanjing as an example, a spatial database of urban land use and other environmental and socio-economic data is constructed. A multiple linear regression model is developed to determine the statistically significant factors affecting the residential land use distributions. To explain the spatial variations of urban land use patterns, the geographically weighted regression (GWR) is employed to establish spatial associations between these significant factors and the distribution of urban residential land use. The results demonstrate that the GWR can provide an effective approach to the exploration of the urban land use spatial patterns and also provide useful spatial information for planning residential development and other types of urban land use.
基金supported by the National Key Research and Development Program of China(Grant No.2023YFF0805104)the National Natural Science Foundation of China(NSFC)under Grant Nos.41925022,42105016 and 42375070+1 种基金supported by the NSFC under Grant No.41888101the Natural Sciences and Engineering Research Council of Canada(Grant No.RGPIN-2019-04511)。
文摘Siberian-Arctic heatwaves(SAHs)disrupt ecosystems by increasing wildfires,thawing permafrost,and threatening Arctic communities.As SAHs become more frequent and intense,accurate prediction is crucial for preparedness and mitigating their impacts.We demonstrate that April surface temperatures in the Siberian Arctic can be predicted one month in advance with a skill of 0.75(1979-2022)using a regression model based on Arctic stratospheric ozone,the Arctic Oscillation,and sea ice in the Kara Sea.This model successfully predicts six of seven SAHs,identifying three driven by extreme ozone depletion and three by significant sea-ice loss.Additionally,from 1979 to 1997,warming was primarily caused by ozone depletion,while from 1998 to 2022,sea-ice loss became the main factor.Our findings indicate that SAHs are predictable and recommend this model for real-time monitoring and forecasting,highlighting its potential to enhance preparedness and reduce adverse effects.
基金sustained them with this research(including Eng.Giuseppe Colicchio)and the European Commission for its financial contribution to the LIFE SILENT project“Sustainable Innovations for Long-life Environmental Noise Technologies”(LIFE22-ENV-IT-LIFE-SILENT/101114310.Acronym:LIFE22-ENV-ITLIFE SILENT)the LIFE SNEAK Project“Optimised Surfaces Against Noise and Vibrations Produced by Tramway Track and Road Traffic”(LIFE20 ENV/IT/000181.Acronym:LIFE SNEAK).
文摘There is a lack of studies when dealing with the comparison between regression methods and machine learning(ML)-type methods in terms of their ability to interpret and describe how the components of a bituminous mixture affect mechanistic performance.At the same time,artificial intelligence(AI)-driven approaches are becoming more popular in analysing asphalt mixtures,yet there are limited comparisons of regression and machine learning(ML)models for mechanistic performance interpretation.Consequently,a comparison of AI and statistical approaches is presented in this study for predicting bituminous mixture properties such as stiffness,fatigue resistance,and tensile strength.Some of the important input features are bitumen content,crumb rubber content,and air void content.The research uses random forest model(RFM),linear regression model(LRM),and polynomial regression model(PRM).RFM and PRM achieved an R^(2) as high as 0.94,with mean absolute error(MAE)less than 2.5,and are,therefore,good predictive models.Interestingly,RFM works best in one-third of instances,particularly when dealing with outliers,whereas traditional statistical models work better in two-thirds of instances.The results highlight AI's value in bituminous mixture optimisation,where RFM showed good prediction accuracy.In 30%of the cases,AI models outperformed the conventional statistical approaches.At the same time,analyses show that model performance varies significantly with scenarios and that even if AI models capture complex nonlinear relationships,they must not override DOE principles.
基金Supported by the Korea Health Technology R&D Project through the Korea Health Industry Development Institute(KHIDI),the Ministry of Health&Welfare,Republic of Korea(No.RS-2020-KH088726)the Patient-Centered Clinical Research Coordinating Center(PACEN),the Ministry of Health and Welfare,Republic of Korea(No.HC19C0276)the National Research Foundation of Korea(NRF),the Korea Government(MSIT)(No.RS-2023-00247504).
文摘AIM:To evaluate long-term visual field(VF)prediction using K-means clustering in patients with primary open angle glaucoma(POAG).METHODS:Patients who underwent 24-2 VF tests≥10 were included in this study.Using 52 total deviation values(TDVs)from the first 10 VF tests of the training dataset,VF points were clustered into several regions using the hierarchical ordered partitioning and collapsing hybrid(HOPACH)and K-means clustering.Based on the clustering results,a linear regression analysis was applied to each clustered region of the testing dataset to predict the TDVs of the 10th VF test.Three to nine VF tests were used to predict the 10th VF test,and the prediction errors(root mean square error,RMSE)of each clustering method and pointwise linear regression(PLR)were compared.RESULTS:The training group consisted of 228 patients(mean age,54.20±14.38y;123 males and 105 females),and the testing group included 81 patients(mean age,54.88±15.22y;43 males and 38 females).All subjects were diagnosed with POAG.Fifty-two VF points were clustered into 11 and nine regions using HOPACH and K-means clustering,respectively.K-means clustering had a lower prediction error than PLR when n=1:3 and 1:4(both P≤0.003).The prediction errors of K-means clustering were lower than those of HOPACH in all sections(n=1:4 to 1:9;all P≤0.011),except for n=1:3(P=0.680).PLR outperformed K-means clustering only when n=1:8 and 1:9(both P≤0.020).CONCLUSION:K-means clustering can predict longterm VF test results more accurately in patients with POAG with limited VF data.
文摘Abstract Using the method of stepwise multivariate linear regression (SMLR), the quantitative structure activity relationships (QSAR) of two isomeric series of taxol and its derivatives have been studied. It was found that the molar refractivity of the C3′substituent of the C13 side chain has significant correlation with its activity. We deduce that structural changes in the C3′substituents may be critical to the anticancer function. It would be useful to the design and synthesis of taxol like compounds with improved activities.
基金provided by the Korean Ministry of Environment and Eco Star Project
文摘Rainfall is an important factor in estimating the event mean concentration (EMC) which is used to quantify the washed-off pollutant concentrations from non-point sources (NPSs). Pollutant loads could also be calculated using rainfall, catchment area and runoff coefficient. In this study, runoff quantity and quality data gathered from a 28-month monitoring conducted on the road and parking lot sites in Korea were evaluated using multiple linear regression (MLR) to develop equations for estimating pollutant loads and EMCs as a function of rainfall variables. The results revealed that total event rainfall and average rainfall intensity are possible predictors of pollutant loads. Overall, the models are indicators of the high uncertainties of NPSs; perhaps estimation of EMCs and loads could be accurately obtained by means of water quality sampling or a long term monitoring is needed to gather more data that can be used for the development of estimation models.
基金Supported by the Natural Science Foundation of Anhui Education Committee
文摘In this paper, based on the theory of parameter estimation, we give a selection method and, in a sense of a good character of the parameter estimation, we think that it is very reasonable. Moreover, we offer a calculation method of selection statistic and an applied example.
基金supported by the National Natural Science Foundation of China(71071077)the Ministry of Education Key Project of National Educational Science Planning(DFA090215)+1 种基金China Postdoctoral Science Foundation(20100481137)Funding of Jiangsu Innovation Program for Graduate Education(CXZZ11-0226)
文摘The construction method of background value is improved in the original multi-variable grey model (MGM(1,m)) from its source of construction errors. The MGM(1,m) with optimized background value is used to eliminate the random fluctuations or errors of the observational data of all variables, and the combined prediction model together with the multiple linear regression is established in order to improve the simulation and prediction accuracy of the combined model. Finally, a combined model of the MGM(1,2) with optimized background value and the binary linear regression is constructed by an example. The results show that the model has good effects for simulation and prediction.
文摘Many properties of fruit are influenced by plant nutrition. Fruit firmness is one of the most important fruit characteristics and determines post-harvest life of the fruit, in recent decades, artificial intelligence systems were employed for developing predictive models to estimate and predict many agriculture processes. In the present study, the predictive capabilities of multiple linear regressions (MLR) and artificial neural networks (ANNs) are evaluated to estimate fruit firmness in six months, including each of nutrients concentrations (nitrogen (N), potassium (K), calcium (Ca) and magnesium (Mg)) alone (P1), com- bination of nutrients concentrations (P2), nutrient concentration ratios alone (P3), and combination of nutrient concentrations and nutrient concentration ratios (P4). The results showed that MLR model estimated fruit firmness more accuracy than ANN model in three datasets (P1, P2 and P4). However, the application of P3 (N/Ca ratio) as the input dataset in ANN model improved the prediction of fruit firmness than the MLR model. Correlation coefficient and root mean squared error (RMSE) were 0.850 and 0.539 between the measured and the estimated data by the ANN model, respectively. Generally, the ANN model showed greater potential in determining the relationship between 6-mon-fruit firmness and nutrients concentration.
基金The National Natural Science Foundation of China under contract No.11174235the Science and Technology Development Project of Shaanxi Province of China under contract No.2010KJXX-02+2 种基金the Program for New Century Excellent Talents in University of China under contract No. NCET-08-0455the Science and Technology Innovation Foundation of Northwestern Polytechnical University of Chinathe Doctorate Foundation of Northwestern Polytechnical University of China under contract No.CX201226.
文摘Multiple linear regression (MLR) method was applied to quantify the effects of the net heat flux (NHF), the net freshwater flux (NFF) and the wind stress on the mixed layer depth (MLD) of the South China Sea (SCS) based on the simple ocean data assimilation (SODA) dataset. The spatio-temporal distributions of the MLD, the buoyancy flux (combining the NHF and the NFF) and the wind stress of the SCS were presented. Then using an oceanic vertical mixing model, the MLD after a certain time under the same initial conditions but various pairs of boundary conditions (the three factors) was simulated. Applying the MLR method to the results, regression equations which modeling the relationship between the simulated MLD and the three factors were calculated. The equations indicate that when the NHF was negative, it was the primary driver of the mixed layer deepening; and when the NHF was positive, the wind stress played a more important role than that of the NHF while the NFF had the least effect. When the NHF was positive, the relative quantitative effects of the wind stress, the NHF, and the NFF were about i0, 6 and 2. The above conclusions were applied to explaining the spatio-temporal distributions of the MLD in the SCS and thus proved to be valid.
文摘In agricultural systems,the regular monitoring of Soil Organic Matter(SOM)dynamics is essential.This task is costly and time-consuming when using the conventional method,especially in a very fragmented area and with intensive agricultural activity,such as the area of Sidi Bennour.The study area is located in the Doukkala irrigated perimeter in Morocco.Satellite data can provide an alternative and fill this gap at a low cost.Models to predict SOM from a satellite image,whether linear or nonlinear,have shown considerable interest.This study aims to compare SOM prediction using Multiple Linear Regression(MLR)and Artificial Neural Networks(ANN).A total of 368 points were collected at a depth of 0-30 cm and analyzed in the laboratory.An image at 15 m resolution(MSPAN)was produced from a 30 m resolution(MS)Landsat-8 image using image pansharpening processing and panchromatic band(15 m).The results obtained show that the MLR models predicted the SOM with(training/validation)R^(2)values of 0.62/0.63 and 0.64/0.65 and RMSE values of 0.23/0.22 and 0.22/0.21 for the MS and MSPAN images,respectively.In contrast,the ANN models predicted SOM with R2 values of 0.65/0.66 and 0.69/0.71 and RMSE values of 0.22/0.10 and 0.21/0.18 for the MS and MSPAN images,respectively.Image pansharpening improved the prediction accuracy by 2.60%and 4.30%and reduced the estimation error by 0.80%and 1.30%for the MLR and ANN models,respectively.
基金financially supported by the National Natural Science Foundation of China(Nos.51671072 and 51471062)。
文摘The mechanical properties of TiAl alloy prepared by directional solidification were predicted through a machine learning algorithm model.The composition,input power,and pulling speed were designated as input variables as representative factors influencing mechanical properties,and multiple linear regression analysis was conducted by collecting data obtained from the literature.In this study,the R^(2)value of the tensile strength prediction result was 0.7159,elongation was 0.8459,nanoindentation hardness was 0.7573,and interlamellar spacing was 0.9674.As the R^(2)value of the elongation obtained through the analysis was higher than the R^(2)value of the tensile strength,it was confirmed that the elongation had a closer relationship with the input variables(composition,input power,pulling speed)than the tensile strength.By adding the elongation to the tensile strength as an input variable,it was observed that the R^(2)value was further increased.The tensile test prediction results were divided into four groups:The group with the lowest residual value(predicted value-actual value)was designated as group A,and the group with the largest residual value was designated as group D.When comparing the values of group A and group D,more overpredictions occurred in group A,while more under predictions occurred in group D.Using the residuals and R^(2)values,the cause of the well-prediction was studied,and through this,the relationship between the mechanical properties and the microstructure was quantitatively investigated.
基金Supported by the Ministry of Environmental Protection of China(No.2011467037)
文摘In current paper, a quantitative structure-activity relationship (QSAR) study was performed for the prediction of acute toxicity of aromatic amines. A set of 56 compounds was randomly divided into a training set of 46 compounds and a test set of 10 compounds. The electronic and topological descriptors computed by the Scigress package and Dragon software were used as predictor variables. Multiple linear regression (MLR) and support vector machine (SVM) were utilized to build the linear and nonlinear QSAR models, respectively. The obtained models with five descriptors show strong predictive ability. The linear model fits the training set with R2 = 0.71, with higher SVM values of R2 = 0.77. The validation results obtained from the test set indicate that the SVM model is comparable or superior to that obtained by MLR, both in terms of prediction ability and robustness.
基金Supported by the National Natural Science Foundation of China (70171008)
文摘A class of estimators of the mean survival time with interval censored data are studied by unbiased transformation method. The estimators are constructed based on the observations to ensure unbiasedness in the sense that the estimators in a certain class have the same expectation as the mean survival time. The estimators have good properties such as strong consistency (with the rate of O(n^-1/1 (log log n)^1/2)) and asymptotic normality. The application to linear regression is considered and the simulation reports are given.