This paper considers the approaches and methods for reducing the influence of multi-collinearity. Great attention is paid to the question of using shrinkage estimators for this purpose. Two classes of regression model...This paper considers the approaches and methods for reducing the influence of multi-collinearity. Great attention is paid to the question of using shrinkage estimators for this purpose. Two classes of regression models are investigated, the first of which corresponds to systems with a negative feedback, while the second class presents systems without the feedback. In the first case the use of shrinkage estimators, especially the Principal Component estimator, is inappropriate but is possible in the second case with the right choice of the regularization parameter or of the number of principal components included in the regression model. This fact is substantiated by the study of the distribution of the random variable , where b is the LS estimate and β is the true coefficient, since the form of this distribution is the basic characteristic of the specified classes. For this study, a regression approximation of the distribution of the event based on the Edgeworth series was developed. Also, alternative approaches are examined to resolve the multicollinearity issue, including an application of the known Inequality Constrained Least Squares method and the Dual estimator method proposed by the author. It is shown that with a priori information the Euclidean distance between the estimates and the true coefficients can be significantly reduced.展开更多
Power converters are essential components in modern life,being widely used in industry,automation,transportation,and household appliances.In many critical applications,their failure can lead not only to financial loss...Power converters are essential components in modern life,being widely used in industry,automation,transportation,and household appliances.In many critical applications,their failure can lead not only to financial losses due to operational downtime but also to serious risks to human safety.The capacitors forming the output filter,typically aluminumelectrolytic capacitors(AECs),are among the most critical and susceptible components in power converters.The electrolyte in AECs often evaporates over time,causing the internal resistance to rise and the capacitance to drop,ultimately leading to component failure.Detecting this fault requires measuring the current in the capacitor,rendering the method invasive and frequently impractical due to spatial constraints or operational limitations imposed by the integration of a current sensor in the capacitor branch.This article proposes the implementation of an online noninvasive fault diagnosis technique for estimating the Equivalent Series Resistance(ESR)and Capacitance(C)values of the capacitor,employing a combination of signal processing techniques(SPT)and machine learning(ML)algorithms.This solution relies solely on the converter’s input and output signals,therefore making it a non-invasive approach.The ML algorithm used was linear regression,applied to 27 attributes,21 of which were generated through feature engineering to enhance the model’s performance.The proposed solution demonstrates an R^(2) score greater than 0.99 in the estimation of both ESR and C.展开更多
Gastric cancer is the third leading cause of cancer-related mortality and remains a major global health issue^([1]).Annually,approximately 479,000individuals in China are diagnosed with gastric cancer,accounting for a...Gastric cancer is the third leading cause of cancer-related mortality and remains a major global health issue^([1]).Annually,approximately 479,000individuals in China are diagnosed with gastric cancer,accounting for almost 45%of all new cases worldwide^([2]).展开更多
This study aimed to create a model for calculating the total reference crop evapotranspiration(ETo)in Mersin Province from May 2015 to 2020 and to generate maps using spatial analysis.Lemons from citrus play a signifi...This study aimed to create a model for calculating the total reference crop evapotranspiration(ETo)in Mersin Province from May 2015 to 2020 and to generate maps using spatial analysis.Lemons from citrus play a significant role inMersin’s agriculture,and because of lemons’sensitivity to temperature,ETo is essential for them.Itwas observed that the ETo value(EToPM)calculated using the Penman-Monteith(PM)method increased over the years.A model was developed using data from 36 Automated Weather Observing Systems(AWOS)in Mersin,Turkiye,which is located in a semi-arid climate zone.The model was created using Multiple Linear Regression(MLR)and artificial neural network(ANN)methods.The station climate data were divided into training and test datasets separately and collectively,and ETo values were estimated with different combinations using three scenarios and six model constructs.The dataset was divided into training(2015-2018)and testing(2019-2020).ANN1 andMLR1 are analyses of individual AWOS,while the other models are analyses of all AWOS together.The statistical performance analysis involved a comparison of the R2,Mean Absolute Error(MAE),Mean Absolute Percentage Error(MAPE),and RootMean Square Error(RMSE)values.The analysis results indicated that ANN1(0.9997,0.0105,0.2718%,and 0.0162,respectively)and ANN2(0.9958,0.0678,1.5341%,and 0.0864,respectively)models successfully predicted as statistical with both single and all AWOS.Themodels were visually evaluated using the Inverse DistanceWeighting(IDW)interpolationmethod,andmaps of plant water consumption were generated.The relationships between bothmodels and years in themonthly total ETo maps allowed for a clearer comparison.展开更多
According to some main assumptions in the Rouse Formula,it analyzes the applicability of Rouse distribution in the coastal region.Based on the classical Rouse Formula,the linear form of Rouse Formula and the transport...According to some main assumptions in the Rouse Formula,it analyzes the applicability of Rouse distribution in the coastal region.Based on the classical Rouse Formula,the linear form of Rouse Formula and the transport characteristics of offshore sediment were used to take lnz/h,lnc_(a),c_(a),u,lnu and z/h as the independent variables.The multiple liner regression method was used to analyze the influence of the independent variables on the vertical distribution of sediment concentration.By using the method of significance test,the factors(lnu)that have less influence on sediment concentration among 6 variables were eliminated.The correlation coefficient between the calculated sediment concentration and the measured sediment concentration indicates that the adopted variables can reflect the characteristics of vertical distribution of concentration of fine sediment near shore under complex dynamic conditions.展开更多
The rock matrix bulk modulus or its inverse, the compressive coefficient, is an important input parameter for fluid substitution by the Biot-Gassmann equation in reservoir prediction. However, it is not easy to accura...The rock matrix bulk modulus or its inverse, the compressive coefficient, is an important input parameter for fluid substitution by the Biot-Gassmann equation in reservoir prediction. However, it is not easy to accurately estimate the bulk modulus by using conventional methods. In this paper, we present a new linear regression equation for calculating the parameter. In order to get this equation, we first derive a simplified Gassmann equation by using a reasonable assumption in which the compressive coefficient of the saturated pore fluid is much greater than the rock matrix, and, second, we use the Eshelby- Walsh relation to replace the equivalent modulus of a dry rock in the Gassmann equation. Results from the rock physics analysis of rock sample from a carbonate area show that rock matrix compressive coefficients calculated with water-saturated and dry rock samples using the linear regression method are very close (their error is less than 1%). This means the new method is accurate and reliable.展开更多
As the traditional methods and technical means cannot meet the quantitative research needs of the urban land use patterns, quantitative research methods for the urban land use pattern are established via the GIS (geo...As the traditional methods and technical means cannot meet the quantitative research needs of the urban land use patterns, quantitative research methods for the urban land use pattern are established via the GIS (geographic information system ) technique combined with the related theories and models. Taking the city of Nanjing as an example, a spatial database of urban land use and other environmental and socio-economic data is constructed. A multiple linear regression model is developed to determine the statistically significant factors affecting the residential land use distributions. To explain the spatial variations of urban land use patterns, the geographically weighted regression (GWR) is employed to establish spatial associations between these significant factors and the distribution of urban residential land use. The results demonstrate that the GWR can provide an effective approach to the exploration of the urban land use spatial patterns and also provide useful spatial information for planning residential development and other types of urban land use.展开更多
There is a lack of studies when dealing with the comparison between regression methods and machine learning(ML)-type methods in terms of their ability to interpret and describe how the components of a bituminous mixtu...There is a lack of studies when dealing with the comparison between regression methods and machine learning(ML)-type methods in terms of their ability to interpret and describe how the components of a bituminous mixture affect mechanistic performance.At the same time,artificial intelligence(AI)-driven approaches are becoming more popular in analysing asphalt mixtures,yet there are limited comparisons of regression and machine learning(ML)models for mechanistic performance interpretation.Consequently,a comparison of AI and statistical approaches is presented in this study for predicting bituminous mixture properties such as stiffness,fatigue resistance,and tensile strength.Some of the important input features are bitumen content,crumb rubber content,and air void content.The research uses random forest model(RFM),linear regression model(LRM),and polynomial regression model(PRM).RFM and PRM achieved an R^(2) as high as 0.94,with mean absolute error(MAE)less than 2.5,and are,therefore,good predictive models.Interestingly,RFM works best in one-third of instances,particularly when dealing with outliers,whereas traditional statistical models work better in two-thirds of instances.The results highlight AI's value in bituminous mixture optimisation,where RFM showed good prediction accuracy.In 30%of the cases,AI models outperformed the conventional statistical approaches.At the same time,analyses show that model performance varies significantly with scenarios and that even if AI models capture complex nonlinear relationships,they must not override DOE principles.展开更多
AIM:To evaluate long-term visual field(VF)prediction using K-means clustering in patients with primary open angle glaucoma(POAG).METHODS:Patients who underwent 24-2 VF tests≥10 were included in this study.Using 52 to...AIM:To evaluate long-term visual field(VF)prediction using K-means clustering in patients with primary open angle glaucoma(POAG).METHODS:Patients who underwent 24-2 VF tests≥10 were included in this study.Using 52 total deviation values(TDVs)from the first 10 VF tests of the training dataset,VF points were clustered into several regions using the hierarchical ordered partitioning and collapsing hybrid(HOPACH)and K-means clustering.Based on the clustering results,a linear regression analysis was applied to each clustered region of the testing dataset to predict the TDVs of the 10th VF test.Three to nine VF tests were used to predict the 10th VF test,and the prediction errors(root mean square error,RMSE)of each clustering method and pointwise linear regression(PLR)were compared.RESULTS:The training group consisted of 228 patients(mean age,54.20±14.38y;123 males and 105 females),and the testing group included 81 patients(mean age,54.88±15.22y;43 males and 38 females).All subjects were diagnosed with POAG.Fifty-two VF points were clustered into 11 and nine regions using HOPACH and K-means clustering,respectively.K-means clustering had a lower prediction error than PLR when n=1:3 and 1:4(both P≤0.003).The prediction errors of K-means clustering were lower than those of HOPACH in all sections(n=1:4 to 1:9;all P≤0.011),except for n=1:3(P=0.680).PLR outperformed K-means clustering only when n=1:8 and 1:9(both P≤0.020).CONCLUSION:K-means clustering can predict longterm VF test results more accurately in patients with POAG with limited VF data.展开更多
Abstract Using the method of stepwise multivariate linear regression (SMLR), the quantitative structure activity relationships (QSAR) of two isomeric series of taxol and its derivatives have been studied. It was foun...Abstract Using the method of stepwise multivariate linear regression (SMLR), the quantitative structure activity relationships (QSAR) of two isomeric series of taxol and its derivatives have been studied. It was found that the molar refractivity of the C3′substituent of the C13 side chain has significant correlation with its activity. We deduce that structural changes in the C3′substituents may be critical to the anticancer function. It would be useful to the design and synthesis of taxol like compounds with improved activities.展开更多
Rainfall is an important factor in estimating the event mean concentration (EMC) which is used to quantify the washed-off pollutant concentrations from non-point sources (NPSs). Pollutant loads could also be calcu...Rainfall is an important factor in estimating the event mean concentration (EMC) which is used to quantify the washed-off pollutant concentrations from non-point sources (NPSs). Pollutant loads could also be calculated using rainfall, catchment area and runoff coefficient. In this study, runoff quantity and quality data gathered from a 28-month monitoring conducted on the road and parking lot sites in Korea were evaluated using multiple linear regression (MLR) to develop equations for estimating pollutant loads and EMCs as a function of rainfall variables. The results revealed that total event rainfall and average rainfall intensity are possible predictors of pollutant loads. Overall, the models are indicators of the high uncertainties of NPSs; perhaps estimation of EMCs and loads could be accurately obtained by means of water quality sampling or a long term monitoring is needed to gather more data that can be used for the development of estimation models.展开更多
In this paper, based on the theory of parameter estimation, we give a selection method and, in a sense of a good character of the parameter estimation, we think that it is very reasonable. Moreover, we offer a calcula...In this paper, based on the theory of parameter estimation, we give a selection method and, in a sense of a good character of the parameter estimation, we think that it is very reasonable. Moreover, we offer a calculation method of selection statistic and an applied example.展开更多
The construction method of background value is improved in the original multi-variable grey model (MGM(1,m)) from its source of construction errors. The MGM(1,m) with optimized background value is used to elimin...The construction method of background value is improved in the original multi-variable grey model (MGM(1,m)) from its source of construction errors. The MGM(1,m) with optimized background value is used to eliminate the random fluctuations or errors of the observational data of all variables, and the combined prediction model together with the multiple linear regression is established in order to improve the simulation and prediction accuracy of the combined model. Finally, a combined model of the MGM(1,2) with optimized background value and the binary linear regression is constructed by an example. The results show that the model has good effects for simulation and prediction.展开更多
Many properties of fruit are influenced by plant nutrition. Fruit firmness is one of the most important fruit characteristics and determines post-harvest life of the fruit, in recent decades, artificial intelligence s...Many properties of fruit are influenced by plant nutrition. Fruit firmness is one of the most important fruit characteristics and determines post-harvest life of the fruit, in recent decades, artificial intelligence systems were employed for developing predictive models to estimate and predict many agriculture processes. In the present study, the predictive capabilities of multiple linear regressions (MLR) and artificial neural networks (ANNs) are evaluated to estimate fruit firmness in six months, including each of nutrients concentrations (nitrogen (N), potassium (K), calcium (Ca) and magnesium (Mg)) alone (P1), com- bination of nutrients concentrations (P2), nutrient concentration ratios alone (P3), and combination of nutrient concentrations and nutrient concentration ratios (P4). The results showed that MLR model estimated fruit firmness more accuracy than ANN model in three datasets (P1, P2 and P4). However, the application of P3 (N/Ca ratio) as the input dataset in ANN model improved the prediction of fruit firmness than the MLR model. Correlation coefficient and root mean squared error (RMSE) were 0.850 and 0.539 between the measured and the estimated data by the ANN model, respectively. Generally, the ANN model showed greater potential in determining the relationship between 6-mon-fruit firmness and nutrients concentration.展开更多
Multiple linear regression (MLR) method was applied to quantify the effects of the net heat flux (NHF), the net freshwater flux (NFF) and the wind stress on the mixed layer depth (MLD) of the South China Sea ...Multiple linear regression (MLR) method was applied to quantify the effects of the net heat flux (NHF), the net freshwater flux (NFF) and the wind stress on the mixed layer depth (MLD) of the South China Sea (SCS) based on the simple ocean data assimilation (SODA) dataset. The spatio-temporal distributions of the MLD, the buoyancy flux (combining the NHF and the NFF) and the wind stress of the SCS were presented. Then using an oceanic vertical mixing model, the MLD after a certain time under the same initial conditions but various pairs of boundary conditions (the three factors) was simulated. Applying the MLR method to the results, regression equations which modeling the relationship between the simulated MLD and the three factors were calculated. The equations indicate that when the NHF was negative, it was the primary driver of the mixed layer deepening; and when the NHF was positive, the wind stress played a more important role than that of the NHF while the NFF had the least effect. When the NHF was positive, the relative quantitative effects of the wind stress, the NHF, and the NFF were about i0, 6 and 2. The above conclusions were applied to explaining the spatio-temporal distributions of the MLD in the SCS and thus proved to be valid.展开更多
In agricultural systems,the regular monitoring of Soil Organic Matter(SOM)dynamics is essential.This task is costly and time-consuming when using the conventional method,especially in a very fragmented area and with i...In agricultural systems,the regular monitoring of Soil Organic Matter(SOM)dynamics is essential.This task is costly and time-consuming when using the conventional method,especially in a very fragmented area and with intensive agricultural activity,such as the area of Sidi Bennour.The study area is located in the Doukkala irrigated perimeter in Morocco.Satellite data can provide an alternative and fill this gap at a low cost.Models to predict SOM from a satellite image,whether linear or nonlinear,have shown considerable interest.This study aims to compare SOM prediction using Multiple Linear Regression(MLR)and Artificial Neural Networks(ANN).A total of 368 points were collected at a depth of 0-30 cm and analyzed in the laboratory.An image at 15 m resolution(MSPAN)was produced from a 30 m resolution(MS)Landsat-8 image using image pansharpening processing and panchromatic band(15 m).The results obtained show that the MLR models predicted the SOM with(training/validation)R^(2)values of 0.62/0.63 and 0.64/0.65 and RMSE values of 0.23/0.22 and 0.22/0.21 for the MS and MSPAN images,respectively.In contrast,the ANN models predicted SOM with R2 values of 0.65/0.66 and 0.69/0.71 and RMSE values of 0.22/0.10 and 0.21/0.18 for the MS and MSPAN images,respectively.Image pansharpening improved the prediction accuracy by 2.60%and 4.30%and reduced the estimation error by 0.80%and 1.30%for the MLR and ANN models,respectively.展开更多
The mechanical properties of TiAl alloy prepared by directional solidification were predicted through a machine learning algorithm model.The composition,input power,and pulling speed were designated as input variables...The mechanical properties of TiAl alloy prepared by directional solidification were predicted through a machine learning algorithm model.The composition,input power,and pulling speed were designated as input variables as representative factors influencing mechanical properties,and multiple linear regression analysis was conducted by collecting data obtained from the literature.In this study,the R^(2)value of the tensile strength prediction result was 0.7159,elongation was 0.8459,nanoindentation hardness was 0.7573,and interlamellar spacing was 0.9674.As the R^(2)value of the elongation obtained through the analysis was higher than the R^(2)value of the tensile strength,it was confirmed that the elongation had a closer relationship with the input variables(composition,input power,pulling speed)than the tensile strength.By adding the elongation to the tensile strength as an input variable,it was observed that the R^(2)value was further increased.The tensile test prediction results were divided into four groups:The group with the lowest residual value(predicted value-actual value)was designated as group A,and the group with the largest residual value was designated as group D.When comparing the values of group A and group D,more overpredictions occurred in group A,while more under predictions occurred in group D.Using the residuals and R^(2)values,the cause of the well-prediction was studied,and through this,the relationship between the mechanical properties and the microstructure was quantitatively investigated.展开更多
In current paper, a quantitative structure-activity relationship (QSAR) study was performed for the prediction of acute toxicity of aromatic amines. A set of 56 compounds was randomly divided into a training set of ...In current paper, a quantitative structure-activity relationship (QSAR) study was performed for the prediction of acute toxicity of aromatic amines. A set of 56 compounds was randomly divided into a training set of 46 compounds and a test set of 10 compounds. The electronic and topological descriptors computed by the Scigress package and Dragon software were used as predictor variables. Multiple linear regression (MLR) and support vector machine (SVM) were utilized to build the linear and nonlinear QSAR models, respectively. The obtained models with five descriptors show strong predictive ability. The linear model fits the training set with R2 = 0.71, with higher SVM values of R2 = 0.77. The validation results obtained from the test set indicate that the SVM model is comparable or superior to that obtained by MLR, both in terms of prediction ability and robustness.展开更多
A class of estimators of the mean survival time with interval censored data are studied by unbiased transformation method. The estimators are constructed based on the observations to ensure unbiasedness in the sense t...A class of estimators of the mean survival time with interval censored data are studied by unbiased transformation method. The estimators are constructed based on the observations to ensure unbiasedness in the sense that the estimators in a certain class have the same expectation as the mean survival time. The estimators have good properties such as strong consistency (with the rate of O(n^-1/1 (log log n)^1/2)) and asymptotic normality. The application to linear regression is considered and the simulation reports are given.展开更多
Understanding the spatial-temporal dynamics of crop nitrogen(N)use efficiency(NUE)and the relationship with explanatory environmental variables can support land-use management and policymaking.Nevertheless,the applica...Understanding the spatial-temporal dynamics of crop nitrogen(N)use efficiency(NUE)and the relationship with explanatory environmental variables can support land-use management and policymaking.Nevertheless,the application of statistical models for evaluating the explanatory variables of space-time variation in crop NUE is still under-researched.In this study,stepwise multiple linear regression(SMLR)and Random Forest(RF)were used to evaluate the spatial and temporal variation of NUE indicators(i.e.,partial factor productivity of N(PFPN);partial nutrient balance of N(PNBN))at county scale in Northeast China(Heilongjiang,Liaoning and Jilin provinces)from 1990 to 2015.Explanatory variables included agricultural management practices,topography,climate,economy,soil and crop types.Results revealed that the PFPN was higher in the northern parts and lower in the center of the Northeast China and PNBN increased from southern to northern parts during the 1990–2015 period.The NUE indicators decreased with time in most counties during the study period.The model efficiency coefficients of the SMLR and RF models were 0.44 and 0.84 for PFPN,and 0.67 and 0.89 for PNBN,respectively.The RF model had higher relative importance of soil and climatic covariates and lower relative importance of crop covariates compared to the SMLR model.The planting area index of vegetables and beans,soil clay content,saturated water content,enhanced vegetation index in November&December,soil bulk density,and annual minimum temperature were the main explanatory variables for both NUE indicators.This is the first study to show the quantitative relative importance of explanatory variables for NUE at a county level in Northeast China using RF and SMLR.This novel study gives reference measurements to improve crop NUE which is one of the most effective means of managing N for sustainable development,ensuring food security,alleviating environmental degradation and increasing farmer’s profitability.展开更多
文摘This paper considers the approaches and methods for reducing the influence of multi-collinearity. Great attention is paid to the question of using shrinkage estimators for this purpose. Two classes of regression models are investigated, the first of which corresponds to systems with a negative feedback, while the second class presents systems without the feedback. In the first case the use of shrinkage estimators, especially the Principal Component estimator, is inappropriate but is possible in the second case with the right choice of the regularization parameter or of the number of principal components included in the regression model. This fact is substantiated by the study of the distribution of the random variable , where b is the LS estimate and β is the true coefficient, since the form of this distribution is the basic characteristic of the specified classes. For this study, a regression approximation of the distribution of the event based on the Edgeworth series was developed. Also, alternative approaches are examined to resolve the multicollinearity issue, including an application of the known Inequality Constrained Least Squares method and the Dual estimator method proposed by the author. It is shown that with a priori information the Euclidean distance between the estimates and the true coefficients can be significantly reduced.
文摘Power converters are essential components in modern life,being widely used in industry,automation,transportation,and household appliances.In many critical applications,their failure can lead not only to financial losses due to operational downtime but also to serious risks to human safety.The capacitors forming the output filter,typically aluminumelectrolytic capacitors(AECs),are among the most critical and susceptible components in power converters.The electrolyte in AECs often evaporates over time,causing the internal resistance to rise and the capacitance to drop,ultimately leading to component failure.Detecting this fault requires measuring the current in the capacitor,rendering the method invasive and frequently impractical due to spatial constraints or operational limitations imposed by the integration of a current sensor in the capacitor branch.This article proposes the implementation of an online noninvasive fault diagnosis technique for estimating the Equivalent Series Resistance(ESR)and Capacitance(C)values of the capacitor,employing a combination of signal processing techniques(SPT)and machine learning(ML)algorithms.This solution relies solely on the converter’s input and output signals,therefore making it a non-invasive approach.The ML algorithm used was linear regression,applied to 27 attributes,21 of which were generated through feature engineering to enhance the model’s performance.The proposed solution demonstrates an R^(2) score greater than 0.99 in the estimation of both ESR and C.
基金supported by the Natural Science Foundation of Shanghai(23ZR1463600)Shanghai Pudong New Area Health Commission Research Project(PW2021A-69)Research Project of Clinical Research Center of Shanghai Health Medical University(22MC2022002)。
文摘Gastric cancer is the third leading cause of cancer-related mortality and remains a major global health issue^([1]).Annually,approximately 479,000individuals in China are diagnosed with gastric cancer,accounting for almost 45%of all new cases worldwide^([2]).
文摘This study aimed to create a model for calculating the total reference crop evapotranspiration(ETo)in Mersin Province from May 2015 to 2020 and to generate maps using spatial analysis.Lemons from citrus play a significant role inMersin’s agriculture,and because of lemons’sensitivity to temperature,ETo is essential for them.Itwas observed that the ETo value(EToPM)calculated using the Penman-Monteith(PM)method increased over the years.A model was developed using data from 36 Automated Weather Observing Systems(AWOS)in Mersin,Turkiye,which is located in a semi-arid climate zone.The model was created using Multiple Linear Regression(MLR)and artificial neural network(ANN)methods.The station climate data were divided into training and test datasets separately and collectively,and ETo values were estimated with different combinations using three scenarios and six model constructs.The dataset was divided into training(2015-2018)and testing(2019-2020).ANN1 andMLR1 are analyses of individual AWOS,while the other models are analyses of all AWOS together.The statistical performance analysis involved a comparison of the R2,Mean Absolute Error(MAE),Mean Absolute Percentage Error(MAPE),and RootMean Square Error(RMSE)values.The analysis results indicated that ANN1(0.9997,0.0105,0.2718%,and 0.0162,respectively)and ANN2(0.9958,0.0678,1.5341%,and 0.0864,respectively)models successfully predicted as statistical with both single and all AWOS.Themodels were visually evaluated using the Inverse DistanceWeighting(IDW)interpolationmethod,andmaps of plant water consumption were generated.The relationships between bothmodels and years in themonthly total ETo maps allowed for a clearer comparison.
文摘According to some main assumptions in the Rouse Formula,it analyzes the applicability of Rouse distribution in the coastal region.Based on the classical Rouse Formula,the linear form of Rouse Formula and the transport characteristics of offshore sediment were used to take lnz/h,lnc_(a),c_(a),u,lnu and z/h as the independent variables.The multiple liner regression method was used to analyze the influence of the independent variables on the vertical distribution of sediment concentration.By using the method of significance test,the factors(lnu)that have less influence on sediment concentration among 6 variables were eliminated.The correlation coefficient between the calculated sediment concentration and the measured sediment concentration indicates that the adopted variables can reflect the characteristics of vertical distribution of concentration of fine sediment near shore under complex dynamic conditions.
基金supported by the National Nature Science Foundation of China (Grant Noss 40739907 and 40774064)National Science and Technology Major Project (Grant No. 2008ZX05025-003)
文摘The rock matrix bulk modulus or its inverse, the compressive coefficient, is an important input parameter for fluid substitution by the Biot-Gassmann equation in reservoir prediction. However, it is not easy to accurately estimate the bulk modulus by using conventional methods. In this paper, we present a new linear regression equation for calculating the parameter. In order to get this equation, we first derive a simplified Gassmann equation by using a reasonable assumption in which the compressive coefficient of the saturated pore fluid is much greater than the rock matrix, and, second, we use the Eshelby- Walsh relation to replace the equivalent modulus of a dry rock in the Gassmann equation. Results from the rock physics analysis of rock sample from a carbonate area show that rock matrix compressive coefficients calculated with water-saturated and dry rock samples using the linear regression method are very close (their error is less than 1%). This means the new method is accurate and reliable.
基金The National Natural Science Foundation of China(No.51378099)
文摘As the traditional methods and technical means cannot meet the quantitative research needs of the urban land use patterns, quantitative research methods for the urban land use pattern are established via the GIS (geographic information system ) technique combined with the related theories and models. Taking the city of Nanjing as an example, a spatial database of urban land use and other environmental and socio-economic data is constructed. A multiple linear regression model is developed to determine the statistically significant factors affecting the residential land use distributions. To explain the spatial variations of urban land use patterns, the geographically weighted regression (GWR) is employed to establish spatial associations between these significant factors and the distribution of urban residential land use. The results demonstrate that the GWR can provide an effective approach to the exploration of the urban land use spatial patterns and also provide useful spatial information for planning residential development and other types of urban land use.
基金sustained them with this research(including Eng.Giuseppe Colicchio)and the European Commission for its financial contribution to the LIFE SILENT project“Sustainable Innovations for Long-life Environmental Noise Technologies”(LIFE22-ENV-IT-LIFE-SILENT/101114310.Acronym:LIFE22-ENV-ITLIFE SILENT)the LIFE SNEAK Project“Optimised Surfaces Against Noise and Vibrations Produced by Tramway Track and Road Traffic”(LIFE20 ENV/IT/000181.Acronym:LIFE SNEAK).
文摘There is a lack of studies when dealing with the comparison between regression methods and machine learning(ML)-type methods in terms of their ability to interpret and describe how the components of a bituminous mixture affect mechanistic performance.At the same time,artificial intelligence(AI)-driven approaches are becoming more popular in analysing asphalt mixtures,yet there are limited comparisons of regression and machine learning(ML)models for mechanistic performance interpretation.Consequently,a comparison of AI and statistical approaches is presented in this study for predicting bituminous mixture properties such as stiffness,fatigue resistance,and tensile strength.Some of the important input features are bitumen content,crumb rubber content,and air void content.The research uses random forest model(RFM),linear regression model(LRM),and polynomial regression model(PRM).RFM and PRM achieved an R^(2) as high as 0.94,with mean absolute error(MAE)less than 2.5,and are,therefore,good predictive models.Interestingly,RFM works best in one-third of instances,particularly when dealing with outliers,whereas traditional statistical models work better in two-thirds of instances.The results highlight AI's value in bituminous mixture optimisation,where RFM showed good prediction accuracy.In 30%of the cases,AI models outperformed the conventional statistical approaches.At the same time,analyses show that model performance varies significantly with scenarios and that even if AI models capture complex nonlinear relationships,they must not override DOE principles.
基金Supported by the Korea Health Technology R&D Project through the Korea Health Industry Development Institute(KHIDI),the Ministry of Health&Welfare,Republic of Korea(No.RS-2020-KH088726)the Patient-Centered Clinical Research Coordinating Center(PACEN),the Ministry of Health and Welfare,Republic of Korea(No.HC19C0276)the National Research Foundation of Korea(NRF),the Korea Government(MSIT)(No.RS-2023-00247504).
文摘AIM:To evaluate long-term visual field(VF)prediction using K-means clustering in patients with primary open angle glaucoma(POAG).METHODS:Patients who underwent 24-2 VF tests≥10 were included in this study.Using 52 total deviation values(TDVs)from the first 10 VF tests of the training dataset,VF points were clustered into several regions using the hierarchical ordered partitioning and collapsing hybrid(HOPACH)and K-means clustering.Based on the clustering results,a linear regression analysis was applied to each clustered region of the testing dataset to predict the TDVs of the 10th VF test.Three to nine VF tests were used to predict the 10th VF test,and the prediction errors(root mean square error,RMSE)of each clustering method and pointwise linear regression(PLR)were compared.RESULTS:The training group consisted of 228 patients(mean age,54.20±14.38y;123 males and 105 females),and the testing group included 81 patients(mean age,54.88±15.22y;43 males and 38 females).All subjects were diagnosed with POAG.Fifty-two VF points were clustered into 11 and nine regions using HOPACH and K-means clustering,respectively.K-means clustering had a lower prediction error than PLR when n=1:3 and 1:4(both P≤0.003).The prediction errors of K-means clustering were lower than those of HOPACH in all sections(n=1:4 to 1:9;all P≤0.011),except for n=1:3(P=0.680).PLR outperformed K-means clustering only when n=1:8 and 1:9(both P≤0.020).CONCLUSION:K-means clustering can predict longterm VF test results more accurately in patients with POAG with limited VF data.
文摘Abstract Using the method of stepwise multivariate linear regression (SMLR), the quantitative structure activity relationships (QSAR) of two isomeric series of taxol and its derivatives have been studied. It was found that the molar refractivity of the C3′substituent of the C13 side chain has significant correlation with its activity. We deduce that structural changes in the C3′substituents may be critical to the anticancer function. It would be useful to the design and synthesis of taxol like compounds with improved activities.
基金provided by the Korean Ministry of Environment and Eco Star Project
文摘Rainfall is an important factor in estimating the event mean concentration (EMC) which is used to quantify the washed-off pollutant concentrations from non-point sources (NPSs). Pollutant loads could also be calculated using rainfall, catchment area and runoff coefficient. In this study, runoff quantity and quality data gathered from a 28-month monitoring conducted on the road and parking lot sites in Korea were evaluated using multiple linear regression (MLR) to develop equations for estimating pollutant loads and EMCs as a function of rainfall variables. The results revealed that total event rainfall and average rainfall intensity are possible predictors of pollutant loads. Overall, the models are indicators of the high uncertainties of NPSs; perhaps estimation of EMCs and loads could be accurately obtained by means of water quality sampling or a long term monitoring is needed to gather more data that can be used for the development of estimation models.
基金Supported by the Natural Science Foundation of Anhui Education Committee
文摘In this paper, based on the theory of parameter estimation, we give a selection method and, in a sense of a good character of the parameter estimation, we think that it is very reasonable. Moreover, we offer a calculation method of selection statistic and an applied example.
基金supported by the National Natural Science Foundation of China(71071077)the Ministry of Education Key Project of National Educational Science Planning(DFA090215)+1 种基金China Postdoctoral Science Foundation(20100481137)Funding of Jiangsu Innovation Program for Graduate Education(CXZZ11-0226)
文摘The construction method of background value is improved in the original multi-variable grey model (MGM(1,m)) from its source of construction errors. The MGM(1,m) with optimized background value is used to eliminate the random fluctuations or errors of the observational data of all variables, and the combined prediction model together with the multiple linear regression is established in order to improve the simulation and prediction accuracy of the combined model. Finally, a combined model of the MGM(1,2) with optimized background value and the binary linear regression is constructed by an example. The results show that the model has good effects for simulation and prediction.
文摘Many properties of fruit are influenced by plant nutrition. Fruit firmness is one of the most important fruit characteristics and determines post-harvest life of the fruit, in recent decades, artificial intelligence systems were employed for developing predictive models to estimate and predict many agriculture processes. In the present study, the predictive capabilities of multiple linear regressions (MLR) and artificial neural networks (ANNs) are evaluated to estimate fruit firmness in six months, including each of nutrients concentrations (nitrogen (N), potassium (K), calcium (Ca) and magnesium (Mg)) alone (P1), com- bination of nutrients concentrations (P2), nutrient concentration ratios alone (P3), and combination of nutrient concentrations and nutrient concentration ratios (P4). The results showed that MLR model estimated fruit firmness more accuracy than ANN model in three datasets (P1, P2 and P4). However, the application of P3 (N/Ca ratio) as the input dataset in ANN model improved the prediction of fruit firmness than the MLR model. Correlation coefficient and root mean squared error (RMSE) were 0.850 and 0.539 between the measured and the estimated data by the ANN model, respectively. Generally, the ANN model showed greater potential in determining the relationship between 6-mon-fruit firmness and nutrients concentration.
基金The National Natural Science Foundation of China under contract No.11174235the Science and Technology Development Project of Shaanxi Province of China under contract No.2010KJXX-02+2 种基金the Program for New Century Excellent Talents in University of China under contract No. NCET-08-0455the Science and Technology Innovation Foundation of Northwestern Polytechnical University of Chinathe Doctorate Foundation of Northwestern Polytechnical University of China under contract No.CX201226.
文摘Multiple linear regression (MLR) method was applied to quantify the effects of the net heat flux (NHF), the net freshwater flux (NFF) and the wind stress on the mixed layer depth (MLD) of the South China Sea (SCS) based on the simple ocean data assimilation (SODA) dataset. The spatio-temporal distributions of the MLD, the buoyancy flux (combining the NHF and the NFF) and the wind stress of the SCS were presented. Then using an oceanic vertical mixing model, the MLD after a certain time under the same initial conditions but various pairs of boundary conditions (the three factors) was simulated. Applying the MLR method to the results, regression equations which modeling the relationship between the simulated MLD and the three factors were calculated. The equations indicate that when the NHF was negative, it was the primary driver of the mixed layer deepening; and when the NHF was positive, the wind stress played a more important role than that of the NHF while the NFF had the least effect. When the NHF was positive, the relative quantitative effects of the wind stress, the NHF, and the NFF were about i0, 6 and 2. The above conclusions were applied to explaining the spatio-temporal distributions of the MLD in the SCS and thus proved to be valid.
文摘In agricultural systems,the regular monitoring of Soil Organic Matter(SOM)dynamics is essential.This task is costly and time-consuming when using the conventional method,especially in a very fragmented area and with intensive agricultural activity,such as the area of Sidi Bennour.The study area is located in the Doukkala irrigated perimeter in Morocco.Satellite data can provide an alternative and fill this gap at a low cost.Models to predict SOM from a satellite image,whether linear or nonlinear,have shown considerable interest.This study aims to compare SOM prediction using Multiple Linear Regression(MLR)and Artificial Neural Networks(ANN).A total of 368 points were collected at a depth of 0-30 cm and analyzed in the laboratory.An image at 15 m resolution(MSPAN)was produced from a 30 m resolution(MS)Landsat-8 image using image pansharpening processing and panchromatic band(15 m).The results obtained show that the MLR models predicted the SOM with(training/validation)R^(2)values of 0.62/0.63 and 0.64/0.65 and RMSE values of 0.23/0.22 and 0.22/0.21 for the MS and MSPAN images,respectively.In contrast,the ANN models predicted SOM with R2 values of 0.65/0.66 and 0.69/0.71 and RMSE values of 0.22/0.10 and 0.21/0.18 for the MS and MSPAN images,respectively.Image pansharpening improved the prediction accuracy by 2.60%and 4.30%and reduced the estimation error by 0.80%and 1.30%for the MLR and ANN models,respectively.
基金financially supported by the National Natural Science Foundation of China(Nos.51671072 and 51471062)。
文摘The mechanical properties of TiAl alloy prepared by directional solidification were predicted through a machine learning algorithm model.The composition,input power,and pulling speed were designated as input variables as representative factors influencing mechanical properties,and multiple linear regression analysis was conducted by collecting data obtained from the literature.In this study,the R^(2)value of the tensile strength prediction result was 0.7159,elongation was 0.8459,nanoindentation hardness was 0.7573,and interlamellar spacing was 0.9674.As the R^(2)value of the elongation obtained through the analysis was higher than the R^(2)value of the tensile strength,it was confirmed that the elongation had a closer relationship with the input variables(composition,input power,pulling speed)than the tensile strength.By adding the elongation to the tensile strength as an input variable,it was observed that the R^(2)value was further increased.The tensile test prediction results were divided into four groups:The group with the lowest residual value(predicted value-actual value)was designated as group A,and the group with the largest residual value was designated as group D.When comparing the values of group A and group D,more overpredictions occurred in group A,while more under predictions occurred in group D.Using the residuals and R^(2)values,the cause of the well-prediction was studied,and through this,the relationship between the mechanical properties and the microstructure was quantitatively investigated.
基金Supported by the Ministry of Environmental Protection of China(No.2011467037)
文摘In current paper, a quantitative structure-activity relationship (QSAR) study was performed for the prediction of acute toxicity of aromatic amines. A set of 56 compounds was randomly divided into a training set of 46 compounds and a test set of 10 compounds. The electronic and topological descriptors computed by the Scigress package and Dragon software were used as predictor variables. Multiple linear regression (MLR) and support vector machine (SVM) were utilized to build the linear and nonlinear QSAR models, respectively. The obtained models with five descriptors show strong predictive ability. The linear model fits the training set with R2 = 0.71, with higher SVM values of R2 = 0.77. The validation results obtained from the test set indicate that the SVM model is comparable or superior to that obtained by MLR, both in terms of prediction ability and robustness.
基金Supported by the National Natural Science Foundation of China (70171008)
文摘A class of estimators of the mean survival time with interval censored data are studied by unbiased transformation method. The estimators are constructed based on the observations to ensure unbiasedness in the sense that the estimators in a certain class have the same expectation as the mean survival time. The estimators have good properties such as strong consistency (with the rate of O(n^-1/1 (log log n)^1/2)) and asymptotic normality. The application to linear regression is considered and the simulation reports are given.
基金the China Scholarship Council(CSC)(201903250115)the National Natural Science Foundation of China(31972515)the China Agriculture Research System of MOF and MARA(CARS-09-P31).
文摘Understanding the spatial-temporal dynamics of crop nitrogen(N)use efficiency(NUE)and the relationship with explanatory environmental variables can support land-use management and policymaking.Nevertheless,the application of statistical models for evaluating the explanatory variables of space-time variation in crop NUE is still under-researched.In this study,stepwise multiple linear regression(SMLR)and Random Forest(RF)were used to evaluate the spatial and temporal variation of NUE indicators(i.e.,partial factor productivity of N(PFPN);partial nutrient balance of N(PNBN))at county scale in Northeast China(Heilongjiang,Liaoning and Jilin provinces)from 1990 to 2015.Explanatory variables included agricultural management practices,topography,climate,economy,soil and crop types.Results revealed that the PFPN was higher in the northern parts and lower in the center of the Northeast China and PNBN increased from southern to northern parts during the 1990–2015 period.The NUE indicators decreased with time in most counties during the study period.The model efficiency coefficients of the SMLR and RF models were 0.44 and 0.84 for PFPN,and 0.67 and 0.89 for PNBN,respectively.The RF model had higher relative importance of soil and climatic covariates and lower relative importance of crop covariates compared to the SMLR model.The planting area index of vegetables and beans,soil clay content,saturated water content,enhanced vegetation index in November&December,soil bulk density,and annual minimum temperature were the main explanatory variables for both NUE indicators.This is the first study to show the quantitative relative importance of explanatory variables for NUE at a county level in Northeast China using RF and SMLR.This novel study gives reference measurements to improve crop NUE which is one of the most effective means of managing N for sustainable development,ensuring food security,alleviating environmental degradation and increasing farmer’s profitability.