Piles are long, slender structural elements used to transfer the loads from the superstructure through weak strata onto stiffer soils or rocks. For driven piles, the impact of the piling hammer induces compression and...Piles are long, slender structural elements used to transfer the loads from the superstructure through weak strata onto stiffer soils or rocks. For driven piles, the impact of the piling hammer induces compression and tension stresses in the piles. Hence, an important design consideration is to check that the strength of the pile is sufficient to resist the stresses caused by the impact of the pile hammer. Due to its complexity, pile drivability lacks a precise analytical solution with regard to the phenomena involved.In situations where measured data or numerical hypothetical results are available, neural networks stand out in mapping the nonlinear interactions and relationships between the system’s predictors and dependent responses. In addition, unlike most computational tools, no mathematical relationship assumption between the dependent and independent variables has to be made. Nevertheless, neural networks have been criticized for their long trial-and-error training process since the optimal configuration is not known a priori. This paper investigates the use of a fairly simple nonparametric regression algorithm known as multivariate adaptive regression splines(MARS), as an alternative to neural networks, to approximate the relationship between the inputs and dependent response, and to mathematically interpret the relationship between the various parameters. In this paper, the Back propagation neural network(BPNN) and MARS models are developed for assessing pile drivability in relation to the prediction of the Maximum compressive stresses(MCS), Maximum tensile stresses(MTS), and Blow per foot(BPF). A database of more than four thousand piles is utilized for model development and comparative performance between BPNN and MARS predictions.展开更多
This study aims to extend the multivariate adaptive regression splines(MARS)-Monte Carlo simulation(MCS) method for reliability analysis of slopes in spatially variable soils. This approach is used to explore the infl...This study aims to extend the multivariate adaptive regression splines(MARS)-Monte Carlo simulation(MCS) method for reliability analysis of slopes in spatially variable soils. This approach is used to explore the influences of the multiscale spatial variability of soil properties on the probability of failure(P_f) of the slopes. In the proposed approach, the relationship between the factor of safety and the soil strength parameters characterized with spatial variability is approximated by the MARS, with the aid of Karhunen-Loeve expansion. MCS is subsequently performed on the established MARS model to evaluate Pf.Finally, a nominally homogeneous cohesive-frictional slope and a heterogeneous cohesive slope, which are both characterized with different spatial variabilities, are utilized to illustrate the proposed approach.Results showed that the proposed approach can estimate the P_f of the slopes efficiently in spatially variable soils with sufficient accuracy. Moreover, the approach is relatively robust to the influence of different statistics of soil properties, thereby making it an effective and practical tool for addressing slope reliability problems concerning time-consuming deterministic stability models with low levels of P_f.Furthermore, disregarding the multiscale spatial variability of soil properties can overestimate or underestimate the P_f. Although the difference is small in general, the multiscale spatial variability of the soil properties must still be considered in the reliability analysis of heterogeneous slopes, especially for those highly related to cost effective and accurate designs.展开更多
The assessment of in situ permeability of rock mass is challenging for large-scale projects such as reservoirs created by dams,where water tightness issues are of prime importance.The in situ permeability is strongly ...The assessment of in situ permeability of rock mass is challenging for large-scale projects such as reservoirs created by dams,where water tightness issues are of prime importance.The in situ permeability is strongly related to the frequency and distribution of discontinuities in the rock mass and quantified by rock quality designation(RQD).This paper analyzes the data of hydraulic conductivity and discontinuities sampled at different depths during the borehole investigations in the limestone and sandstone formations for the construction of hydraulic structures in Oman.Cores recovered from boreholes provide RQD data,and in situ Lugeon tests elucidate the permeability.A modern technique of multivariate adaptive regression splines(MARS)assisted in correlating permeability and RQD along with the depth.In situ permeability shows a declining trend with increasing RQD,and the depth of investigation is within 50 m.This type of relationship can be developed based on detailed initial investigations at the site where the hydraulic conductivity of discontinuous rocks is required to be delineated.The relationship can approximate the permeability by only measuring the RQD in later investigations on the same site,thus saving the time and cost of the site investigations.The applicability of the relationship developed in this study to another location requires a lithological similarity of the rock mass that can be verified through preliminary investigation at the site.展开更多
In high mountainous areas, the development and distribution of alpine permafrost is greatly affected by macro- and mi- cro-topographic factors. The effects of latitude, altitude, slope, and aspect on the distribution ...In high mountainous areas, the development and distribution of alpine permafrost is greatly affected by macro- and mi- cro-topographic factors. The effects of latitude, altitude, slope, and aspect on the distribution of permafrost were studied to under- stand the dislribution patterns of permafrost in Wenquan on the Qinghai-Tibet Plateau. Cluster and correlation analysis were per- formed based on 30 m Global Digital Elevation Model (GDEM) data and field data obtained using geophysical exploration and borehole drilling methods. A Multivariate Adaptive Regression Spline model (MARS) was developed to simulate permafrost spa- tial distribution over the studied area. A validation was followed by comparing to 201 geophysical exploration sites, as well as by comparing to two other models, i.e., a binary logistic regression model and the Mean Annual Ground Temperature model (IVlAGT). The MARS model provides a better simulation than the other two models. Besides the control effect of elevation on permafrost distribution, the MARS model also takes into account the impact of direct solar radiation on permafrost distribution.展开更多
This paper makes an approach to the approximate optimum in structural design,which combines the global response surface(GRS) based multivariate adaptive regression splines(MARS) with Move-Limit strategy(MLS).MAR...This paper makes an approach to the approximate optimum in structural design,which combines the global response surface(GRS) based multivariate adaptive regression splines(MARS) with Move-Limit strategy(MLS).MARS is an adaptive regression process,which fits in with the multidimensional problems.It adopts a modified recursive partitioning strategy to simplify high-dimensional problems into smaller highly accurate models.MLS for moving and resizing the search sub-regions is employed in the space of design variables.The quality of the approximation functions and the convergence history of the optimization process are reflected in MLS.The disadvantages of the conventional response surface method(RSM) have been avoided,specifically,highly nonlinear high-dimensional problems.The GRS/MARS with MLS is applied to a high-dimensional test function and an engineering problem to demonstrate its feasibility and convergence,and compared with quadratic response surface(QRS) models in terms of computational efficiency and accuracy.展开更多
The purpose of this article is to provide an overview of adaptive regression modeling and demonstrate its use in conducting nonlinear analyses of interrupted time series (ITS) data. Adaptive regression modeling is bas...The purpose of this article is to provide an overview of adaptive regression modeling and demonstrate its use in conducting nonlinear analyses of interrupted time series (ITS) data. Adaptive regression modeling is based on heuristic search over alternative models for data controlled by likelihood-cross validation (LCV) scores with larger scores indicating better models. Extended linear mixed models are used for correlated data like ITS data. Power transforms of predictor variables are used to account for nonlinearity. The use of adaptive regression modeling for assessing ITS effects is demonstrated using data on annual proportions of major birth defects in children fathered by male Air Force veterans of the Vietnam War over a 59-year period. The interruption for this ITS is conception after versus before the start of a participant’s first tour in the Vietnam War. Whether the ITS effect is related to dioxin exposure is also addressed. Dioxin is a highly toxic contaminant of the herbicide Agent Orange used in the Vietnam War. The core findings of the reported analyses are that a substantial adverse ITS interruption effect is identified and that this adverse effect can reasonably be attributed to participants having a high dioxin exposure level. Moreover, these results indicate that adaptive regression modeling can identify nonlinear ITS effects in general situations that can lead to consequential insights into nonlinear relationships over time, possibly varying with other available predictors.展开更多
Accurate prediction of compressive strength of concrete is one of the key issues in the concrete industry. In this paper, a prediction method of fly ash-slag concrete compressive strength based on multiple adaptive re...Accurate prediction of compressive strength of concrete is one of the key issues in the concrete industry. In this paper, a prediction method of fly ash-slag concrete compressive strength based on multiple adaptive regression splines (MARS) is proposed, and the model analysis process is determined by analyzing the principle of this algorithm. Based on the Concrete Compressive Strength dataset of UCI, the MARS model for compressive strength prediction was constructed with cement content, blast furnace slag powder content, fly ash content, water content, reducing agent content, coarse aggregate content, fine aggregate content and age as independent variables. The prediction results of artificial neural network (BP), random forest (RF), support vector machine (SVM), extreme learning machine (ELM), and multiple nonlinear regression (MnLR) were compared and analyzed, and the prediction accuracy and model stability of MARS and RF models had obvious advantages, and the comprehensive performance of MARS model was slightly better than that of RF model. Finally, the explicit expression of the MARS model for compressive strength is given, which provides an effective method to achieve the prediction of compressive strength of fly ash-slag concrete.展开更多
The design and construction of underground structures are significantly affected by the distribution of geological formations.Prediction of the geological interfaces using limited data has been a difficult task.A mult...The design and construction of underground structures are significantly affected by the distribution of geological formations.Prediction of the geological interfaces using limited data has been a difficult task.A multivariate adaptive regression spline(MARS)method capable of modeling nonlinearities automatically was used in this study to spatially predict the elevations of geological interfaces.Borehole data from two sites in Singapore were used to evaluate the capability of the MARS method for predicting geological interfaces.By comparing the predicted values with the borehole data,it is shown that the MARS method has a mean of root mean square error of 4.4 m for the predicted elevations of the Kallang Formation–Old Alluvium interface.In addition,the MARS method is able to produce reasonable prediction intervals in the sense that the percentage of testing data covered by 95% prediction intervals was close to the associated confidence level,95%.More importantly,the prediction interval evaluated by the MARS method had a non-constant width that appropriately reflected the data density and geological complexity.展开更多
We propose a method for spatial downscaling of Landsat 8-derived LST maps from 100(30 m)resolution down to 2–4 m with the use of the Multiple Adaptive Regression Splines(MARS)models coupled with very high resolution ...We propose a method for spatial downscaling of Landsat 8-derived LST maps from 100(30 m)resolution down to 2–4 m with the use of the Multiple Adaptive Regression Splines(MARS)models coupled with very high resolution auxiliary data derived from hyperspectral aerial imagery and large-scale topographic maps.We applied the method to four Landsat 8 scenes,two collected in summer and two in winter,for three British towns collectively representing a variety of urban form.We used several spectral indices as well as fractional coverage of water and paved surfaces as LST predictors,and applied a novel method for the correction of temporal mismatch between spectral indices derived from aerial and satellite imagery captured at different dates,allowing for the application of the downscaling method for multiple dates without the need for repeating the aerial survey.Our results suggest that the method performed well for the summer dates,achieving RMSE of 1.40–1.83 K prior to and 0.76–1.21 K after correction for residuals.We conclude that the MARS models,by addressing the non-linear relationship of LST at coarse and fine spatial resolutions,can be successfully applied to produce high resolution LST maps suitable for studies of urban thermal environment at local scales.展开更多
Regression models for survival time data involve estimation of the hazard rate as a function of predictor variables and associated slope parameters. An adaptive approach is formulated for such hazard regression modeli...Regression models for survival time data involve estimation of the hazard rate as a function of predictor variables and associated slope parameters. An adaptive approach is formulated for such hazard regression modeling. The hazard rate is modeled using fractional polynomials, that is, linear combinations of products of power transforms of time together with other available predictors. These fractional polynomial models are restricted to generating positive-valued hazard rates and decreasing survival times. Exponentially distributed survival times are a special case. Parameters are estimated using maximum likelihood estimation allowing for right censored survival times. Models are evaluated and compared using likelihood cross-validation (LCV) scores. LCV scores and tolerance parameters are used to control an adaptive search through alternative fractional polynomial hazard rate models to identify effective models for the underlying survival time data. These methods are demonstrated using two different survival time data sets including survival times for lung cancer patients and for multiple myeloma patients. For the lung cancer data, the hazard rate depends distinctly on time. However, controlling for cell type provides a distinct improvement while the hazard rate depends only on cell type and no longer on time. Furthermore, Cox regression is unable to identify a cell type effect. For the multiple myeloma data, the hazard rate also depends distinctly on time. Moreover, consideration of hemoglobin at diagnosis provides a distinct improvement, the hazard rate still depends distinctly on time, and hemoglobin distinctly moderates the effect of time on the hazard rate. These results indicate that adaptive hazard rate modeling can provide unique insights into survival time data.展开更多
Recurrent event time data and more general multiple event time data are commonly analyzed using extensions of Cox regression, or proportional hazards regression, as used with single event time data. These methods trea...Recurrent event time data and more general multiple event time data are commonly analyzed using extensions of Cox regression, or proportional hazards regression, as used with single event time data. These methods treat covariates, either time-invariant or time-varying, as having multiplicative effects while general dependence on time is left un-estimated. An adaptive approach is formulated for analyzing multiple event time data. Conditional hazard rates are modeled in terms of dependence on both time and covariates using fractional polynomials restricted so that the conditional hazard rates are positive-valued and so that excess time probability functions (generalizing survival functions for single event times) are decreasing. Maximum likelihood is used to estimate parameters adjusting for right censored event times. Likelihood cross-validation (LCV) scores are used to compare models. Adaptive searches through alternate conditional hazard rate models are controlled by LCV scores combined with tolerance parameters. These searches identify effective models for the underlying multiple event time data. Conditional hazard regression is demonstrated using data on times between tumor recurrence for bladder cancer patients. Analyses of theory-based models for these data using extensions of Cox regression provide conflicting results on effects to treatment group and the initial number of tumors. On the other hand, fractional polynomial analyses of these theory-based models provide consistent results identifying significant effects to treatment group and initial number of tumors using both model-based and robust empirical tests. Adaptive analyses further identify distinct moderation by group of the effect of tumor order and an additive effect to group after controlling for nonlinear effects to initial number of tumors and tumor order. Results of example analyses indicate that adaptive conditional hazard rate modeling can generate useful insights into multiple event time data.展开更多
Recurrent event time data and more general multiple event time data are commonly analyzed using extensions of Cox regression, or proportional hazards regression, as used with single event time data. These methods trea...Recurrent event time data and more general multiple event time data are commonly analyzed using extensions of Cox regression, or proportional hazards regression, as used with single event time data. These methods treat covariates, either time-invariant or time-varying, as having multiplicative effects while general dependence on time is left un-estimated. An adaptive approach is formulated for analyzing multiple event time data. Conditional hazard rates are modeled in terms of dependence on both time and covariates using fractional polynomials restricted so that the conditional hazard rates are positive-valued and so that excess time probability functions (generalizing survival functions for single event times) are decreasing. Maximum likelihood is used to estimate parameters adjusting for right censored event times. Likelihood cross-validation (LCV) scores are used to compare models. Adaptive searches through alternate conditional hazard rate models are controlled by LCV scores combined with tolerance parameters. These searches identify effective models for the underlying multiple event time data. Conditional hazard regression is demonstrated using data on times between tumor recurrence for bladder cancer patients. Analyses of theory-based models for these data using extensions of Cox regression provide conflicting results on effects to treatment group and the initial number of tumors. On the other hand, fractional polynomial analyses of these theory-based models provide consistent results identifying significant effects to treatment group and initial number of tumors using both model-based and robust empirical tests. Adaptive analyses further identify distinct moderation by group of the effect of tumor order and an additive effect to group after controlling for nonlinear effects to initial number of tumors and tumor order. Results of example analyses indicate that adaptive conditional hazard rate modeling can generate useful insights into multiple event time data.展开更多
This paper is devoted to identifying the biomarkers of rat liver regeneration via the adaptive logistic regression. By combining the adaptive elastic net penalty with the logistic regression loss, the adaptive logisti...This paper is devoted to identifying the biomarkers of rat liver regeneration via the adaptive logistic regression. By combining the adaptive elastic net penalty with the logistic regression loss, the adaptive logistic regression is proposed to adaptively identify the important genes in groups. Furthermore, by improving the pathwise coordinate descent algorithm, a fast solving algorithm is developed for computing the regularized paths of the adaptive logistic regression. The results from the experiments performed on the microarray data of rat liver regeneration are provided to illustrate the effectiveness of the proposed method and verify the biological rationality of the selected biomarkers.展开更多
A research study collected intensive longitudinal data from cancer patients on a daily basis as well as non-intensive longitudinal survey data on a monthly basis. Although the daily data need separate analysis, those ...A research study collected intensive longitudinal data from cancer patients on a daily basis as well as non-intensive longitudinal survey data on a monthly basis. Although the daily data need separate analysis, those data can also be utilized to generate predictors of monthly outcomes. Alternatives for generating daily data predictors of monthly outcomes are addressed in this work. Analyses are reported of depression measured by the Patient Health Questionnaire 8 as the monthly survey outcome. Daily measures include numbers of opioid medications taken, numbers of pain flares, least pain levels, and worst pain levels. Predictors are averages of recent non-missing values for each daily measure recorded on or prior to survey dates for depression values. Weights for recent non-missing values are based on days between measurement of a recent value and a survey date. Five alternative averages are considered: averages with unit weights, averages with reciprocal weights, weighted averages with reciprocal weights, averages with exponential weights, and weighted averages with exponential weights. Adaptive regression methods based on likelihood cross-validation (LCV) scores are used to generate fractional polynomial models for possible nonlinear dependence of depression on each average. For all four daily measures, the best LCV score over averages of all types is generated using the average of recent non-missing values with reciprocal weights. Generated models are nonlinear and monotonic. Results indicate that an appropriate choice would be to assume three recent non-missing values and use the average with reciprocal weights of the first three recent non-missing values.展开更多
Adaptive fractional polynomial modeling of general correlated outcomes is formulated to address nonlinearity in means, variances/dispersions, and correlations. Means and variances/dispersions are modeled using general...Adaptive fractional polynomial modeling of general correlated outcomes is formulated to address nonlinearity in means, variances/dispersions, and correlations. Means and variances/dispersions are modeled using generalized linear models in fixed effects/coefficients. Correlations are modeled using random effects/coefficients. Nonlinearity is addressed using power transforms of primary (untransformed) predictors. Parameter estimation is based on extended linear mixed modeling generalizing both generalized estimating equations and linear mixed modeling. Models are evaluated using likelihood cross-validation (LCV) scores and are generated adaptively using a heuristic search controlled by LCV scores. Cases covered include linear, Poisson, logistic, exponential, and discrete regression of correlated continuous, count/rate, dichotomous, positive continuous, and discrete numeric outcomes treated as normally, Poisson, Bernoulli, exponentially, and discrete numerically distributed, respectively. Example analyses are also generated for these five cases to compare adaptive random effects/coefficients modeling of correlated outcomes to previously developed adaptive modeling based on directly specified covariance structures. Adaptive random effects/coefficients modeling substantially outperforms direct covariance modeling in the linear, exponential, and discrete regression example analyses. It generates equivalent results in the logistic regression example analyses and it is substantially outperformed in the Poisson regression case. Random effects/coefficients modeling of correlated outcomes can provide substantial improvements in model selection compared to directly specified covariance modeling. However, directly specified covariance modeling can generate competitive or substantially better results in some cases while usually requiring less computation time.展开更多
Coastal wetlands are crucial for the‘blue carbon sink’,significantly contributing to regulating climate change.This study util-ized 160 soil samples,35 remote sensing features,and 5 geo-climatic data to accurately e...Coastal wetlands are crucial for the‘blue carbon sink’,significantly contributing to regulating climate change.This study util-ized 160 soil samples,35 remote sensing features,and 5 geo-climatic data to accurately estimate the soil organic carbon stocks(SOCS)in the coastal wetlands of Tianjin and Hebei,China.To reduce data redundancy,simplify model complexity,and improve model inter-pretability,Pearson correlation analysis(PsCA),Boruta,and recursive feature elimination(RFE)were employed to optimize features.Combined with the optimized features,the soil organic carbon density(SOCD)prediction model was constructed by using multivariate adaptive regression splines(MARS),extreme gradient boosting(XGBoost),and random forest(RF)algorithms and applied to predict the spatial distribution of SOCD and estimate the SOCS of different wetland types in 2020.The results show that:1)different feature combinations have a significant influence on the model performance.Better prediction performance was attained by building a model using RFE-based feature combinations.RF has the best prediction accuracy(R^(2)=0.587,RMSE=0.798 kg/m^(2),MAE=0.660 kg/m^(2)).2)Optical features are more important than radar and geo-climatic features in the MARS,XGBoost,and RF algorithms.3)The size of SOCS is related to SOCD and the area of each wetland type,aquaculture pond has the highest SOCS,followed by marsh,salt pan,mud-flat,and sand shore.展开更多
This study presents a hybrid framework to predict stability solutions of buried structures under active trapdoor conditions in natural clays with anisotropy and heterogeneity by combining physics-based and data-driven...This study presents a hybrid framework to predict stability solutions of buried structures under active trapdoor conditions in natural clays with anisotropy and heterogeneity by combining physics-based and data-driven modeling.Finite-element limit analysis(FELA)with a newly developed anisotropic undrained shear(AUS)failure criterion is used to identify the underlying active failure mechanisms as well as to develop a numerical(physics-based)database of stability numbers for both planar and circular trapdoors.Practical considerations are given for natural clays to three linearly increasing shear strengths in compression,extension,and direct simple shear in the AUS material model.The obtained numerical solutions are compared and validated with published solutions in the literature.A multivariate adaptive regression splines(MARS)algorithm is further utilized to learn the numerical solutions to act as fast FELA data-driven surrogates for stability evaluation.The current MARS-based modeling provides both relative importance index and accurate design equations that can be used with confidence by practitioners.展开更多
Despite many studies on land degradation in the Highlands of Northern Ethiopia, quantitative information regarding long-term changes in land use/cover(LUC) is rare. Hence, this study aims to investigate the LUC change...Despite many studies on land degradation in the Highlands of Northern Ethiopia, quantitative information regarding long-term changes in land use/cover(LUC) is rare. Hence, this study aims to investigate the LUC changes in the Geba catchment(5142 km2), Northern Ethiopia, over 80 years(1935–2014). Aerial photographs(APs) of the 1930 s and Google Earth(GE) images(2014) were used. The point-count technique was utilized by overlaying a grid on APs and GE images. The occurrence of cropland, forest, grassland, shrubland, bare land, built-up areas and water body was counted to compute their fractions. A multivariate adaptive regression spline was applied to identify the explanatory factors of LUC and to create fractional maps of LUC. The results indicate significant changes of most types, except for forest and cropland. In the 1930 s, shrubland(48%) was dominant, followed by cropland(39%). The fraction of cropland in 2014(42%) remained approximately the same as in the 1930 s, while shrubland significantly dropped to 37%. Forests shrank further from a meagre 6.3% in the 1930 s to 2.3% in 2014. High overall accuracies(93% and 83%) and strong Kappa coefficients(89% and 72%) for point counts and fractional maps respectively indicate the validity of the techniques used for LUC mapping.展开更多
In the present scenario,computational modeling has gained much importance for the prediction of the properties of concrete.This paper depicts that how computational intelligence can be applied for the prediction of co...In the present scenario,computational modeling has gained much importance for the prediction of the properties of concrete.This paper depicts that how computational intelligence can be applied for the prediction of compressive strength of Self Compacting Concrete(SCC).Three models,namely,Extreme Learning Machine(ELM),Adaptive Neuro Fuzzy Inference System(ANFIS)and Multi Adaptive Regression Spline(MARS)have been employed in the present study for the prediction of compressive strength of self compacting concrete.The contents of cement(c),sand(s),coarse aggregate(a),fly ash(f),water/powder(w/p)ratio and superplasticizer(sp)dosage have been taken as inputs and 28 days compressive strength(fck)as output for ELM,ANFIS and MARS models.A relatively large set of data including 80 normalized data available in the literature has been taken for the study.A comparison is made between the results obtained from all the above-mentioned models and the model which provides best fit is established.The experimental results demonstrate that proposed models are robust for determination of compressive strength of self-compacting concrete.展开更多
Steam cracking is the dominant technology for producing light olefins,which are believed to be the foundation of the chemical industry.Predictive models of the cracking process can boost production efficiency and prof...Steam cracking is the dominant technology for producing light olefins,which are believed to be the foundation of the chemical industry.Predictive models of the cracking process can boost production efficiency and profit margin.Rapid advancements in machine learning research have recently enabled data-driven solutions to usher in a new era of process modeling.Meanwhile,its practical application to steam cracking is still hindered by the trade-off between prediction accuracy and computational speed.This research presents a framework for data-driven intelligent modeling of the steam cracking process.Industrial data preparation and feature engineering techniques provide computational-ready datasets for the framework,and feedstock similarities are exploited using k-means clustering.We propose LArge-Residuals-Deletion Multivariate Adaptive Regression Spline(LARD-MARS),a modeling approach that explicitly generates output formulas and eliminates potentially outlying instances.The framework is validated further by the presentation of clustering results,the explanation of variable importance,and the testing and comparison of model performance.展开更多
文摘Piles are long, slender structural elements used to transfer the loads from the superstructure through weak strata onto stiffer soils or rocks. For driven piles, the impact of the piling hammer induces compression and tension stresses in the piles. Hence, an important design consideration is to check that the strength of the pile is sufficient to resist the stresses caused by the impact of the pile hammer. Due to its complexity, pile drivability lacks a precise analytical solution with regard to the phenomena involved.In situations where measured data or numerical hypothetical results are available, neural networks stand out in mapping the nonlinear interactions and relationships between the system’s predictors and dependent responses. In addition, unlike most computational tools, no mathematical relationship assumption between the dependent and independent variables has to be made. Nevertheless, neural networks have been criticized for their long trial-and-error training process since the optimal configuration is not known a priori. This paper investigates the use of a fairly simple nonparametric regression algorithm known as multivariate adaptive regression splines(MARS), as an alternative to neural networks, to approximate the relationship between the inputs and dependent response, and to mathematically interpret the relationship between the various parameters. In this paper, the Back propagation neural network(BPNN) and MARS models are developed for assessing pile drivability in relation to the prediction of the Maximum compressive stresses(MCS), Maximum tensile stresses(MTS), and Blow per foot(BPF). A database of more than four thousand piles is utilized for model development and comparative performance between BPNN and MARS predictions.
基金supported by The Hong Kong Polytechnic University through the project RU3Ythe Research Grant Council through the project PolyU 5128/13E+1 种基金National Natural Science Foundation of China(Grant No.51778313)Cooperative Innovation Center of Engineering Construction and Safety in Shangdong Blue Economic Zone
文摘This study aims to extend the multivariate adaptive regression splines(MARS)-Monte Carlo simulation(MCS) method for reliability analysis of slopes in spatially variable soils. This approach is used to explore the influences of the multiscale spatial variability of soil properties on the probability of failure(P_f) of the slopes. In the proposed approach, the relationship between the factor of safety and the soil strength parameters characterized with spatial variability is approximated by the MARS, with the aid of Karhunen-Loeve expansion. MCS is subsequently performed on the established MARS model to evaluate Pf.Finally, a nominally homogeneous cohesive-frictional slope and a heterogeneous cohesive slope, which are both characterized with different spatial variabilities, are utilized to illustrate the proposed approach.Results showed that the proposed approach can estimate the P_f of the slopes efficiently in spatially variable soils with sufficient accuracy. Moreover, the approach is relatively robust to the influence of different statistics of soil properties, thereby making it an effective and practical tool for addressing slope reliability problems concerning time-consuming deterministic stability models with low levels of P_f.Furthermore, disregarding the multiscale spatial variability of soil properties can overestimate or underestimate the P_f. Although the difference is small in general, the multiscale spatial variability of the soil properties must still be considered in the reliability analysis of heterogeneous slopes, especially for those highly related to cost effective and accurate designs.
基金indebted to the Sohar University and the University of Buraimi, Oman, to support this study
文摘The assessment of in situ permeability of rock mass is challenging for large-scale projects such as reservoirs created by dams,where water tightness issues are of prime importance.The in situ permeability is strongly related to the frequency and distribution of discontinuities in the rock mass and quantified by rock quality designation(RQD).This paper analyzes the data of hydraulic conductivity and discontinuities sampled at different depths during the borehole investigations in the limestone and sandstone formations for the construction of hydraulic structures in Oman.Cores recovered from boreholes provide RQD data,and in situ Lugeon tests elucidate the permeability.A modern technique of multivariate adaptive regression splines(MARS)assisted in correlating permeability and RQD along with the depth.In situ permeability shows a declining trend with increasing RQD,and the depth of investigation is within 50 m.This type of relationship can be developed based on detailed initial investigations at the site where the hydraulic conductivity of discontinuous rocks is required to be delineated.The relationship can approximate the permeability by only measuring the RQD in later investigations on the same site,thus saving the time and cost of the site investigations.The applicability of the relationship developed in this study to another location requires a lithological similarity of the rock mass that can be verified through preliminary investigation at the site.
基金supported financially by the Special Basic Research Program of China(Grant No.2008FY110200)partially by Open Programme of State Key Laboratory(No.SKLFSE201009)
文摘In high mountainous areas, the development and distribution of alpine permafrost is greatly affected by macro- and mi- cro-topographic factors. The effects of latitude, altitude, slope, and aspect on the distribution of permafrost were studied to under- stand the dislribution patterns of permafrost in Wenquan on the Qinghai-Tibet Plateau. Cluster and correlation analysis were per- formed based on 30 m Global Digital Elevation Model (GDEM) data and field data obtained using geophysical exploration and borehole drilling methods. A Multivariate Adaptive Regression Spline model (MARS) was developed to simulate permafrost spa- tial distribution over the studied area. A validation was followed by comparing to 201 geophysical exploration sites, as well as by comparing to two other models, i.e., a binary logistic regression model and the Mean Annual Ground Temperature model (IVlAGT). The MARS model provides a better simulation than the other two models. Besides the control effect of elevation on permafrost distribution, the MARS model also takes into account the impact of direct solar radiation on permafrost distribution.
基金Project supported by the National Natural Science Foundation of China (Grant No.50775084)the National Hightechnology Research and Development Program of China (Grant No.2006AA04Z121)
文摘This paper makes an approach to the approximate optimum in structural design,which combines the global response surface(GRS) based multivariate adaptive regression splines(MARS) with Move-Limit strategy(MLS).MARS is an adaptive regression process,which fits in with the multidimensional problems.It adopts a modified recursive partitioning strategy to simplify high-dimensional problems into smaller highly accurate models.MLS for moving and resizing the search sub-regions is employed in the space of design variables.The quality of the approximation functions and the convergence history of the optimization process are reflected in MLS.The disadvantages of the conventional response surface method(RSM) have been avoided,specifically,highly nonlinear high-dimensional problems.The GRS/MARS with MLS is applied to a high-dimensional test function and an engineering problem to demonstrate its feasibility and convergence,and compared with quadratic response surface(QRS) models in terms of computational efficiency and accuracy.
文摘The purpose of this article is to provide an overview of adaptive regression modeling and demonstrate its use in conducting nonlinear analyses of interrupted time series (ITS) data. Adaptive regression modeling is based on heuristic search over alternative models for data controlled by likelihood-cross validation (LCV) scores with larger scores indicating better models. Extended linear mixed models are used for correlated data like ITS data. Power transforms of predictor variables are used to account for nonlinearity. The use of adaptive regression modeling for assessing ITS effects is demonstrated using data on annual proportions of major birth defects in children fathered by male Air Force veterans of the Vietnam War over a 59-year period. The interruption for this ITS is conception after versus before the start of a participant’s first tour in the Vietnam War. Whether the ITS effect is related to dioxin exposure is also addressed. Dioxin is a highly toxic contaminant of the herbicide Agent Orange used in the Vietnam War. The core findings of the reported analyses are that a substantial adverse ITS interruption effect is identified and that this adverse effect can reasonably be attributed to participants having a high dioxin exposure level. Moreover, these results indicate that adaptive regression modeling can identify nonlinear ITS effects in general situations that can lead to consequential insights into nonlinear relationships over time, possibly varying with other available predictors.
文摘Accurate prediction of compressive strength of concrete is one of the key issues in the concrete industry. In this paper, a prediction method of fly ash-slag concrete compressive strength based on multiple adaptive regression splines (MARS) is proposed, and the model analysis process is determined by analyzing the principle of this algorithm. Based on the Concrete Compressive Strength dataset of UCI, the MARS model for compressive strength prediction was constructed with cement content, blast furnace slag powder content, fly ash content, water content, reducing agent content, coarse aggregate content, fine aggregate content and age as independent variables. The prediction results of artificial neural network (BP), random forest (RF), support vector machine (SVM), extreme learning machine (ELM), and multiple nonlinear regression (MnLR) were compared and analyzed, and the prediction accuracy and model stability of MARS and RF models had obvious advantages, and the comprehensive performance of MARS model was slightly better than that of RF model. Finally, the explicit expression of the MARS model for compressive strength is given, which provides an effective method to achieve the prediction of compressive strength of fly ash-slag concrete.
基金supported by the Singapore Ministry of National Development and the National Research Foundation,Prime Minister’s Office under the Land and Liveability National Innovation Challenge(L2 NIC)Research Programme(Award No.L2NICCFP2-2015-1).
文摘The design and construction of underground structures are significantly affected by the distribution of geological formations.Prediction of the geological interfaces using limited data has been a difficult task.A multivariate adaptive regression spline(MARS)method capable of modeling nonlinearities automatically was used in this study to spatially predict the elevations of geological interfaces.Borehole data from two sites in Singapore were used to evaluate the capability of the MARS method for predicting geological interfaces.By comparing the predicted values with the borehole data,it is shown that the MARS method has a mean of root mean square error of 4.4 m for the predicted elevations of the Kallang Formation–Old Alluvium interface.In addition,the MARS method is able to produce reasonable prediction intervals in the sense that the percentage of testing data covered by 95% prediction intervals was close to the associated confidence level,95%.More importantly,the prediction interval evaluated by the MARS method had a non-constant width that appropriately reflected the data density and geological complexity.
基金This research(Grant Number NE/J015067/1)the Fragments,Functions and Flows in Urban Ecosystem Services(F3UES)project as part of the larger Biodiversity and Ecosystem Service Sustainability(BESS)framework.BESS is a six-year programme(2011-2017)funded+1 种基金the UK Natural Environment Research Council(NERC)the Biotechnology and Biological Sciences Research Council(BBSRC)as part of the UK’s Living with Environmental Change(LWEC)programme.
文摘We propose a method for spatial downscaling of Landsat 8-derived LST maps from 100(30 m)resolution down to 2–4 m with the use of the Multiple Adaptive Regression Splines(MARS)models coupled with very high resolution auxiliary data derived from hyperspectral aerial imagery and large-scale topographic maps.We applied the method to four Landsat 8 scenes,two collected in summer and two in winter,for three British towns collectively representing a variety of urban form.We used several spectral indices as well as fractional coverage of water and paved surfaces as LST predictors,and applied a novel method for the correction of temporal mismatch between spectral indices derived from aerial and satellite imagery captured at different dates,allowing for the application of the downscaling method for multiple dates without the need for repeating the aerial survey.Our results suggest that the method performed well for the summer dates,achieving RMSE of 1.40–1.83 K prior to and 0.76–1.21 K after correction for residuals.We conclude that the MARS models,by addressing the non-linear relationship of LST at coarse and fine spatial resolutions,can be successfully applied to produce high resolution LST maps suitable for studies of urban thermal environment at local scales.
文摘Regression models for survival time data involve estimation of the hazard rate as a function of predictor variables and associated slope parameters. An adaptive approach is formulated for such hazard regression modeling. The hazard rate is modeled using fractional polynomials, that is, linear combinations of products of power transforms of time together with other available predictors. These fractional polynomial models are restricted to generating positive-valued hazard rates and decreasing survival times. Exponentially distributed survival times are a special case. Parameters are estimated using maximum likelihood estimation allowing for right censored survival times. Models are evaluated and compared using likelihood cross-validation (LCV) scores. LCV scores and tolerance parameters are used to control an adaptive search through alternative fractional polynomial hazard rate models to identify effective models for the underlying survival time data. These methods are demonstrated using two different survival time data sets including survival times for lung cancer patients and for multiple myeloma patients. For the lung cancer data, the hazard rate depends distinctly on time. However, controlling for cell type provides a distinct improvement while the hazard rate depends only on cell type and no longer on time. Furthermore, Cox regression is unable to identify a cell type effect. For the multiple myeloma data, the hazard rate also depends distinctly on time. Moreover, consideration of hemoglobin at diagnosis provides a distinct improvement, the hazard rate still depends distinctly on time, and hemoglobin distinctly moderates the effect of time on the hazard rate. These results indicate that adaptive hazard rate modeling can provide unique insights into survival time data.
文摘Recurrent event time data and more general multiple event time data are commonly analyzed using extensions of Cox regression, or proportional hazards regression, as used with single event time data. These methods treat covariates, either time-invariant or time-varying, as having multiplicative effects while general dependence on time is left un-estimated. An adaptive approach is formulated for analyzing multiple event time data. Conditional hazard rates are modeled in terms of dependence on both time and covariates using fractional polynomials restricted so that the conditional hazard rates are positive-valued and so that excess time probability functions (generalizing survival functions for single event times) are decreasing. Maximum likelihood is used to estimate parameters adjusting for right censored event times. Likelihood cross-validation (LCV) scores are used to compare models. Adaptive searches through alternate conditional hazard rate models are controlled by LCV scores combined with tolerance parameters. These searches identify effective models for the underlying multiple event time data. Conditional hazard regression is demonstrated using data on times between tumor recurrence for bladder cancer patients. Analyses of theory-based models for these data using extensions of Cox regression provide conflicting results on effects to treatment group and the initial number of tumors. On the other hand, fractional polynomial analyses of these theory-based models provide consistent results identifying significant effects to treatment group and initial number of tumors using both model-based and robust empirical tests. Adaptive analyses further identify distinct moderation by group of the effect of tumor order and an additive effect to group after controlling for nonlinear effects to initial number of tumors and tumor order. Results of example analyses indicate that adaptive conditional hazard rate modeling can generate useful insights into multiple event time data.
文摘Recurrent event time data and more general multiple event time data are commonly analyzed using extensions of Cox regression, or proportional hazards regression, as used with single event time data. These methods treat covariates, either time-invariant or time-varying, as having multiplicative effects while general dependence on time is left un-estimated. An adaptive approach is formulated for analyzing multiple event time data. Conditional hazard rates are modeled in terms of dependence on both time and covariates using fractional polynomials restricted so that the conditional hazard rates are positive-valued and so that excess time probability functions (generalizing survival functions for single event times) are decreasing. Maximum likelihood is used to estimate parameters adjusting for right censored event times. Likelihood cross-validation (LCV) scores are used to compare models. Adaptive searches through alternate conditional hazard rate models are controlled by LCV scores combined with tolerance parameters. These searches identify effective models for the underlying multiple event time data. Conditional hazard regression is demonstrated using data on times between tumor recurrence for bladder cancer patients. Analyses of theory-based models for these data using extensions of Cox regression provide conflicting results on effects to treatment group and the initial number of tumors. On the other hand, fractional polynomial analyses of these theory-based models provide consistent results identifying significant effects to treatment group and initial number of tumors using both model-based and robust empirical tests. Adaptive analyses further identify distinct moderation by group of the effect of tumor order and an additive effect to group after controlling for nonlinear effects to initial number of tumors and tumor order. Results of example analyses indicate that adaptive conditional hazard rate modeling can generate useful insights into multiple event time data.
基金supported by National Nature Science Foundation of China(No.61203293)Key Scientific and Technological Project of Henan Province(No.122102210131)+3 种基金Program for Science and Technology Innovation Talents in Universities of Henan Province(No.13HASTIT040)Foundation of Henan Educational Committee(No.13A120524)Henan Normal University Doctoral Topics(No.qd14156)Henan Higher School Funding Scheme for Young Teachers(No.2012GGJS-063)
文摘This paper is devoted to identifying the biomarkers of rat liver regeneration via the adaptive logistic regression. By combining the adaptive elastic net penalty with the logistic regression loss, the adaptive logistic regression is proposed to adaptively identify the important genes in groups. Furthermore, by improving the pathwise coordinate descent algorithm, a fast solving algorithm is developed for computing the regularized paths of the adaptive logistic regression. The results from the experiments performed on the microarray data of rat liver regeneration are provided to illustrate the effectiveness of the proposed method and verify the biological rationality of the selected biomarkers.
文摘A research study collected intensive longitudinal data from cancer patients on a daily basis as well as non-intensive longitudinal survey data on a monthly basis. Although the daily data need separate analysis, those data can also be utilized to generate predictors of monthly outcomes. Alternatives for generating daily data predictors of monthly outcomes are addressed in this work. Analyses are reported of depression measured by the Patient Health Questionnaire 8 as the monthly survey outcome. Daily measures include numbers of opioid medications taken, numbers of pain flares, least pain levels, and worst pain levels. Predictors are averages of recent non-missing values for each daily measure recorded on or prior to survey dates for depression values. Weights for recent non-missing values are based on days between measurement of a recent value and a survey date. Five alternative averages are considered: averages with unit weights, averages with reciprocal weights, weighted averages with reciprocal weights, averages with exponential weights, and weighted averages with exponential weights. Adaptive regression methods based on likelihood cross-validation (LCV) scores are used to generate fractional polynomial models for possible nonlinear dependence of depression on each average. For all four daily measures, the best LCV score over averages of all types is generated using the average of recent non-missing values with reciprocal weights. Generated models are nonlinear and monotonic. Results indicate that an appropriate choice would be to assume three recent non-missing values and use the average with reciprocal weights of the first three recent non-missing values.
文摘Adaptive fractional polynomial modeling of general correlated outcomes is formulated to address nonlinearity in means, variances/dispersions, and correlations. Means and variances/dispersions are modeled using generalized linear models in fixed effects/coefficients. Correlations are modeled using random effects/coefficients. Nonlinearity is addressed using power transforms of primary (untransformed) predictors. Parameter estimation is based on extended linear mixed modeling generalizing both generalized estimating equations and linear mixed modeling. Models are evaluated using likelihood cross-validation (LCV) scores and are generated adaptively using a heuristic search controlled by LCV scores. Cases covered include linear, Poisson, logistic, exponential, and discrete regression of correlated continuous, count/rate, dichotomous, positive continuous, and discrete numeric outcomes treated as normally, Poisson, Bernoulli, exponentially, and discrete numerically distributed, respectively. Example analyses are also generated for these five cases to compare adaptive random effects/coefficients modeling of correlated outcomes to previously developed adaptive modeling based on directly specified covariance structures. Adaptive random effects/coefficients modeling substantially outperforms direct covariance modeling in the linear, exponential, and discrete regression example analyses. It generates equivalent results in the logistic regression example analyses and it is substantially outperformed in the Poisson regression case. Random effects/coefficients modeling of correlated outcomes can provide substantial improvements in model selection compared to directly specified covariance modeling. However, directly specified covariance modeling can generate competitive or substantially better results in some cases while usually requiring less computation time.
基金Under the auspices of National Natural Science Foundation of China(No.42101393,41901375,52274166)Hebei Natural Science Foundation(No.D2022209005,D2023209008)Central Guided Local Science and Technology Development Fund Project of Hebei Province(No.236Z3305G,246Z4201G)Key Research and Development Program of Science and Technology Plan of Tangshan,China(No.22150221J)。
文摘Coastal wetlands are crucial for the‘blue carbon sink’,significantly contributing to regulating climate change.This study util-ized 160 soil samples,35 remote sensing features,and 5 geo-climatic data to accurately estimate the soil organic carbon stocks(SOCS)in the coastal wetlands of Tianjin and Hebei,China.To reduce data redundancy,simplify model complexity,and improve model inter-pretability,Pearson correlation analysis(PsCA),Boruta,and recursive feature elimination(RFE)were employed to optimize features.Combined with the optimized features,the soil organic carbon density(SOCD)prediction model was constructed by using multivariate adaptive regression splines(MARS),extreme gradient boosting(XGBoost),and random forest(RF)algorithms and applied to predict the spatial distribution of SOCD and estimate the SOCS of different wetland types in 2020.The results show that:1)different feature combinations have a significant influence on the model performance.Better prediction performance was attained by building a model using RFE-based feature combinations.RF has the best prediction accuracy(R^(2)=0.587,RMSE=0.798 kg/m^(2),MAE=0.660 kg/m^(2)).2)Optical features are more important than radar and geo-climatic features in the MARS,XGBoost,and RF algorithms.3)The size of SOCS is related to SOCD and the area of each wetland type,aquaculture pond has the highest SOCS,followed by marsh,salt pan,mud-flat,and sand shore.
基金the funding support provided by National Natural Science Foundation of China(Grant No.42177121)Thammasat University Research Unit in Structural and Foundation Engineering.
文摘This study presents a hybrid framework to predict stability solutions of buried structures under active trapdoor conditions in natural clays with anisotropy and heterogeneity by combining physics-based and data-driven modeling.Finite-element limit analysis(FELA)with a newly developed anisotropic undrained shear(AUS)failure criterion is used to identify the underlying active failure mechanisms as well as to develop a numerical(physics-based)database of stability numbers for both planar and circular trapdoors.Practical considerations are given for natural clays to three linearly increasing shear strengths in compression,extension,and direct simple shear in the AUS material model.The obtained numerical solutions are compared and validated with published solutions in the literature.A multivariate adaptive regression splines(MARS)algorithm is further utilized to learn the numerical solutions to act as fast FELA data-driven surrogates for stability evaluation.The current MARS-based modeling provides both relative importance index and accurate design equations that can be used with confidence by practitioners.
基金a scholarship of the Special Research Fund (BOF) obtained from Ghent University, Belgiumpartially covered by the RIP-MU (VLIR, Belgium) project
文摘Despite many studies on land degradation in the Highlands of Northern Ethiopia, quantitative information regarding long-term changes in land use/cover(LUC) is rare. Hence, this study aims to investigate the LUC changes in the Geba catchment(5142 km2), Northern Ethiopia, over 80 years(1935–2014). Aerial photographs(APs) of the 1930 s and Google Earth(GE) images(2014) were used. The point-count technique was utilized by overlaying a grid on APs and GE images. The occurrence of cropland, forest, grassland, shrubland, bare land, built-up areas and water body was counted to compute their fractions. A multivariate adaptive regression spline was applied to identify the explanatory factors of LUC and to create fractional maps of LUC. The results indicate significant changes of most types, except for forest and cropland. In the 1930 s, shrubland(48%) was dominant, followed by cropland(39%). The fraction of cropland in 2014(42%) remained approximately the same as in the 1930 s, while shrubland significantly dropped to 37%. Forests shrank further from a meagre 6.3% in the 1930 s to 2.3% in 2014. High overall accuracies(93% and 83%) and strong Kappa coefficients(89% and 72%) for point counts and fractional maps respectively indicate the validity of the techniques used for LUC mapping.
文摘In the present scenario,computational modeling has gained much importance for the prediction of the properties of concrete.This paper depicts that how computational intelligence can be applied for the prediction of compressive strength of Self Compacting Concrete(SCC).Three models,namely,Extreme Learning Machine(ELM),Adaptive Neuro Fuzzy Inference System(ANFIS)and Multi Adaptive Regression Spline(MARS)have been employed in the present study for the prediction of compressive strength of self compacting concrete.The contents of cement(c),sand(s),coarse aggregate(a),fly ash(f),water/powder(w/p)ratio and superplasticizer(sp)dosage have been taken as inputs and 28 days compressive strength(fck)as output for ELM,ANFIS and MARS models.A relatively large set of data including 80 normalized data available in the literature has been taken for the study.A comparison is made between the results obtained from all the above-mentioned models and the model which provides best fit is established.The experimental results demonstrate that proposed models are robust for determination of compressive strength of self-compacting concrete.
基金supported by the National Key Research and Development Program of China(2021 YFB 4000500,2021 YFB 4000501,and 2021 YFB 4000502)。
文摘Steam cracking is the dominant technology for producing light olefins,which are believed to be the foundation of the chemical industry.Predictive models of the cracking process can boost production efficiency and profit margin.Rapid advancements in machine learning research have recently enabled data-driven solutions to usher in a new era of process modeling.Meanwhile,its practical application to steam cracking is still hindered by the trade-off between prediction accuracy and computational speed.This research presents a framework for data-driven intelligent modeling of the steam cracking process.Industrial data preparation and feature engineering techniques provide computational-ready datasets for the framework,and feedstock similarities are exploited using k-means clustering.We propose LArge-Residuals-Deletion Multivariate Adaptive Regression Spline(LARD-MARS),a modeling approach that explicitly generates output formulas and eliminates potentially outlying instances.The framework is validated further by the presentation of clustering results,the explanation of variable importance,and the testing and comparison of model performance.