Background Cotton is one of the most important commercial crops after food crops,especially in countries like India,where it’s grown extensively under rainfed conditions.Because of its usage in multiple industries,su...Background Cotton is one of the most important commercial crops after food crops,especially in countries like India,where it’s grown extensively under rainfed conditions.Because of its usage in multiple industries,such as textile,medicine,and automobile industries,it has greater commercial importance.The crop’s performance is greatly influenced by prevailing weather dynamics.As climate changes,assessing how weather changes affect crop performance is essential.Among various techniques that are available,crop models are the most effective and widely used tools for predicting yields.Results This study compares statistical and machine learning models to assess their ability to predict cotton yield across major producing districts of Karnataka,India,utilizing a long-term dataset spanning from 1990 to 2023 that includes yield and weather factors.The artificial neural networks(ANNs)performed superiorly with acceptable yield deviations ranging within±10%during both vegetative stage(F1)and mid stage(F2)for cotton.The model evaluation metrics such as root mean square error(RMSE),normalized root mean square error(nRMSE),and modelling efficiency(EF)were also within the acceptance limits in most districts.Furthermore,the tested ANN model was used to assess the importance of the dominant weather factors influencing crop yield in each district.Specifically,the use of morning relative humidity as an individual parameter and its interaction with maximum and minimum tempera-ture had a major influence on cotton yield in most of the yield predicted districts.These differences highlighted the differential interactions of weather factors in each district for cotton yield formation,highlighting individual response of each weather factor under different soils and management conditions over the major cotton growing districts of Karnataka.Conclusions Compared with statistical models,machine learning models such as ANNs proved higher efficiency in forecasting the cotton yield due to their ability to consider the interactive effects of weather factors on yield forma-tion at different growth stages.This highlights the best suitability of ANNs for yield forecasting in rainfed conditions and for the study on relative impacts of weather factors on yield.Thus,the study aims to provide valuable insights to support stakeholders in planning effective crop management strategies and formulating relevant policies.展开更多
Background:Survival from birth to slaughter is an important economic trait in commercial pig productions.Increasing survival can improve both economic efficiency and animal welfare.The aim of this study is to explore ...Background:Survival from birth to slaughter is an important economic trait in commercial pig productions.Increasing survival can improve both economic efficiency and animal welfare.The aim of this study is to explore the impact of genotyping strategies and statistical models on the accuracy of genomic prediction for survival in pigs during the total growing period from birth to slaughter.Results:We simulated pig populations with different direct and maternal heritabilities and used a linear mixed model,a logit model,and a probit model to predict genomic breeding values of pig survival based on data of individual survival records with binary outcomes(0,1).The results show that in the case of only alive animals having genotype data,unbiased genomic predictions can be achieved when using variances estimated from pedigreebased model.Models using genomic information achieved up to 59.2%higher accuracy of estimated breeding value compared to pedigree-based model,dependent on genotyping scenarios.The scenario of genotyping all individuals,both dead and alive individuals,obtained the highest accuracy.When an equal number of individuals(80%)were genotyped,random sample of individuals with genotypes achieved higher accuracy than only alive individuals with genotypes.The linear model,logit model and probit model achieved similar accuracy.Conclusions:Our conclusion is that genomic prediction of pig survival is feasible in the situation that only alive pigs have genotypes,but genomic information of dead individuals can increase accuracy of genomic prediction by 2.06%to 6.04%.展开更多
The water resources of the Nadhour-Sisseb-El Alem Basin in Tunisia exhibit semi-arid and arid climatic conditions.This induces an excessive pumping of groundwater,which creates drops in water level ranging about 1-2 m...The water resources of the Nadhour-Sisseb-El Alem Basin in Tunisia exhibit semi-arid and arid climatic conditions.This induces an excessive pumping of groundwater,which creates drops in water level ranging about 1-2 m/a.Indeed,these unfavorable conditions require interventions to rationalize integrated management in decision making.The aim of this study is to determine a water recharge index(WRI),delineate the potential groundwater recharge area and estimate the potential groundwater recharge rate based on the integration of statistical models resulted from remote sensing imagery,GIS digital data(e.g.,lithology,soil,runoff),measured artificial recharge data,fuzzy set theory and multi-criteria decision making(MCDM)using the analytical hierarchy process(AHP).Eight factors affecting potential groundwater recharge were determined,namely lithology,soil,slope,topography,land cover/use,runoff,drainage and lineaments.The WRI is between 1.2 and 3.1,which is classified into five classes as poor,weak,moderate,good and very good sites of potential groundwater recharge area.The very good and good classes occupied respectively 27%and 44%of the study area.The potential groundwater recharge rate was 43%of total precipitation.According to the results of the study,river beds are favorable sites for groundwater recharge.展开更多
The objective of this study is to analyze the sensitivity of the statistical models regarding the size of samples. The study carried out in Ivory Coast is based on annual maximum daily rainfall data collected from 26 ...The objective of this study is to analyze the sensitivity of the statistical models regarding the size of samples. The study carried out in Ivory Coast is based on annual maximum daily rainfall data collected from 26 stations. The methodological approach is based on the statistical modeling of maximum daily rainfall. Adjustments were made on several sample sizes and several return periods (2, 5, 10, 20, 50 and 100 years). The main results have shown that the 30 years series (1931-1960;1961-1990;1991-2020) are better adjusted by the Gumbel (26.92% - 53.85%) and Inverse Gamma (26.92% - 46.15%). Concerning the 60-years series (1931-1990;1961-2020), they are better adjusted by the Inverse Gamma (30.77%), Gamma (15.38% - 46.15%) and Gumbel (15.38% - 42.31%). The full chronicle 1931-2020 (90 years) presents a notable supremacy of 50% of Gumbel model over the Gamma (34.62%) and Gamma Inverse (15.38%) model. It is noted that the Gumbel is the most dominant model overall and more particularly in wet periods. The data for periods with normal and dry trends were better fitted by Gamma and Inverse Gamma.展开更多
The usability of an interface is a fundamental issue to elucidate. Many researchers argued that many usability results and recommendations lack empirical and experimental data. In this research, the usability of the w...The usability of an interface is a fundamental issue to elucidate. Many researchers argued that many usability results and recommendations lack empirical and experimental data. In this research, the usability of the web pages is evaluated using several carefully selected statistical models. Universities web pages are chosen as subjects for this work for ease of comparison and ease of collecting data. A series of experiments has been conducted to investigate into the usability and design of the universities web pages. Prototype web pages have been developed according to the structured methodologies of web pages design and usability. Universities web pages were evaluated together with the prototype web pages using a questionnaire which was designed according to the Human Computer Interactions (HCI) heuristics. Nine (users) respondents’ variables and 14 web pages variables (items) were studied. Stringent statistical analysis was adopted to extract the required information to form the data acquired, and augmented interpretation of the statistical results was followed. The results showed that the analysis of variance (ANOVA) procedure showed there were significant differences among the universities web pages regarding most of the 23 items studied. Duncan Multiple Range Test (DMRT) showed that the prototype usability performed significantly better regarding most of the items. The correlation analysis showed significant positive and negative correlations between many items. The regression analysis revealed that the most significant factors (items) that contributed to the best model of the universities web pages design and usability were: multimedia in the web pages, the web pages icons (alone) organisation and design, and graphics attractiveness. The results showed some of the limitations of some heuristics used in conventional interface systems design and proposed some additional heuristics in web pages design and usability.展开更多
Statistical models using historical data on crop yields and weather to calibrate rela- tively simple regression equations have been widely and extensively applied in previous studies, and have provided a common altern...Statistical models using historical data on crop yields and weather to calibrate rela- tively simple regression equations have been widely and extensively applied in previous studies, and have provided a common alternative to process-based models, which require extensive input data on cultivar, management, and soil conditions. However, very few studies had been conducted to review systematically the previous statistical models for indentifying climate contributions to crop yields. This paper introduces three main statistical methods, i.e., time-series model, cross-section model and panel model, which have been used to identify such issues in the field of agrometeorology. Generally, research spatial scale could be categorized into two types using statistical models, including site scale and regional scale (e.g. global scale, national scale, provincial scale and county scale). Four issues exist in identifying response sensitivity of crop yields to climate change by statistical models. The issues include the extent of spatial and temporal scale, non-climatic trend removal, colinearity existing in climate variables and non-consideration of adaptations. Respective resolutions for the above four issues have been put forward in the section of perspective on the future of statistical models finally.展开更多
Based on the review and comparison of main statistical analysis models for estimating variety-environment cell means in regional crop trials, a new statistical model, LR-PCA composite model was proposed, and the predi...Based on the review and comparison of main statistical analysis models for estimating variety-environment cell means in regional crop trials, a new statistical model, LR-PCA composite model was proposed, and the predictive precision of these models were compared by cross validation of an example data. Results showed that the order of model precision was LR-PCA model > AMMI model > PCA model > Treatment Means (TM) model > Linear Regression (LR) model > Additive Main Effects ANOVA model. The precision gain factor of LR-PCA model was 1.55, increasing by 8.4% compared with AMMI.展开更多
QTL mapping for seven quality traits was conducted by using 254 recombinant inbred lines (RIL) derived from a japonica-japonica rice cross of Xiushui 79/C Bao. The seven traits investigated were grain length (GL),...QTL mapping for seven quality traits was conducted by using 254 recombinant inbred lines (RIL) derived from a japonica-japonica rice cross of Xiushui 79/C Bao. The seven traits investigated were grain length (GL), grain length to width ratio (LWR), chalk grain rate (CGR), chalkiness degree (CD), gelatinization temperature (GT), amylose content (AC) and gel consistency (GC) of head rice. Three mapping methods employed were composite interval mapping in QTLMapper 2.0 software based on mixed linear model (MCIM), inclusive composite interval mapping in QTL IciMapping 3.0 software based on stepwise regression linear model (ICIM) and multiple interval mapping with regression forward selection in Windows QTL Cartographer 2.5 based on multiple regression analysis (MIMR). Results showed that five QTLs with additive effect (A-QTLs) were detected by all the three methods simultaneously, two by two methods simultaneously, and 23 by only one method. Five A-QTLs were detected by MCIM, nine by ICIM and 28 by MIMR. The contribution rates of single A-QTL ranged from 0.89% to 38.07%. All the QTLs with epistatic effect (E-QTLs) detected by MIMR were not detected by the other two methods. Fourteen pairs of E-QTLs were detected by both MCIM and ICIM, and 142 pairs of E-QTLs were detected by only one method. Twenty-five pairs of E-QTLs were detected by MCIM, 141 pairs by ICIM and four pairs by MIMR. The contribution rates of single pair of E-QTL were from 2.60% to 23.78%. In the Xiu-Bao RIL population, epistatic effect played a major role in the variation of GL and CD, and additive effect was the dominant in the variation of LWR, while both epistatic effect and additive effect had equal importance in the variation of CGR, AC, GT and GC. QTLs detected by two or more methods simultaneously were highly reliable, and could be applied to improve the quality traits in japonica hybrid rice.展开更多
Semi-rigid liquid crystal polymer is a class of liquid crystal polymers different from long rigid rod liquid crystal polymer to which the well-known Onsager and Flory theories are applied. In this paper, three statist...Semi-rigid liquid crystal polymer is a class of liquid crystal polymers different from long rigid rod liquid crystal polymer to which the well-known Onsager and Flory theories are applied. In this paper, three statistical models for the semi-rigid nematic polymer were addressed. They are the elastically jointed rod model, worm-like chain model, and non-homogeneous chain model. The nematic-isotropic transition temperature was examined. The pseudo-second transition temperature is expressed analytically. Comparisons with the experiments were made and the agreements were found.展开更多
Non-metallic inclusions are critical for the fatigue failure of clean steels in service;especially,the large and hard inclusions are detrimental.Since it is not possible to measure all the inclusions in the large-volu...Non-metallic inclusions are critical for the fatigue failure of clean steels in service;especially,the large and hard inclusions are detrimental.Since it is not possible to measure all the inclusions in the large-volume clean steels,statistical models have been developed to evaluate inclusions,aiming at predicting the maximum inclusion size in the large volume from the data of inclusions,which are derived from the limited observations on small-volume specimens.Different statistical models were reviewed together with their supporting theories.In particular,the block maxima and the threshold types of models were discussed through a thorough comparison as they are both widely used and based on the extreme value theory.The predicted results not only are used to distinguish the different cleanliness levels of steels,but also help to estimate fatigue strength.Finally,future research is proposed to focus on tackling the present difficulties encountered by statistical models,including the sufficient credibility of obtained results and the robustness of models for applications.展开更多
Following the basic principle of modem multivariate statistical analysis theory, the description model, prediction model and control model to relate chemical compositions and mechanical properties of steels are introd...Following the basic principle of modem multivariate statistical analysis theory, the description model, prediction model and control model to relate chemical compositions and mechanical properties of steels are introduced. As an example, the total flowchart of components and structure/properties description, prediction and control model for chemical composition and mechanical properties of 20 and A_2 steel are presented.展开更多
This paper reviews differences between the deterministic(sharp and diffuse)and statistical models of the interphase region between the two-phases.In the literature this region is usually referred to as the(macroscopic...This paper reviews differences between the deterministic(sharp and diffuse)and statistical models of the interphase region between the two-phases.In the literature this region is usually referred to as the(macroscopic)interface.Therein,the mesoscopic interface that is defined at the molecular level and agitated by the thermal fluctuations is found with nonzero probability.For this reason,in this work,the interphase region is called the mesoscopic intermittency/transition region.To this purpose,the first part of the present work gives the rationale for introduction of the mesoscopic intermittency region statistical model.It is argued that classical(deterministic)sharp and diffuse models do not explain the experimental and numerical results presented in the literature.Afterwards,it is elucidated that a statistical model of the mesoscopic intermittency region(SMIR)combines existing sharp and diffuse models into a single coherent framework and explains published experimental and numerical results.In the second part of the present paper,the SMIR is used for the first time to predict equilibrium and nonequilibrium two-phase flow in the numerical simulation.To this goal,a two-dimensional rising gas bubble is studied;obtained numerical results are used as a basis to discuss differences between the deterministic and statistical models showing the statistical description has a potential to account for the physical phenomena not previously considered in the computer simulations.展开更多
Dengue fever(DF),caused by the Dengue virus through the Aedes mosquito vector,is a dangerous infectious disease with the potential to become a global epidemic.Vietnam,particularly Ba Ria-Vung Tau(BRVT)province,is faci...Dengue fever(DF),caused by the Dengue virus through the Aedes mosquito vector,is a dangerous infectious disease with the potential to become a global epidemic.Vietnam,particularly Ba Ria-Vung Tau(BRVT)province,is facing a high risk of DF.This study aims to determine the relationship between the search volume for DF on Google Trends and DF cases in BRVT province,thereby constructing a model to predict the early outbreak risk of DF locally.Using Poisson regression(adjusted by quasi-Poisson),considering the lagged effect of Google Trends Index(GTI)search volume on DF cases,and removing the autocorrelation(AC)of DF cases by using appropriate transformations,seven forecast models were surveyed based on the dataset of DF cases and GTI search volume weekly with the phrase"sôt xuàt huyêt"(dengue fever)in BRVT province from January 2019 to August 2023(243 weeks).The model selected is the one with the lowest dispersion index.The results show that the correlation coefficient(95%confidence interval)and dispersion index of the 7 models including Basis TSR;Basis TSR t AC:Lag(Residuals,1);Basis TSR t AC:Lag(SXH,1);Basis TSR t AC:Lag(log(SXHt1),1);TSR Lag(GTI,2)t AC:Lag(log(SXHt1),2);TSR Lag(GTI,3)t AC:Lag(log(SXHt1),3);TSR Lag(GTI,0)t AC:Lag(log(SXHt1),1)are 0.71(0.63-0.76)and 74.2;0.79(0.73-0.83)and 48.6;0.89(0.87-0.92)and 37.3;0.98(0.97-0.99)and 7.2;0.96(0.95-0.97)and 14.3;0.93(0.91-0.94)and 25.7;0.98(0.97-0.99)and 6.8,respectively.Therefore,the final model is the most suitable one selected.Testing the accuracy of the selected model using the ROC curve with the Youden criterion,the AUC(threshold 75%)is 0.982,and the AUC(threshold 95%)is 0.984,indicating the very good predictive ability of the model.In summary,the research results show the potential for applying this model in Vietnam,especially in BRVT,to enhance the effectiveness of epidemic prevention measures and protect public health.展开更多
In this work, four empirical models of statistical thickness, namely the models of Harkins and Jura, Hasley, Carbon Black and Jaroniec, were compared in order to determine the textural properties (external surface and...In this work, four empirical models of statistical thickness, namely the models of Harkins and Jura, Hasley, Carbon Black and Jaroniec, were compared in order to determine the textural properties (external surface and surface of micropores) of a clay concrete without molasses and clay concretes stabilized with 8%, 12% and 16% molasses. The results obtained show that Hasley’s model can be used to obtain the external surfaces. However, it does not allow the surface of the micropores to be obtained, and is not suitable for the case of simple clay concrete (without molasses) and for clay concretes stabilized with molasses. The Carbon Black, Jaroniec and Harkins and Jura models can be used for clay concrete and stabilized clay concrete. However, the Carbon Black model is the most relevant for clay concrete and the Harkins and Jura model is for molasses-stabilized clay concrete. These last two models augur well for future research.展开更多
Accurate Global Horizontal Irradiance(GHI)forecasting has become vital for successfully integrating solar energy into the electrical grid because of the expanding demand for green power and the worldwide shift favouri...Accurate Global Horizontal Irradiance(GHI)forecasting has become vital for successfully integrating solar energy into the electrical grid because of the expanding demand for green power and the worldwide shift favouring green energy resources.Particularly considering the implications of the aggressive GHG emission targets,accurate GHI forecasting has become vital for developing,designing,and operational managing solar energy systems.This research presented the core concepts of modelling and performance analysis of the application of various forecasting models such as ARIMA(Autoregressive Integrated Moving Average),Elaman NN(Elman Neural Network),RBFN(Radial Basis Function Neural Network),SVM(Support Vector Machine),LSTM(Long Short-Term Memory),Persistent,BPN(Back Propagation Neural Network),MLP(Multilayer Perceptron Neural Network),RF(Random Forest),and XGBoost(eXtreme Gradient Boosting)for assessing multi-seasonal forecasting of GHI.Used the India region data to evaluate the models’performance and forecasting ability.Research using forecasting models for seasonal Global Horizontal Irradiance(GHI)forecasting in winter,spring,summer,monsoon,and autumn.Substantiated performance effectiveness through evaluation metrics,such as Mean Absolute Error(MAE),Root Mean Squared Error(RMSE),and R-squared(R^(2)),coded using Python programming.The performance experimentation analysis inferred that the most accurate forecasts in all the seasons compared to the other forecasting models the Random Forest and eXtreme Gradient Boosting,are the superior and competing models that yield Winter season-based forecasting XGBoost is the best forecasting model with MAE:1.6325,RMSE:4.8338,and R^(2):0.9998.Spring season-based forecasting XGBoost is the best forecasting model with MAE:2.599599,RMSE:5.58539,and R^(2):0.999784.Summer season-based forecasting RF is the best forecasting model with MAE:1.03843,RMSE:2.116325,and R^(2):0.999967.Monsoon season-based forecasting RF is the best forecasting model with MAE:0.892385,RMSE:2.417587,and R^(2):0.999942.Autumn season-based forecasting RF is the best forecasting model with MAE:0.810462,RMSE:1.928215,and R^(2):0.999958.Based on seasonal variations and computing constraints,the findings enable energy system operators to make helpful recommendations for choosing the most effective forecasting models.展开更多
In this study,the gradients of Total Electron Content(TEC)for a midlatitude region are estimated and grouped with respect to the distance between neighboring stations,time periods within a day,and satellite directions...In this study,the gradients of Total Electron Content(TEC)for a midlatitude region are estimated and grouped with respect to the distance between neighboring stations,time periods within a day,and satellite directions.Annual medians of these gradients for quiet days are computed as templates.The metric distances(L2N)and Symmetric Kullback-Leibler Distances(SKLD)are obtained between the templates and the daily gradient series.The grouped histograms are fitted to the prospective Probability Density Functions(PDF).The method is applied to the Slant Total Electron Content(STEC)estimates from the Turkish National Permanent GPS Network(TNPGN-Active)for 2015.The highest gradients are observed in the east-west axis with a maximum of 25 mm/km during a geomagnetic storm.The maximum differences from the gradient templates occur for neighboring stations within100-130 km distance away from each other,during night hours,and for regions bordering the Black Sea and the Mediterranean in the northeast and southeast of Turkey.The empirical PDFs of the stationpair gradients are predominantly Weibull-distributed.The mean values of Weibull PDFs in all station groups are between 1.2 and 1.8 mm/km,with an increase during noon and afternoon hours.The standard deviations of the gradient PDFs generally increase during night hours.The algorithm will form a basis for quantifying the stochastic variations of the spatial rate of change of TEC trends in midlatitude regions,thus supplementing reliable and accurate regional monitoring of ionospheric variability.展开更多
Numerical models are crucial for quantifying the ocean-atmosphere interactions associated with the El Niño-Southern Oscillation(ENSO)phenomenon in the tropical Pacific.Current coupled models often exhibit signifi...Numerical models are crucial for quantifying the ocean-atmosphere interactions associated with the El Niño-Southern Oscillation(ENSO)phenomenon in the tropical Pacific.Current coupled models often exhibit significant biases and inter-model differences in simulating ENSO,underscoring the need for alternative modeling approaches.The Regional Ocean Modeling System(ROMS)is a sophisticated ocean model widely used for regional studies and has been coupled with various atmospheric models.However,its application in simulating ENSO processes on a basin scale in the tropical Pacific has not been explored.For the first time,this study presents the development of a basin-scale hybrid coupled model(HCM)for the tropical Pacific,integrating ROMS with a statistical atmospheric model that captures the interannual relationships between sea surface temperature(SST)and wind stress anomalies.The HCM is evaluated for its capability to simulate the annual mean,seasonal,and interannual variations of the oceanic state in the tropical Pacific.Results demonstrate that the model effectively reproduces the ENSO cycle,with a dominant oscillation period of approximately two years.The ROMS-based HCM developed here offers an efficient and robust tool for investigating climate variability in the tropical Pacific.展开更多
Accurate assessment of coal brittleness is crucial in the design of coal seam drilling and underground coal mining operations.This study proposes a method for evaluating the brittleness of gas-bearing coal based on a ...Accurate assessment of coal brittleness is crucial in the design of coal seam drilling and underground coal mining operations.This study proposes a method for evaluating the brittleness of gas-bearing coal based on a statistical damage constitutive model and energy evolution mechanisms.Initially,integrating the principle of effective stress and the Hoek-Brown criterion,a statistical damage constitutive model for gas-bearing coal is established and validated through triaxial compression tests under different gas pressures to verify its accuracy and applicability.Subsequently,employing energy evolution mechanism,two energy characteristic parameters(elastic energy proportion and dissipated energy proportion)are analyzed.Based on the damage stress thresholds,the damage evolution characteristics of gas bearing coal were explored.Finally,by integrating energy characteristic parameters with damage parameters,a novel brittleness index is proposed.The results demonstrate that the theoretical curves derived from the statistical damage constitutive model closely align with the test curves,accurately reflecting the stress−strain characteristics of gas-bearing coal and revealing the stress drop and softening characteristics of coal in the post-peak stage.The shape parameter and scale parameter represent the brittleness and macroscopic strength of the coal,respectively.As gas pressure increases from 1 to 5 MPa,the shape parameter and the scale parameter decrease by 22.18%and 60.45%,respectively,indicating a reduction in both brittleness and strength of the coal.Parameters such as maximum damage rate and peak elastic energy storage limit positively correlate with coal brittleness.The brittleness index effectively captures the brittleness characteristics and reveals a decrease in brittleness and an increase in sensitivity to plastic deformation under higher gas pressure conditions.展开更多
Soil erosion is a crucial geo-environmental hazard worldwide that affects water quality and agriculture,decreases reservoir storage capacity due to sedimentation,and increases the danger of flooding and landslides.Thu...Soil erosion is a crucial geo-environmental hazard worldwide that affects water quality and agriculture,decreases reservoir storage capacity due to sedimentation,and increases the danger of flooding and landslides.Thus,this study uses geospatial modeling to produce soil erosion susceptibility maps(SESM)for the Hangu region,Khyber Pakhtunkhwa(KPK),Pakistan.The Hangu region,located in the Kohat Plateau of KPK,Pakistan,is particularly susceptible to soil erosion due to its unique geomorphological and climatic characteristics.Moreover,the Hangu region is characterized by a combination of steep slopes,variable rainfall patterns,diverse land use,and distinct soil types,all of which contribute to the complexity and severity of soil erosion processes.These factors necessitate a detailed and region-specific study to develop effective soil conservation strategies.In this research,we detected and mapped 1013 soil erosion points and prepared 12 predisposing factors(elevation,aspect,slope,Normalized Differentiate Vegetation Index(NDVI),drainage network,curvature,Land Use Land Cover(LULC),rainfall,lithology,contour,soil texture,and road network)of soil erosion using GIS platform.Additionally,GIS-based statistical models like the weight of evidence(WOE)and frequency ratio(FR)were applied to produce the SESM for the study area.The SESM was reclassified into four classes,i.e.,low,medium,high,and very high zone.The results of WOE for SESM show that 16.39%,33.02%,29.27%,and 21.30%of areas are covered by low,medium,high,and very high zones,respectively.In contrast,the FR results revealed that 16.50%,24.33%,35.55%,and 23.59%of the areas are occupied by low,medium,high,and very high classes.Furthermore,the reliability of applied models was evaluated using the Area Under Curve(AUC)technique.The validation results utilizing the area under curve showed that the success rate curve(SRC)and predicted rate curve(PRC)for WOE are 82%and 86%,respectively,while SRC and PRC for FR are 85%and 96%,respectively.The validation results revealed that the FR model performance is better and more reliable than the WOE.展开更多
Forecasting the movement of stock market is a long-time attractive topic. This paper implements different statistical learning models to predict the movement of S&P 500 index. The S&P 500 index is influenced b...Forecasting the movement of stock market is a long-time attractive topic. This paper implements different statistical learning models to predict the movement of S&P 500 index. The S&P 500 index is influenced by other important financial indexes across the world such as commodity price and financial technical indicators. This paper systematically investigated four supervised learning models, including Logistic Regression, Gaussian Discriminant Analysis (GDA), Naive Bayes and Support Vector Machine (SVM) in the forecast of S&P 500 index. After several experiments of optimization in features and models, especially the SVM kernel selection and feature selection for different models, this paper concludes that a SVM model with a Radial Basis Function (RBF) kernel can achieve an accuracy rate of 62.51% for the future market trend of the S&P 500 index.展开更多
基金funded through India Meteorological Department,New Delhi,India under the Forecasting Agricultural output using Space,Agrometeorol ogy and Land based observations(FASAL)project and fund number:No.ASC/FASAL/KT-11/01/HQ-2010.
文摘Background Cotton is one of the most important commercial crops after food crops,especially in countries like India,where it’s grown extensively under rainfed conditions.Because of its usage in multiple industries,such as textile,medicine,and automobile industries,it has greater commercial importance.The crop’s performance is greatly influenced by prevailing weather dynamics.As climate changes,assessing how weather changes affect crop performance is essential.Among various techniques that are available,crop models are the most effective and widely used tools for predicting yields.Results This study compares statistical and machine learning models to assess their ability to predict cotton yield across major producing districts of Karnataka,India,utilizing a long-term dataset spanning from 1990 to 2023 that includes yield and weather factors.The artificial neural networks(ANNs)performed superiorly with acceptable yield deviations ranging within±10%during both vegetative stage(F1)and mid stage(F2)for cotton.The model evaluation metrics such as root mean square error(RMSE),normalized root mean square error(nRMSE),and modelling efficiency(EF)were also within the acceptance limits in most districts.Furthermore,the tested ANN model was used to assess the importance of the dominant weather factors influencing crop yield in each district.Specifically,the use of morning relative humidity as an individual parameter and its interaction with maximum and minimum tempera-ture had a major influence on cotton yield in most of the yield predicted districts.These differences highlighted the differential interactions of weather factors in each district for cotton yield formation,highlighting individual response of each weather factor under different soils and management conditions over the major cotton growing districts of Karnataka.Conclusions Compared with statistical models,machine learning models such as ANNs proved higher efficiency in forecasting the cotton yield due to their ability to consider the interactive effects of weather factors on yield forma-tion at different growth stages.This highlights the best suitability of ANNs for yield forecasting in rainfed conditions and for the study on relative impacts of weather factors on yield.Thus,the study aims to provide valuable insights to support stakeholders in planning effective crop management strategies and formulating relevant policies.
基金funded by the"Genetic improvement of pig survival"project from Danish Pig Levy Foundation (Aarhus,Denmark)The China Scholarship Council (CSC)for providing scholarship to the first author。
文摘Background:Survival from birth to slaughter is an important economic trait in commercial pig productions.Increasing survival can improve both economic efficiency and animal welfare.The aim of this study is to explore the impact of genotyping strategies and statistical models on the accuracy of genomic prediction for survival in pigs during the total growing period from birth to slaughter.Results:We simulated pig populations with different direct and maternal heritabilities and used a linear mixed model,a logit model,and a probit model to predict genomic breeding values of pig survival based on data of individual survival records with binary outcomes(0,1).The results show that in the case of only alive animals having genotype data,unbiased genomic predictions can be achieved when using variances estimated from pedigreebased model.Models using genomic information achieved up to 59.2%higher accuracy of estimated breeding value compared to pedigree-based model,dependent on genotyping scenarios.The scenario of genotyping all individuals,both dead and alive individuals,obtained the highest accuracy.When an equal number of individuals(80%)were genotyped,random sample of individuals with genotypes achieved higher accuracy than only alive individuals with genotypes.The linear model,logit model and probit model achieved similar accuracy.Conclusions:Our conclusion is that genomic prediction of pig survival is feasible in the situation that only alive pigs have genotypes,but genomic information of dead individuals can increase accuracy of genomic prediction by 2.06%to 6.04%.
文摘The water resources of the Nadhour-Sisseb-El Alem Basin in Tunisia exhibit semi-arid and arid climatic conditions.This induces an excessive pumping of groundwater,which creates drops in water level ranging about 1-2 m/a.Indeed,these unfavorable conditions require interventions to rationalize integrated management in decision making.The aim of this study is to determine a water recharge index(WRI),delineate the potential groundwater recharge area and estimate the potential groundwater recharge rate based on the integration of statistical models resulted from remote sensing imagery,GIS digital data(e.g.,lithology,soil,runoff),measured artificial recharge data,fuzzy set theory and multi-criteria decision making(MCDM)using the analytical hierarchy process(AHP).Eight factors affecting potential groundwater recharge were determined,namely lithology,soil,slope,topography,land cover/use,runoff,drainage and lineaments.The WRI is between 1.2 and 3.1,which is classified into five classes as poor,weak,moderate,good and very good sites of potential groundwater recharge area.The very good and good classes occupied respectively 27%and 44%of the study area.The potential groundwater recharge rate was 43%of total precipitation.According to the results of the study,river beds are favorable sites for groundwater recharge.
文摘The objective of this study is to analyze the sensitivity of the statistical models regarding the size of samples. The study carried out in Ivory Coast is based on annual maximum daily rainfall data collected from 26 stations. The methodological approach is based on the statistical modeling of maximum daily rainfall. Adjustments were made on several sample sizes and several return periods (2, 5, 10, 20, 50 and 100 years). The main results have shown that the 30 years series (1931-1960;1961-1990;1991-2020) are better adjusted by the Gumbel (26.92% - 53.85%) and Inverse Gamma (26.92% - 46.15%). Concerning the 60-years series (1931-1990;1961-2020), they are better adjusted by the Inverse Gamma (30.77%), Gamma (15.38% - 46.15%) and Gumbel (15.38% - 42.31%). The full chronicle 1931-2020 (90 years) presents a notable supremacy of 50% of Gumbel model over the Gamma (34.62%) and Gamma Inverse (15.38%) model. It is noted that the Gumbel is the most dominant model overall and more particularly in wet periods. The data for periods with normal and dry trends were better fitted by Gamma and Inverse Gamma.
文摘The usability of an interface is a fundamental issue to elucidate. Many researchers argued that many usability results and recommendations lack empirical and experimental data. In this research, the usability of the web pages is evaluated using several carefully selected statistical models. Universities web pages are chosen as subjects for this work for ease of comparison and ease of collecting data. A series of experiments has been conducted to investigate into the usability and design of the universities web pages. Prototype web pages have been developed according to the structured methodologies of web pages design and usability. Universities web pages were evaluated together with the prototype web pages using a questionnaire which was designed according to the Human Computer Interactions (HCI) heuristics. Nine (users) respondents’ variables and 14 web pages variables (items) were studied. Stringent statistical analysis was adopted to extract the required information to form the data acquired, and augmented interpretation of the statistical results was followed. The results showed that the analysis of variance (ANOVA) procedure showed there were significant differences among the universities web pages regarding most of the 23 items studied. Duncan Multiple Range Test (DMRT) showed that the prototype usability performed significantly better regarding most of the items. The correlation analysis showed significant positive and negative correlations between many items. The regression analysis revealed that the most significant factors (items) that contributed to the best model of the universities web pages design and usability were: multimedia in the web pages, the web pages icons (alone) organisation and design, and graphics attractiveness. The results showed some of the limitations of some heuristics used in conventional interface systems design and proposed some additional heuristics in web pages design and usability.
基金National Natural Science Foundation of China, No.41001057 The Science and Technology Strategic Pilot of the Chinese Academy of Sciences, No.XDA05090308+1 种基金 No.XDA05090310 Project Supported by State Key Laboratory of Earth Surface Processes and Resource Ecology, No.2011-KF-06
文摘Statistical models using historical data on crop yields and weather to calibrate rela- tively simple regression equations have been widely and extensively applied in previous studies, and have provided a common alternative to process-based models, which require extensive input data on cultivar, management, and soil conditions. However, very few studies had been conducted to review systematically the previous statistical models for indentifying climate contributions to crop yields. This paper introduces three main statistical methods, i.e., time-series model, cross-section model and panel model, which have been used to identify such issues in the field of agrometeorology. Generally, research spatial scale could be categorized into two types using statistical models, including site scale and regional scale (e.g. global scale, national scale, provincial scale and county scale). Four issues exist in identifying response sensitivity of crop yields to climate change by statistical models. The issues include the extent of spatial and temporal scale, non-climatic trend removal, colinearity existing in climate variables and non-consideration of adaptations. Respective resolutions for the above four issues have been put forward in the section of perspective on the future of statistical models finally.
文摘Based on the review and comparison of main statistical analysis models for estimating variety-environment cell means in regional crop trials, a new statistical model, LR-PCA composite model was proposed, and the predictive precision of these models were compared by cross validation of an example data. Results showed that the order of model precision was LR-PCA model > AMMI model > PCA model > Treatment Means (TM) model > Linear Regression (LR) model > Additive Main Effects ANOVA model. The precision gain factor of LR-PCA model was 1.55, increasing by 8.4% compared with AMMI.
基金supported by the National High Technology Research and Development Program of China (Grant No. 2010AA101301)the Program of Introducing International Advanced Agricultural Science and Technology in China (Grant No. 2006-G8[4]-31-1)the Program of Science-Technology Basis and Conditional Platform in China (Grant No. 505005)
文摘QTL mapping for seven quality traits was conducted by using 254 recombinant inbred lines (RIL) derived from a japonica-japonica rice cross of Xiushui 79/C Bao. The seven traits investigated were grain length (GL), grain length to width ratio (LWR), chalk grain rate (CGR), chalkiness degree (CD), gelatinization temperature (GT), amylose content (AC) and gel consistency (GC) of head rice. Three mapping methods employed were composite interval mapping in QTLMapper 2.0 software based on mixed linear model (MCIM), inclusive composite interval mapping in QTL IciMapping 3.0 software based on stepwise regression linear model (ICIM) and multiple interval mapping with regression forward selection in Windows QTL Cartographer 2.5 based on multiple regression analysis (MIMR). Results showed that five QTLs with additive effect (A-QTLs) were detected by all the three methods simultaneously, two by two methods simultaneously, and 23 by only one method. Five A-QTLs were detected by MCIM, nine by ICIM and 28 by MIMR. The contribution rates of single A-QTL ranged from 0.89% to 38.07%. All the QTLs with epistatic effect (E-QTLs) detected by MIMR were not detected by the other two methods. Fourteen pairs of E-QTLs were detected by both MCIM and ICIM, and 142 pairs of E-QTLs were detected by only one method. Twenty-five pairs of E-QTLs were detected by MCIM, 141 pairs by ICIM and four pairs by MIMR. The contribution rates of single pair of E-QTL were from 2.60% to 23.78%. In the Xiu-Bao RIL population, epistatic effect played a major role in the variation of GL and CD, and additive effect was the dominant in the variation of LWR, while both epistatic effect and additive effect had equal importance in the variation of CGR, AC, GT and GC. QTLs detected by two or more methods simultaneously were highly reliable, and could be applied to improve the quality traits in japonica hybrid rice.
基金The work was supported by the Foundation of State Education Committee of China
文摘Semi-rigid liquid crystal polymer is a class of liquid crystal polymers different from long rigid rod liquid crystal polymer to which the well-known Onsager and Flory theories are applied. In this paper, three statistical models for the semi-rigid nematic polymer were addressed. They are the elastically jointed rod model, worm-like chain model, and non-homogeneous chain model. The nematic-isotropic transition temperature was examined. The pseudo-second transition temperature is expressed analytically. Comparisons with the experiments were made and the agreements were found.
基金support from the National Natural Science Foundation of China(No.51831002)Fundamental Research Funds for the Central Universities(No.FRF-TP-18-002C2).
文摘Non-metallic inclusions are critical for the fatigue failure of clean steels in service;especially,the large and hard inclusions are detrimental.Since it is not possible to measure all the inclusions in the large-volume clean steels,statistical models have been developed to evaluate inclusions,aiming at predicting the maximum inclusion size in the large volume from the data of inclusions,which are derived from the limited observations on small-volume specimens.Different statistical models were reviewed together with their supporting theories.In particular,the block maxima and the threshold types of models were discussed through a thorough comparison as they are both widely used and based on the extreme value theory.The predicted results not only are used to distinguish the different cleanliness levels of steels,but also help to estimate fatigue strength.Finally,future research is proposed to focus on tackling the present difficulties encountered by statistical models,including the sufficient credibility of obtained results and the robustness of models for applications.
文摘Following the basic principle of modem multivariate statistical analysis theory, the description model, prediction model and control model to relate chemical compositions and mechanical properties of steels are introduced. As an example, the total flowchart of components and structure/properties description, prediction and control model for chemical composition and mechanical properties of 20 and A_2 steel are presented.
基金This work was supported by the National Science Center,Poland(Narodowe Centrum Nauki,Polska)in the project“Statistical modeling of turbulent two-fluid flows with interfaces”(Grant No.2016/21/B/ST8/01010,ID:334165).
文摘This paper reviews differences between the deterministic(sharp and diffuse)and statistical models of the interphase region between the two-phases.In the literature this region is usually referred to as the(macroscopic)interface.Therein,the mesoscopic interface that is defined at the molecular level and agitated by the thermal fluctuations is found with nonzero probability.For this reason,in this work,the interphase region is called the mesoscopic intermittency/transition region.To this purpose,the first part of the present work gives the rationale for introduction of the mesoscopic intermittency region statistical model.It is argued that classical(deterministic)sharp and diffuse models do not explain the experimental and numerical results presented in the literature.Afterwards,it is elucidated that a statistical model of the mesoscopic intermittency region(SMIR)combines existing sharp and diffuse models into a single coherent framework and explains published experimental and numerical results.In the second part of the present paper,the SMIR is used for the first time to predict equilibrium and nonequilibrium two-phase flow in the numerical simulation.To this goal,a two-dimensional rising gas bubble is studied;obtained numerical results are used as a basis to discuss differences between the deterministic and statistical models showing the statistical description has a potential to account for the physical phenomena not previously considered in the computer simulations.
文摘Dengue fever(DF),caused by the Dengue virus through the Aedes mosquito vector,is a dangerous infectious disease with the potential to become a global epidemic.Vietnam,particularly Ba Ria-Vung Tau(BRVT)province,is facing a high risk of DF.This study aims to determine the relationship between the search volume for DF on Google Trends and DF cases in BRVT province,thereby constructing a model to predict the early outbreak risk of DF locally.Using Poisson regression(adjusted by quasi-Poisson),considering the lagged effect of Google Trends Index(GTI)search volume on DF cases,and removing the autocorrelation(AC)of DF cases by using appropriate transformations,seven forecast models were surveyed based on the dataset of DF cases and GTI search volume weekly with the phrase"sôt xuàt huyêt"(dengue fever)in BRVT province from January 2019 to August 2023(243 weeks).The model selected is the one with the lowest dispersion index.The results show that the correlation coefficient(95%confidence interval)and dispersion index of the 7 models including Basis TSR;Basis TSR t AC:Lag(Residuals,1);Basis TSR t AC:Lag(SXH,1);Basis TSR t AC:Lag(log(SXHt1),1);TSR Lag(GTI,2)t AC:Lag(log(SXHt1),2);TSR Lag(GTI,3)t AC:Lag(log(SXHt1),3);TSR Lag(GTI,0)t AC:Lag(log(SXHt1),1)are 0.71(0.63-0.76)and 74.2;0.79(0.73-0.83)and 48.6;0.89(0.87-0.92)and 37.3;0.98(0.97-0.99)and 7.2;0.96(0.95-0.97)and 14.3;0.93(0.91-0.94)and 25.7;0.98(0.97-0.99)and 6.8,respectively.Therefore,the final model is the most suitable one selected.Testing the accuracy of the selected model using the ROC curve with the Youden criterion,the AUC(threshold 75%)is 0.982,and the AUC(threshold 95%)is 0.984,indicating the very good predictive ability of the model.In summary,the research results show the potential for applying this model in Vietnam,especially in BRVT,to enhance the effectiveness of epidemic prevention measures and protect public health.
文摘In this work, four empirical models of statistical thickness, namely the models of Harkins and Jura, Hasley, Carbon Black and Jaroniec, were compared in order to determine the textural properties (external surface and surface of micropores) of a clay concrete without molasses and clay concretes stabilized with 8%, 12% and 16% molasses. The results obtained show that Hasley’s model can be used to obtain the external surfaces. However, it does not allow the surface of the micropores to be obtained, and is not suitable for the case of simple clay concrete (without molasses) and for clay concretes stabilized with molasses. The Carbon Black, Jaroniec and Harkins and Jura models can be used for clay concrete and stabilized clay concrete. However, the Carbon Black model is the most relevant for clay concrete and the Harkins and Jura model is for molasses-stabilized clay concrete. These last two models augur well for future research.
文摘Accurate Global Horizontal Irradiance(GHI)forecasting has become vital for successfully integrating solar energy into the electrical grid because of the expanding demand for green power and the worldwide shift favouring green energy resources.Particularly considering the implications of the aggressive GHG emission targets,accurate GHI forecasting has become vital for developing,designing,and operational managing solar energy systems.This research presented the core concepts of modelling and performance analysis of the application of various forecasting models such as ARIMA(Autoregressive Integrated Moving Average),Elaman NN(Elman Neural Network),RBFN(Radial Basis Function Neural Network),SVM(Support Vector Machine),LSTM(Long Short-Term Memory),Persistent,BPN(Back Propagation Neural Network),MLP(Multilayer Perceptron Neural Network),RF(Random Forest),and XGBoost(eXtreme Gradient Boosting)for assessing multi-seasonal forecasting of GHI.Used the India region data to evaluate the models’performance and forecasting ability.Research using forecasting models for seasonal Global Horizontal Irradiance(GHI)forecasting in winter,spring,summer,monsoon,and autumn.Substantiated performance effectiveness through evaluation metrics,such as Mean Absolute Error(MAE),Root Mean Squared Error(RMSE),and R-squared(R^(2)),coded using Python programming.The performance experimentation analysis inferred that the most accurate forecasts in all the seasons compared to the other forecasting models the Random Forest and eXtreme Gradient Boosting,are the superior and competing models that yield Winter season-based forecasting XGBoost is the best forecasting model with MAE:1.6325,RMSE:4.8338,and R^(2):0.9998.Spring season-based forecasting XGBoost is the best forecasting model with MAE:2.599599,RMSE:5.58539,and R^(2):0.999784.Summer season-based forecasting RF is the best forecasting model with MAE:1.03843,RMSE:2.116325,and R^(2):0.999967.Monsoon season-based forecasting RF is the best forecasting model with MAE:0.892385,RMSE:2.417587,and R^(2):0.999942.Autumn season-based forecasting RF is the best forecasting model with MAE:0.810462,RMSE:1.928215,and R^(2):0.999958.Based on seasonal variations and computing constraints,the findings enable energy system operators to make helpful recommendations for choosing the most effective forecasting models.
基金supported by TUBITAK 112E568,114E092,and 115E915 projectsTNPGN-Active RINEX data set is available to the IONOLAB group for the TUBITAK 109E055 project。
文摘In this study,the gradients of Total Electron Content(TEC)for a midlatitude region are estimated and grouped with respect to the distance between neighboring stations,time periods within a day,and satellite directions.Annual medians of these gradients for quiet days are computed as templates.The metric distances(L2N)and Symmetric Kullback-Leibler Distances(SKLD)are obtained between the templates and the daily gradient series.The grouped histograms are fitted to the prospective Probability Density Functions(PDF).The method is applied to the Slant Total Electron Content(STEC)estimates from the Turkish National Permanent GPS Network(TNPGN-Active)for 2015.The highest gradients are observed in the east-west axis with a maximum of 25 mm/km during a geomagnetic storm.The maximum differences from the gradient templates occur for neighboring stations within100-130 km distance away from each other,during night hours,and for regions bordering the Black Sea and the Mediterranean in the northeast and southeast of Turkey.The empirical PDFs of the stationpair gradients are predominantly Weibull-distributed.The mean values of Weibull PDFs in all station groups are between 1.2 and 1.8 mm/km,with an increase during noon and afternoon hours.The standard deviations of the gradient PDFs generally increase during night hours.The algorithm will form a basis for quantifying the stochastic variations of the spatial rate of change of TEC trends in midlatitude regions,thus supplementing reliable and accurate regional monitoring of ionospheric variability.
基金Supported by the Laoshan Laboratory(No.LSKJ 202202404)the Strategic Priority Research Program of the Chinese Academy of Sciences(No.XDB 42000000)+1 种基金the National Natural Science Foundation of China(NSFC)(No.42030410)the Startup Foundation for Introducing Talent of NUIST,and the Jiangsu Innovation Research Group(No.JSSCTD 202346)。
文摘Numerical models are crucial for quantifying the ocean-atmosphere interactions associated with the El Niño-Southern Oscillation(ENSO)phenomenon in the tropical Pacific.Current coupled models often exhibit significant biases and inter-model differences in simulating ENSO,underscoring the need for alternative modeling approaches.The Regional Ocean Modeling System(ROMS)is a sophisticated ocean model widely used for regional studies and has been coupled with various atmospheric models.However,its application in simulating ENSO processes on a basin scale in the tropical Pacific has not been explored.For the first time,this study presents the development of a basin-scale hybrid coupled model(HCM)for the tropical Pacific,integrating ROMS with a statistical atmospheric model that captures the interannual relationships between sea surface temperature(SST)and wind stress anomalies.The HCM is evaluated for its capability to simulate the annual mean,seasonal,and interannual variations of the oceanic state in the tropical Pacific.Results demonstrate that the model effectively reproduces the ENSO cycle,with a dominant oscillation period of approximately two years.The ROMS-based HCM developed here offers an efficient and robust tool for investigating climate variability in the tropical Pacific.
基金Project(52274096)supported by the National Natural Science Foundation of ChinaProject(WS2023A03)supported by the State Key Laboratory Cultivation Base for Gas Geology and Gas Control,China。
文摘Accurate assessment of coal brittleness is crucial in the design of coal seam drilling and underground coal mining operations.This study proposes a method for evaluating the brittleness of gas-bearing coal based on a statistical damage constitutive model and energy evolution mechanisms.Initially,integrating the principle of effective stress and the Hoek-Brown criterion,a statistical damage constitutive model for gas-bearing coal is established and validated through triaxial compression tests under different gas pressures to verify its accuracy and applicability.Subsequently,employing energy evolution mechanism,two energy characteristic parameters(elastic energy proportion and dissipated energy proportion)are analyzed.Based on the damage stress thresholds,the damage evolution characteristics of gas bearing coal were explored.Finally,by integrating energy characteristic parameters with damage parameters,a novel brittleness index is proposed.The results demonstrate that the theoretical curves derived from the statistical damage constitutive model closely align with the test curves,accurately reflecting the stress−strain characteristics of gas-bearing coal and revealing the stress drop and softening characteristics of coal in the post-peak stage.The shape parameter and scale parameter represent the brittleness and macroscopic strength of the coal,respectively.As gas pressure increases from 1 to 5 MPa,the shape parameter and the scale parameter decrease by 22.18%and 60.45%,respectively,indicating a reduction in both brittleness and strength of the coal.Parameters such as maximum damage rate and peak elastic energy storage limit positively correlate with coal brittleness.The brittleness index effectively captures the brittleness characteristics and reveals a decrease in brittleness and an increase in sensitivity to plastic deformation under higher gas pressure conditions.
基金The authors extend their appreciation to Researchers Supporting Project number(RSP2024R390),King Saud University,Riyadh,Saudi Arabia.
文摘Soil erosion is a crucial geo-environmental hazard worldwide that affects water quality and agriculture,decreases reservoir storage capacity due to sedimentation,and increases the danger of flooding and landslides.Thus,this study uses geospatial modeling to produce soil erosion susceptibility maps(SESM)for the Hangu region,Khyber Pakhtunkhwa(KPK),Pakistan.The Hangu region,located in the Kohat Plateau of KPK,Pakistan,is particularly susceptible to soil erosion due to its unique geomorphological and climatic characteristics.Moreover,the Hangu region is characterized by a combination of steep slopes,variable rainfall patterns,diverse land use,and distinct soil types,all of which contribute to the complexity and severity of soil erosion processes.These factors necessitate a detailed and region-specific study to develop effective soil conservation strategies.In this research,we detected and mapped 1013 soil erosion points and prepared 12 predisposing factors(elevation,aspect,slope,Normalized Differentiate Vegetation Index(NDVI),drainage network,curvature,Land Use Land Cover(LULC),rainfall,lithology,contour,soil texture,and road network)of soil erosion using GIS platform.Additionally,GIS-based statistical models like the weight of evidence(WOE)and frequency ratio(FR)were applied to produce the SESM for the study area.The SESM was reclassified into four classes,i.e.,low,medium,high,and very high zone.The results of WOE for SESM show that 16.39%,33.02%,29.27%,and 21.30%of areas are covered by low,medium,high,and very high zones,respectively.In contrast,the FR results revealed that 16.50%,24.33%,35.55%,and 23.59%of the areas are occupied by low,medium,high,and very high classes.Furthermore,the reliability of applied models was evaluated using the Area Under Curve(AUC)technique.The validation results utilizing the area under curve showed that the success rate curve(SRC)and predicted rate curve(PRC)for WOE are 82%and 86%,respectively,while SRC and PRC for FR are 85%and 96%,respectively.The validation results revealed that the FR model performance is better and more reliable than the WOE.
文摘Forecasting the movement of stock market is a long-time attractive topic. This paper implements different statistical learning models to predict the movement of S&P 500 index. The S&P 500 index is influenced by other important financial indexes across the world such as commodity price and financial technical indicators. This paper systematically investigated four supervised learning models, including Logistic Regression, Gaussian Discriminant Analysis (GDA), Naive Bayes and Support Vector Machine (SVM) in the forecast of S&P 500 index. After several experiments of optimization in features and models, especially the SVM kernel selection and feature selection for different models, this paper concludes that a SVM model with a Radial Basis Function (RBF) kernel can achieve an accuracy rate of 62.51% for the future market trend of the S&P 500 index.