Background Cotton is one of the most important commercial crops after food crops,especially in countries like India,where it’s grown extensively under rainfed conditions.Because of its usage in multiple industries,su...Background Cotton is one of the most important commercial crops after food crops,especially in countries like India,where it’s grown extensively under rainfed conditions.Because of its usage in multiple industries,such as textile,medicine,and automobile industries,it has greater commercial importance.The crop’s performance is greatly influenced by prevailing weather dynamics.As climate changes,assessing how weather changes affect crop performance is essential.Among various techniques that are available,crop models are the most effective and widely used tools for predicting yields.Results This study compares statistical and machine learning models to assess their ability to predict cotton yield across major producing districts of Karnataka,India,utilizing a long-term dataset spanning from 1990 to 2023 that includes yield and weather factors.The artificial neural networks(ANNs)performed superiorly with acceptable yield deviations ranging within±10%during both vegetative stage(F1)and mid stage(F2)for cotton.The model evaluation metrics such as root mean square error(RMSE),normalized root mean square error(nRMSE),and modelling efficiency(EF)were also within the acceptance limits in most districts.Furthermore,the tested ANN model was used to assess the importance of the dominant weather factors influencing crop yield in each district.Specifically,the use of morning relative humidity as an individual parameter and its interaction with maximum and minimum tempera-ture had a major influence on cotton yield in most of the yield predicted districts.These differences highlighted the differential interactions of weather factors in each district for cotton yield formation,highlighting individual response of each weather factor under different soils and management conditions over the major cotton growing districts of Karnataka.Conclusions Compared with statistical models,machine learning models such as ANNs proved higher efficiency in forecasting the cotton yield due to their ability to consider the interactive effects of weather factors on yield forma-tion at different growth stages.This highlights the best suitability of ANNs for yield forecasting in rainfed conditions and for the study on relative impacts of weather factors on yield.Thus,the study aims to provide valuable insights to support stakeholders in planning effective crop management strategies and formulating relevant policies.展开更多
Background:Survival from birth to slaughter is an important economic trait in commercial pig productions.Increasing survival can improve both economic efficiency and animal welfare.The aim of this study is to explore ...Background:Survival from birth to slaughter is an important economic trait in commercial pig productions.Increasing survival can improve both economic efficiency and animal welfare.The aim of this study is to explore the impact of genotyping strategies and statistical models on the accuracy of genomic prediction for survival in pigs during the total growing period from birth to slaughter.Results:We simulated pig populations with different direct and maternal heritabilities and used a linear mixed model,a logit model,and a probit model to predict genomic breeding values of pig survival based on data of individual survival records with binary outcomes(0,1).The results show that in the case of only alive animals having genotype data,unbiased genomic predictions can be achieved when using variances estimated from pedigreebased model.Models using genomic information achieved up to 59.2%higher accuracy of estimated breeding value compared to pedigree-based model,dependent on genotyping scenarios.The scenario of genotyping all individuals,both dead and alive individuals,obtained the highest accuracy.When an equal number of individuals(80%)were genotyped,random sample of individuals with genotypes achieved higher accuracy than only alive individuals with genotypes.The linear model,logit model and probit model achieved similar accuracy.Conclusions:Our conclusion is that genomic prediction of pig survival is feasible in the situation that only alive pigs have genotypes,but genomic information of dead individuals can increase accuracy of genomic prediction by 2.06%to 6.04%.展开更多
The water resources of the Nadhour-Sisseb-El Alem Basin in Tunisia exhibit semi-arid and arid climatic conditions.This induces an excessive pumping of groundwater,which creates drops in water level ranging about 1-2 m...The water resources of the Nadhour-Sisseb-El Alem Basin in Tunisia exhibit semi-arid and arid climatic conditions.This induces an excessive pumping of groundwater,which creates drops in water level ranging about 1-2 m/a.Indeed,these unfavorable conditions require interventions to rationalize integrated management in decision making.The aim of this study is to determine a water recharge index(WRI),delineate the potential groundwater recharge area and estimate the potential groundwater recharge rate based on the integration of statistical models resulted from remote sensing imagery,GIS digital data(e.g.,lithology,soil,runoff),measured artificial recharge data,fuzzy set theory and multi-criteria decision making(MCDM)using the analytical hierarchy process(AHP).Eight factors affecting potential groundwater recharge were determined,namely lithology,soil,slope,topography,land cover/use,runoff,drainage and lineaments.The WRI is between 1.2 and 3.1,which is classified into five classes as poor,weak,moderate,good and very good sites of potential groundwater recharge area.The very good and good classes occupied respectively 27%and 44%of the study area.The potential groundwater recharge rate was 43%of total precipitation.According to the results of the study,river beds are favorable sites for groundwater recharge.展开更多
The objective of this study is to analyze the sensitivity of the statistical models regarding the size of samples. The study carried out in Ivory Coast is based on annual maximum daily rainfall data collected from 26 ...The objective of this study is to analyze the sensitivity of the statistical models regarding the size of samples. The study carried out in Ivory Coast is based on annual maximum daily rainfall data collected from 26 stations. The methodological approach is based on the statistical modeling of maximum daily rainfall. Adjustments were made on several sample sizes and several return periods (2, 5, 10, 20, 50 and 100 years). The main results have shown that the 30 years series (1931-1960;1961-1990;1991-2020) are better adjusted by the Gumbel (26.92% - 53.85%) and Inverse Gamma (26.92% - 46.15%). Concerning the 60-years series (1931-1990;1961-2020), they are better adjusted by the Inverse Gamma (30.77%), Gamma (15.38% - 46.15%) and Gumbel (15.38% - 42.31%). The full chronicle 1931-2020 (90 years) presents a notable supremacy of 50% of Gumbel model over the Gamma (34.62%) and Gamma Inverse (15.38%) model. It is noted that the Gumbel is the most dominant model overall and more particularly in wet periods. The data for periods with normal and dry trends were better fitted by Gamma and Inverse Gamma.展开更多
The usability of an interface is a fundamental issue to elucidate. Many researchers argued that many usability results and recommendations lack empirical and experimental data. In this research, the usability of the w...The usability of an interface is a fundamental issue to elucidate. Many researchers argued that many usability results and recommendations lack empirical and experimental data. In this research, the usability of the web pages is evaluated using several carefully selected statistical models. Universities web pages are chosen as subjects for this work for ease of comparison and ease of collecting data. A series of experiments has been conducted to investigate into the usability and design of the universities web pages. Prototype web pages have been developed according to the structured methodologies of web pages design and usability. Universities web pages were evaluated together with the prototype web pages using a questionnaire which was designed according to the Human Computer Interactions (HCI) heuristics. Nine (users) respondents’ variables and 14 web pages variables (items) were studied. Stringent statistical analysis was adopted to extract the required information to form the data acquired, and augmented interpretation of the statistical results was followed. The results showed that the analysis of variance (ANOVA) procedure showed there were significant differences among the universities web pages regarding most of the 23 items studied. Duncan Multiple Range Test (DMRT) showed that the prototype usability performed significantly better regarding most of the items. The correlation analysis showed significant positive and negative correlations between many items. The regression analysis revealed that the most significant factors (items) that contributed to the best model of the universities web pages design and usability were: multimedia in the web pages, the web pages icons (alone) organisation and design, and graphics attractiveness. The results showed some of the limitations of some heuristics used in conventional interface systems design and proposed some additional heuristics in web pages design and usability.展开更多
Statistical models using historical data on crop yields and weather to calibrate rela- tively simple regression equations have been widely and extensively applied in previous studies, and have provided a common altern...Statistical models using historical data on crop yields and weather to calibrate rela- tively simple regression equations have been widely and extensively applied in previous studies, and have provided a common alternative to process-based models, which require extensive input data on cultivar, management, and soil conditions. However, very few studies had been conducted to review systematically the previous statistical models for indentifying climate contributions to crop yields. This paper introduces three main statistical methods, i.e., time-series model, cross-section model and panel model, which have been used to identify such issues in the field of agrometeorology. Generally, research spatial scale could be categorized into two types using statistical models, including site scale and regional scale (e.g. global scale, national scale, provincial scale and county scale). Four issues exist in identifying response sensitivity of crop yields to climate change by statistical models. The issues include the extent of spatial and temporal scale, non-climatic trend removal, colinearity existing in climate variables and non-consideration of adaptations. Respective resolutions for the above four issues have been put forward in the section of perspective on the future of statistical models finally.展开更多
Based on the review and comparison of main statistical analysis models for estimating variety-environment cell means in regional crop trials, a new statistical model, LR-PCA composite model was proposed, and the predi...Based on the review and comparison of main statistical analysis models for estimating variety-environment cell means in regional crop trials, a new statistical model, LR-PCA composite model was proposed, and the predictive precision of these models were compared by cross validation of an example data. Results showed that the order of model precision was LR-PCA model > AMMI model > PCA model > Treatment Means (TM) model > Linear Regression (LR) model > Additive Main Effects ANOVA model. The precision gain factor of LR-PCA model was 1.55, increasing by 8.4% compared with AMMI.展开更多
QTL mapping for seven quality traits was conducted by using 254 recombinant inbred lines (RIL) derived from a japonica-japonica rice cross of Xiushui 79/C Bao. The seven traits investigated were grain length (GL),...QTL mapping for seven quality traits was conducted by using 254 recombinant inbred lines (RIL) derived from a japonica-japonica rice cross of Xiushui 79/C Bao. The seven traits investigated were grain length (GL), grain length to width ratio (LWR), chalk grain rate (CGR), chalkiness degree (CD), gelatinization temperature (GT), amylose content (AC) and gel consistency (GC) of head rice. Three mapping methods employed were composite interval mapping in QTLMapper 2.0 software based on mixed linear model (MCIM), inclusive composite interval mapping in QTL IciMapping 3.0 software based on stepwise regression linear model (ICIM) and multiple interval mapping with regression forward selection in Windows QTL Cartographer 2.5 based on multiple regression analysis (MIMR). Results showed that five QTLs with additive effect (A-QTLs) were detected by all the three methods simultaneously, two by two methods simultaneously, and 23 by only one method. Five A-QTLs were detected by MCIM, nine by ICIM and 28 by MIMR. The contribution rates of single A-QTL ranged from 0.89% to 38.07%. All the QTLs with epistatic effect (E-QTLs) detected by MIMR were not detected by the other two methods. Fourteen pairs of E-QTLs were detected by both MCIM and ICIM, and 142 pairs of E-QTLs were detected by only one method. Twenty-five pairs of E-QTLs were detected by MCIM, 141 pairs by ICIM and four pairs by MIMR. The contribution rates of single pair of E-QTL were from 2.60% to 23.78%. In the Xiu-Bao RIL population, epistatic effect played a major role in the variation of GL and CD, and additive effect was the dominant in the variation of LWR, while both epistatic effect and additive effect had equal importance in the variation of CGR, AC, GT and GC. QTLs detected by two or more methods simultaneously were highly reliable, and could be applied to improve the quality traits in japonica hybrid rice.展开更多
Semi-rigid liquid crystal polymer is a class of liquid crystal polymers different from long rigid rod liquid crystal polymer to which the well-known Onsager and Flory theories are applied. In this paper, three statist...Semi-rigid liquid crystal polymer is a class of liquid crystal polymers different from long rigid rod liquid crystal polymer to which the well-known Onsager and Flory theories are applied. In this paper, three statistical models for the semi-rigid nematic polymer were addressed. They are the elastically jointed rod model, worm-like chain model, and non-homogeneous chain model. The nematic-isotropic transition temperature was examined. The pseudo-second transition temperature is expressed analytically. Comparisons with the experiments were made and the agreements were found.展开更多
Non-metallic inclusions are critical for the fatigue failure of clean steels in service;especially,the large and hard inclusions are detrimental.Since it is not possible to measure all the inclusions in the large-volu...Non-metallic inclusions are critical for the fatigue failure of clean steels in service;especially,the large and hard inclusions are detrimental.Since it is not possible to measure all the inclusions in the large-volume clean steels,statistical models have been developed to evaluate inclusions,aiming at predicting the maximum inclusion size in the large volume from the data of inclusions,which are derived from the limited observations on small-volume specimens.Different statistical models were reviewed together with their supporting theories.In particular,the block maxima and the threshold types of models were discussed through a thorough comparison as they are both widely used and based on the extreme value theory.The predicted results not only are used to distinguish the different cleanliness levels of steels,but also help to estimate fatigue strength.Finally,future research is proposed to focus on tackling the present difficulties encountered by statistical models,including the sufficient credibility of obtained results and the robustness of models for applications.展开更多
Following the basic principle of modem multivariate statistical analysis theory, the description model, prediction model and control model to relate chemical compositions and mechanical properties of steels are introd...Following the basic principle of modem multivariate statistical analysis theory, the description model, prediction model and control model to relate chemical compositions and mechanical properties of steels are introduced. As an example, the total flowchart of components and structure/properties description, prediction and control model for chemical composition and mechanical properties of 20 and A_2 steel are presented.展开更多
This paper reviews differences between the deterministic(sharp and diffuse)and statistical models of the interphase region between the two-phases.In the literature this region is usually referred to as the(macroscopic...This paper reviews differences between the deterministic(sharp and diffuse)and statistical models of the interphase region between the two-phases.In the literature this region is usually referred to as the(macroscopic)interface.Therein,the mesoscopic interface that is defined at the molecular level and agitated by the thermal fluctuations is found with nonzero probability.For this reason,in this work,the interphase region is called the mesoscopic intermittency/transition region.To this purpose,the first part of the present work gives the rationale for introduction of the mesoscopic intermittency region statistical model.It is argued that classical(deterministic)sharp and diffuse models do not explain the experimental and numerical results presented in the literature.Afterwards,it is elucidated that a statistical model of the mesoscopic intermittency region(SMIR)combines existing sharp and diffuse models into a single coherent framework and explains published experimental and numerical results.In the second part of the present paper,the SMIR is used for the first time to predict equilibrium and nonequilibrium two-phase flow in the numerical simulation.To this goal,a two-dimensional rising gas bubble is studied;obtained numerical results are used as a basis to discuss differences between the deterministic and statistical models showing the statistical description has a potential to account for the physical phenomena not previously considered in the computer simulations.展开更多
Climate change has significantly increased the frequency and severity of extreme weather events,a trend recognized under the United Nations Sustainable Development Goal 13:Climate Action.This study forecasts hurricane...Climate change has significantly increased the frequency and severity of extreme weather events,a trend recognized under the United Nations Sustainable Development Goal 13:Climate Action.This study forecasts hurricane activity in the Yucatan Peninsula,Mexico,for the period 2025–2034 using advanced computational models,including Convolutional Neural Networks(CNNs),Long Short-Term Memory networks(LSTMs),Autoregressive Integrated Moving Average models(ARIMA),and Linear Regression(LR).Historical hurricane data were extracted from the HURDAT2 database kept by the National Hurricane Center(NHC)and spatially analyzed in QGIS to assess storm trajectories and wind intensities.The data were processed using Python,and each model was trained to predict hurricane frequency within three wind speed categories:<50 knots,50–100 knots,and>100 knots.Results reveal divergent performance among the models.CNN exhibited high variability for low-speed events,peaking at 4.21 events in 2027 and dropping to 1.27 by 2034.In contrast,LSTM and ARIMA maintained stable forecasts:LSTM fluctuated between 2.7 and 3.0,and ARIMA ranged from 1.5 to 1.8.For the 50–100 knot range,CNN reached an anomalous high of 8.14 events in 2032,while LSTM and ARIMA remained within narrower bands(1.85–2.01 and 1.32–1.99,respectively).At the>100 knot level,ARIMA showed a rising trend from 0.21 in 2025 to 0.57 in 2034,suggesting a potential increase in high-intensity cyclones.These findings emphasize the need for adaptive forecasting systems that account for nonlinear behavior under climate change conditions.The model outputs offer valuable insights for risk management,contingency planning,and infrastructure resilience in the hurricane-prone Yucatan Peninsula.展开更多
Dengue fever(DF),caused by the Dengue virus through the Aedes mosquito vector,is a dangerous infectious disease with the potential to become a global epidemic.Vietnam,particularly Ba Ria-Vung Tau(BRVT)province,is faci...Dengue fever(DF),caused by the Dengue virus through the Aedes mosquito vector,is a dangerous infectious disease with the potential to become a global epidemic.Vietnam,particularly Ba Ria-Vung Tau(BRVT)province,is facing a high risk of DF.This study aims to determine the relationship between the search volume for DF on Google Trends and DF cases in BRVT province,thereby constructing a model to predict the early outbreak risk of DF locally.Using Poisson regression(adjusted by quasi-Poisson),considering the lagged effect of Google Trends Index(GTI)search volume on DF cases,and removing the autocorrelation(AC)of DF cases by using appropriate transformations,seven forecast models were surveyed based on the dataset of DF cases and GTI search volume weekly with the phrase"sôt xuàt huyêt"(dengue fever)in BRVT province from January 2019 to August 2023(243 weeks).The model selected is the one with the lowest dispersion index.The results show that the correlation coefficient(95%confidence interval)and dispersion index of the 7 models including Basis TSR;Basis TSR t AC:Lag(Residuals,1);Basis TSR t AC:Lag(SXH,1);Basis TSR t AC:Lag(log(SXHt1),1);TSR Lag(GTI,2)t AC:Lag(log(SXHt1),2);TSR Lag(GTI,3)t AC:Lag(log(SXHt1),3);TSR Lag(GTI,0)t AC:Lag(log(SXHt1),1)are 0.71(0.63-0.76)and 74.2;0.79(0.73-0.83)and 48.6;0.89(0.87-0.92)and 37.3;0.98(0.97-0.99)and 7.2;0.96(0.95-0.97)and 14.3;0.93(0.91-0.94)and 25.7;0.98(0.97-0.99)and 6.8,respectively.Therefore,the final model is the most suitable one selected.Testing the accuracy of the selected model using the ROC curve with the Youden criterion,the AUC(threshold 75%)is 0.982,and the AUC(threshold 95%)is 0.984,indicating the very good predictive ability of the model.In summary,the research results show the potential for applying this model in Vietnam,especially in BRVT,to enhance the effectiveness of epidemic prevention measures and protect public health.展开更多
Forecasting the movement of stock market is a long-time attractive topic. This paper implements different statistical learning models to predict the movement of S&P 500 index. The S&P 500 index is influenced b...Forecasting the movement of stock market is a long-time attractive topic. This paper implements different statistical learning models to predict the movement of S&P 500 index. The S&P 500 index is influenced by other important financial indexes across the world such as commodity price and financial technical indicators. This paper systematically investigated four supervised learning models, including Logistic Regression, Gaussian Discriminant Analysis (GDA), Naive Bayes and Support Vector Machine (SVM) in the forecast of S&P 500 index. After several experiments of optimization in features and models, especially the SVM kernel selection and feature selection for different models, this paper concludes that a SVM model with a Radial Basis Function (RBF) kernel can achieve an accuracy rate of 62.51% for the future market trend of the S&P 500 index.展开更多
High efficiency video coding(HEVC)uses half of the bitrate compared to H.264/advanced video coding(AVC)for encoding the same sequence with similar quality.Because of the advanced hierarchical structures of coding unit...High efficiency video coding(HEVC)uses half of the bitrate compared to H.264/advanced video coding(AVC)for encoding the same sequence with similar quality.Because of the advanced hierarchical structures of coding units(CUs),predicting units(PUs),and transform units(TUs),HEVC can better adapt when encoding full high definition(HD)and ultra high definition(UHD)videos.At the expense of encoding efficiency,the complexity of HEVC sharply increases compared to H.264/AVC,mainly due to its quad-tree structure that splits pictures.In this study,the probability distribution,which is generated by a rate distortion optimizing(RDO)cost,is analyzed.Then,an early terminating method is proposed to decrease the complexity of the HEVC based on probability distributions.The experiment shows that the coding time is reduced by 44.9%for HEVC intra coding,at the cost of a 0.61%increase in the Bj?ntegaard delta rate(BD-rate),on average.展开更多
The rapid increase in Water Temperature Rivers (WTR) observed globally in recent decades and projections for the coming decades under climate change scenarios make water temperature prediction essential to assess chan...The rapid increase in Water Temperature Rivers (WTR) observed globally in recent decades and projections for the coming decades under climate change scenarios make water temperature prediction essential to assess changes in aquatic biota. Statistical models for stream temperature prediction have been widely used because they are computationally simple, involve few parameters, and because of their relatively good accuracy. However, these models have not been evaluated in Peruvian Andean rivers. This work evaluates the main water temperature statistical models from the literature and fits them with data recorded in the Ichu River experimental watershed, Huancavelica-Peru. Three well-known models were reviewed: the Stefan & Preud’homme linear regression model and the Mohseni & Stefan 3- and 4-parameter logistic regression models. Ichu river water temperatures were simulated using the SWAT (Soil and Water Assessment Tool) hydrometeorological model, which defaults to the Stefan & Preud’homme model. Modifications and adjustment of coefficients of the evaluated models were configured in the SWAT code using the “Latin Hypercube Sampling” technique. The evaluated models showed poor performance in predicting the water temperature in the Ichu River with NSE (Nash-Sutcliffe Efficiency) values ranging from -2.6 to 0.49, while the modified models showed NSE values of 0.72 in all three cases. Findings suggest that the statistical models shown in the literature should be validated for Andean rivers.展开更多
Lexicalized reordering models are very important components of phrasebased translation systems.By examining the reordering relationships between adjacent phrases,conventional methods learn these models from the word a...Lexicalized reordering models are very important components of phrasebased translation systems.By examining the reordering relationships between adjacent phrases,conventional methods learn these models from the word aligned bilingual corpus,while ignoring the effect of the number of adjacent bilingual phrases.In this paper,we propose a method to take the number of adjacent phrases into account for better estimation of reordering models.Instead of just checking whether there is one phrase adjacent to a given phrase,our method firstly uses a compact structure named reordering graph to represent all phrase segmentations of a parallel sentence,then the effect of the adjacent phrase number can be quantified in a forward-backward fashion,and finally incorporated into the estimation of reordering models.Experimental results on the NIST Chinese-English and WMT French-Spanish data sets show that our approach significantly outperforms the baseline method.展开更多
Coral reef limestone(CRL)constitutes a distinctive marine carbonate formation with complex mechanical properties.This study investigates the multiscale damage and fracture mechanisms of CRL through integrated experime...Coral reef limestone(CRL)constitutes a distinctive marine carbonate formation with complex mechanical properties.This study investigates the multiscale damage and fracture mechanisms of CRL through integrated experimental testing,digital core technology,and theoretical modelling.Two CRL types with contrasting mesostructures were characterized across three scales.Macroscopically,CRL-I and CRL-II exhibited mean compressive strengths of 8.46 and 5.17 MPa,respectively.Mesoscopically,CRL-I featured small-scale highly interconnected pores,whilst CRL-II developed larger stratified pores with diminished connectivity.Microscopically,both CRL matrices demonstrated remarkable similarity in mineral composition and mechanical properties.A novel voxel average-based digital core scaling methodology was developed to facilitate numerical simulation of cross-scale damage processes,revealing network-progressive failure in CRL-I versus directional-brittle failure in CRL-II.Furthermore,a damage statistical constitutive model based on digital core technology and mesoscopic homogenisation theory established quantitative relationships between microelement strength distribution and macroscopic mechanical behavior.These findings illuminate the fundamental mechanisms through which mesoscopic structure governs the macroscopic mechanical properties of CRL.展开更多
In this work, four empirical models of statistical thickness, namely the models of Harkins and Jura, Hasley, Carbon Black and Jaroniec, were compared in order to determine the textural properties (external surface and...In this work, four empirical models of statistical thickness, namely the models of Harkins and Jura, Hasley, Carbon Black and Jaroniec, were compared in order to determine the textural properties (external surface and surface of micropores) of a clay concrete without molasses and clay concretes stabilized with 8%, 12% and 16% molasses. The results obtained show that Hasley’s model can be used to obtain the external surfaces. However, it does not allow the surface of the micropores to be obtained, and is not suitable for the case of simple clay concrete (without molasses) and for clay concretes stabilized with molasses. The Carbon Black, Jaroniec and Harkins and Jura models can be used for clay concrete and stabilized clay concrete. However, the Carbon Black model is the most relevant for clay concrete and the Harkins and Jura model is for molasses-stabilized clay concrete. These last two models augur well for future research.展开更多
基金funded through India Meteorological Department,New Delhi,India under the Forecasting Agricultural output using Space,Agrometeorol ogy and Land based observations(FASAL)project and fund number:No.ASC/FASAL/KT-11/01/HQ-2010.
文摘Background Cotton is one of the most important commercial crops after food crops,especially in countries like India,where it’s grown extensively under rainfed conditions.Because of its usage in multiple industries,such as textile,medicine,and automobile industries,it has greater commercial importance.The crop’s performance is greatly influenced by prevailing weather dynamics.As climate changes,assessing how weather changes affect crop performance is essential.Among various techniques that are available,crop models are the most effective and widely used tools for predicting yields.Results This study compares statistical and machine learning models to assess their ability to predict cotton yield across major producing districts of Karnataka,India,utilizing a long-term dataset spanning from 1990 to 2023 that includes yield and weather factors.The artificial neural networks(ANNs)performed superiorly with acceptable yield deviations ranging within±10%during both vegetative stage(F1)and mid stage(F2)for cotton.The model evaluation metrics such as root mean square error(RMSE),normalized root mean square error(nRMSE),and modelling efficiency(EF)were also within the acceptance limits in most districts.Furthermore,the tested ANN model was used to assess the importance of the dominant weather factors influencing crop yield in each district.Specifically,the use of morning relative humidity as an individual parameter and its interaction with maximum and minimum tempera-ture had a major influence on cotton yield in most of the yield predicted districts.These differences highlighted the differential interactions of weather factors in each district for cotton yield formation,highlighting individual response of each weather factor under different soils and management conditions over the major cotton growing districts of Karnataka.Conclusions Compared with statistical models,machine learning models such as ANNs proved higher efficiency in forecasting the cotton yield due to their ability to consider the interactive effects of weather factors on yield forma-tion at different growth stages.This highlights the best suitability of ANNs for yield forecasting in rainfed conditions and for the study on relative impacts of weather factors on yield.Thus,the study aims to provide valuable insights to support stakeholders in planning effective crop management strategies and formulating relevant policies.
基金funded by the"Genetic improvement of pig survival"project from Danish Pig Levy Foundation (Aarhus,Denmark)The China Scholarship Council (CSC)for providing scholarship to the first author。
文摘Background:Survival from birth to slaughter is an important economic trait in commercial pig productions.Increasing survival can improve both economic efficiency and animal welfare.The aim of this study is to explore the impact of genotyping strategies and statistical models on the accuracy of genomic prediction for survival in pigs during the total growing period from birth to slaughter.Results:We simulated pig populations with different direct and maternal heritabilities and used a linear mixed model,a logit model,and a probit model to predict genomic breeding values of pig survival based on data of individual survival records with binary outcomes(0,1).The results show that in the case of only alive animals having genotype data,unbiased genomic predictions can be achieved when using variances estimated from pedigreebased model.Models using genomic information achieved up to 59.2%higher accuracy of estimated breeding value compared to pedigree-based model,dependent on genotyping scenarios.The scenario of genotyping all individuals,both dead and alive individuals,obtained the highest accuracy.When an equal number of individuals(80%)were genotyped,random sample of individuals with genotypes achieved higher accuracy than only alive individuals with genotypes.The linear model,logit model and probit model achieved similar accuracy.Conclusions:Our conclusion is that genomic prediction of pig survival is feasible in the situation that only alive pigs have genotypes,but genomic information of dead individuals can increase accuracy of genomic prediction by 2.06%to 6.04%.
文摘The water resources of the Nadhour-Sisseb-El Alem Basin in Tunisia exhibit semi-arid and arid climatic conditions.This induces an excessive pumping of groundwater,which creates drops in water level ranging about 1-2 m/a.Indeed,these unfavorable conditions require interventions to rationalize integrated management in decision making.The aim of this study is to determine a water recharge index(WRI),delineate the potential groundwater recharge area and estimate the potential groundwater recharge rate based on the integration of statistical models resulted from remote sensing imagery,GIS digital data(e.g.,lithology,soil,runoff),measured artificial recharge data,fuzzy set theory and multi-criteria decision making(MCDM)using the analytical hierarchy process(AHP).Eight factors affecting potential groundwater recharge were determined,namely lithology,soil,slope,topography,land cover/use,runoff,drainage and lineaments.The WRI is between 1.2 and 3.1,which is classified into five classes as poor,weak,moderate,good and very good sites of potential groundwater recharge area.The very good and good classes occupied respectively 27%and 44%of the study area.The potential groundwater recharge rate was 43%of total precipitation.According to the results of the study,river beds are favorable sites for groundwater recharge.
文摘The objective of this study is to analyze the sensitivity of the statistical models regarding the size of samples. The study carried out in Ivory Coast is based on annual maximum daily rainfall data collected from 26 stations. The methodological approach is based on the statistical modeling of maximum daily rainfall. Adjustments were made on several sample sizes and several return periods (2, 5, 10, 20, 50 and 100 years). The main results have shown that the 30 years series (1931-1960;1961-1990;1991-2020) are better adjusted by the Gumbel (26.92% - 53.85%) and Inverse Gamma (26.92% - 46.15%). Concerning the 60-years series (1931-1990;1961-2020), they are better adjusted by the Inverse Gamma (30.77%), Gamma (15.38% - 46.15%) and Gumbel (15.38% - 42.31%). The full chronicle 1931-2020 (90 years) presents a notable supremacy of 50% of Gumbel model over the Gamma (34.62%) and Gamma Inverse (15.38%) model. It is noted that the Gumbel is the most dominant model overall and more particularly in wet periods. The data for periods with normal and dry trends were better fitted by Gamma and Inverse Gamma.
文摘The usability of an interface is a fundamental issue to elucidate. Many researchers argued that many usability results and recommendations lack empirical and experimental data. In this research, the usability of the web pages is evaluated using several carefully selected statistical models. Universities web pages are chosen as subjects for this work for ease of comparison and ease of collecting data. A series of experiments has been conducted to investigate into the usability and design of the universities web pages. Prototype web pages have been developed according to the structured methodologies of web pages design and usability. Universities web pages were evaluated together with the prototype web pages using a questionnaire which was designed according to the Human Computer Interactions (HCI) heuristics. Nine (users) respondents’ variables and 14 web pages variables (items) were studied. Stringent statistical analysis was adopted to extract the required information to form the data acquired, and augmented interpretation of the statistical results was followed. The results showed that the analysis of variance (ANOVA) procedure showed there were significant differences among the universities web pages regarding most of the 23 items studied. Duncan Multiple Range Test (DMRT) showed that the prototype usability performed significantly better regarding most of the items. The correlation analysis showed significant positive and negative correlations between many items. The regression analysis revealed that the most significant factors (items) that contributed to the best model of the universities web pages design and usability were: multimedia in the web pages, the web pages icons (alone) organisation and design, and graphics attractiveness. The results showed some of the limitations of some heuristics used in conventional interface systems design and proposed some additional heuristics in web pages design and usability.
基金National Natural Science Foundation of China, No.41001057 The Science and Technology Strategic Pilot of the Chinese Academy of Sciences, No.XDA05090308+1 种基金 No.XDA05090310 Project Supported by State Key Laboratory of Earth Surface Processes and Resource Ecology, No.2011-KF-06
文摘Statistical models using historical data on crop yields and weather to calibrate rela- tively simple regression equations have been widely and extensively applied in previous studies, and have provided a common alternative to process-based models, which require extensive input data on cultivar, management, and soil conditions. However, very few studies had been conducted to review systematically the previous statistical models for indentifying climate contributions to crop yields. This paper introduces three main statistical methods, i.e., time-series model, cross-section model and panel model, which have been used to identify such issues in the field of agrometeorology. Generally, research spatial scale could be categorized into two types using statistical models, including site scale and regional scale (e.g. global scale, national scale, provincial scale and county scale). Four issues exist in identifying response sensitivity of crop yields to climate change by statistical models. The issues include the extent of spatial and temporal scale, non-climatic trend removal, colinearity existing in climate variables and non-consideration of adaptations. Respective resolutions for the above four issues have been put forward in the section of perspective on the future of statistical models finally.
文摘Based on the review and comparison of main statistical analysis models for estimating variety-environment cell means in regional crop trials, a new statistical model, LR-PCA composite model was proposed, and the predictive precision of these models were compared by cross validation of an example data. Results showed that the order of model precision was LR-PCA model > AMMI model > PCA model > Treatment Means (TM) model > Linear Regression (LR) model > Additive Main Effects ANOVA model. The precision gain factor of LR-PCA model was 1.55, increasing by 8.4% compared with AMMI.
基金supported by the National High Technology Research and Development Program of China (Grant No. 2010AA101301)the Program of Introducing International Advanced Agricultural Science and Technology in China (Grant No. 2006-G8[4]-31-1)the Program of Science-Technology Basis and Conditional Platform in China (Grant No. 505005)
文摘QTL mapping for seven quality traits was conducted by using 254 recombinant inbred lines (RIL) derived from a japonica-japonica rice cross of Xiushui 79/C Bao. The seven traits investigated were grain length (GL), grain length to width ratio (LWR), chalk grain rate (CGR), chalkiness degree (CD), gelatinization temperature (GT), amylose content (AC) and gel consistency (GC) of head rice. Three mapping methods employed were composite interval mapping in QTLMapper 2.0 software based on mixed linear model (MCIM), inclusive composite interval mapping in QTL IciMapping 3.0 software based on stepwise regression linear model (ICIM) and multiple interval mapping with regression forward selection in Windows QTL Cartographer 2.5 based on multiple regression analysis (MIMR). Results showed that five QTLs with additive effect (A-QTLs) were detected by all the three methods simultaneously, two by two methods simultaneously, and 23 by only one method. Five A-QTLs were detected by MCIM, nine by ICIM and 28 by MIMR. The contribution rates of single A-QTL ranged from 0.89% to 38.07%. All the QTLs with epistatic effect (E-QTLs) detected by MIMR were not detected by the other two methods. Fourteen pairs of E-QTLs were detected by both MCIM and ICIM, and 142 pairs of E-QTLs were detected by only one method. Twenty-five pairs of E-QTLs were detected by MCIM, 141 pairs by ICIM and four pairs by MIMR. The contribution rates of single pair of E-QTL were from 2.60% to 23.78%. In the Xiu-Bao RIL population, epistatic effect played a major role in the variation of GL and CD, and additive effect was the dominant in the variation of LWR, while both epistatic effect and additive effect had equal importance in the variation of CGR, AC, GT and GC. QTLs detected by two or more methods simultaneously were highly reliable, and could be applied to improve the quality traits in japonica hybrid rice.
基金The work was supported by the Foundation of State Education Committee of China
文摘Semi-rigid liquid crystal polymer is a class of liquid crystal polymers different from long rigid rod liquid crystal polymer to which the well-known Onsager and Flory theories are applied. In this paper, three statistical models for the semi-rigid nematic polymer were addressed. They are the elastically jointed rod model, worm-like chain model, and non-homogeneous chain model. The nematic-isotropic transition temperature was examined. The pseudo-second transition temperature is expressed analytically. Comparisons with the experiments were made and the agreements were found.
基金support from the National Natural Science Foundation of China(No.51831002)Fundamental Research Funds for the Central Universities(No.FRF-TP-18-002C2).
文摘Non-metallic inclusions are critical for the fatigue failure of clean steels in service;especially,the large and hard inclusions are detrimental.Since it is not possible to measure all the inclusions in the large-volume clean steels,statistical models have been developed to evaluate inclusions,aiming at predicting the maximum inclusion size in the large volume from the data of inclusions,which are derived from the limited observations on small-volume specimens.Different statistical models were reviewed together with their supporting theories.In particular,the block maxima and the threshold types of models were discussed through a thorough comparison as they are both widely used and based on the extreme value theory.The predicted results not only are used to distinguish the different cleanliness levels of steels,but also help to estimate fatigue strength.Finally,future research is proposed to focus on tackling the present difficulties encountered by statistical models,including the sufficient credibility of obtained results and the robustness of models for applications.
文摘Following the basic principle of modem multivariate statistical analysis theory, the description model, prediction model and control model to relate chemical compositions and mechanical properties of steels are introduced. As an example, the total flowchart of components and structure/properties description, prediction and control model for chemical composition and mechanical properties of 20 and A_2 steel are presented.
基金This work was supported by the National Science Center,Poland(Narodowe Centrum Nauki,Polska)in the project“Statistical modeling of turbulent two-fluid flows with interfaces”(Grant No.2016/21/B/ST8/01010,ID:334165).
文摘This paper reviews differences between the deterministic(sharp and diffuse)and statistical models of the interphase region between the two-phases.In the literature this region is usually referred to as the(macroscopic)interface.Therein,the mesoscopic interface that is defined at the molecular level and agitated by the thermal fluctuations is found with nonzero probability.For this reason,in this work,the interphase region is called the mesoscopic intermittency/transition region.To this purpose,the first part of the present work gives the rationale for introduction of the mesoscopic intermittency region statistical model.It is argued that classical(deterministic)sharp and diffuse models do not explain the experimental and numerical results presented in the literature.Afterwards,it is elucidated that a statistical model of the mesoscopic intermittency region(SMIR)combines existing sharp and diffuse models into a single coherent framework and explains published experimental and numerical results.In the second part of the present paper,the SMIR is used for the first time to predict equilibrium and nonequilibrium two-phase flow in the numerical simulation.To this goal,a two-dimensional rising gas bubble is studied;obtained numerical results are used as a basis to discuss differences between the deterministic and statistical models showing the statistical description has a potential to account for the physical phenomena not previously considered in the computer simulations.
文摘Climate change has significantly increased the frequency and severity of extreme weather events,a trend recognized under the United Nations Sustainable Development Goal 13:Climate Action.This study forecasts hurricane activity in the Yucatan Peninsula,Mexico,for the period 2025–2034 using advanced computational models,including Convolutional Neural Networks(CNNs),Long Short-Term Memory networks(LSTMs),Autoregressive Integrated Moving Average models(ARIMA),and Linear Regression(LR).Historical hurricane data were extracted from the HURDAT2 database kept by the National Hurricane Center(NHC)and spatially analyzed in QGIS to assess storm trajectories and wind intensities.The data were processed using Python,and each model was trained to predict hurricane frequency within three wind speed categories:<50 knots,50–100 knots,and>100 knots.Results reveal divergent performance among the models.CNN exhibited high variability for low-speed events,peaking at 4.21 events in 2027 and dropping to 1.27 by 2034.In contrast,LSTM and ARIMA maintained stable forecasts:LSTM fluctuated between 2.7 and 3.0,and ARIMA ranged from 1.5 to 1.8.For the 50–100 knot range,CNN reached an anomalous high of 8.14 events in 2032,while LSTM and ARIMA remained within narrower bands(1.85–2.01 and 1.32–1.99,respectively).At the>100 knot level,ARIMA showed a rising trend from 0.21 in 2025 to 0.57 in 2034,suggesting a potential increase in high-intensity cyclones.These findings emphasize the need for adaptive forecasting systems that account for nonlinear behavior under climate change conditions.The model outputs offer valuable insights for risk management,contingency planning,and infrastructure resilience in the hurricane-prone Yucatan Peninsula.
文摘Dengue fever(DF),caused by the Dengue virus through the Aedes mosquito vector,is a dangerous infectious disease with the potential to become a global epidemic.Vietnam,particularly Ba Ria-Vung Tau(BRVT)province,is facing a high risk of DF.This study aims to determine the relationship between the search volume for DF on Google Trends and DF cases in BRVT province,thereby constructing a model to predict the early outbreak risk of DF locally.Using Poisson regression(adjusted by quasi-Poisson),considering the lagged effect of Google Trends Index(GTI)search volume on DF cases,and removing the autocorrelation(AC)of DF cases by using appropriate transformations,seven forecast models were surveyed based on the dataset of DF cases and GTI search volume weekly with the phrase"sôt xuàt huyêt"(dengue fever)in BRVT province from January 2019 to August 2023(243 weeks).The model selected is the one with the lowest dispersion index.The results show that the correlation coefficient(95%confidence interval)and dispersion index of the 7 models including Basis TSR;Basis TSR t AC:Lag(Residuals,1);Basis TSR t AC:Lag(SXH,1);Basis TSR t AC:Lag(log(SXHt1),1);TSR Lag(GTI,2)t AC:Lag(log(SXHt1),2);TSR Lag(GTI,3)t AC:Lag(log(SXHt1),3);TSR Lag(GTI,0)t AC:Lag(log(SXHt1),1)are 0.71(0.63-0.76)and 74.2;0.79(0.73-0.83)and 48.6;0.89(0.87-0.92)and 37.3;0.98(0.97-0.99)and 7.2;0.96(0.95-0.97)and 14.3;0.93(0.91-0.94)and 25.7;0.98(0.97-0.99)and 6.8,respectively.Therefore,the final model is the most suitable one selected.Testing the accuracy of the selected model using the ROC curve with the Youden criterion,the AUC(threshold 75%)is 0.982,and the AUC(threshold 95%)is 0.984,indicating the very good predictive ability of the model.In summary,the research results show the potential for applying this model in Vietnam,especially in BRVT,to enhance the effectiveness of epidemic prevention measures and protect public health.
文摘Forecasting the movement of stock market is a long-time attractive topic. This paper implements different statistical learning models to predict the movement of S&P 500 index. The S&P 500 index is influenced by other important financial indexes across the world such as commodity price and financial technical indicators. This paper systematically investigated four supervised learning models, including Logistic Regression, Gaussian Discriminant Analysis (GDA), Naive Bayes and Support Vector Machine (SVM) in the forecast of S&P 500 index. After several experiments of optimization in features and models, especially the SVM kernel selection and feature selection for different models, this paper concludes that a SVM model with a Radial Basis Function (RBF) kernel can achieve an accuracy rate of 62.51% for the future market trend of the S&P 500 index.
文摘High efficiency video coding(HEVC)uses half of the bitrate compared to H.264/advanced video coding(AVC)for encoding the same sequence with similar quality.Because of the advanced hierarchical structures of coding units(CUs),predicting units(PUs),and transform units(TUs),HEVC can better adapt when encoding full high definition(HD)and ultra high definition(UHD)videos.At the expense of encoding efficiency,the complexity of HEVC sharply increases compared to H.264/AVC,mainly due to its quad-tree structure that splits pictures.In this study,the probability distribution,which is generated by a rate distortion optimizing(RDO)cost,is analyzed.Then,an early terminating method is proposed to decrease the complexity of the HEVC based on probability distributions.The experiment shows that the coding time is reduced by 44.9%for HEVC intra coding,at the cost of a 0.61%increase in the Bj?ntegaard delta rate(BD-rate),on average.
文摘The rapid increase in Water Temperature Rivers (WTR) observed globally in recent decades and projections for the coming decades under climate change scenarios make water temperature prediction essential to assess changes in aquatic biota. Statistical models for stream temperature prediction have been widely used because they are computationally simple, involve few parameters, and because of their relatively good accuracy. However, these models have not been evaluated in Peruvian Andean rivers. This work evaluates the main water temperature statistical models from the literature and fits them with data recorded in the Ichu River experimental watershed, Huancavelica-Peru. Three well-known models were reviewed: the Stefan & Preud’homme linear regression model and the Mohseni & Stefan 3- and 4-parameter logistic regression models. Ichu river water temperatures were simulated using the SWAT (Soil and Water Assessment Tool) hydrometeorological model, which defaults to the Stefan & Preud’homme model. Modifications and adjustment of coefficients of the evaluated models were configured in the SWAT code using the “Latin Hypercube Sampling” technique. The evaluated models showed poor performance in predicting the water temperature in the Ichu River with NSE (Nash-Sutcliffe Efficiency) values ranging from -2.6 to 0.49, while the modified models showed NSE values of 0.72 in all three cases. Findings suggest that the statistical models shown in the literature should be validated for Andean rivers.
基金supported by the National Natural Science Foundation of China(No.61303082) the Research Fund for the Doctoral Program of Higher Education of China(No.20120121120046)
文摘Lexicalized reordering models are very important components of phrasebased translation systems.By examining the reordering relationships between adjacent phrases,conventional methods learn these models from the word aligned bilingual corpus,while ignoring the effect of the number of adjacent bilingual phrases.In this paper,we propose a method to take the number of adjacent phrases into account for better estimation of reordering models.Instead of just checking whether there is one phrase adjacent to a given phrase,our method firstly uses a compact structure named reordering graph to represent all phrase segmentations of a parallel sentence,then the effect of the adjacent phrase number can be quantified in a forward-backward fashion,and finally incorporated into the estimation of reordering models.Experimental results on the NIST Chinese-English and WMT French-Spanish data sets show that our approach significantly outperforms the baseline method.
基金National Key Research and Development Program of China (No.2021YFC3100800)the National Natural Science Foundation of China (Nos.42407235 and 42271026)+1 种基金the Project of Sanya Yazhou Bay Science and Technology City (No.SCKJ-JYRC-2023-54)supported by the Hefei advanced computing center
文摘Coral reef limestone(CRL)constitutes a distinctive marine carbonate formation with complex mechanical properties.This study investigates the multiscale damage and fracture mechanisms of CRL through integrated experimental testing,digital core technology,and theoretical modelling.Two CRL types with contrasting mesostructures were characterized across three scales.Macroscopically,CRL-I and CRL-II exhibited mean compressive strengths of 8.46 and 5.17 MPa,respectively.Mesoscopically,CRL-I featured small-scale highly interconnected pores,whilst CRL-II developed larger stratified pores with diminished connectivity.Microscopically,both CRL matrices demonstrated remarkable similarity in mineral composition and mechanical properties.A novel voxel average-based digital core scaling methodology was developed to facilitate numerical simulation of cross-scale damage processes,revealing network-progressive failure in CRL-I versus directional-brittle failure in CRL-II.Furthermore,a damage statistical constitutive model based on digital core technology and mesoscopic homogenisation theory established quantitative relationships between microelement strength distribution and macroscopic mechanical behavior.These findings illuminate the fundamental mechanisms through which mesoscopic structure governs the macroscopic mechanical properties of CRL.
文摘In this work, four empirical models of statistical thickness, namely the models of Harkins and Jura, Hasley, Carbon Black and Jaroniec, were compared in order to determine the textural properties (external surface and surface of micropores) of a clay concrete without molasses and clay concretes stabilized with 8%, 12% and 16% molasses. The results obtained show that Hasley’s model can be used to obtain the external surfaces. However, it does not allow the surface of the micropores to be obtained, and is not suitable for the case of simple clay concrete (without molasses) and for clay concretes stabilized with molasses. The Carbon Black, Jaroniec and Harkins and Jura models can be used for clay concrete and stabilized clay concrete. However, the Carbon Black model is the most relevant for clay concrete and the Harkins and Jura model is for molasses-stabilized clay concrete. These last two models augur well for future research.