The purpose of this paper is to obtain the expression of the sample mean difference variance of the Student’s distributive model. In the 2007 the study of the mean difference variance, after some decades, was resumed...The purpose of this paper is to obtain the expression of the sample mean difference variance of the Student’s distributive model. In the 2007 the study of the mean difference variance, after some decades, was resumed by Campobasso</span><span style="font-family:Verdana;"> [1]</span><span style="font-family:Verdana;">. Using the Nair’s </span><span style="font-family:Verdana;">[2]</span><span style="font-family:Verdana;"> and Lomnicki’s general results</span><span style="font-family:Verdana;"> [3]</span><span style="font-family:Verdana;">, he obtained the variance of sample mean difference for different distributive models (Laplace</span><span style="font-family:Verdana;">’</span><span style="font-family:Verdana;">s, triangular, power, logit, Pareto</span><span style="font-family:Verdana;">’</span><span style="font-family:Verdana;">s and Gumbel’s model). In addition he extended the knowledge comparing to the ones already known for the other distributive model (normal, rectangular and exponential model).展开更多
Random sample partition(RSP)is a newly developed big data representation and management model to deal with big data approximate computation problems.Academic research and practical applications have confirmed that RSP...Random sample partition(RSP)is a newly developed big data representation and management model to deal with big data approximate computation problems.Academic research and practical applications have confirmed that RSP is an efficient solution for big data processing and analysis.However,a challenge for implementing RSP is determining an appropriate sample size for RSP data blocks.While a large sample size increases the burden of big data computation,a small size will lead to insufficient distribution information for RSP data blocks.To address this problem,this paper presents a novel density estimation-based method(DEM)to determine the optimal sample size for RSP data blocks.First,a theoretical sample size is calculated based on the multivariate Dvoretzky-Kiefer-Wolfowitz(DKW)inequality by using the fixed-point iteration(FPI)method.Second,a practical sample size is determined by minimizing the validation error of a kernel density estimator(KDE)constructed on RSP data blocks for an increasing sample size.Finally,a series of persuasive experiments are conducted to validate the feasibility,rationality,and effectiveness of DEM.Experimental results show that(1)the iteration function of the FPI method is convergent for calculating the theoretical sample size from the multivariate DKW inequality;(2)the KDE constructed on RSP data blocks with sample size determined by DEM can yield a good approximation of the probability density function(p.d.f);and(3)DEM provides more accurate sample sizes than the existing sample size determination methods from the perspective of p.d.f.estimation.This demonstrates that DEM is a viable approach to deal with the sample size determination problem for big data RSP implementation.展开更多
During environment testing, the estimation of random vibration signals (RVS) is an important technique for the airborne platform safety and reliability. However, the available meth- ods including extreme value envel...During environment testing, the estimation of random vibration signals (RVS) is an important technique for the airborne platform safety and reliability. However, the available meth- ods including extreme value envelope method (EVEM), statistical tolerances method (STM) and improved statistical tolerance method (ISTM) require large samples and typical probability distri- bution. Moreover, the frequency-varying characteristic of RVS is usually not taken into account. Gray bootstrap method (GBM) is proposed to solve the problem of estimating frequency-varying RVS with small samples. Firstly, the estimated indexes are obtained including the estimated inter- val, the estimated uncertainty, the estimated value, the estimated error and estimated reliability. In addition, GBM is applied to estimating the single flight testing of certain aircraft. At last, in order to evaluate the estimated performance, GBM is compared with bootstrap method (BM) and gray method (GM) in testing analysis. The result shows that GBM has superiority for estimating dynamic signals with small samples and estimated reliability is proved to be 100% at the given confidence level.展开更多
In the realm of survey data analysis,encountering substantial variance relative to bias is a common occurrence.In this study,we present an innovative strategy to tackle this issue by introducing slightly biased varian...In the realm of survey data analysis,encountering substantial variance relative to bias is a common occurrence.In this study,we present an innovative strategy to tackle this issue by introducing slightly biased variance estimators.These estimators incorporate a constant c within the range of 0 to 1,which is determined through the minimization of Mean Squared Error(MSE)for c×(variance estimator).This research builds upon the foundation laid by Kourouklis(2012,A new estimator of the variance based on minimizing mean squared error.The American Statistician,66(4),234–236)and extends their work into the domain of survey sampling.Extensive simulation studies are conducted to illustrate the superior performance of the adjusted variance estimators when compared to standard variance estimators,particularly in terms of MSE.These findings underscore the efficacy of our proposed approach in enhancing the precision of variance estimation within the context of survey data analysis.展开更多
Objective:To reveal the distribution characteristics and demographic factors of traditional Chinese medicine(TCM)constitution among elderly individuals in China.Methods: Elderly individuals from seven regions in China...Objective:To reveal the distribution characteristics and demographic factors of traditional Chinese medicine(TCM)constitution among elderly individuals in China.Methods: Elderly individuals from seven regions in China were selected as samples in this study using a multistage cluster random sampling method.The basic information questionnaire and Constitution in Chinese Medicine Questionnaire(Elderly Edition)were used.Descriptive statistical analysis,chi-squared tests,and binary logistic regression analysis were used.Results: The single balanced constitution(BC)accounted for 23.9%.The results of the major TCM constitution types showed that BC(43.2%)accounted for the largest proportion and unbalanced constitutions ranged from 0.9%to 15.7%.East China region(odds ratio[OR]=2.097;95%confidence interval[CI],1.912 to 2.301),married status(OR=1.341;95%CI,1.235 to 1.457),and managers(OR=1.254;95%CI,1.044 to 1.505)were significantly associated with BC.Age>70 years was associated with qi-deficiency constitution and blood stasis constitution(BSC).Female sex was significantly associated with yang-deficiency constitution(OR=1.646;95%CI,1.52 to 1.782).Southwest region was significantly associated with phlegm-dampness constitution(OR=1.809;95%CI,1.569 to 2.086).North China region was significantly associated with inherited special constitution(OR=2.521;95%CI,1.569 to 4.05).South China region(OR=2.741;95%CI,1.997 to 1.3.763),Central China region(OR=8.889;95%CI,6.676 to 11.835),senior middle school education(OR=2.442;95%CI,1.932 to 3.088),and managers(OR=1.804;95%CI,1.21 to 2.69)were significantly associated with BSC.Conclusions: This study defined the distribution characteristics and demographic factors of TCM constitution in the elderly population.Adjusting and improving unbalanced constitutions,which are correlated with diseases,can help promote healthy aging through the scientific management of these demographic factors.展开更多
On the basis of the principles of simple random sampling, the statistical model of rate of disfigurement (RD) is put forward and described in detail. According to the definition of simple random sampling for the attri...On the basis of the principles of simple random sampling, the statistical model of rate of disfigurement (RD) is put forward and described in detail. According to the definition of simple random sampling for the attribute data in GIS, the mean and variance of the RD are deduced as the characteristic value of the statistical model in order to explain the feasibility of the accuracy measurement of the attribute data in GIS by using the RD. Moreover, on the basis of the mean and variance of the RD, the quality assessment method for attribute data of vector maps during the data collecting is discussed. The RD spread graph is also drawn to see whether the quality of the attribute data is under control. The RD model can synthetically judge the quality of attribute data, which is different from other measurement coefficients that only discuss accuracy of classification.展开更多
In general the accuracy of mean estimator can be improved by stratified random sampling. In this paper, we provide an idea different from empirical methods that the accuracy can be more improved through bootstrap resa...In general the accuracy of mean estimator can be improved by stratified random sampling. In this paper, we provide an idea different from empirical methods that the accuracy can be more improved through bootstrap resampling method under some conditions. The determination of sample size by bootstrap method is also discussed, and a simulation is made to verify the accuracy of the proposed method. The simulation results show that the sample size based on bootstrapping is smaller than that based on central limit theorem.展开更多
In this paper, analysis of methodology was realized for the application of stratified random sampling with optimum allocation in the case of a subject of research which concerns the rural population and presents high ...In this paper, analysis of methodology was realized for the application of stratified random sampling with optimum allocation in the case of a subject of research which concerns the rural population and presents high differentiations among the three strata in which this population could be classified. The rural population of Evros Prefecture (Greece) with criterion the mean altitude of settlements was classified in three strata, the mountainous, semi-mountainous and fiat population for the estimation of mean consumption of forest fuelwood for covering of heating and cooking needs in households of these three strata. The analysis of this methodology includes: (1) the determination of total size of sample for entire the rural population and its allocation to the various strata; (2) the investigation of effectiveness of stratification with the technique of analysis of variance (One-Way ANOVA); (3) the conduct of sampling research with the realization of face-to-face interviews in selected households and (4) the control of forms of the questionnaire and the analysis of data by using the statistical package for social sciences, SPSS for Windows. All data for the analysis of this methodology and its practical application were taken by the pilot sampling which was realized in each stratum. Relative paper was not found by the review of literature.展开更多
In this paper, we propose a software component under Windows that generates pseudo random numbers using RDS (Refined Descriptive Sampling) as required by the simulation. RDS is regarded as the best sampling method a...In this paper, we propose a software component under Windows that generates pseudo random numbers using RDS (Refined Descriptive Sampling) as required by the simulation. RDS is regarded as the best sampling method as shown in the literature. In order to validate the proposed component, its implementation is proposed on approximating integrals. The simulation results from RDS using "RDSRnd" generator were compared to those obtained using the generator "Rnd" included in the Pascal programming language under Windows. The best results are given by the proposed software component.展开更多
In this paper, auxiliary information is used to determine an estimator of finite population total using nonparametric regression under stratified random sampling. To achieve this, a model-based approach is adopted by ...In this paper, auxiliary information is used to determine an estimator of finite population total using nonparametric regression under stratified random sampling. To achieve this, a model-based approach is adopted by making use of the local polynomial regression estimation to predict the nonsampled values of the survey variable y. The performance of the proposed estimator is investigated against some design-based and model-based regression estimators. The simulation experiments show that the resulting estimator exhibits good properties. Generally, good confidence intervals are seen for the nonparametric regression estimators, and use of the proposed estimator leads to relatively smaller values of RE compared to other estimators.展开更多
Srivastava and Jhajj [ 1 6] proposed a class of estimators for estimating population variance using multi auxiliary variables in simple random sampling and they utilized the means and variances of auxiliary variables....Srivastava and Jhajj [ 1 6] proposed a class of estimators for estimating population variance using multi auxiliary variables in simple random sampling and they utilized the means and variances of auxiliary variables. In this paper, we adapted this class and motivated by Searle [13], and we suggested more generalized class of estimators for estimating the population variance in simple random sampling. The expressions for the mean square error of proposed class have been derived in general form. Besides obtaining the minimized MSE of the proposed and adapted class, it is shown that the adapted classis the special case of the proposed class. Moreover, these theoretical findings are supported by an empirical study of original data.展开更多
In this paper, the problem of nonparametric estimation of finite population quantile function using multiplicative bias correction technique is considered. A robust estimator of the finite population quantile function...In this paper, the problem of nonparametric estimation of finite population quantile function using multiplicative bias correction technique is considered. A robust estimator of the finite population quantile function based on multiplicative bias correction is derived with the aid of a super population model. Most studies have concentrated on kernel smoothers in the estimation of regression functions. This technique has also been applied to various methods of non-parametric estimation of the finite population quantile already under review. A major problem with the use of nonparametric kernel-based regression over a finite interval, such as the estimation of finite population quantities, is bias at boundary points. By correcting the boundary problems associated with previous model-based estimators, the multiplicative bias corrected estimator produced better results in estimating the finite population quantile function. Furthermore, the asymptotic behavior of the proposed estimators </span><span style="font-family:Verdana;">is</span><span style="font-family:Verdana;"> presented</span><span style="font-family:Verdana;">. </span><span style="font-family:Verdana;">It is observed that the estimator is asymptotically unbiased and statistically consistent when certain conditions are satisfied. The simulation results show that the suggested estimator is quite well in terms of relative bias, mean squared error, and relative root mean error. As a result, the multiplicative bias corrected estimator is strongly suggested for survey sampling estimation of the finite population quantile function.展开更多
This research aims to develop a model to enhance lymphatic diseases diagnosis by the use of random forest ensemble machine-learning method trained with a simple sampling scheme. This study has been carried out in two ...This research aims to develop a model to enhance lymphatic diseases diagnosis by the use of random forest ensemble machine-learning method trained with a simple sampling scheme. This study has been carried out in two major phases: feature selection and classification. In the first stage, a number of discriminative features out of 18 were selected using PSO and several feature selection techniques to reduce the features dimension. In the second stage, we applied the random forest ensemble classification scheme to diagnose lymphatic diseases. While making experiments with the selected features, we used original and resampled distributions of the dataset to train random forest classifier. Experimental results demonstrate that the proposed method achieves a remark-able improvement in classification accuracy rate.展开更多
Prompt fission neutron spectra(PFNS)have a significant role in nuclear science and technology.In this study,the PFNS for^(239)Pu are evaluated using both differential and integral experimental data.A method that lever...Prompt fission neutron spectra(PFNS)have a significant role in nuclear science and technology.In this study,the PFNS for^(239)Pu are evaluated using both differential and integral experimental data.A method that leverages integral criticality benchmark experiments to constrain the PFNS data is introduced.The measured central values of the PFNS are perturbed by constructing a covariance matrix.The PFNS are sampled using two types of covariance matrices,either generated with an assumed correlation matrix and incorporating experimental uncertainties or derived directly from experimental reports.The joint Monte Carlo transport code is employed to perform transport simulations on five criticality benchmark assemblies by utilizing perturbed PFNS data.Extensive simulations result in an optimized PFNS that shows improved agreement with the integral criticality benchmark experiments.This study introduces a novel approach for optimizing differential experimental data through integral experiments,particularly when a covariance matrix is not provided.展开更多
Conventional soil maps(CSMs)often have multiple soil types within a single polygon,which hinders the ability of machine learning to accurately predict soils.Soil disaggregation approaches are commonly used to improve ...Conventional soil maps(CSMs)often have multiple soil types within a single polygon,which hinders the ability of machine learning to accurately predict soils.Soil disaggregation approaches are commonly used to improve the spatial and attribute precision of CSMs.The approach disaggregation and harmonization of soil map units through resampled classification trees(DSMART)is popular but computationally intensive,as it generates and assigns synthetic samples to soil series based on the areal coverage information of CSMs.Alternatively,the disaggregation approach pure polygon disaggregation(PPD)assigns soil series based solely on the proportions of soil series in pure polygons in CSMs.This study compared these two disaggregation approaches by applying them to a CSM of Middlesex County,Ontario,Canada.Four different sampling methods were used:two sampling designs,simple random sampling(SRS)and conditional Latin hypercube sampling(cLHS),with two sample sizes(83100 and 19420 samples per sampling plan),both based on an area-weighted approach.Two machine learning algorithms(MLAs),C5.0 decision tree(C5.0)and random forest(RF),were applied to the disaggregation approaches to compare the disaggregation accuracy.The accuracy assessment utilized a set of 500 validation points obtained from the Middlesex County soil survey report.The MLA C5.0(Kappa index=0.58–0.63)showed better performance than RF(Kappa index=0.53–0.54)based on the larger sample size,and PPD with C5.0 based on the larger sample size was the best-performing(Kappa index=0.63)approach.Based on the smaller sample size,both cLHS(Kappa index=0.41–0.48)and SRS(Kappa index=0.40–0.47)produced similar accuracy results.The disaggregation approach PPD exhibited lower processing capacity and time demands(1.62–5.93 h)while yielding maps with lower uncertainty as compared to DSMART(2.75–194.2 h).For CSMs predominantly composed of pure polygons,utilizing PPD for soil series disaggregation is a more efficient and rational choice.However,DSMART is the preferable approach for disaggregating soil series that lack pure polygon representations in the CSMs.展开更多
The curve of relationship between fatigue crack growth rate and the stress strength factor amplitude represented an important fatigue property in designing of damage tolerance limits and predicting life of metallic co...The curve of relationship between fatigue crack growth rate and the stress strength factor amplitude represented an important fatigue property in designing of damage tolerance limits and predicting life of metallic component parts. In order to have a more reasonable use of testing data, samples from population were stratified suggested by the stratified random sample model (SRAM). The data in each stratum corresponded to the same experiment conditions. A suitable weight was assigned to each stratified sample according to the actual working states of the pressure vessel, so that the estimation of fatigue crack growth rate equation was more accurate for practice. An empirical study shows that the SRAM estimation by using fatigue crack growth rate data from different stoves is obviously better than the estimation from simple random sample model.展开更多
针对双目视觉测距中测量误差大、图像信息单一、实时性差等问题,提出一种基于ORB(oriented fast and rotated brief)特征的双目测距方法。对视频帧进行中值滤波处理,提取图像ORB特征,通过实验选出匹配效果最好的汉明距离。对筛选后的匹...针对双目视觉测距中测量误差大、图像信息单一、实时性差等问题,提出一种基于ORB(oriented fast and rotated brief)特征的双目测距方法。对视频帧进行中值滤波处理,提取图像ORB特征,通过实验选出匹配效果最好的汉明距离。对筛选后的匹配点进行RANSAC(random sample consensus)模型估计,去除误匹配,分析视差和真实距离的模型关系,构建最优的测距模型并在实验平台上进行验证。结果表明:所提方法比其他双目测距方法具有测距精确、运行速度快、鲁棒性强的优势,能够实时显示图中特征的距离信息。展开更多
文摘The purpose of this paper is to obtain the expression of the sample mean difference variance of the Student’s distributive model. In the 2007 the study of the mean difference variance, after some decades, was resumed by Campobasso</span><span style="font-family:Verdana;"> [1]</span><span style="font-family:Verdana;">. Using the Nair’s </span><span style="font-family:Verdana;">[2]</span><span style="font-family:Verdana;"> and Lomnicki’s general results</span><span style="font-family:Verdana;"> [3]</span><span style="font-family:Verdana;">, he obtained the variance of sample mean difference for different distributive models (Laplace</span><span style="font-family:Verdana;">’</span><span style="font-family:Verdana;">s, triangular, power, logit, Pareto</span><span style="font-family:Verdana;">’</span><span style="font-family:Verdana;">s and Gumbel’s model). In addition he extended the knowledge comparing to the ones already known for the other distributive model (normal, rectangular and exponential model).
基金This paper was supported by the National Natural Science Foundation of China(Grant No.61972261)the Natural Science Foundation of Guangdong Province(No.2023A1515011667)+1 种基金the Key Basic Research Foundation of Shenzhen(No.JCYJ20220818100205012)the Basic Research Foundation of Shenzhen(No.JCYJ20210324093609026)。
文摘Random sample partition(RSP)is a newly developed big data representation and management model to deal with big data approximate computation problems.Academic research and practical applications have confirmed that RSP is an efficient solution for big data processing and analysis.However,a challenge for implementing RSP is determining an appropriate sample size for RSP data blocks.While a large sample size increases the burden of big data computation,a small size will lead to insufficient distribution information for RSP data blocks.To address this problem,this paper presents a novel density estimation-based method(DEM)to determine the optimal sample size for RSP data blocks.First,a theoretical sample size is calculated based on the multivariate Dvoretzky-Kiefer-Wolfowitz(DKW)inequality by using the fixed-point iteration(FPI)method.Second,a practical sample size is determined by minimizing the validation error of a kernel density estimator(KDE)constructed on RSP data blocks for an increasing sample size.Finally,a series of persuasive experiments are conducted to validate the feasibility,rationality,and effectiveness of DEM.Experimental results show that(1)the iteration function of the FPI method is convergent for calculating the theoretical sample size from the multivariate DKW inequality;(2)the KDE constructed on RSP data blocks with sample size determined by DEM can yield a good approximation of the probability density function(p.d.f);and(3)DEM provides more accurate sample sizes than the existing sample size determination methods from the perspective of p.d.f.estimation.This demonstrates that DEM is a viable approach to deal with the sample size determination problem for big data RSP implementation.
基金supported by Aviation Science Foundation of China (No. 20100251006)the Technological Foundation Project (No. J132012C001)
文摘During environment testing, the estimation of random vibration signals (RVS) is an important technique for the airborne platform safety and reliability. However, the available meth- ods including extreme value envelope method (EVEM), statistical tolerances method (STM) and improved statistical tolerance method (ISTM) require large samples and typical probability distri- bution. Moreover, the frequency-varying characteristic of RVS is usually not taken into account. Gray bootstrap method (GBM) is proposed to solve the problem of estimating frequency-varying RVS with small samples. Firstly, the estimated indexes are obtained including the estimated inter- val, the estimated uncertainty, the estimated value, the estimated error and estimated reliability. In addition, GBM is applied to estimating the single flight testing of certain aircraft. At last, in order to evaluate the estimated performance, GBM is compared with bootstrap method (BM) and gray method (GM) in testing analysis. The result shows that GBM has superiority for estimating dynamic signals with small samples and estimated reliability is proved to be 100% at the given confidence level.
文摘In the realm of survey data analysis,encountering substantial variance relative to bias is a common occurrence.In this study,we present an innovative strategy to tackle this issue by introducing slightly biased variance estimators.These estimators incorporate a constant c within the range of 0 to 1,which is determined through the minimization of Mean Squared Error(MSE)for c×(variance estimator).This research builds upon the foundation laid by Kourouklis(2012,A new estimator of the variance based on minimizing mean squared error.The American Statistician,66(4),234–236)and extends their work into the domain of survey sampling.Extensive simulation studies are conducted to illustrate the superior performance of the adjusted variance estimators when compared to standard variance estimators,particularly in terms of MSE.These findings underscore the efficacy of our proposed approach in enhancing the precision of variance estimation within the context of survey data analysis.
基金supported by the National Key R&D Program of China(2020YFC2003102).
文摘Objective:To reveal the distribution characteristics and demographic factors of traditional Chinese medicine(TCM)constitution among elderly individuals in China.Methods: Elderly individuals from seven regions in China were selected as samples in this study using a multistage cluster random sampling method.The basic information questionnaire and Constitution in Chinese Medicine Questionnaire(Elderly Edition)were used.Descriptive statistical analysis,chi-squared tests,and binary logistic regression analysis were used.Results: The single balanced constitution(BC)accounted for 23.9%.The results of the major TCM constitution types showed that BC(43.2%)accounted for the largest proportion and unbalanced constitutions ranged from 0.9%to 15.7%.East China region(odds ratio[OR]=2.097;95%confidence interval[CI],1.912 to 2.301),married status(OR=1.341;95%CI,1.235 to 1.457),and managers(OR=1.254;95%CI,1.044 to 1.505)were significantly associated with BC.Age>70 years was associated with qi-deficiency constitution and blood stasis constitution(BSC).Female sex was significantly associated with yang-deficiency constitution(OR=1.646;95%CI,1.52 to 1.782).Southwest region was significantly associated with phlegm-dampness constitution(OR=1.809;95%CI,1.569 to 2.086).North China region was significantly associated with inherited special constitution(OR=2.521;95%CI,1.569 to 4.05).South China region(OR=2.741;95%CI,1.997 to 1.3.763),Central China region(OR=8.889;95%CI,6.676 to 11.835),senior middle school education(OR=2.442;95%CI,1.932 to 3.088),and managers(OR=1.804;95%CI,1.21 to 2.69)were significantly associated with BSC.Conclusions: This study defined the distribution characteristics and demographic factors of TCM constitution in the elderly population.Adjusting and improving unbalanced constitutions,which are correlated with diseases,can help promote healthy aging through the scientific management of these demographic factors.
基金ProjectsupportedbytheNationalNaturalScienceFoundationofChina (No .40 1 71 0 78) ,FundfromHongKongPolytechnicUniversity (No.1 .34 .970 9)andtheResearchGrantsCouncilofHongKongSAR (No .3 ZB40 ) .
文摘On the basis of the principles of simple random sampling, the statistical model of rate of disfigurement (RD) is put forward and described in detail. According to the definition of simple random sampling for the attribute data in GIS, the mean and variance of the RD are deduced as the characteristic value of the statistical model in order to explain the feasibility of the accuracy measurement of the attribute data in GIS by using the RD. Moreover, on the basis of the mean and variance of the RD, the quality assessment method for attribute data of vector maps during the data collecting is discussed. The RD spread graph is also drawn to see whether the quality of the attribute data is under control. The RD model can synthetically judge the quality of attribute data, which is different from other measurement coefficients that only discuss accuracy of classification.
基金The Science Research Start-up Foundation for Young Teachers of Southwest Jiaotong University(No.2007Q091)
文摘In general the accuracy of mean estimator can be improved by stratified random sampling. In this paper, we provide an idea different from empirical methods that the accuracy can be more improved through bootstrap resampling method under some conditions. The determination of sample size by bootstrap method is also discussed, and a simulation is made to verify the accuracy of the proposed method. The simulation results show that the sample size based on bootstrapping is smaller than that based on central limit theorem.
文摘In this paper, analysis of methodology was realized for the application of stratified random sampling with optimum allocation in the case of a subject of research which concerns the rural population and presents high differentiations among the three strata in which this population could be classified. The rural population of Evros Prefecture (Greece) with criterion the mean altitude of settlements was classified in three strata, the mountainous, semi-mountainous and fiat population for the estimation of mean consumption of forest fuelwood for covering of heating and cooking needs in households of these three strata. The analysis of this methodology includes: (1) the determination of total size of sample for entire the rural population and its allocation to the various strata; (2) the investigation of effectiveness of stratification with the technique of analysis of variance (One-Way ANOVA); (3) the conduct of sampling research with the realization of face-to-face interviews in selected households and (4) the control of forms of the questionnaire and the analysis of data by using the statistical package for social sciences, SPSS for Windows. All data for the analysis of this methodology and its practical application were taken by the pilot sampling which was realized in each stratum. Relative paper was not found by the review of literature.
文摘In this paper, we propose a software component under Windows that generates pseudo random numbers using RDS (Refined Descriptive Sampling) as required by the simulation. RDS is regarded as the best sampling method as shown in the literature. In order to validate the proposed component, its implementation is proposed on approximating integrals. The simulation results from RDS using "RDSRnd" generator were compared to those obtained using the generator "Rnd" included in the Pascal programming language under Windows. The best results are given by the proposed software component.
文摘In this paper, auxiliary information is used to determine an estimator of finite population total using nonparametric regression under stratified random sampling. To achieve this, a model-based approach is adopted by making use of the local polynomial regression estimation to predict the nonsampled values of the survey variable y. The performance of the proposed estimator is investigated against some design-based and model-based regression estimators. The simulation experiments show that the resulting estimator exhibits good properties. Generally, good confidence intervals are seen for the nonparametric regression estimators, and use of the proposed estimator leads to relatively smaller values of RE compared to other estimators.
文摘Srivastava and Jhajj [ 1 6] proposed a class of estimators for estimating population variance using multi auxiliary variables in simple random sampling and they utilized the means and variances of auxiliary variables. In this paper, we adapted this class and motivated by Searle [13], and we suggested more generalized class of estimators for estimating the population variance in simple random sampling. The expressions for the mean square error of proposed class have been derived in general form. Besides obtaining the minimized MSE of the proposed and adapted class, it is shown that the adapted classis the special case of the proposed class. Moreover, these theoretical findings are supported by an empirical study of original data.
文摘In this paper, the problem of nonparametric estimation of finite population quantile function using multiplicative bias correction technique is considered. A robust estimator of the finite population quantile function based on multiplicative bias correction is derived with the aid of a super population model. Most studies have concentrated on kernel smoothers in the estimation of regression functions. This technique has also been applied to various methods of non-parametric estimation of the finite population quantile already under review. A major problem with the use of nonparametric kernel-based regression over a finite interval, such as the estimation of finite population quantities, is bias at boundary points. By correcting the boundary problems associated with previous model-based estimators, the multiplicative bias corrected estimator produced better results in estimating the finite population quantile function. Furthermore, the asymptotic behavior of the proposed estimators </span><span style="font-family:Verdana;">is</span><span style="font-family:Verdana;"> presented</span><span style="font-family:Verdana;">. </span><span style="font-family:Verdana;">It is observed that the estimator is asymptotically unbiased and statistically consistent when certain conditions are satisfied. The simulation results show that the suggested estimator is quite well in terms of relative bias, mean squared error, and relative root mean error. As a result, the multiplicative bias corrected estimator is strongly suggested for survey sampling estimation of the finite population quantile function.
文摘This research aims to develop a model to enhance lymphatic diseases diagnosis by the use of random forest ensemble machine-learning method trained with a simple sampling scheme. This study has been carried out in two major phases: feature selection and classification. In the first stage, a number of discriminative features out of 18 were selected using PSO and several feature selection techniques to reduce the features dimension. In the second stage, we applied the random forest ensemble classification scheme to diagnose lymphatic diseases. While making experiments with the selected features, we used original and resampled distributions of the dataset to train random forest classifier. Experimental results demonstrate that the proposed method achieves a remark-able improvement in classification accuracy rate.
基金supported by the National Natural Science Foundation of China(No.12347126)。
文摘Prompt fission neutron spectra(PFNS)have a significant role in nuclear science and technology.In this study,the PFNS for^(239)Pu are evaluated using both differential and integral experimental data.A method that leverages integral criticality benchmark experiments to constrain the PFNS data is introduced.The measured central values of the PFNS are perturbed by constructing a covariance matrix.The PFNS are sampled using two types of covariance matrices,either generated with an assumed correlation matrix and incorporating experimental uncertainties or derived directly from experimental reports.The joint Monte Carlo transport code is employed to perform transport simulations on five criticality benchmark assemblies by utilizing perturbed PFNS data.Extensive simulations result in an optimized PFNS that shows improved agreement with the integral criticality benchmark experiments.This study introduces a novel approach for optimizing differential experimental data through integral experiments,particularly when a covariance matrix is not provided.
基金the Ontario Ministry of Agriculture,Food and Rural Affairs,Canada,who supported this project by providing updated soil information on Ontario and Middlesex Countysupported by the Natural Science and Engineering Research Council of Canada(No.RGPIN-2014-4100)。
文摘Conventional soil maps(CSMs)often have multiple soil types within a single polygon,which hinders the ability of machine learning to accurately predict soils.Soil disaggregation approaches are commonly used to improve the spatial and attribute precision of CSMs.The approach disaggregation and harmonization of soil map units through resampled classification trees(DSMART)is popular but computationally intensive,as it generates and assigns synthetic samples to soil series based on the areal coverage information of CSMs.Alternatively,the disaggregation approach pure polygon disaggregation(PPD)assigns soil series based solely on the proportions of soil series in pure polygons in CSMs.This study compared these two disaggregation approaches by applying them to a CSM of Middlesex County,Ontario,Canada.Four different sampling methods were used:two sampling designs,simple random sampling(SRS)and conditional Latin hypercube sampling(cLHS),with two sample sizes(83100 and 19420 samples per sampling plan),both based on an area-weighted approach.Two machine learning algorithms(MLAs),C5.0 decision tree(C5.0)and random forest(RF),were applied to the disaggregation approaches to compare the disaggregation accuracy.The accuracy assessment utilized a set of 500 validation points obtained from the Middlesex County soil survey report.The MLA C5.0(Kappa index=0.58–0.63)showed better performance than RF(Kappa index=0.53–0.54)based on the larger sample size,and PPD with C5.0 based on the larger sample size was the best-performing(Kappa index=0.63)approach.Based on the smaller sample size,both cLHS(Kappa index=0.41–0.48)and SRS(Kappa index=0.40–0.47)produced similar accuracy results.The disaggregation approach PPD exhibited lower processing capacity and time demands(1.62–5.93 h)while yielding maps with lower uncertainty as compared to DSMART(2.75–194.2 h).For CSMs predominantly composed of pure polygons,utilizing PPD for soil series disaggregation is a more efficient and rational choice.However,DSMART is the preferable approach for disaggregating soil series that lack pure polygon representations in the CSMs.
文摘The curve of relationship between fatigue crack growth rate and the stress strength factor amplitude represented an important fatigue property in designing of damage tolerance limits and predicting life of metallic component parts. In order to have a more reasonable use of testing data, samples from population were stratified suggested by the stratified random sample model (SRAM). The data in each stratum corresponded to the same experiment conditions. A suitable weight was assigned to each stratified sample according to the actual working states of the pressure vessel, so that the estimation of fatigue crack growth rate equation was more accurate for practice. An empirical study shows that the SRAM estimation by using fatigue crack growth rate data from different stoves is obviously better than the estimation from simple random sample model.
文摘针对双目视觉测距中测量误差大、图像信息单一、实时性差等问题,提出一种基于ORB(oriented fast and rotated brief)特征的双目测距方法。对视频帧进行中值滤波处理,提取图像ORB特征,通过实验选出匹配效果最好的汉明距离。对筛选后的匹配点进行RANSAC(random sample consensus)模型估计,去除误匹配,分析视差和真实距离的模型关系,构建最优的测距模型并在实验平台上进行验证。结果表明:所提方法比其他双目测距方法具有测距精确、运行速度快、鲁棒性强的优势,能够实时显示图中特征的距离信息。