The purpose of this paper is to obtain the expression of the sample mean difference variance of the Student’s distributive model. In the 2007 the study of the mean difference variance, after some decades, was resumed...The purpose of this paper is to obtain the expression of the sample mean difference variance of the Student’s distributive model. In the 2007 the study of the mean difference variance, after some decades, was resumed by Campobasso</span><span style="font-family:Verdana;"> [1]</span><span style="font-family:Verdana;">. Using the Nair’s </span><span style="font-family:Verdana;">[2]</span><span style="font-family:Verdana;"> and Lomnicki’s general results</span><span style="font-family:Verdana;"> [3]</span><span style="font-family:Verdana;">, he obtained the variance of sample mean difference for different distributive models (Laplace</span><span style="font-family:Verdana;">’</span><span style="font-family:Verdana;">s, triangular, power, logit, Pareto</span><span style="font-family:Verdana;">’</span><span style="font-family:Verdana;">s and Gumbel’s model). In addition he extended the knowledge comparing to the ones already known for the other distributive model (normal, rectangular and exponential model).展开更多
During environment testing, the estimation of random vibration signals (RVS) is an important technique for the airborne platform safety and reliability. However, the available meth- ods including extreme value envel...During environment testing, the estimation of random vibration signals (RVS) is an important technique for the airborne platform safety and reliability. However, the available meth- ods including extreme value envelope method (EVEM), statistical tolerances method (STM) and improved statistical tolerance method (ISTM) require large samples and typical probability distri- bution. Moreover, the frequency-varying characteristic of RVS is usually not taken into account. Gray bootstrap method (GBM) is proposed to solve the problem of estimating frequency-varying RVS with small samples. Firstly, the estimated indexes are obtained including the estimated inter- val, the estimated uncertainty, the estimated value, the estimated error and estimated reliability. In addition, GBM is applied to estimating the single flight testing of certain aircraft. At last, in order to evaluate the estimated performance, GBM is compared with bootstrap method (BM) and gray method (GM) in testing analysis. The result shows that GBM has superiority for estimating dynamic signals with small samples and estimated reliability is proved to be 100% at the given confidence level.展开更多
In the realm of survey data analysis,encountering substantial variance relative to bias is a common occurrence.In this study,we present an innovative strategy to tackle this issue by introducing slightly biased varian...In the realm of survey data analysis,encountering substantial variance relative to bias is a common occurrence.In this study,we present an innovative strategy to tackle this issue by introducing slightly biased variance estimators.These estimators incorporate a constant c within the range of 0 to 1,which is determined through the minimization of Mean Squared Error(MSE)for c×(variance estimator).This research builds upon the foundation laid by Kourouklis(2012,A new estimator of the variance based on minimizing mean squared error.The American Statistician,66(4),234–236)and extends their work into the domain of survey sampling.Extensive simulation studies are conducted to illustrate the superior performance of the adjusted variance estimators when compared to standard variance estimators,particularly in terms of MSE.These findings underscore the efficacy of our proposed approach in enhancing the precision of variance estimation within the context of survey data analysis.展开更多
Weighted exponential distribution W ED(α,λ)with shape parameterαand scale parameterλpossesses some good properties and can be used as a good fit to survival time data compared to other distributions such as gamma,...Weighted exponential distribution W ED(α,λ)with shape parameterαand scale parameterλpossesses some good properties and can be used as a good fit to survival time data compared to other distributions such as gamma,Weibull,or generalized exponential distribution.In this article,we proved the existence and uniqueness of the maximum likelihood estimator(MLE)of the parameters of W ED(α,λ)in simple random sampling(SRS)and provided explicit expressions for the Fisher information number in SRS.Moreover,we also proved the existence and uniqueness of the MLE of the parameters of W ED(α,λ)in ranked set sampling(RSS)and provided explicit expressions for the Fisher information number in RSS.Simulation studies show that these MLEs in RSS can be real competitors for those in SRS.展开更多
Objective:To reveal the distribution characteristics and demographic factors of traditional Chinese medicine(TCM)constitution among elderly individuals in China.Methods: Elderly individuals from seven regions in China...Objective:To reveal the distribution characteristics and demographic factors of traditional Chinese medicine(TCM)constitution among elderly individuals in China.Methods: Elderly individuals from seven regions in China were selected as samples in this study using a multistage cluster random sampling method.The basic information questionnaire and Constitution in Chinese Medicine Questionnaire(Elderly Edition)were used.Descriptive statistical analysis,chi-squared tests,and binary logistic regression analysis were used.Results: The single balanced constitution(BC)accounted for 23.9%.The results of the major TCM constitution types showed that BC(43.2%)accounted for the largest proportion and unbalanced constitutions ranged from 0.9%to 15.7%.East China region(odds ratio[OR]=2.097;95%confidence interval[CI],1.912 to 2.301),married status(OR=1.341;95%CI,1.235 to 1.457),and managers(OR=1.254;95%CI,1.044 to 1.505)were significantly associated with BC.Age>70 years was associated with qi-deficiency constitution and blood stasis constitution(BSC).Female sex was significantly associated with yang-deficiency constitution(OR=1.646;95%CI,1.52 to 1.782).Southwest region was significantly associated with phlegm-dampness constitution(OR=1.809;95%CI,1.569 to 2.086).North China region was significantly associated with inherited special constitution(OR=2.521;95%CI,1.569 to 4.05).South China region(OR=2.741;95%CI,1.997 to 1.3.763),Central China region(OR=8.889;95%CI,6.676 to 11.835),senior middle school education(OR=2.442;95%CI,1.932 to 3.088),and managers(OR=1.804;95%CI,1.21 to 2.69)were significantly associated with BSC.Conclusions: This study defined the distribution characteristics and demographic factors of TCM constitution in the elderly population.Adjusting and improving unbalanced constitutions,which are correlated with diseases,can help promote healthy aging through the scientific management of these demographic factors.展开更多
On the basis of the principles of simple random sampling, the statistical model of rate of disfigurement (RD) is put forward and described in detail. According to the definition of simple random sampling for the attri...On the basis of the principles of simple random sampling, the statistical model of rate of disfigurement (RD) is put forward and described in detail. According to the definition of simple random sampling for the attribute data in GIS, the mean and variance of the RD are deduced as the characteristic value of the statistical model in order to explain the feasibility of the accuracy measurement of the attribute data in GIS by using the RD. Moreover, on the basis of the mean and variance of the RD, the quality assessment method for attribute data of vector maps during the data collecting is discussed. The RD spread graph is also drawn to see whether the quality of the attribute data is under control. The RD model can synthetically judge the quality of attribute data, which is different from other measurement coefficients that only discuss accuracy of classification.展开更多
In general the accuracy of mean estimator can be improved by stratified random sampling. In this paper, we provide an idea different from empirical methods that the accuracy can be more improved through bootstrap resa...In general the accuracy of mean estimator can be improved by stratified random sampling. In this paper, we provide an idea different from empirical methods that the accuracy can be more improved through bootstrap resampling method under some conditions. The determination of sample size by bootstrap method is also discussed, and a simulation is made to verify the accuracy of the proposed method. The simulation results show that the sample size based on bootstrapping is smaller than that based on central limit theorem.展开更多
In this paper, analysis of methodology was realized for the application of stratified random sampling with optimum allocation in the case of a subject of research which concerns the rural population and presents high ...In this paper, analysis of methodology was realized for the application of stratified random sampling with optimum allocation in the case of a subject of research which concerns the rural population and presents high differentiations among the three strata in which this population could be classified. The rural population of Evros Prefecture (Greece) with criterion the mean altitude of settlements was classified in three strata, the mountainous, semi-mountainous and fiat population for the estimation of mean consumption of forest fuelwood for covering of heating and cooking needs in households of these three strata. The analysis of this methodology includes: (1) the determination of total size of sample for entire the rural population and its allocation to the various strata; (2) the investigation of effectiveness of stratification with the technique of analysis of variance (One-Way ANOVA); (3) the conduct of sampling research with the realization of face-to-face interviews in selected households and (4) the control of forms of the questionnaire and the analysis of data by using the statistical package for social sciences, SPSS for Windows. All data for the analysis of this methodology and its practical application were taken by the pilot sampling which was realized in each stratum. Relative paper was not found by the review of literature.展开更多
In this paper, we propose a software component under Windows that generates pseudo random numbers using RDS (Refined Descriptive Sampling) as required by the simulation. RDS is regarded as the best sampling method a...In this paper, we propose a software component under Windows that generates pseudo random numbers using RDS (Refined Descriptive Sampling) as required by the simulation. RDS is regarded as the best sampling method as shown in the literature. In order to validate the proposed component, its implementation is proposed on approximating integrals. The simulation results from RDS using "RDSRnd" generator were compared to those obtained using the generator "Rnd" included in the Pascal programming language under Windows. The best results are given by the proposed software component.展开更多
In this paper, auxiliary information is used to determine an estimator of finite population total using nonparametric regression under stratified random sampling. To achieve this, a model-based approach is adopted by ...In this paper, auxiliary information is used to determine an estimator of finite population total using nonparametric regression under stratified random sampling. To achieve this, a model-based approach is adopted by making use of the local polynomial regression estimation to predict the nonsampled values of the survey variable y. The performance of the proposed estimator is investigated against some design-based and model-based regression estimators. The simulation experiments show that the resulting estimator exhibits good properties. Generally, good confidence intervals are seen for the nonparametric regression estimators, and use of the proposed estimator leads to relatively smaller values of RE compared to other estimators.展开更多
Srivastava and Jhajj [ 1 6] proposed a class of estimators for estimating population variance using multi auxiliary variables in simple random sampling and they utilized the means and variances of auxiliary variables....Srivastava and Jhajj [ 1 6] proposed a class of estimators for estimating population variance using multi auxiliary variables in simple random sampling and they utilized the means and variances of auxiliary variables. In this paper, we adapted this class and motivated by Searle [13], and we suggested more generalized class of estimators for estimating the population variance in simple random sampling. The expressions for the mean square error of proposed class have been derived in general form. Besides obtaining the minimized MSE of the proposed and adapted class, it is shown that the adapted classis the special case of the proposed class. Moreover, these theoretical findings are supported by an empirical study of original data.展开更多
In this paper, the problem of nonparametric estimation of finite population quantile function using multiplicative bias correction technique is considered. A robust estimator of the finite population quantile function...In this paper, the problem of nonparametric estimation of finite population quantile function using multiplicative bias correction technique is considered. A robust estimator of the finite population quantile function based on multiplicative bias correction is derived with the aid of a super population model. Most studies have concentrated on kernel smoothers in the estimation of regression functions. This technique has also been applied to various methods of non-parametric estimation of the finite population quantile already under review. A major problem with the use of nonparametric kernel-based regression over a finite interval, such as the estimation of finite population quantities, is bias at boundary points. By correcting the boundary problems associated with previous model-based estimators, the multiplicative bias corrected estimator produced better results in estimating the finite population quantile function. Furthermore, the asymptotic behavior of the proposed estimators </span><span style="font-family:Verdana;">is</span><span style="font-family:Verdana;"> presented</span><span style="font-family:Verdana;">. </span><span style="font-family:Verdana;">It is observed that the estimator is asymptotically unbiased and statistically consistent when certain conditions are satisfied. The simulation results show that the suggested estimator is quite well in terms of relative bias, mean squared error, and relative root mean error. As a result, the multiplicative bias corrected estimator is strongly suggested for survey sampling estimation of the finite population quantile function.展开更多
This research aims to develop a model to enhance lymphatic diseases diagnosis by the use of random forest ensemble machine-learning method trained with a simple sampling scheme. This study has been carried out in two ...This research aims to develop a model to enhance lymphatic diseases diagnosis by the use of random forest ensemble machine-learning method trained with a simple sampling scheme. This study has been carried out in two major phases: feature selection and classification. In the first stage, a number of discriminative features out of 18 were selected using PSO and several feature selection techniques to reduce the features dimension. In the second stage, we applied the random forest ensemble classification scheme to diagnose lymphatic diseases. While making experiments with the selected features, we used original and resampled distributions of the dataset to train random forest classifier. Experimental results demonstrate that the proposed method achieves a remark-able improvement in classification accuracy rate.展开更多
Prompt fission neutron spectra(PFNS)have a significant role in nuclear science and technology.In this study,the PFNS for^(239)Pu are evaluated using both differential and integral experimental data.A method that lever...Prompt fission neutron spectra(PFNS)have a significant role in nuclear science and technology.In this study,the PFNS for^(239)Pu are evaluated using both differential and integral experimental data.A method that leverages integral criticality benchmark experiments to constrain the PFNS data is introduced.The measured central values of the PFNS are perturbed by constructing a covariance matrix.The PFNS are sampled using two types of covariance matrices,either generated with an assumed correlation matrix and incorporating experimental uncertainties or derived directly from experimental reports.The joint Monte Carlo transport code is employed to perform transport simulations on five criticality benchmark assemblies by utilizing perturbed PFNS data.Extensive simulations result in an optimized PFNS that shows improved agreement with the integral criticality benchmark experiments.This study introduces a novel approach for optimizing differential experimental data through integral experiments,particularly when a covariance matrix is not provided.展开更多
Conventional soil maps(CSMs)often have multiple soil types within a single polygon,which hinders the ability of machine learning to accurately predict soils.Soil disaggregation approaches are commonly used to improve ...Conventional soil maps(CSMs)often have multiple soil types within a single polygon,which hinders the ability of machine learning to accurately predict soils.Soil disaggregation approaches are commonly used to improve the spatial and attribute precision of CSMs.The approach disaggregation and harmonization of soil map units through resampled classification trees(DSMART)is popular but computationally intensive,as it generates and assigns synthetic samples to soil series based on the areal coverage information of CSMs.Alternatively,the disaggregation approach pure polygon disaggregation(PPD)assigns soil series based solely on the proportions of soil series in pure polygons in CSMs.This study compared these two disaggregation approaches by applying them to a CSM of Middlesex County,Ontario,Canada.Four different sampling methods were used:two sampling designs,simple random sampling(SRS)and conditional Latin hypercube sampling(cLHS),with two sample sizes(83100 and 19420 samples per sampling plan),both based on an area-weighted approach.Two machine learning algorithms(MLAs),C5.0 decision tree(C5.0)and random forest(RF),were applied to the disaggregation approaches to compare the disaggregation accuracy.The accuracy assessment utilized a set of 500 validation points obtained from the Middlesex County soil survey report.The MLA C5.0(Kappa index=0.58–0.63)showed better performance than RF(Kappa index=0.53–0.54)based on the larger sample size,and PPD with C5.0 based on the larger sample size was the best-performing(Kappa index=0.63)approach.Based on the smaller sample size,both cLHS(Kappa index=0.41–0.48)and SRS(Kappa index=0.40–0.47)produced similar accuracy results.The disaggregation approach PPD exhibited lower processing capacity and time demands(1.62–5.93 h)while yielding maps with lower uncertainty as compared to DSMART(2.75–194.2 h).For CSMs predominantly composed of pure polygons,utilizing PPD for soil series disaggregation is a more efficient and rational choice.However,DSMART is the preferable approach for disaggregating soil series that lack pure polygon representations in the CSMs.展开更多
目的:探讨随机化检验(Randomization test)在内部预试验IPS(Internal Pilot Study)自适应设计样本量调整中对I型错误和检验效能的影响.方法:利用蒙特-卡罗(MonteCarlo)法模拟样本量较小时的IPS样本量调整,分别采用随机化检验和t检验分...目的:探讨随机化检验(Randomization test)在内部预试验IPS(Internal Pilot Study)自适应设计样本量调整中对I型错误和检验效能的影响.方法:利用蒙特-卡罗(MonteCarlo)法模拟样本量较小时的IPS样本量调整,分别采用随机化检验和t检验分析最后数据并比较二者对I型错误、检验效能值的影响.结果:重计算的第二阶段样本量波动性较大,t检验不能很好地抑制I型错误,随机化检验能较好的抑制I型错误,检验效能略有降低.结论:在临床试验样本量较小的情况下,内部预试验盲态下样本量调整后随机化检验能保护I型错误不增大,同时保证检验效能亦满足要求.展开更多
文摘The purpose of this paper is to obtain the expression of the sample mean difference variance of the Student’s distributive model. In the 2007 the study of the mean difference variance, after some decades, was resumed by Campobasso</span><span style="font-family:Verdana;"> [1]</span><span style="font-family:Verdana;">. Using the Nair’s </span><span style="font-family:Verdana;">[2]</span><span style="font-family:Verdana;"> and Lomnicki’s general results</span><span style="font-family:Verdana;"> [3]</span><span style="font-family:Verdana;">, he obtained the variance of sample mean difference for different distributive models (Laplace</span><span style="font-family:Verdana;">’</span><span style="font-family:Verdana;">s, triangular, power, logit, Pareto</span><span style="font-family:Verdana;">’</span><span style="font-family:Verdana;">s and Gumbel’s model). In addition he extended the knowledge comparing to the ones already known for the other distributive model (normal, rectangular and exponential model).
基金supported by Aviation Science Foundation of China (No. 20100251006)the Technological Foundation Project (No. J132012C001)
文摘During environment testing, the estimation of random vibration signals (RVS) is an important technique for the airborne platform safety and reliability. However, the available meth- ods including extreme value envelope method (EVEM), statistical tolerances method (STM) and improved statistical tolerance method (ISTM) require large samples and typical probability distri- bution. Moreover, the frequency-varying characteristic of RVS is usually not taken into account. Gray bootstrap method (GBM) is proposed to solve the problem of estimating frequency-varying RVS with small samples. Firstly, the estimated indexes are obtained including the estimated inter- val, the estimated uncertainty, the estimated value, the estimated error and estimated reliability. In addition, GBM is applied to estimating the single flight testing of certain aircraft. At last, in order to evaluate the estimated performance, GBM is compared with bootstrap method (BM) and gray method (GM) in testing analysis. The result shows that GBM has superiority for estimating dynamic signals with small samples and estimated reliability is proved to be 100% at the given confidence level.
文摘In the realm of survey data analysis,encountering substantial variance relative to bias is a common occurrence.In this study,we present an innovative strategy to tackle this issue by introducing slightly biased variance estimators.These estimators incorporate a constant c within the range of 0 to 1,which is determined through the minimization of Mean Squared Error(MSE)for c×(variance estimator).This research builds upon the foundation laid by Kourouklis(2012,A new estimator of the variance based on minimizing mean squared error.The American Statistician,66(4),234–236)and extends their work into the domain of survey sampling.Extensive simulation studies are conducted to illustrate the superior performance of the adjusted variance estimators when compared to standard variance estimators,particularly in terms of MSE.These findings underscore the efficacy of our proposed approach in enhancing the precision of variance estimation within the context of survey data analysis.
基金Supported by the National Science Foundation of China(11901236,12261036)Scientific Research Fund of Hunan Provincial Education Department(21A0328)+2 种基金Provincial Natural Science Foundation of Hunan(2022JJ30469)Young Core Teacher Foundation of Hunan Province([2020]43)Provincial Postgraduate Innovation Foundation of Hunan(CX20221113)。
文摘Weighted exponential distribution W ED(α,λ)with shape parameterαand scale parameterλpossesses some good properties and can be used as a good fit to survival time data compared to other distributions such as gamma,Weibull,or generalized exponential distribution.In this article,we proved the existence and uniqueness of the maximum likelihood estimator(MLE)of the parameters of W ED(α,λ)in simple random sampling(SRS)and provided explicit expressions for the Fisher information number in SRS.Moreover,we also proved the existence and uniqueness of the MLE of the parameters of W ED(α,λ)in ranked set sampling(RSS)and provided explicit expressions for the Fisher information number in RSS.Simulation studies show that these MLEs in RSS can be real competitors for those in SRS.
基金supported by the National Key R&D Program of China(2020YFC2003102).
文摘Objective:To reveal the distribution characteristics and demographic factors of traditional Chinese medicine(TCM)constitution among elderly individuals in China.Methods: Elderly individuals from seven regions in China were selected as samples in this study using a multistage cluster random sampling method.The basic information questionnaire and Constitution in Chinese Medicine Questionnaire(Elderly Edition)were used.Descriptive statistical analysis,chi-squared tests,and binary logistic regression analysis were used.Results: The single balanced constitution(BC)accounted for 23.9%.The results of the major TCM constitution types showed that BC(43.2%)accounted for the largest proportion and unbalanced constitutions ranged from 0.9%to 15.7%.East China region(odds ratio[OR]=2.097;95%confidence interval[CI],1.912 to 2.301),married status(OR=1.341;95%CI,1.235 to 1.457),and managers(OR=1.254;95%CI,1.044 to 1.505)were significantly associated with BC.Age>70 years was associated with qi-deficiency constitution and blood stasis constitution(BSC).Female sex was significantly associated with yang-deficiency constitution(OR=1.646;95%CI,1.52 to 1.782).Southwest region was significantly associated with phlegm-dampness constitution(OR=1.809;95%CI,1.569 to 2.086).North China region was significantly associated with inherited special constitution(OR=2.521;95%CI,1.569 to 4.05).South China region(OR=2.741;95%CI,1.997 to 1.3.763),Central China region(OR=8.889;95%CI,6.676 to 11.835),senior middle school education(OR=2.442;95%CI,1.932 to 3.088),and managers(OR=1.804;95%CI,1.21 to 2.69)were significantly associated with BSC.Conclusions: This study defined the distribution characteristics and demographic factors of TCM constitution in the elderly population.Adjusting and improving unbalanced constitutions,which are correlated with diseases,can help promote healthy aging through the scientific management of these demographic factors.
基金ProjectsupportedbytheNationalNaturalScienceFoundationofChina (No .40 1 71 0 78) ,FundfromHongKongPolytechnicUniversity (No.1 .34 .970 9)andtheResearchGrantsCouncilofHongKongSAR (No .3 ZB40 ) .
文摘On the basis of the principles of simple random sampling, the statistical model of rate of disfigurement (RD) is put forward and described in detail. According to the definition of simple random sampling for the attribute data in GIS, the mean and variance of the RD are deduced as the characteristic value of the statistical model in order to explain the feasibility of the accuracy measurement of the attribute data in GIS by using the RD. Moreover, on the basis of the mean and variance of the RD, the quality assessment method for attribute data of vector maps during the data collecting is discussed. The RD spread graph is also drawn to see whether the quality of the attribute data is under control. The RD model can synthetically judge the quality of attribute data, which is different from other measurement coefficients that only discuss accuracy of classification.
基金The Science Research Start-up Foundation for Young Teachers of Southwest Jiaotong University(No.2007Q091)
文摘In general the accuracy of mean estimator can be improved by stratified random sampling. In this paper, we provide an idea different from empirical methods that the accuracy can be more improved through bootstrap resampling method under some conditions. The determination of sample size by bootstrap method is also discussed, and a simulation is made to verify the accuracy of the proposed method. The simulation results show that the sample size based on bootstrapping is smaller than that based on central limit theorem.
文摘In this paper, analysis of methodology was realized for the application of stratified random sampling with optimum allocation in the case of a subject of research which concerns the rural population and presents high differentiations among the three strata in which this population could be classified. The rural population of Evros Prefecture (Greece) with criterion the mean altitude of settlements was classified in three strata, the mountainous, semi-mountainous and fiat population for the estimation of mean consumption of forest fuelwood for covering of heating and cooking needs in households of these three strata. The analysis of this methodology includes: (1) the determination of total size of sample for entire the rural population and its allocation to the various strata; (2) the investigation of effectiveness of stratification with the technique of analysis of variance (One-Way ANOVA); (3) the conduct of sampling research with the realization of face-to-face interviews in selected households and (4) the control of forms of the questionnaire and the analysis of data by using the statistical package for social sciences, SPSS for Windows. All data for the analysis of this methodology and its practical application were taken by the pilot sampling which was realized in each stratum. Relative paper was not found by the review of literature.
文摘In this paper, we propose a software component under Windows that generates pseudo random numbers using RDS (Refined Descriptive Sampling) as required by the simulation. RDS is regarded as the best sampling method as shown in the literature. In order to validate the proposed component, its implementation is proposed on approximating integrals. The simulation results from RDS using "RDSRnd" generator were compared to those obtained using the generator "Rnd" included in the Pascal programming language under Windows. The best results are given by the proposed software component.
文摘In this paper, auxiliary information is used to determine an estimator of finite population total using nonparametric regression under stratified random sampling. To achieve this, a model-based approach is adopted by making use of the local polynomial regression estimation to predict the nonsampled values of the survey variable y. The performance of the proposed estimator is investigated against some design-based and model-based regression estimators. The simulation experiments show that the resulting estimator exhibits good properties. Generally, good confidence intervals are seen for the nonparametric regression estimators, and use of the proposed estimator leads to relatively smaller values of RE compared to other estimators.
文摘Srivastava and Jhajj [ 1 6] proposed a class of estimators for estimating population variance using multi auxiliary variables in simple random sampling and they utilized the means and variances of auxiliary variables. In this paper, we adapted this class and motivated by Searle [13], and we suggested more generalized class of estimators for estimating the population variance in simple random sampling. The expressions for the mean square error of proposed class have been derived in general form. Besides obtaining the minimized MSE of the proposed and adapted class, it is shown that the adapted classis the special case of the proposed class. Moreover, these theoretical findings are supported by an empirical study of original data.
文摘In this paper, the problem of nonparametric estimation of finite population quantile function using multiplicative bias correction technique is considered. A robust estimator of the finite population quantile function based on multiplicative bias correction is derived with the aid of a super population model. Most studies have concentrated on kernel smoothers in the estimation of regression functions. This technique has also been applied to various methods of non-parametric estimation of the finite population quantile already under review. A major problem with the use of nonparametric kernel-based regression over a finite interval, such as the estimation of finite population quantities, is bias at boundary points. By correcting the boundary problems associated with previous model-based estimators, the multiplicative bias corrected estimator produced better results in estimating the finite population quantile function. Furthermore, the asymptotic behavior of the proposed estimators </span><span style="font-family:Verdana;">is</span><span style="font-family:Verdana;"> presented</span><span style="font-family:Verdana;">. </span><span style="font-family:Verdana;">It is observed that the estimator is asymptotically unbiased and statistically consistent when certain conditions are satisfied. The simulation results show that the suggested estimator is quite well in terms of relative bias, mean squared error, and relative root mean error. As a result, the multiplicative bias corrected estimator is strongly suggested for survey sampling estimation of the finite population quantile function.
文摘This research aims to develop a model to enhance lymphatic diseases diagnosis by the use of random forest ensemble machine-learning method trained with a simple sampling scheme. This study has been carried out in two major phases: feature selection and classification. In the first stage, a number of discriminative features out of 18 were selected using PSO and several feature selection techniques to reduce the features dimension. In the second stage, we applied the random forest ensemble classification scheme to diagnose lymphatic diseases. While making experiments with the selected features, we used original and resampled distributions of the dataset to train random forest classifier. Experimental results demonstrate that the proposed method achieves a remark-able improvement in classification accuracy rate.
基金supported by the National Natural Science Foundation of China(No.12347126)。
文摘Prompt fission neutron spectra(PFNS)have a significant role in nuclear science and technology.In this study,the PFNS for^(239)Pu are evaluated using both differential and integral experimental data.A method that leverages integral criticality benchmark experiments to constrain the PFNS data is introduced.The measured central values of the PFNS are perturbed by constructing a covariance matrix.The PFNS are sampled using two types of covariance matrices,either generated with an assumed correlation matrix and incorporating experimental uncertainties or derived directly from experimental reports.The joint Monte Carlo transport code is employed to perform transport simulations on five criticality benchmark assemblies by utilizing perturbed PFNS data.Extensive simulations result in an optimized PFNS that shows improved agreement with the integral criticality benchmark experiments.This study introduces a novel approach for optimizing differential experimental data through integral experiments,particularly when a covariance matrix is not provided.
基金the Ontario Ministry of Agriculture,Food and Rural Affairs,Canada,who supported this project by providing updated soil information on Ontario and Middlesex Countysupported by the Natural Science and Engineering Research Council of Canada(No.RGPIN-2014-4100)。
文摘Conventional soil maps(CSMs)often have multiple soil types within a single polygon,which hinders the ability of machine learning to accurately predict soils.Soil disaggregation approaches are commonly used to improve the spatial and attribute precision of CSMs.The approach disaggregation and harmonization of soil map units through resampled classification trees(DSMART)is popular but computationally intensive,as it generates and assigns synthetic samples to soil series based on the areal coverage information of CSMs.Alternatively,the disaggregation approach pure polygon disaggregation(PPD)assigns soil series based solely on the proportions of soil series in pure polygons in CSMs.This study compared these two disaggregation approaches by applying them to a CSM of Middlesex County,Ontario,Canada.Four different sampling methods were used:two sampling designs,simple random sampling(SRS)and conditional Latin hypercube sampling(cLHS),with two sample sizes(83100 and 19420 samples per sampling plan),both based on an area-weighted approach.Two machine learning algorithms(MLAs),C5.0 decision tree(C5.0)and random forest(RF),were applied to the disaggregation approaches to compare the disaggregation accuracy.The accuracy assessment utilized a set of 500 validation points obtained from the Middlesex County soil survey report.The MLA C5.0(Kappa index=0.58–0.63)showed better performance than RF(Kappa index=0.53–0.54)based on the larger sample size,and PPD with C5.0 based on the larger sample size was the best-performing(Kappa index=0.63)approach.Based on the smaller sample size,both cLHS(Kappa index=0.41–0.48)and SRS(Kappa index=0.40–0.47)produced similar accuracy results.The disaggregation approach PPD exhibited lower processing capacity and time demands(1.62–5.93 h)while yielding maps with lower uncertainty as compared to DSMART(2.75–194.2 h).For CSMs predominantly composed of pure polygons,utilizing PPD for soil series disaggregation is a more efficient and rational choice.However,DSMART is the preferable approach for disaggregating soil series that lack pure polygon representations in the CSMs.
文摘目的:探讨随机化检验(Randomization test)在内部预试验IPS(Internal Pilot Study)自适应设计样本量调整中对I型错误和检验效能的影响.方法:利用蒙特-卡罗(MonteCarlo)法模拟样本量较小时的IPS样本量调整,分别采用随机化检验和t检验分析最后数据并比较二者对I型错误、检验效能值的影响.结果:重计算的第二阶段样本量波动性较大,t检验不能很好地抑制I型错误,随机化检验能较好的抑制I型错误,检验效能略有降低.结论:在临床试验样本量较小的情况下,内部预试验盲态下样本量调整后随机化检验能保护I型错误不增大,同时保证检验效能亦满足要求.