We study the asymptotics tot the statistic of chi-square in type Ⅱ error. By the contraction principle, the large deviations and moderate deviations are obtained, and the rate function of moderate deviations can be c...We study the asymptotics tot the statistic of chi-square in type Ⅱ error. By the contraction principle, the large deviations and moderate deviations are obtained, and the rate function of moderate deviations can be calculated explicitly which is a squared function.展开更多
Modeling non coding background sequences appropriately is important for the detection of regulatory elements from DNA sequences. Based on the chi square statistic test, some explanations about why to choose higher ...Modeling non coding background sequences appropriately is important for the detection of regulatory elements from DNA sequences. Based on the chi square statistic test, some explanations about why to choose higher order Markov chain model and how to automatically select the proper order are given in this paper. The chi square test is first run on synthetic data sets to show that it can efficiently find the proper order of Markov chain. Using chi square test, distinct higher order context dependences inherent in ten sets of sequences of yeast S.cerevisiae from other literature have been found. So the Markov chain with higher order would be more suitable for modeling the non coding background sequences than an independent model.展开更多
It is well known that smart thermostats (STs) have become key devices in the implementation of smart homes;thus, they are considered as primary elements for the control of electrical energy consumption in households. ...It is well known that smart thermostats (STs) have become key devices in the implementation of smart homes;thus, they are considered as primary elements for the control of electrical energy consumption in households. Moreover, energy consumption is drastically affected when the end users select unsuitable STs or when they do not use the STs correctly. Furthermore, in future, Mexico will face serious electrical energy challenges that can be considerably resolved if the end users operate the STs in a correct manner. Hence, it is important to carry out an in-depth study and analysis on thermostats, by focusing on social aspects that influence the technological use and performance of the thermostats. This paper proposes the use of a signal detection theory (SDT), fuzzy detection theory (FDT), and chi-square (CS) test in order to understand the perceptions and beliefs of end users about the use of STs in Mexico. This paper extensively shows the perceptions and beliefs about the selected thermostats in Mexico. Besides, it presents an in-depth discussion on the cognitive perceptions and beliefs of end users. Moreover, it shows why the expectations of the end users about STs are not met. It also promotes the technological and social development of STs such that they are relatively more accepted in complex electrical grids such as smart grids.展开更多
In large sample studies where distributions may be skewed and not readily transformed to symmetry, it may be of greater interest to compare different distributions in terms of percentiles rather than means. For exampl...In large sample studies where distributions may be skewed and not readily transformed to symmetry, it may be of greater interest to compare different distributions in terms of percentiles rather than means. For example, it may be more informative to compare two or more populations with respect to their within population distributions by testing the hypothesis that their corresponding respective 10th, 50th, and 90th percentiles are equal. As a generalization of the median test, the proposed test statistic is asymptotically distributed as Chi-square with degrees of freedom dependent upon the number of percentiles tested and constraints of the null hypothesis. Results from simulation studies are used to validate the nominal 0.05 significance level under the null hypothesis, and asymptotic power properties that are suitable for testing equality of percentile profiles against selected profile discrepancies for a variety of underlying distributions. A pragmatic example is provided to illustrate the comparison of the percentile profiles for four body mass index distributions.展开更多
Zero-inflated distributions are common in statistical problems where there is interest in testing homogeneity of two or more independent groups. Often, the underlying distribution that has an inflated number of zero-v...Zero-inflated distributions are common in statistical problems where there is interest in testing homogeneity of two or more independent groups. Often, the underlying distribution that has an inflated number of zero-valued observations is asymmetric, and its functional form may not be known or easily characterized. In this case, comparisons of the groups in terms of their respective percentiles may be appropriate as these estimates are nonparametric and more robust to outliers and other irregularities. The median test is often used to compare distributions with similar but asymmetric shapes but may be uninformative when there are excess zeros or dissimilar shapes. For zero-inflated distributions, it is useful to compare the distributions with respect to their proportion of zeros, coupled with the comparison of percentile profiles for the observed non-zero values. A simple chi-square test for simultaneous testing of these two components is proposed, applicable to both continuous and discrete data. Results of simulation studies are reported to summarize empirical power under several scenarios. We give recommendations for the minimum sample size which is necessary to achieve suitable test performance in specific examples.展开更多
A new six-parameter continuous distribution called the Generalized Kumaraswamy Generalized Power Gompertz (GKGPG) distribution is proposed in this study, a graphical illustration of the probability density function an...A new six-parameter continuous distribution called the Generalized Kumaraswamy Generalized Power Gompertz (GKGPG) distribution is proposed in this study, a graphical illustration of the probability density function and cumulative distribution function is presented. The statistical features of the Generalized Kumaraswamy Generalized Power Gompertz distribution are systematically derived and adequately studied. The estimation of the model parameters in the absence of censoring and under-right censoring is performed using the method of maximum likelihood. The test statistic for right-censored data, criteria test for GKGPG distribution, estimated matrix Ŵ, Ĉ, and Ĝ, criteria test Y<sup>2</sup>n</sub>, alongside the quadratic form of the test statistic is derived. Mean simulated values of maximum likelihood estimates and their corresponding square mean errors are presented and confirmed to agree closely with the true parameter values. Simulated levels of significance for Y<sup>2</sup>n</sub> (γ) test for the GKGPG model against their theoretical values were recorded. We conclude that the null hypothesis for which simulated samples are fitted by GKGPG distribution is widely validated for the different levels of significance considered. From the summary of the results of the strength of a specific type of braided cord dataset on the GKGPG model, it is observed that the proposed GKGPG model fits the data set for a significance level ε = 0.05.展开更多
In this paper Singular Decompositon Value (SVD) formula and modified Chi-square solution are provided, and the modified Chi-square is combined with FT-IR instrument to control biochemical reaction process. Using the m...In this paper Singular Decompositon Value (SVD) formula and modified Chi-square solution are provided, and the modified Chi-square is combined with FT-IR instrument to control biochemical reaction process. Using the modified Chi-square technique, the unknown concentration of reactants and products in test samples withdrawn from the process is determined. The technique avoids the need for the spectral data to conform to Beer’s Law and the best spectral range is determined automatically.展开更多
The application of frequency distribution statistics to data provides objective means to assess the nature of the data distribution and viability of numerical models that are used to visualize and interpret data.Two c...The application of frequency distribution statistics to data provides objective means to assess the nature of the data distribution and viability of numerical models that are used to visualize and interpret data.Two commonly used tools are the kernel density estimation and reduced chi-squared statistic used in combination with a weighted mean.Due to the wide applicability of these tools,we present a Java-based computer application called KDX to facilitate the visualization of data and the utilization of these numerical tools.展开更多
Diabetes mellitus is a metabolic disease that is ranked among the top 10 causes of death by the world health organization.During the last few years,an alarming increase is observed worldwide with a 70%rise in the dise...Diabetes mellitus is a metabolic disease that is ranked among the top 10 causes of death by the world health organization.During the last few years,an alarming increase is observed worldwide with a 70%rise in the disease since 2000 and an 80%rise in male deaths.If untreated,it results in complications of many vital organs of the human body which may lead to fatality.Early detection of diabetes is a task of significant importance to start timely treatment.This study introduces a methodology for the classification of diabetic and normal people using an ensemble machine learning model and feature fusion of Chi-square and principal component analysis.An ensemble model,logistic tree classifier(LTC),is proposed which incorporates logistic regression and extra tree classifier through a soft voting mechanism.Experiments are also performed using several well-known machine learning algorithms to analyze their performance including logistic regression,extra tree classifier,AdaBoost,Gaussian naive Bayes,decision tree,random forest,and k nearest neighbor.In addition,several experiments are carried out using principal component analysis(PCA)and Chi-square(Chi-2)fea-tures to analyze the influence of feature selection on the performance of machine learning classifiers.Results indicate that Chi-2 features show high performance than both PCA features and original features.However,the highest accuracy is obtained when the proposed ensemble model LTC is used with the proposed fea-ture fusion framework-work which achieves a 0.85 accuracy score which is the highest of the available approaches for diabetes prediction.In addition,the statis-tical T-test proves the statistical significance of the proposed approach over other approaches.展开更多
This study introduces a new classifier tailored to address the limitations inherent in conventional classifiers such as K-nearest neighbor(KNN),random forest(RF),decision tree(DT),and support vector machine(SVM)for ar...This study introduces a new classifier tailored to address the limitations inherent in conventional classifiers such as K-nearest neighbor(KNN),random forest(RF),decision tree(DT),and support vector machine(SVM)for arrhythmia detection.The proposed classifier leverages the Chi-square distance as a primary metric,providing a specialized and original approach for precise arrhythmia detection.To optimize feature selection and refine the classifier’s performance,particle swarm optimization(PSO)is integrated with the Chi-square distance as a fitness function.This synergistic integration enhances the classifier’s capabilities,resulting in a substantial improvement in accuracy for arrhythmia detection.Experimental results demonstrate the efficacy of the proposed method,achieving a noteworthy accuracy rate of 98% with PSO,higher than 89% achieved without any previous optimization.The classifier outperforms machine learning(ML)and deep learning(DL)techniques,underscoring its reliability and superiority in the realm of arrhythmia classification.The promising results render it an effective method to support both academic and medical communities,offering an advanced and precise solution for arrhythmia detection in electrocardiogram(ECG)data.展开更多
X-ray pulsar-based navigation is a revolutionary technology which is capable of providing the required navigation information in the solar system. Performing as an important pulsar-based navigation technique, the Sign...X-ray pulsar-based navigation is a revolutionary technology which is capable of providing the required navigation information in the solar system. Performing as an important pulsar-based navigation technique, the Significance Enhancement of Pulse-profile with Orbit-dynamics(SEPO)can estimate orbital elements by using the distortion of pulse profile. Based on the SEPO technique,we propose a grouping bi-chi-squared technique and adopt the observations of Rossi X-ray Timing Explorer(RXTE) to carry out experimental verification. The pulsar timing is refined to determine spin parameters of the Crab pulsar(PSR B0531+21) during the observation periods, and a short-term dynamic model for RXTE satellite is established by considering the effects of diverse perturbations.Experimental results suggest that the position and velocity errors are 16.3 km(3σ) and 13.3 m/s(3σ) with two adjacent observations split by one day(exposure time of 1.5 ks), outperforming those of the POLAR experiment whose results are 19.2 km(3σ) and 432 m/s(3σ). The approach provided is particularly applicable to the estimation of orbital elements via using high-flux observations.展开更多
We describe two new derivations of the chi-square distribution. The first derivation uses the induction method, which requires only a single integral to calculate. The second derivation uses the Laplace transform and ...We describe two new derivations of the chi-square distribution. The first derivation uses the induction method, which requires only a single integral to calculate. The second derivation uses the Laplace transform and requires minimum assumptions. The new derivations are compared with the established derivations, such as by convolution, moment generating function, and Bayesian inference. The chi-square testing has seen many applications to physics and other fields. We describe a unique version of the chi-square test where both the variance and location are tested, which is then applied to environmental data. The chi-square test is used to make a judgment whether a laboratory method is capable of detection of gross alpha and beta radioactivity in drinking water for regulatory monitoring to protect health of population. A case of a failure of the chi-square test and its amelioration are described. The chi-square test is compared to and supplemented by the t-test.展开更多
The purpose of this study is to analyze the application of ultrasonic non-destructive testing technology in bridge engineering.During the research phase,based on literature collection and reading,as well as the analys...The purpose of this study is to analyze the application of ultrasonic non-destructive testing technology in bridge engineering.During the research phase,based on literature collection and reading,as well as the analysis of bridge inspection materials,the principle of ultrasonic non-destructive testing technology and its adaptability to bridge engineering are elaborated.Subsequently,starting from the preparation work before inspection until damage assessment,the entire process of ultrasonic non-destructive testing is studied,and finally,a technical system of ultrasonic non-destructive testing for bridge engineering that runs through the entire process is formed.It is hoped that this article can provide technical reference value for relevant units in China,and promote the high-quality development of China’s bridge engineering from a macro perspective.展开更多
基金the National Natural Science Foundation of China (10571139)
文摘We study the asymptotics tot the statistic of chi-square in type Ⅱ error. By the contraction principle, the large deviations and moderate deviations are obtained, and the rate function of moderate deviations can be calculated explicitly which is a squared function.
文摘Modeling non coding background sequences appropriately is important for the detection of regulatory elements from DNA sequences. Based on the chi square statistic test, some explanations about why to choose higher order Markov chain model and how to automatically select the proper order are given in this paper. The chi square test is first run on synthetic data sets to show that it can efficiently find the proper order of Markov chain. Using chi square test, distinct higher order context dependences inherent in ten sets of sequences of yeast S.cerevisiae from other literature have been found. So the Markov chain with higher order would be more suitable for modeling the non coding background sequences than an independent model.
文摘It is well known that smart thermostats (STs) have become key devices in the implementation of smart homes;thus, they are considered as primary elements for the control of electrical energy consumption in households. Moreover, energy consumption is drastically affected when the end users select unsuitable STs or when they do not use the STs correctly. Furthermore, in future, Mexico will face serious electrical energy challenges that can be considerably resolved if the end users operate the STs in a correct manner. Hence, it is important to carry out an in-depth study and analysis on thermostats, by focusing on social aspects that influence the technological use and performance of the thermostats. This paper proposes the use of a signal detection theory (SDT), fuzzy detection theory (FDT), and chi-square (CS) test in order to understand the perceptions and beliefs of end users about the use of STs in Mexico. This paper extensively shows the perceptions and beliefs about the selected thermostats in Mexico. Besides, it presents an in-depth discussion on the cognitive perceptions and beliefs of end users. Moreover, it shows why the expectations of the end users about STs are not met. It also promotes the technological and social development of STs such that they are relatively more accepted in complex electrical grids such as smart grids.
文摘In large sample studies where distributions may be skewed and not readily transformed to symmetry, it may be of greater interest to compare different distributions in terms of percentiles rather than means. For example, it may be more informative to compare two or more populations with respect to their within population distributions by testing the hypothesis that their corresponding respective 10th, 50th, and 90th percentiles are equal. As a generalization of the median test, the proposed test statistic is asymptotically distributed as Chi-square with degrees of freedom dependent upon the number of percentiles tested and constraints of the null hypothesis. Results from simulation studies are used to validate the nominal 0.05 significance level under the null hypothesis, and asymptotic power properties that are suitable for testing equality of percentile profiles against selected profile discrepancies for a variety of underlying distributions. A pragmatic example is provided to illustrate the comparison of the percentile profiles for four body mass index distributions.
文摘Zero-inflated distributions are common in statistical problems where there is interest in testing homogeneity of two or more independent groups. Often, the underlying distribution that has an inflated number of zero-valued observations is asymmetric, and its functional form may not be known or easily characterized. In this case, comparisons of the groups in terms of their respective percentiles may be appropriate as these estimates are nonparametric and more robust to outliers and other irregularities. The median test is often used to compare distributions with similar but asymmetric shapes but may be uninformative when there are excess zeros or dissimilar shapes. For zero-inflated distributions, it is useful to compare the distributions with respect to their proportion of zeros, coupled with the comparison of percentile profiles for the observed non-zero values. A simple chi-square test for simultaneous testing of these two components is proposed, applicable to both continuous and discrete data. Results of simulation studies are reported to summarize empirical power under several scenarios. We give recommendations for the minimum sample size which is necessary to achieve suitable test performance in specific examples.
文摘A new six-parameter continuous distribution called the Generalized Kumaraswamy Generalized Power Gompertz (GKGPG) distribution is proposed in this study, a graphical illustration of the probability density function and cumulative distribution function is presented. The statistical features of the Generalized Kumaraswamy Generalized Power Gompertz distribution are systematically derived and adequately studied. The estimation of the model parameters in the absence of censoring and under-right censoring is performed using the method of maximum likelihood. The test statistic for right-censored data, criteria test for GKGPG distribution, estimated matrix Ŵ, Ĉ, and Ĝ, criteria test Y<sup>2</sup>n</sub>, alongside the quadratic form of the test statistic is derived. Mean simulated values of maximum likelihood estimates and their corresponding square mean errors are presented and confirmed to agree closely with the true parameter values. Simulated levels of significance for Y<sup>2</sup>n</sub> (γ) test for the GKGPG model against their theoretical values were recorded. We conclude that the null hypothesis for which simulated samples are fitted by GKGPG distribution is widely validated for the different levels of significance considered. From the summary of the results of the strength of a specific type of braided cord dataset on the GKGPG model, it is observed that the proposed GKGPG model fits the data set for a significance level ε = 0.05.
文摘In this paper Singular Decompositon Value (SVD) formula and modified Chi-square solution are provided, and the modified Chi-square is combined with FT-IR instrument to control biochemical reaction process. Using the modified Chi-square technique, the unknown concentration of reactants and products in test samples withdrawn from the process is determined. The technique avoids the need for the spectral data to conform to Beer’s Law and the best spectral range is determined automatically.
文摘The application of frequency distribution statistics to data provides objective means to assess the nature of the data distribution and viability of numerical models that are used to visualize and interpret data.Two commonly used tools are the kernel density estimation and reduced chi-squared statistic used in combination with a weighted mean.Due to the wide applicability of these tools,we present a Java-based computer application called KDX to facilitate the visualization of data and the utilization of these numerical tools.
基金supported by the Florida Center for Advanced Analytics and Data Science funded by Ernesto.Net(under the Algorithms for Good Grant).
文摘Diabetes mellitus is a metabolic disease that is ranked among the top 10 causes of death by the world health organization.During the last few years,an alarming increase is observed worldwide with a 70%rise in the disease since 2000 and an 80%rise in male deaths.If untreated,it results in complications of many vital organs of the human body which may lead to fatality.Early detection of diabetes is a task of significant importance to start timely treatment.This study introduces a methodology for the classification of diabetic and normal people using an ensemble machine learning model and feature fusion of Chi-square and principal component analysis.An ensemble model,logistic tree classifier(LTC),is proposed which incorporates logistic regression and extra tree classifier through a soft voting mechanism.Experiments are also performed using several well-known machine learning algorithms to analyze their performance including logistic regression,extra tree classifier,AdaBoost,Gaussian naive Bayes,decision tree,random forest,and k nearest neighbor.In addition,several experiments are carried out using principal component analysis(PCA)and Chi-square(Chi-2)fea-tures to analyze the influence of feature selection on the performance of machine learning classifiers.Results indicate that Chi-2 features show high performance than both PCA features and original features.However,the highest accuracy is obtained when the proposed ensemble model LTC is used with the proposed fea-ture fusion framework-work which achieves a 0.85 accuracy score which is the highest of the available approaches for diabetes prediction.In addition,the statis-tical T-test proves the statistical significance of the proposed approach over other approaches.
文摘This study introduces a new classifier tailored to address the limitations inherent in conventional classifiers such as K-nearest neighbor(KNN),random forest(RF),decision tree(DT),and support vector machine(SVM)for arrhythmia detection.The proposed classifier leverages the Chi-square distance as a primary metric,providing a specialized and original approach for precise arrhythmia detection.To optimize feature selection and refine the classifier’s performance,particle swarm optimization(PSO)is integrated with the Chi-square distance as a fitness function.This synergistic integration enhances the classifier’s capabilities,resulting in a substantial improvement in accuracy for arrhythmia detection.Experimental results demonstrate the efficacy of the proposed method,achieving a noteworthy accuracy rate of 98% with PSO,higher than 89% achieved without any previous optimization.The classifier outperforms machine learning(ML)and deep learning(DL)techniques,underscoring its reliability and superiority in the realm of arrhythmia classification.The promising results render it an effective method to support both academic and medical communities,offering an advanced and precise solution for arrhythmia detection in electrocardiogram(ECG)data.
基金the National Natural Science Foundation of China(No.62103313)the Space Optoelectronic Measurement&Perception Laboratory of BICE,China(No.LabSOMP-2020-06)the Central Universities,China(No.JC2007).
文摘X-ray pulsar-based navigation is a revolutionary technology which is capable of providing the required navigation information in the solar system. Performing as an important pulsar-based navigation technique, the Significance Enhancement of Pulse-profile with Orbit-dynamics(SEPO)can estimate orbital elements by using the distortion of pulse profile. Based on the SEPO technique,we propose a grouping bi-chi-squared technique and adopt the observations of Rossi X-ray Timing Explorer(RXTE) to carry out experimental verification. The pulsar timing is refined to determine spin parameters of the Crab pulsar(PSR B0531+21) during the observation periods, and a short-term dynamic model for RXTE satellite is established by considering the effects of diverse perturbations.Experimental results suggest that the position and velocity errors are 16.3 km(3σ) and 13.3 m/s(3σ) with two adjacent observations split by one day(exposure time of 1.5 ks), outperforming those of the POLAR experiment whose results are 19.2 km(3σ) and 432 m/s(3σ). The approach provided is particularly applicable to the estimation of orbital elements via using high-flux observations.
文摘We describe two new derivations of the chi-square distribution. The first derivation uses the induction method, which requires only a single integral to calculate. The second derivation uses the Laplace transform and requires minimum assumptions. The new derivations are compared with the established derivations, such as by convolution, moment generating function, and Bayesian inference. The chi-square testing has seen many applications to physics and other fields. We describe a unique version of the chi-square test where both the variance and location are tested, which is then applied to environmental data. The chi-square test is used to make a judgment whether a laboratory method is capable of detection of gross alpha and beta radioactivity in drinking water for regulatory monitoring to protect health of population. A case of a failure of the chi-square test and its amelioration are described. The chi-square test is compared to and supplemented by the t-test.
文摘The purpose of this study is to analyze the application of ultrasonic non-destructive testing technology in bridge engineering.During the research phase,based on literature collection and reading,as well as the analysis of bridge inspection materials,the principle of ultrasonic non-destructive testing technology and its adaptability to bridge engineering are elaborated.Subsequently,starting from the preparation work before inspection until damage assessment,the entire process of ultrasonic non-destructive testing is studied,and finally,a technical system of ultrasonic non-destructive testing for bridge engineering that runs through the entire process is formed.It is hoped that this article can provide technical reference value for relevant units in China,and promote the high-quality development of China’s bridge engineering from a macro perspective.