This article proposes the maximum test for a sequence of quadratic form statistics about score test in logistic regression model which can be applied to genetic and medicine fields.Theoretical properties about the max...This article proposes the maximum test for a sequence of quadratic form statistics about score test in logistic regression model which can be applied to genetic and medicine fields.Theoretical properties about the maximum test are derived.Extensive simulation studies are conducted to testify powers robustness of the maximum test compared to other two existed test.We also apply the maximum test to a real dataset about multiple gene variables association analysis.展开更多
The traditional method for creating a gene score to predict a given outcome is to use the most statistically significant single nucleotide polymorphisms (SNPs) from all SNPs which were tested. There are several disadv...The traditional method for creating a gene score to predict a given outcome is to use the most statistically significant single nucleotide polymorphisms (SNPs) from all SNPs which were tested. There are several disadvantages of this approach such as excluding SNPs that do not have strong single effects when tested on their own but do have strong joint effects when tested together with other SNPs. The interpretation of results from the traditional gene score may lack biological insight since the functional unit of interest is often the gene, not the single SNP. In this paper we present a new gene scoring method, which overcomes these problems as it generates a gene score for each gene, and the total gene score for all the genes available. First, we calculate a gene score for each gene and second, we test the association between this gene score and the outcome of interest (i.e. trait). Only the gene scores which are significantly associated with the outcome after multiple testing correction for the number of gene tests (not SNPs) are considered in the total gene score calculation. This method controls false positive results caused by multiple tests within genes and between genes separately, and has the advantage of identifying multi-locus genetic effects, compared with the Bonferroni correction, false discovery rate (FDR), and permutation tests for all SNPs. Another main feature of this method is that we select the SNPs, which have different effects within a gene by using adjustment in multiple regressions and then combine the information from the selected SNPs within a gene to create a gene score. A simulation study has been conducted to evaluate finite sample performance of the proposed method.展开更多
Count data with excess zeros encountered in many applications often exhibit extra variation. There- fore, zero-inflated Poisson (ZIP) model may fail to fit such data. In this paper, a zero-inflated double Poisson mo...Count data with excess zeros encountered in many applications often exhibit extra variation. There- fore, zero-inflated Poisson (ZIP) model may fail to fit such data. In this paper, a zero-inflated double Poisson model (ZIDP), which is generalization of the ZIP model, is studied and the score tests for the significance of dis- persion and zero-inflation in ZIDP model are developed. Meanwhile, this work also develops homogeneous tests for dispersion and/or zero-inflation parameter, and corresponding score test statistics are obtained. One numer- ical example is given to illustrate our methodology and the properties of score test statistics are investigated through Monte Carlo simulations.展开更多
Cardiovascular disease(CVD) is the leading cause of morbidity and mortality among patients with diabetes mellitus,who have a risk of cardiovascular mortality two to four times that of people without diabetes.An indivi...Cardiovascular disease(CVD) is the leading cause of morbidity and mortality among patients with diabetes mellitus,who have a risk of cardiovascular mortality two to four times that of people without diabetes.An individualised approach to cardiovascular risk estimation and management is needed.Over the past decades,many risk scores have been developed to predict CVD.However,few have been externally validated in a diabetic population and limited studies have examined the impact of applying a prediction model in clinical practice.Currently,guidelines are focused on testing for CVD in symptomatic patients.Atypical symptoms or silent ischemia are more common in the diabetic population,and with additional markers of vascular disease such as erectile dysfunction and autonomic neuropathy,these guidelines can be difficult to interpret.We propose an algorithm incorporating cardiovascular risk scores in combination with typical and atypical signs and symptoms to alert clinicians to consider further investigation with provocative testing.The modalities for investigation of CVD are discussed.展开更多
In order to improve the fitting accuracy of college students’ test scores, this paper proposes two-component mixed generalized normal distribution, uses maximum likelihood estimation method and Expectation Conditiona...In order to improve the fitting accuracy of college students’ test scores, this paper proposes two-component mixed generalized normal distribution, uses maximum likelihood estimation method and Expectation Conditional Maxinnization (ECM) algorithm to estimate parameters and conduct numerical simulation, and performs fitting analysis on the test scores of Linear Algebra and Advanced Mathematics of F University. The empirical results show that the two-component mixed generalized normal distribution is better than the commonly used two-component mixed normal distribution in fitting college students’ test data, and has good application value.展开更多
Objective: To improve the detecting accuracy of chromosomal aneuploidy of fetus by non-invasive prenatal testing (NIPT) using next generation sequencing data of pregnant women’s cell-free DNA. Methods: We proposed th...Objective: To improve the detecting accuracy of chromosomal aneuploidy of fetus by non-invasive prenatal testing (NIPT) using next generation sequencing data of pregnant women’s cell-free DNA. Methods: We proposed the multi-Z method which uses 21 z-scores for each autosomal chromosome to detect aneuploidy of the chromosome, while the conventional NIPT method uses only one z-score. To do this, mapped read numbers of a certain chromosome were normalized by those of the other 21 chromosomes. Average and standard deviation (SD), which are used for calculating z-score of each sample, were obtained with normalized values between all autosomal chromosomes of control samples. In this way, multiple z-scores can be calculated for 21 autosomal chromosomes except oneself. Results: Multi-Z method showed 100% sensitivity and specificity for 187 samples sequenced to 3 M reads while the conventional NIPT method showed 95.1% specificity. Similarly, for 216 samples sequenced to 1 M reads, Multi-Z method showed 100% sensitivity and 95.6% specificity and the conventional NIPT method showed a result of 75.1% specificity. Conclusion: Multi-Z method showed higher accuracy and robust results than the conventional method even at low coverage reads.展开更多
In this paper, it is discussed that two tests for varying dispersion of binomial data in the framework of nonlinear logistic models with random effects, which are widely used in analyzing longitudinal binomial data. O...In this paper, it is discussed that two tests for varying dispersion of binomial data in the framework of nonlinear logistic models with random effects, which are widely used in analyzing longitudinal binomial data. One is the individual test and power calculation for varying dispersion through testing the randomness of cluster effects, which is extensions of Dean(1992) and Commenges et al (1994). The second test is the composite test for varying dispersion through simultaneously testing the randomness of cluster effects and the equality of random-effect means. The score test statistics are constructed and expressed in simple, easy to use, matrix formulas. The authors illustrate their test methods using the insecticide data (Giltinan, Capizzi & Malani (1988)).展开更多
We propose the maximin efficiency robust test(MERT) for multiple nuisance parameters based on theories about the maximin efficiency robust test for only one nuisance parameter and investigate some theoretical proper...We propose the maximin efficiency robust test(MERT) for multiple nuisance parameters based on theories about the maximin efficiency robust test for only one nuisance parameter and investigate some theoretical properties about this robust test.We explore some theoretical properties about the power of the MERT for multiple nuisance parameters in a specified scenario intuitively further more.We also propose a meaningful example from statistical genetic field to which the MERT for multiple nuisance parameters can be well applied.Extensive simulation studies are conducted to testify the robustness of the MERT for multiple nuisance parameters.展开更多
This paper presents an approach for estimating power of the score test, based on an asymptotic approximation to the power of the score test under contiguous alternatives. The method is applied to the problem of power ...This paper presents an approach for estimating power of the score test, based on an asymptotic approximation to the power of the score test under contiguous alternatives. The method is applied to the problem of power calculations for the score test of heteroscedasticity in European rabbit data (Ratkowsky, 1983). Simulation studies are presented which indicate that the asymptotic approximation to the finite-sample situation is good over a wide range of parameter configurations.展开更多
Chaos theory has taught us that a system which has both nonlinearity and random input will most likely produce irregular data. If random errors are irregular data, then random error process will raise nonlinearity (K...Chaos theory has taught us that a system which has both nonlinearity and random input will most likely produce irregular data. If random errors are irregular data, then random error process will raise nonlinearity (Kantz and Schreiber (1997)). Tsai (1986) introduced a composite test for autocorrelation and heteroscedasticity in linear models with AR(1) errors. Liu (2003) introduced a composite test for correlation and heteroscedasticity in nonlinear models with DBL(p, 0, 1) errors. Therefore, the important problems in regression model axe detections of bilinearity, correlation and heteroscedasticity. In this article, the authors discuss more general case of nonlinear models with DBL(p, q, 1) random errors by score test. Several statistics for the test of bilinearity, correlation, and heteroscedasticity are obtained, and expressed in simple matrix formulas. The results of regression models with linear errors are extended to those with bilinear errors. The simulation study is carried out to investigate the powers of the test statistics. All results of this article extend and develop results of Tsai (1986), Wei, et al (1995), and Liu, et al (2003).展开更多
基金This work of Jiayan Zhu is partially supported by seeding project funding(2019ZZX026)scientific research project funding of talent recruitment,and start up funding for scientific research of Hubei University of Chinese MedicineThis work of Zhengbang Li is partially supported by self-determined research funds of Central China Normal University from colleges'basic research of MOE(CCNU18QN031).
文摘This article proposes the maximum test for a sequence of quadratic form statistics about score test in logistic regression model which can be applied to genetic and medicine fields.Theoretical properties about the maximum test are derived.Extensive simulation studies are conducted to testify powers robustness of the maximum test compared to other two existed test.We also apply the maximum test to a real dataset about multiple gene variables association analysis.
文摘The traditional method for creating a gene score to predict a given outcome is to use the most statistically significant single nucleotide polymorphisms (SNPs) from all SNPs which were tested. There are several disadvantages of this approach such as excluding SNPs that do not have strong single effects when tested on their own but do have strong joint effects when tested together with other SNPs. The interpretation of results from the traditional gene score may lack biological insight since the functional unit of interest is often the gene, not the single SNP. In this paper we present a new gene scoring method, which overcomes these problems as it generates a gene score for each gene, and the total gene score for all the genes available. First, we calculate a gene score for each gene and second, we test the association between this gene score and the outcome of interest (i.e. trait). Only the gene scores which are significantly associated with the outcome after multiple testing correction for the number of gene tests (not SNPs) are considered in the total gene score calculation. This method controls false positive results caused by multiple tests within genes and between genes separately, and has the advantage of identifying multi-locus genetic effects, compared with the Bonferroni correction, false discovery rate (FDR), and permutation tests for all SNPs. Another main feature of this method is that we select the SNPs, which have different effects within a gene by using adjustment in multiple regressions and then combine the information from the selected SNPs within a gene to create a gene score. A simulation study has been conducted to evaluate finite sample performance of the proposed method.
基金Supported in part by the National Natural Science Foundation of China under Grant No.11271193 and 11571073the Natural Science Foundation of Jiangsu Province under Grant No.BK20141326
文摘Count data with excess zeros encountered in many applications often exhibit extra variation. There- fore, zero-inflated Poisson (ZIP) model may fail to fit such data. In this paper, a zero-inflated double Poisson model (ZIDP), which is generalization of the ZIP model, is studied and the score tests for the significance of dis- persion and zero-inflation in ZIDP model are developed. Meanwhile, this work also develops homogeneous tests for dispersion and/or zero-inflation parameter, and corresponding score test statistics are obtained. One numer- ical example is given to illustrate our methodology and the properties of score test statistics are investigated through Monte Carlo simulations.
文摘Cardiovascular disease(CVD) is the leading cause of morbidity and mortality among patients with diabetes mellitus,who have a risk of cardiovascular mortality two to four times that of people without diabetes.An individualised approach to cardiovascular risk estimation and management is needed.Over the past decades,many risk scores have been developed to predict CVD.However,few have been externally validated in a diabetic population and limited studies have examined the impact of applying a prediction model in clinical practice.Currently,guidelines are focused on testing for CVD in symptomatic patients.Atypical symptoms or silent ischemia are more common in the diabetic population,and with additional markers of vascular disease such as erectile dysfunction and autonomic neuropathy,these guidelines can be difficult to interpret.We propose an algorithm incorporating cardiovascular risk scores in combination with typical and atypical signs and symptoms to alert clinicians to consider further investigation with provocative testing.The modalities for investigation of CVD are discussed.
文摘In order to improve the fitting accuracy of college students’ test scores, this paper proposes two-component mixed generalized normal distribution, uses maximum likelihood estimation method and Expectation Conditional Maxinnization (ECM) algorithm to estimate parameters and conduct numerical simulation, and performs fitting analysis on the test scores of Linear Algebra and Advanced Mathematics of F University. The empirical results show that the two-component mixed generalized normal distribution is better than the commonly used two-component mixed normal distribution in fitting college students’ test data, and has good application value.
文摘Objective: To improve the detecting accuracy of chromosomal aneuploidy of fetus by non-invasive prenatal testing (NIPT) using next generation sequencing data of pregnant women’s cell-free DNA. Methods: We proposed the multi-Z method which uses 21 z-scores for each autosomal chromosome to detect aneuploidy of the chromosome, while the conventional NIPT method uses only one z-score. To do this, mapped read numbers of a certain chromosome were normalized by those of the other 21 chromosomes. Average and standard deviation (SD), which are used for calculating z-score of each sample, were obtained with normalized values between all autosomal chromosomes of control samples. In this way, multiple z-scores can be calculated for 21 autosomal chromosomes except oneself. Results: Multi-Z method showed 100% sensitivity and specificity for 187 samples sequenced to 3 M reads while the conventional NIPT method showed 95.1% specificity. Similarly, for 216 samples sequenced to 1 M reads, Multi-Z method showed 100% sensitivity and 95.6% specificity and the conventional NIPT method showed a result of 75.1% specificity. Conclusion: Multi-Z method showed higher accuracy and robust results than the conventional method even at low coverage reads.
基金The project supported by NNSFC (19631040), NSSFC (04BTJ002) and the grant for post-doctor fellows in SELF.
文摘In this paper, it is discussed that two tests for varying dispersion of binomial data in the framework of nonlinear logistic models with random effects, which are widely used in analyzing longitudinal binomial data. One is the individual test and power calculation for varying dispersion through testing the randomness of cluster effects, which is extensions of Dean(1992) and Commenges et al (1994). The second test is the composite test for varying dispersion through simultaneously testing the randomness of cluster effects and the equality of random-effect means. The score test statistics are constructed and expressed in simple, easy to use, matrix formulas. The authors illustrate their test methods using the insecticide data (Giltinan, Capizzi & Malani (1988)).
基金supported by the Natural Science Foundation of China(11401240,11471135)the self-determined research funds of CCNU from the colleges’basic research of MOE(CCNU15A05038,CCNU15ZD011)
文摘We propose the maximin efficiency robust test(MERT) for multiple nuisance parameters based on theories about the maximin efficiency robust test for only one nuisance parameter and investigate some theoretical properties about this robust test.We explore some theoretical properties about the power of the MERT for multiple nuisance parameters in a specified scenario intuitively further more.We also propose a meaningful example from statistical genetic field to which the MERT for multiple nuisance parameters can be well applied.Extensive simulation studies are conducted to testify the robustness of the MERT for multiple nuisance parameters.
基金Supported by SSFC(04BTJ002),the National Natural Science Foundation of China(10371016) and the Post-Doctorial Grant in Southeast University.
文摘This paper presents an approach for estimating power of the score test, based on an asymptotic approximation to the power of the score test under contiguous alternatives. The method is applied to the problem of power calculations for the score test of heteroscedasticity in European rabbit data (Ratkowsky, 1983). Simulation studies are presented which indicate that the asymptotic approximation to the finite-sample situation is good over a wide range of parameter configurations.
文摘Chaos theory has taught us that a system which has both nonlinearity and random input will most likely produce irregular data. If random errors are irregular data, then random error process will raise nonlinearity (Kantz and Schreiber (1997)). Tsai (1986) introduced a composite test for autocorrelation and heteroscedasticity in linear models with AR(1) errors. Liu (2003) introduced a composite test for correlation and heteroscedasticity in nonlinear models with DBL(p, 0, 1) errors. Therefore, the important problems in regression model axe detections of bilinearity, correlation and heteroscedasticity. In this article, the authors discuss more general case of nonlinear models with DBL(p, q, 1) random errors by score test. Several statistics for the test of bilinearity, correlation, and heteroscedasticity are obtained, and expressed in simple matrix formulas. The results of regression models with linear errors are extended to those with bilinear errors. The simulation study is carried out to investigate the powers of the test statistics. All results of this article extend and develop results of Tsai (1986), Wei, et al (1995), and Liu, et al (2003).