DNA microarray technology is an extremely effective technique for studying gene expression patterns in cells, and the main challenge currently faced by this technology is how to analyze the large amount of gene expres...DNA microarray technology is an extremely effective technique for studying gene expression patterns in cells, and the main challenge currently faced by this technology is how to analyze the large amount of gene expression data generated. To address this, this paper employs a mixed-effects model to analyze gene expression data. In terms of data selection, 1176 genes from the white mouse gene expression dataset under two experimental conditions were chosen, setting up two conditions: pneumococcal infection and no infection, and constructing a mixed-effects model. After preprocessing the gene chip information, the data were imported into the model, preliminary results were calculated, and permutation tests were performed to biologically validate the preliminary results using GSEA. The final dataset consists of 20 groups of gene expression data from pneumococcal infection, which categorizes functionally related genes based on the similarity of their expression profiles, facilitating the study of genes with unknown functions.展开更多
MODIS (Moderate Resolution Imaging Spectroradiometer) is a key instrument aboard the Terra (EOS AM) and Aqua (EOS PM) satellites. Linear spectral mixture models are applied to MOIDS data for the sub-pixel classi...MODIS (Moderate Resolution Imaging Spectroradiometer) is a key instrument aboard the Terra (EOS AM) and Aqua (EOS PM) satellites. Linear spectral mixture models are applied to MOIDS data for the sub-pixel classification of land covers. Shaoxing county of Zhejiang Province in China was chosen to be the study site and early rice was selected as the study crop. The derived proportions of land covers from MODIS pixel using linear spectral mixture models were compared with unsupervised classification derived from TM data acquired on the same day, which implies that MODIS data could be used as satellite data source for rice cultivation area estimation, possibly rice growth monitoring and yield forecasting on the regional scale.展开更多
Outlier detection is a key research area in data mining technologies,as outlier detection can identify data inconsistent within a data set.Outlier detection aims to find an abnormal data size from a large data size an...Outlier detection is a key research area in data mining technologies,as outlier detection can identify data inconsistent within a data set.Outlier detection aims to find an abnormal data size from a large data size and has been applied in many fields including fraud detection,network intrusion detection,disaster prediction,medical diagnosis,public security,and image processing.While outlier detection has been widely applied in real systems,its effectiveness is challenged by higher dimensions and redundant data attributes,leading to detection errors and complicated calculations.The prevalence of mixed data is a current issue for outlier detection algorithms.An outlier detection method of mixed data based on neighborhood combinatorial entropy is studied to improve outlier detection performance by reducing data dimension using an attribute reduction algorithm.The significance of attributes is determined,and fewer influencing attributes are removed based on neighborhood combinatorial entropy.Outlier detection is conducted using the algorithm of local outlier factor.The proposed outlier detection method can be applied effectively in numerical and mixed multidimensional data using neighborhood combinatorial entropy.In the experimental part of this paper,we give a comparison on outlier detection before and after attribute reduction.In a comparative analysis,we give results of the enhanced outlier detection accuracy by removing the fewer influencing attributes in numerical and mixed multidimensional data.展开更多
This paper studies the nonlinear mixed problem for a class of symmetric hyperbolic systems with the boundary condition satisfying the dissipative condition about discontinuous data in higher dimension spaces, establis...This paper studies the nonlinear mixed problem for a class of symmetric hyperbolic systems with the boundary condition satisfying the dissipative condition about discontinuous data in higher dimension spaces, establishes the local existence theorem by using the method of a prior estimates, and obtains the structure of singularities of the solutions of such problems.展开更多
In this article, robust generalized estimating equation for the analysis of partial linear mixed model for longitudinal data is used. The authors approximate the nonparametric function by a regression spline. Under so...In this article, robust generalized estimating equation for the analysis of partial linear mixed model for longitudinal data is used. The authors approximate the nonparametric function by a regression spline. Under some regular conditions, the asymptotic properties of the estimators are obtained. To avoid the computation of high-dimensional integral, a robust Monte Carlo Newton-Raphson algorithm is used. Some simulations are carried out to study the performance of the proposed robust estimators. In addition, the authors also study the robustness and the efficiency of the proposed estimators by simulation. Finally, two real longitudinal data sets are analyzed.展开更多
Driven by both market demand and policies,the drone insurance industry is facing new development opportunities.This study focuses on exploring an innovative hybrid data integration method,which uses public datasets of...Driven by both market demand and policies,the drone insurance industry is facing new development opportunities.This study focuses on exploring an innovative hybrid data integration method,which uses public datasets of drones and small manned aircraft for hybrid data integration and severity scaling,and conducts simulation tests to ensure the reproducibility of the method.A two-part hybrid model approach is adopted to separate the frequency model from the severity model,and a hierarchical modeling method is used for each part to deal with the occurrence of extreme losses.Monte Carlo simulation is performed on the fused data to calculate the net premium.Innovatively,a no-claim discount system is introduced,and the impact of operators'behaviors on claim frequency is quantified,with comprehensive consideration given to the inclusion and quantification of risk factors.The application of Tweedie GLM in total loss modeling is constructed and analyzed,and the advantages and disadvantages of different modeling methods are compared,aiming to provide more comprehensive decision-making basis for insurance companies.This report is intended to construct and evaluate a robust actuarial rate-making model for the rapidly developing drone insurance market,and to develop more accurate,fair and market-competitive drone insurance products.展开更多
Data envelopment analysis(DEA) model is widely used to evaluate the relative efficiency of producers. It is a kind of objective decision method with multiple indexes. However, the two basic models frequently used at p...Data envelopment analysis(DEA) model is widely used to evaluate the relative efficiency of producers. It is a kind of objective decision method with multiple indexes. However, the two basic models frequently used at present, the C2R model and the C2GS2 model have limitations when used alone,resulting in evaluations that are often unsatisfactory. In order to solve this problem, a mixed DEA model is built and is used to evaluate the validity of the business efficiency of listed companies. An explanation of how to use this mixed DEA model is offered and its feasibility is verified.展开更多
Based on left truncated and right censored dependent data, the estimators of higher derivatives of density function and hazard rate function are given by kernel smoothing method. When observed data exhibit α-mixing d...Based on left truncated and right censored dependent data, the estimators of higher derivatives of density function and hazard rate function are given by kernel smoothing method. When observed data exhibit α-mixing dependence, local properties including strong consistency and law of iterated logarithm are presented. Moreover, when the mode estimator is defined as the random variable that maximizes the kernel density estimator, the asymptotic normality of the mode estimator is established.展开更多
文摘DNA microarray technology is an extremely effective technique for studying gene expression patterns in cells, and the main challenge currently faced by this technology is how to analyze the large amount of gene expression data generated. To address this, this paper employs a mixed-effects model to analyze gene expression data. In terms of data selection, 1176 genes from the white mouse gene expression dataset under two experimental conditions were chosen, setting up two conditions: pneumococcal infection and no infection, and constructing a mixed-effects model. After preprocessing the gene chip information, the data were imported into the model, preliminary results were calculated, and permutation tests were performed to biologically validate the preliminary results using GSEA. The final dataset consists of 20 groups of gene expression data from pneumococcal infection, which categorizes functionally related genes based on the similarity of their expression profiles, facilitating the study of genes with unknown functions.
文摘MODIS (Moderate Resolution Imaging Spectroradiometer) is a key instrument aboard the Terra (EOS AM) and Aqua (EOS PM) satellites. Linear spectral mixture models are applied to MOIDS data for the sub-pixel classification of land covers. Shaoxing county of Zhejiang Province in China was chosen to be the study site and early rice was selected as the study crop. The derived proportions of land covers from MODIS pixel using linear spectral mixture models were compared with unsupervised classification derived from TM data acquired on the same day, which implies that MODIS data could be used as satellite data source for rice cultivation area estimation, possibly rice growth monitoring and yield forecasting on the regional scale.
基金The authors would like to acknowledge the support of Southern Marine Science and Engineering Guangdong Laboratory(Zhuhai)(SML2020SP007)The paper is supported under the National Natural Science Foundation of China(Nos.61772280 and 62072249).
文摘Outlier detection is a key research area in data mining technologies,as outlier detection can identify data inconsistent within a data set.Outlier detection aims to find an abnormal data size from a large data size and has been applied in many fields including fraud detection,network intrusion detection,disaster prediction,medical diagnosis,public security,and image processing.While outlier detection has been widely applied in real systems,its effectiveness is challenged by higher dimensions and redundant data attributes,leading to detection errors and complicated calculations.The prevalence of mixed data is a current issue for outlier detection algorithms.An outlier detection method of mixed data based on neighborhood combinatorial entropy is studied to improve outlier detection performance by reducing data dimension using an attribute reduction algorithm.The significance of attributes is determined,and fewer influencing attributes are removed based on neighborhood combinatorial entropy.Outlier detection is conducted using the algorithm of local outlier factor.The proposed outlier detection method can be applied effectively in numerical and mixed multidimensional data using neighborhood combinatorial entropy.In the experimental part of this paper,we give a comparison on outlier detection before and after attribute reduction.In a comparative analysis,we give results of the enhanced outlier detection accuracy by removing the fewer influencing attributes in numerical and mixed multidimensional data.
文摘This paper studies the nonlinear mixed problem for a class of symmetric hyperbolic systems with the boundary condition satisfying the dissipative condition about discontinuous data in higher dimension spaces, establishes the local existence theorem by using the method of a prior estimates, and obtains the structure of singularities of the solutions of such problems.
基金the Natural Science Foundation of China(10371042,10671038)
文摘In this article, robust generalized estimating equation for the analysis of partial linear mixed model for longitudinal data is used. The authors approximate the nonparametric function by a regression spline. Under some regular conditions, the asymptotic properties of the estimators are obtained. To avoid the computation of high-dimensional integral, a robust Monte Carlo Newton-Raphson algorithm is used. Some simulations are carried out to study the performance of the proposed robust estimators. In addition, the authors also study the robustness and the efficiency of the proposed estimators by simulation. Finally, two real longitudinal data sets are analyzed.
基金funded by the National College Students'Innovation and Entrepreneurship Training Program(No.202410456025)supported by the China Center of the Serbian Academy of Sciences and Arts and the Hong Kong Institute of Humanities and Natural Sciences and Technology.
文摘Driven by both market demand and policies,the drone insurance industry is facing new development opportunities.This study focuses on exploring an innovative hybrid data integration method,which uses public datasets of drones and small manned aircraft for hybrid data integration and severity scaling,and conducts simulation tests to ensure the reproducibility of the method.A two-part hybrid model approach is adopted to separate the frequency model from the severity model,and a hierarchical modeling method is used for each part to deal with the occurrence of extreme losses.Monte Carlo simulation is performed on the fused data to calculate the net premium.Innovatively,a no-claim discount system is introduced,and the impact of operators'behaviors on claim frequency is quantified,with comprehensive consideration given to the inclusion and quantification of risk factors.The application of Tweedie GLM in total loss modeling is constructed and analyzed,and the advantages and disadvantages of different modeling methods are compared,aiming to provide more comprehensive decision-making basis for insurance companies.This report is intended to construct and evaluate a robust actuarial rate-making model for the rapidly developing drone insurance market,and to develop more accurate,fair and market-competitive drone insurance products.
基金Supported by Commission of Science Technology and Industry for National Defense(No, C192005C001)
文摘Data envelopment analysis(DEA) model is widely used to evaluate the relative efficiency of producers. It is a kind of objective decision method with multiple indexes. However, the two basic models frequently used at present, the C2R model and the C2GS2 model have limitations when used alone,resulting in evaluations that are often unsatisfactory. In order to solve this problem, a mixed DEA model is built and is used to evaluate the validity of the business efficiency of listed companies. An explanation of how to use this mixed DEA model is offered and its feasibility is verified.
文摘Based on left truncated and right censored dependent data, the estimators of higher derivatives of density function and hazard rate function are given by kernel smoothing method. When observed data exhibit α-mixing dependence, local properties including strong consistency and law of iterated logarithm are presented. Moreover, when the mode estimator is defined as the random variable that maximizes the kernel density estimator, the asymptotic normality of the mode estimator is established.