In recent years, functional data has been widely used in finance, medicine, biology and other fields. The current clustering analysis can solve the problems in finite-dimensional space, but it is difficult to be direc...In recent years, functional data has been widely used in finance, medicine, biology and other fields. The current clustering analysis can solve the problems in finite-dimensional space, but it is difficult to be directly used for the clustering of functional data. In this paper, we propose a new unsupervised clustering algorithm based on adaptive weights. In the absence of initialization parameter, we use entropy-type penalty terms and fuzzy partition matrix to find the optimal number of clusters. At the same time, we introduce a measure based on adaptive weights to reflect the difference in information content between different clustering metrics. Simulation experiments show that the proposed algorithm has higher purity than some algorithms.展开更多
Human living would be impossible without air quality. Consistent advancements in practically every aspect of contemporary human life have harmed air quality. Everyday industrial, transportation, and home activities tu...Human living would be impossible without air quality. Consistent advancements in practically every aspect of contemporary human life have harmed air quality. Everyday industrial, transportation, and home activities turn up dangerous contaminants in our surroundings. This study investigated two years’ worth of air quality and outlier detection data from two Indian cities. Studies on air pollution have used numerous types of methodologies, with various gases being seen as a vector whose components include gas concentration values for each observation per-formed. We use curves to represent the monthly average of daily gas emissions in our technique. The approach, which is based on functional depth, was used to find outliers in the city of Delhi and Kolkata’s gas emissions, and the outcomes were compared to those from the traditional method. In the evaluation and comparison of these models’ performances, the functional approach model studied well.展开更多
In this paper,we consider the clustering of bivariate functional data where each random surface consists of a set of curves recorded repeatedly for each subject.The k-centres surface clustering method based on margina...In this paper,we consider the clustering of bivariate functional data where each random surface consists of a set of curves recorded repeatedly for each subject.The k-centres surface clustering method based on marginal functional principal component analysis is proposed for the bivariate functional data,and a novel clustering criterion is presented where both the random surface and its partial derivative function in two directions are considered.In addition,we also consider two other clustering methods,k-centres surface clustering methods based on product functional principal component analysis or double functional principal component analysis.Simulation results indicate that the proposed methods have a nice performance in terms of both the correct classification rate and the adjusted rand index.The approaches are further illustrated through empirical analysis of human mortality data.展开更多
Objective Humans are exposed to complex mixtures of environmental chemicals and other factors that can affect their health.Analysis of these mixture exposures presents several key challenges for environmental epidemio...Objective Humans are exposed to complex mixtures of environmental chemicals and other factors that can affect their health.Analysis of these mixture exposures presents several key challenges for environmental epidemiology and risk assessment,including high dimensionality,correlated exposure,and subtle individual effects.Methods We proposed a novel statistical approach,the generalized functional linear model(GFLM),to analyze the health effects of exposure mixtures.GFLM treats the effect of mixture exposures as a smooth function by reordering exposures based on specific mechanisms and capturing internal correlations to provide a meaningful estimation and interpretation.The robustness and efficiency was evaluated under various scenarios through extensive simulation studies.Results We applied the GFLM to two datasets from the National Health and Nutrition Examination Survey(NHANES).In the first application,we examined the effects of 37 nutrients on BMI(2011–2016 cycles).The GFLM identified a significant mixture effect,with fiber and fat emerging as the nutrients with the greatest negative and positive effects on BMI,respectively.For the second application,we investigated the association between four pre-and perfluoroalkyl substances(PFAS)and gout risk(2007–2018 cycles).Unlike traditional methods,the GFLM indicated no significant association,demonstrating its robustness to multicollinearity.Conclusion GFLM framework is a powerful tool for mixture exposure analysis,offering improved handling of correlated exposures and interpretable results.It demonstrates robust performance across various scenarios and real-world applications,advancing our understanding of complex environmental exposures and their health impacts on environmental epidemiology and toxicology.展开更多
We consider the semiparametric partially linear regression models with mean function XTβ + g(z), where X and z are functional data. The new estimators of β and g(z) are presented and some asymptotic results are...We consider the semiparametric partially linear regression models with mean function XTβ + g(z), where X and z are functional data. The new estimators of β and g(z) are presented and some asymptotic results are given. The strong convergence rates of the proposed estimators are obtained. In our estimation, the observation number of each subject will be completely flexible. Some simulation study is conducted to investigate the finite sample performance of the proposed estimators.展开更多
We propose a new functional single index model, which called dynamic single-index model for functional data, or DSIM, to efficiently perform non-linear and dynamic relationships between functional predictor and functi...We propose a new functional single index model, which called dynamic single-index model for functional data, or DSIM, to efficiently perform non-linear and dynamic relationships between functional predictor and functional response. The proposed model naturally allows for some curvature not captured by the ordinary functional linear model. By using the proposed two-step estimating algorithm, we develop the estimates for both the link function and the regression coefficient function, and then provide predictions of new response trajectories. Besides the asymptotic properties for the estimates of the unknown functions, we also establish the consistency of the predictions of new response trajectories under mild conditions. Finally, we show through extensive simulation studies and a real data example that the proposed DSIM can highly outperform existed functional regression methods in most settings.展开更多
We propose a methodology for testing two-sample means in high-dimensional functional data that requires no decaying pattern on eigenvalues of the functional data.To the best of our knowledge,we are the first to consid...We propose a methodology for testing two-sample means in high-dimensional functional data that requires no decaying pattern on eigenvalues of the functional data.To the best of our knowledge,we are the first to consider and address such a problem.To be specific,we devise a confidence region for the mean curve difference between two samples,which directly establishes a rigorous inferential procedure based on the multiplier bootstrap.In addition,the proposed test permits the functional observations in each sample to have mutually different distributions and arbitrary correlation structures,which is regarded as the desired property of distribution/correlation-free,leading to a more challenging scenario for theoretical development.Other desired properties include the allowance for highly unequal sample sizes,exponentially growing data dimension in sample sizes and consistent power behavior under fairly general alternatives.The proposed test is shown uniformly convergent to the prescribed significance,and its finite sample performance is evaluated via the simulation study and an application to electroencephalography data.展开更多
We propose a two-sample test for the mean functions of functional data when the number of bases is much lager than the sample size.The novel test is based on U-statistics which avoids estimating the covariance operato...We propose a two-sample test for the mean functions of functional data when the number of bases is much lager than the sample size.The novel test is based on U-statistics which avoids estimating the covariance operator accurately under the high dimensional situation.We further prove the asymptotic normality of our test statistic under both null hypothesis and a local alternative hypothesis.An extensive simulation study is presented which shows that the proposed test works well in comparison with several other methods under the high dimensional situation.An application to egg-laying trajectories of Mediterranean fruit flies data set demonstrates the applicability of the method.展开更多
Background:The accurate estimation of temporal patterns of influenza may help in utilizing hospital resources and guiding influenza surveillance.This paper proposes functional data analysis(FDA)to improve the predicti...Background:The accurate estimation of temporal patterns of influenza may help in utilizing hospital resources and guiding influenza surveillance.This paper proposes functional data analysis(FDA)to improve the prediction of temporal patterns of influenza.Methods:We illustrate FDA methods using the weekly Influenza-like Illness(ILI)activity level data from the U.S.We propose to use the Fourier basis function for transforming discrete weekly data to the smoothed functional ILI activities.Functional analysis of variance(FANOVA)is used to examine the regional differences in temporal patterns and the impact of state's political orientation.Results:The ILI activity has a very distinct peak at the beginning and end of the year.There are significant differences in average level of ILI activities among geographic regions.However,the temporal patterns in terms of the peak and flat time are quite consistent across regions.The geographic and temporal patterns of ILI activities also depend on the political make-up of states.The states affiliated with Republicans had higher ILI activities than those affiliated with Democrats across the whole year.The influence of political party affiliation on temporal pattern is quite different among geographic regions.Conclusions:Functional data analysis can help us to reveal the temporal variability in average ILI levels,rate of change in ILI levels,and the effect of geographical regions.Consideration should be given to wider application of FDA to generate more accurate estimates in public health and biomedical research.展开更多
This paper deals with the conditional density estimator of a real response variable given a functional random variable(i.e.,takes values in an infinite-dimensional space).Specifically,we focus on the functional index ...This paper deals with the conditional density estimator of a real response variable given a functional random variable(i.e.,takes values in an infinite-dimensional space).Specifically,we focus on the functional index model,and this approach represents a good compromise between nonparametric and parametric models.Then we give under general conditions and when the variables are independent,the quadratic error and asymptotic normality of estimator by local linear method,based on the single-index structure.Finally,wecomplete these theoretical advances by some simulation studies showing both the practical result of the local linear method and the good behaviour for finite sample sizes of the estimator and of the Monte Carlo methods to create functional pseudo-confidence area.展开更多
Chlorophyll-a(Chl-a)concentration is a primary indicator for marine environmental monitoring.The spatio-temporal variations of sea surface Chl-a concentration in the Yellow Sea(YS)and the East China Sea(ECS)in 2001-20...Chlorophyll-a(Chl-a)concentration is a primary indicator for marine environmental monitoring.The spatio-temporal variations of sea surface Chl-a concentration in the Yellow Sea(YS)and the East China Sea(ECS)in 2001-2020 were investigated by reconstructing the MODIS Level 3 products with the data interpolation empirical orthogonal function(DINEOF)method.The reconstructed results by interpolating the combined MODIS daily+8-day datasets were found better than those merely by interpolating daily or 8-day data.Chl-a concentration in the YS and the ECS reached its maximum in spring,with blooms occurring,decreased in summer and autumn,and increased in late autumn and early winter.By performing empirical orthogonal function(EOF)decomposition of the reconstructed data fields and correlation analysis with several potential environmental factors,we found that the sea surface temperature(SST)plays a significant role in the seasonal variation of Chl a,especially during spring and summer.The increase of SST in spring and the upper-layer nutrients mixed up during the last winter might favor the occurrence of spring blooms.The high sea surface temperature(SST)throughout the summer would strengthen the vertical stratification and prevent nutrients supply from deep water,resulting in low surface Chl-a concentrations.The sea surface Chl-a concentration in the YS was found decreased significantly from 2012 to 2020,which was possibly related to the Pacific Decadal Oscillation(PDO).展开更多
As data becomes increasingly complex,measuring dependence among variables is of great interest.However,most existing measures of dependence are limited to the Euclidean setting and cannot effectively characterize the ...As data becomes increasingly complex,measuring dependence among variables is of great interest.However,most existing measures of dependence are limited to the Euclidean setting and cannot effectively characterize the complex relationships.In this paper,we propose a novel method for constructing independence tests for random elements in Hilbert spaces,which includes functional data as a special case.Our approach is using distance covariance of random projections to build a test statistic that is computationally efficient and exhibits strong power performance.We prove the equivalence between testing for independence expressed on the original and the projected covariates,bridging the gap between measures of testing independence in Euclidean spaces and Hilbert spaces.Implementation of the test involves calibration by permutation and combining several p-values from different projections using the false discovery rate method.Simulation studies and real data examples illustrate the finite sample properties of the proposed method under a variety of scenarios.展开更多
It is well known that the nonparametric estimation of the regression function is highly sensitive to the presence of even a small proportion of outliers in the data.To solve the problem of typical observations when th...It is well known that the nonparametric estimation of the regression function is highly sensitive to the presence of even a small proportion of outliers in the data.To solve the problem of typical observations when the covariates of the nonparametric component are functional,the robust estimates for the regression parameter and regression operator are introduced.The main propose of the paper is to consider data-driven methods of selecting the number of neighbors in order to make the proposed processes fully automatic.We use thek Nearest Neighbors procedure(kNN)to construct the kernel estimator of the proposed robust model.Under some regularity conditions,we state consistency results for kNN functional estimators,which are uniform in the number of neighbors(UINN).Furthermore,a simulation study and an empirical application to a real data analysis of octane gasoline predictions are carried out to illustrate the higher predictive performances and the usefulness of the kNN approach.展开更多
Fuzzy clustering theory is widely used in data mining of full-face tunnel boring machine.However,the traditional fuzzy clustering algorithm based on objective function is difficult to effectively cluster functional da...Fuzzy clustering theory is widely used in data mining of full-face tunnel boring machine.However,the traditional fuzzy clustering algorithm based on objective function is difficult to effectively cluster functional data.We propose a new Fuzzy clustering algorithm,namely FCM-ANN algorithm.The algorithm replaces the clustering prototype of the FCM algorithm with the predicted value of the artificial neural network.This makes the algorithm not only satisfy the clustering based on the traditional similarity criterion,but also can effectively cluster the functional data.In this paper,we first use the t-test as an evaluation index and apply the FCM-ANN algorithm to the synthetic datasets for validity testing.Then the algorithm is applied to TBM operation data and combined with the crossvalidation method to predict the tunneling speed.The predicted results are evaluated by RMSE and R^(2).According to the experimental results on the synthetic datasets,we obtain the relationship among the membership threshold,the number of samples,the number of attributes and the noise.Accordingly,the datasets can be effectively adjusted.Applying the FCM-ANN algorithm to the TBM operation data can accurately predict the tunneling speed.The FCM-ANN algorithm has improved the traditional fuzzy clustering algorithm,which can be used not only for the prediction of tunneling speed of TBM but also for clustering or prediction of other functional data.展开更多
The study of estimation of conditional extreme quantile in incomplete data frameworks is of growing interest. Specially, the estimation of the extreme value index in a censorship framework has been the purpose of many...The study of estimation of conditional extreme quantile in incomplete data frameworks is of growing interest. Specially, the estimation of the extreme value index in a censorship framework has been the purpose of many inves<span style="font-family:Verdana;">tigations when finite dimension covariate information has been considered. In this paper, the estimation of the conditional extreme quantile of a </span><span style="font-family:Verdana;">heavy-tailed distribution is discussed when some functional random covariate (</span><i><span style="font-family:Verdana;">i.e.</span></i><span style="font-family:Verdana;"> valued in some infinite-dimensional space) information is available and the scalar response variable is right-censored. A Weissman-type estimator of conditional extreme quantiles is proposed and its asymptotic normality is established under mild assumptions. A simulation study is conducted to assess the finite-sample behavior of the proposed estimator and a comparison with two simple estimations strategies is provided.</span>展开更多
For Hermite-Birkhoff interpolation of scattered multidumensional data by radial basis function (?),existence and characterization theorems and a variational principle are proved. Examples include (?)(r)=r^b,Duchon'...For Hermite-Birkhoff interpolation of scattered multidumensional data by radial basis function (?),existence and characterization theorems and a variational principle are proved. Examples include (?)(r)=r^b,Duchon's thin-plate splines,Hardy's multiquadrics,and inverse multiquadrics.展开更多
This paper studies the problem of robust H∞ control of piecewise-linear chaotic systems with random data loss. The communication links between the plant and the controller are assumed to be imperfect (that is, data ...This paper studies the problem of robust H∞ control of piecewise-linear chaotic systems with random data loss. The communication links between the plant and the controller are assumed to be imperfect (that is, data loss occurs intermittently, which appears typically in a network environment). The data loss is modelled as a random process which obeys a Bernoulli distribution. In the face of random data loss, a piecewise controller is designed to robustly stabilize the networked system in the sense of mean square and also achieve a prescribed H∞ disturbance attenuation performance based on a piecewise-quadratic Lyapunov function. The required H∞ controllers can be designed by solving a set of linear matrix inequalities (LMIs). Chua's system is provided to illustrate the usefulness and applicability of the developed theoretical results.展开更多
The classification of functional data has drawn much attention in recent years.The main challenge is representing infinite-dimensional functional data by finite-dimensional features while utilizing those features to a...The classification of functional data has drawn much attention in recent years.The main challenge is representing infinite-dimensional functional data by finite-dimensional features while utilizing those features to achieve better classification accuracy.In this paper,we propose a mean-variance-based(MV)feature weighting method for classifying functional data or functional curves.In the feature extraction stage,each sample curve is approximated by B-splines to transfer features to the coefficients of the spline basis.After that,a feature weighting approach based on statistical principles is introduced by comprehensively considering the between-class differences and within-class variations of the coefficients.We also introduce a scaling parameter to adjust the gap between the weights of features.The new feature weighting approach can adaptively enhance noteworthy local features while mitigating the impact of confusing features.The algorithms for feature weighted K-nearest neighbor and support vector machine classifiers are both provided.Moreover,the new approach can be well integrated into existing functional data classifiers,such as the generalized functional linear model and functional linear discriminant analysis,resulting in a more accurate classification.The performance of the mean-variance-based classifiers is evaluated by simulation studies and real data.The results show that the newfeatureweighting approach significantly improves the classification accuracy for complex functional data.展开更多
For functional data,the most popular dimension reduction methods are functional sliced inverse regression(FSIR)and functional sliced average variance estimation(FSAVE).Both FSIR and FSAVE methods are based on the slic...For functional data,the most popular dimension reduction methods are functional sliced inverse regression(FSIR)and functional sliced average variance estimation(FSAVE).Both FSIR and FSAVE methods are based on the slice approach to estimate the conditional expectation E[x(t)|y].While sliced-based methods are effective for scalar responses,they often perform poorly or even lead to failure for multivariate responses and small sample sizes as the so-called“curse of dimensionality”.To avoid this problem,this study proposes a projective resampling method that first projects the multivariate response into a scalar-response and then uses SDR method for the univariate response to estimate the effective dimension reduction space(e.d.r space).The proposed projective resampling method is insensitive to the number of slices and the dimensionality of the response variable.In theory,the proposed resampling method can fully recover the effective dimension reduction space.Furthermore,this study investigates the performance of the proposed method through simulation studies and one real data analysis and compares the proposed method with other methods.展开更多
文摘In recent years, functional data has been widely used in finance, medicine, biology and other fields. The current clustering analysis can solve the problems in finite-dimensional space, but it is difficult to be directly used for the clustering of functional data. In this paper, we propose a new unsupervised clustering algorithm based on adaptive weights. In the absence of initialization parameter, we use entropy-type penalty terms and fuzzy partition matrix to find the optimal number of clusters. At the same time, we introduce a measure based on adaptive weights to reflect the difference in information content between different clustering metrics. Simulation experiments show that the proposed algorithm has higher purity than some algorithms.
文摘Human living would be impossible without air quality. Consistent advancements in practically every aspect of contemporary human life have harmed air quality. Everyday industrial, transportation, and home activities turn up dangerous contaminants in our surroundings. This study investigated two years’ worth of air quality and outlier detection data from two Indian cities. Studies on air pollution have used numerous types of methodologies, with various gases being seen as a vector whose components include gas concentration values for each observation per-formed. We use curves to represent the monthly average of daily gas emissions in our technique. The approach, which is based on functional depth, was used to find outliers in the city of Delhi and Kolkata’s gas emissions, and the outcomes were compared to those from the traditional method. In the evaluation and comparison of these models’ performances, the functional approach model studied well.
基金supported by National Natural Science Foundation of China (Grant Nos.12261007)Natural Science Foundation of Guangxi Province (Grant No.2020GXNSFAA297225)。
文摘In this paper,we consider the clustering of bivariate functional data where each random surface consists of a set of curves recorded repeatedly for each subject.The k-centres surface clustering method based on marginal functional principal component analysis is proposed for the bivariate functional data,and a novel clustering criterion is presented where both the random surface and its partial derivative function in two directions are considered.In addition,we also consider two other clustering methods,k-centres surface clustering methods based on product functional principal component analysis or double functional principal component analysis.Simulation results indicate that the proposed methods have a nice performance in terms of both the correct classification rate and the adjusted rand index.The approaches are further illustrated through empirical analysis of human mortality data.
基金supported in part by the Young Scientists Fund of the National Natural Science Foundation of China(Grant Nos.82304253)(and 82273709)the Foundation for Young Talents in Higher Education of Guangdong Province(Grant No.2022KQNCX021)the PhD Starting Project of Guangdong Medical University(Grant No.GDMUB2022054).
文摘Objective Humans are exposed to complex mixtures of environmental chemicals and other factors that can affect their health.Analysis of these mixture exposures presents several key challenges for environmental epidemiology and risk assessment,including high dimensionality,correlated exposure,and subtle individual effects.Methods We proposed a novel statistical approach,the generalized functional linear model(GFLM),to analyze the health effects of exposure mixtures.GFLM treats the effect of mixture exposures as a smooth function by reordering exposures based on specific mechanisms and capturing internal correlations to provide a meaningful estimation and interpretation.The robustness and efficiency was evaluated under various scenarios through extensive simulation studies.Results We applied the GFLM to two datasets from the National Health and Nutrition Examination Survey(NHANES).In the first application,we examined the effects of 37 nutrients on BMI(2011–2016 cycles).The GFLM identified a significant mixture effect,with fiber and fat emerging as the nutrients with the greatest negative and positive effects on BMI,respectively.For the second application,we investigated the association between four pre-and perfluoroalkyl substances(PFAS)and gout risk(2007–2018 cycles).Unlike traditional methods,the GFLM indicated no significant association,demonstrating its robustness to multicollinearity.Conclusion GFLM framework is a powerful tool for mixture exposure analysis,offering improved handling of correlated exposures and interpretable results.It demonstrates robust performance across various scenarios and real-world applications,advancing our understanding of complex environmental exposures and their health impacts on environmental epidemiology and toxicology.
文摘We consider the semiparametric partially linear regression models with mean function XTβ + g(z), where X and z are functional data. The new estimators of β and g(z) are presented and some asymptotic results are given. The strong convergence rates of the proposed estimators are obtained. In our estimation, the observation number of each subject will be completely flexible. Some simulation study is conducted to investigate the finite sample performance of the proposed estimators.
基金supported by National Natural Science Foundation of China (Grant No. 11271080)
文摘We propose a new functional single index model, which called dynamic single-index model for functional data, or DSIM, to efficiently perform non-linear and dynamic relationships between functional predictor and functional response. The proposed model naturally allows for some curvature not captured by the ordinary functional linear model. By using the proposed two-step estimating algorithm, we develop the estimates for both the link function and the regression coefficient function, and then provide predictions of new response trajectories. Besides the asymptotic properties for the estimates of the unknown functions, we also establish the consistency of the predictions of new response trajectories under mild conditions. Finally, we show through extensive simulation studies and a real data example that the proposed DSIM can highly outperform existed functional regression methods in most settings.
基金supported by National Natural Science Foundation of China (Grant No.11901313)Fundamental Research Funds for the Central Universities+1 种基金Key Laboratory for Medical Data Analysis and Statistical Research of TianjinKey Laboratory of Pure Mathematics and Combinatorics.
文摘We propose a methodology for testing two-sample means in high-dimensional functional data that requires no decaying pattern on eigenvalues of the functional data.To the best of our knowledge,we are the first to consider and address such a problem.To be specific,we devise a confidence region for the mean curve difference between two samples,which directly establishes a rigorous inferential procedure based on the multiplier bootstrap.In addition,the proposed test permits the functional observations in each sample to have mutually different distributions and arbitrary correlation structures,which is regarded as the desired property of distribution/correlation-free,leading to a more challenging scenario for theoretical development.Other desired properties include the allowance for highly unequal sample sizes,exponentially growing data dimension in sample sizes and consistent power behavior under fairly general alternatives.The proposed test is shown uniformly convergent to the prescribed significance,and its finite sample performance is evaluated via the simulation study and an application to electroencephalography data.
基金Supported by the National Natural Science Foundation of China(Grant Nos.11671268 and 12271370)the Guangdong Basic and Applied Basic Research Foundation(Grant No.2020A1515010821)+1 种基金the Fundamental Research Funds for the Central Universities(Grant No.12619624)Supported by the Research Start-up Fund for new young Teachers of Capital University of Economics and Business(Grant No.00592254417068)。
文摘We propose a two-sample test for the mean functions of functional data when the number of bases is much lager than the sample size.The novel test is based on U-statistics which avoids estimating the covariance operator accurately under the high dimensional situation.We further prove the asymptotic normality of our test statistic under both null hypothesis and a local alternative hypothesis.An extensive simulation study is presented which shows that the proposed test works well in comparison with several other methods under the high dimensional situation.An application to egg-laying trajectories of Mediterranean fruit flies data set demonstrates the applicability of the method.
基金Authors acknowledged the Canadian Institute for Health Research(CIHR)Children's Hospital Research Institute of Manitoba(CHRIM)Foundation+1 种基金Visual and Automated Disease Analytics(VADA)graduate training program of Natural Sciences and Engineering Research Council of Canada(NSERC)for providing the funding opportunities to conduct this research.
文摘Background:The accurate estimation of temporal patterns of influenza may help in utilizing hospital resources and guiding influenza surveillance.This paper proposes functional data analysis(FDA)to improve the prediction of temporal patterns of influenza.Methods:We illustrate FDA methods using the weekly Influenza-like Illness(ILI)activity level data from the U.S.We propose to use the Fourier basis function for transforming discrete weekly data to the smoothed functional ILI activities.Functional analysis of variance(FANOVA)is used to examine the regional differences in temporal patterns and the impact of state's political orientation.Results:The ILI activity has a very distinct peak at the beginning and end of the year.There are significant differences in average level of ILI activities among geographic regions.However,the temporal patterns in terms of the peak and flat time are quite consistent across regions.The geographic and temporal patterns of ILI activities also depend on the political make-up of states.The states affiliated with Republicans had higher ILI activities than those affiliated with Democrats across the whole year.The influence of political party affiliation on temporal pattern is quite different among geographic regions.Conclusions:Functional data analysis can help us to reveal the temporal variability in average ILI levels,rate of change in ILI levels,and the effect of geographical regions.Consideration should be given to wider application of FDA to generate more accurate estimates in public health and biomedical research.
文摘This paper deals with the conditional density estimator of a real response variable given a functional random variable(i.e.,takes values in an infinite-dimensional space).Specifically,we focus on the functional index model,and this approach represents a good compromise between nonparametric and parametric models.Then we give under general conditions and when the variables are independent,the quadratic error and asymptotic normality of estimator by local linear method,based on the single-index structure.Finally,wecomplete these theoretical advances by some simulation studies showing both the practical result of the local linear method and the good behaviour for finite sample sizes of the estimator and of the Monte Carlo methods to create functional pseudo-confidence area.
基金Supported by the Fundamental Research Funds for the Central Universities(Nos.202341017,202313024)。
文摘Chlorophyll-a(Chl-a)concentration is a primary indicator for marine environmental monitoring.The spatio-temporal variations of sea surface Chl-a concentration in the Yellow Sea(YS)and the East China Sea(ECS)in 2001-2020 were investigated by reconstructing the MODIS Level 3 products with the data interpolation empirical orthogonal function(DINEOF)method.The reconstructed results by interpolating the combined MODIS daily+8-day datasets were found better than those merely by interpolating daily or 8-day data.Chl-a concentration in the YS and the ECS reached its maximum in spring,with blooms occurring,decreased in summer and autumn,and increased in late autumn and early winter.By performing empirical orthogonal function(EOF)decomposition of the reconstructed data fields and correlation analysis with several potential environmental factors,we found that the sea surface temperature(SST)plays a significant role in the seasonal variation of Chl a,especially during spring and summer.The increase of SST in spring and the upper-layer nutrients mixed up during the last winter might favor the occurrence of spring blooms.The high sea surface temperature(SST)throughout the summer would strengthen the vertical stratification and prevent nutrients supply from deep water,resulting in low surface Chl-a concentrations.The sea surface Chl-a concentration in the YS was found decreased significantly from 2012 to 2020,which was possibly related to the Pacific Decadal Oscillation(PDO).
基金Supported by the Grant of National Science Foundation of China(11971433)Zhejiang Gongshang University“Digital+”Disciplinary Construction Management Project(SZJ2022B004)+1 种基金Institute for International People-to-People Exchange in Artificial Intelligence and Advanced Manufacturing(CCIPERGZN202439)the Development Fund for Zhejiang College of Shanghai University of Finance and Economics(2023FZJJ15).
文摘As data becomes increasingly complex,measuring dependence among variables is of great interest.However,most existing measures of dependence are limited to the Euclidean setting and cannot effectively characterize the complex relationships.In this paper,we propose a novel method for constructing independence tests for random elements in Hilbert spaces,which includes functional data as a special case.Our approach is using distance covariance of random projections to build a test statistic that is computationally efficient and exhibits strong power performance.We prove the equivalence between testing for independence expressed on the original and the projected covariates,bridging the gap between measures of testing independence in Euclidean spaces and Hilbert spaces.Implementation of the test involves calibration by permutation and combining several p-values from different projections using the false discovery rate method.Simulation studies and real data examples illustrate the finite sample properties of the proposed method under a variety of scenarios.
文摘It is well known that the nonparametric estimation of the regression function is highly sensitive to the presence of even a small proportion of outliers in the data.To solve the problem of typical observations when the covariates of the nonparametric component are functional,the robust estimates for the regression parameter and regression operator are introduced.The main propose of the paper is to consider data-driven methods of selecting the number of neighbors in order to make the proposed processes fully automatic.We use thek Nearest Neighbors procedure(kNN)to construct the kernel estimator of the proposed robust model.Under some regularity conditions,we state consistency results for kNN functional estimators,which are uniform in the number of neighbors(UINN).Furthermore,a simulation study and an empirical application to a real data analysis of octane gasoline predictions are carried out to illustrate the higher predictive performances and the usefulness of the kNN approach.
基金supported by the National Key R&D Program of China(Grant Nos.2018YFB1700704 and 2018YFB1702502)the Study on the Key Management and Privacy Preservation in VANET,The Innovation Foundation of Science and Technology of Dalian(2018J12GX045).
文摘Fuzzy clustering theory is widely used in data mining of full-face tunnel boring machine.However,the traditional fuzzy clustering algorithm based on objective function is difficult to effectively cluster functional data.We propose a new Fuzzy clustering algorithm,namely FCM-ANN algorithm.The algorithm replaces the clustering prototype of the FCM algorithm with the predicted value of the artificial neural network.This makes the algorithm not only satisfy the clustering based on the traditional similarity criterion,but also can effectively cluster the functional data.In this paper,we first use the t-test as an evaluation index and apply the FCM-ANN algorithm to the synthetic datasets for validity testing.Then the algorithm is applied to TBM operation data and combined with the crossvalidation method to predict the tunneling speed.The predicted results are evaluated by RMSE and R^(2).According to the experimental results on the synthetic datasets,we obtain the relationship among the membership threshold,the number of samples,the number of attributes and the noise.Accordingly,the datasets can be effectively adjusted.Applying the FCM-ANN algorithm to the TBM operation data can accurately predict the tunneling speed.The FCM-ANN algorithm has improved the traditional fuzzy clustering algorithm,which can be used not only for the prediction of tunneling speed of TBM but also for clustering or prediction of other functional data.
文摘The study of estimation of conditional extreme quantile in incomplete data frameworks is of growing interest. Specially, the estimation of the extreme value index in a censorship framework has been the purpose of many inves<span style="font-family:Verdana;">tigations when finite dimension covariate information has been considered. In this paper, the estimation of the conditional extreme quantile of a </span><span style="font-family:Verdana;">heavy-tailed distribution is discussed when some functional random covariate (</span><i><span style="font-family:Verdana;">i.e.</span></i><span style="font-family:Verdana;"> valued in some infinite-dimensional space) information is available and the scalar response variable is right-censored. A Weissman-type estimator of conditional extreme quantiles is proposed and its asymptotic normality is established under mild assumptions. A simulation study is conducted to assess the finite-sample behavior of the proposed estimator and a comparison with two simple estimations strategies is provided.</span>
文摘For Hermite-Birkhoff interpolation of scattered multidumensional data by radial basis function (?),existence and characterization theorems and a variational principle are proved. Examples include (?)(r)=r^b,Duchon's thin-plate splines,Hardy's multiquadrics,and inverse multiquadrics.
基金Project partially supported by the Young Scientists Fund of the National Natural Science Foundation of China(Grant No.60904004)the Key Youth Science and Technology Foundation of University of Electronic Science and Technology of China (Grant No.L08010201JX0720)
文摘This paper studies the problem of robust H∞ control of piecewise-linear chaotic systems with random data loss. The communication links between the plant and the controller are assumed to be imperfect (that is, data loss occurs intermittently, which appears typically in a network environment). The data loss is modelled as a random process which obeys a Bernoulli distribution. In the face of random data loss, a piecewise controller is designed to robustly stabilize the networked system in the sense of mean square and also achieve a prescribed H∞ disturbance attenuation performance based on a piecewise-quadratic Lyapunov function. The required H∞ controllers can be designed by solving a set of linear matrix inequalities (LMIs). Chua's system is provided to illustrate the usefulness and applicability of the developed theoretical results.
基金the National Social Science Foundation of China(Grant No.22BTJ035).
文摘The classification of functional data has drawn much attention in recent years.The main challenge is representing infinite-dimensional functional data by finite-dimensional features while utilizing those features to achieve better classification accuracy.In this paper,we propose a mean-variance-based(MV)feature weighting method for classifying functional data or functional curves.In the feature extraction stage,each sample curve is approximated by B-splines to transfer features to the coefficients of the spline basis.After that,a feature weighting approach based on statistical principles is introduced by comprehensively considering the between-class differences and within-class variations of the coefficients.We also introduce a scaling parameter to adjust the gap between the weights of features.The new feature weighting approach can adaptively enhance noteworthy local features while mitigating the impact of confusing features.The algorithms for feature weighted K-nearest neighbor and support vector machine classifiers are both provided.Moreover,the new approach can be well integrated into existing functional data classifiers,such as the generalized functional linear model and functional linear discriminant analysis,resulting in a more accurate classification.The performance of the mean-variance-based classifiers is evaluated by simulation studies and real data.The results show that the newfeatureweighting approach significantly improves the classification accuracy for complex functional data.
基金supported by the National Social Science Foundation of China under Grant No.20BTJ041。
文摘For functional data,the most popular dimension reduction methods are functional sliced inverse regression(FSIR)and functional sliced average variance estimation(FSAVE).Both FSIR and FSAVE methods are based on the slice approach to estimate the conditional expectation E[x(t)|y].While sliced-based methods are effective for scalar responses,they often perform poorly or even lead to failure for multivariate responses and small sample sizes as the so-called“curse of dimensionality”.To avoid this problem,this study proposes a projective resampling method that first projects the multivariate response into a scalar-response and then uses SDR method for the univariate response to estimate the effective dimension reduction space(e.d.r space).The proposed projective resampling method is insensitive to the number of slices and the dimensionality of the response variable.In theory,the proposed resampling method can fully recover the effective dimension reduction space.Furthermore,this study investigates the performance of the proposed method through simulation studies and one real data analysis and compares the proposed method with other methods.