Effectively handling imbalanced datasets remains a fundamental challenge in computational modeling and machine learning,particularly when class overlap significantly deteriorates classification performance.Traditional...Effectively handling imbalanced datasets remains a fundamental challenge in computational modeling and machine learning,particularly when class overlap significantly deteriorates classification performance.Traditional oversampling methods often generate synthetic samples without considering density variations,leading to redundant or misleading instances that exacerbate class overlap in high-density regions.To address these limitations,we propose Wasserstein Generative Adversarial Network Variational Density Estimation WGAN-VDE,a computationally efficient density-aware adversarial resampling framework that enhances minority class representation while strategically reducing class overlap.The originality of WGAN-VDE lies in its density-aware sample refinement,ensuring that synthetic samples are positioned in underrepresented regions,thereby improving class distinctiveness.By applying structured feature representation,targeted sample generation,and density-based selection mechanisms strategies,the proposed framework ensures the generation of well-separated and diverse synthetic samples,improving class separability and reducing redundancy.The experimental evaluation on 20 benchmark datasets demonstrates that this approach outperforms 11 state-of-the-art rebalancing techniques,achieving superior results in F1-score,Accuracy,G-Mean,and AUC metrics.These results establish the proposed method as an effective and robust computational approach,suitable for diverse engineering and scientific applications involving imbalanced data classification and computational modeling.展开更多
In a crowd density estimation dataset,the annotation of crowd locations is an extremely laborious task,and they are not taken into the evaluation metrics.In this paper,we aim to reduce the annotation cost of crowd dat...In a crowd density estimation dataset,the annotation of crowd locations is an extremely laborious task,and they are not taken into the evaluation metrics.In this paper,we aim to reduce the annotation cost of crowd datasets,and propose a crowd density estimation method based on weakly-supervised learning,in the absence of crowd position supervision information,which directly reduces the number of crowds by using the number of pedestrians in the image as the supervised information.For this purpose,we design a new training method,which exploits the correlation between global and local image features by incremental learning to train the network.Specifically,we design a parent-child network(PC-Net)focusing on the global and local image respectively,and propose a linear feature calibration structure to train the PC-Net simultaneously,and the child network learns feature transfer factors and feature bias weights,and uses the transfer factors and bias weights to linearly feature calibrate the features extracted from the Parent network,to improve the convergence of the network by using local features hidden in the crowd images.In addition,we use the pyramid vision transformer as the backbone of the PC-Net to extract crowd features at different levels,and design a global-local feature loss function(L2).We combine it with a crowd counting loss(LC)to enhance the sensitivity of the network to crowd features during the training process,which effectively improves the accuracy of crowd density estimation.The experimental results show that the PC-Net significantly reduces the gap between fullysupervised and weakly-supervised crowd density estimation,and outperforms the comparison methods on five datasets of Shanghai Tech Part A,ShanghaiTech Part B,UCF_CC_50,UCF_QNRF and JHU-CROWD++.展开更多
Monitoring sensors in complex engineering environments often record abnormal data,leading to significant positioning errors.To reduce the influence of abnormal arrival times,we introduce an innovative,outlier-robust l...Monitoring sensors in complex engineering environments often record abnormal data,leading to significant positioning errors.To reduce the influence of abnormal arrival times,we introduce an innovative,outlier-robust localization method that integrates kernel density estimation(KDE)with damping linear correction to enhance the precision of microseismic/acoustic emission(MS/AE)source positioning.Our approach systematically addresses abnormal arrival times through a three-step process:initial location by 4-arrival combinations,elimination of outliers based on three-dimensional KDE,and refinement using a linear correction with an adaptive damping factor.We validate our method through lead-breaking experiments,demonstrating over a 23%improvement in positioning accuracy with a maximum error of 9.12 mm(relative error of 15.80%)—outperforming 4 existing methods.Simulations under various system errors,outlier scales,and ratios substantiate our method’s superior performance.Field blasting experiments also confirm the practical applicability,with an average positioning error of 11.71 m(relative error of 7.59%),compared to 23.56,66.09,16.95,and 28.52 m for other methods.This research is significant as it enhances the robustness of MS/AE source localization when confronted with data anomalies.It also provides a practical solution for real-world engineering and safety monitoring applications.展开更多
Controlled experiments are widely used in many applications to investigate the causal relationship between input factors and experimental outcomes.A completely randomised design is usually used to randomly assign trea...Controlled experiments are widely used in many applications to investigate the causal relationship between input factors and experimental outcomes.A completely randomised design is usually used to randomly assign treatment levels to experimental units.When covariates of the experimental units are available,the experimental design should achieve covariate balancing among the treatment groups,such that the statistical inference of the treatment effects is not confounded with any possible effects of covariates.However,covariate imbalance often exists,because the experiment is carried out based on a single realisation of the complete randomisation.It is more likely to occur and worsen when the size of the experimental units is small or moderate.In this paper,we introduce a new covariate balancing criterion,which measures the differences between kernel density estimates of the covariates of treatment groups.To achieve covariate balance before the treatments are randomly assigned,we partition the experimental units by minimising the criterion,then randomly assign the treatment levels to the partitioned groups.Through numerical examples,weshow that the proposed partition approach can improve the accuracy of the difference-in-mean estimator and outperforms the complete randomisation and rerandomisation approaches.展开更多
Let {Xn, n≥1} be a strictly stationary sequence of random variables, which are either associated or negatively associated, f(.) be their common density. In this paper, the author shows a central limit theorem for a k...Let {Xn, n≥1} be a strictly stationary sequence of random variables, which are either associated or negatively associated, f(.) be their common density. In this paper, the author shows a central limit theorem for a kernel estimate of f(.) under certain regular conditions.展开更多
In this paper,we consider the limit distribution of the error density function estima-tor in the rst-order autoregressive models with negatively associated and positively associated random errors.Under mild regularity...In this paper,we consider the limit distribution of the error density function estima-tor in the rst-order autoregressive models with negatively associated and positively associated random errors.Under mild regularity assumptions,some asymptotic normality results of the residual density estimator are obtained when the autoregressive models are stationary process and explosive process.In order to illustrate these results,some simulations such as con dence intervals and mean integrated square errors are provided in this paper.It shows that the residual density estimator can replace the density\estimator"which contains errors.展开更多
This paper addresses the problem of predicting population density leveraging cellular station data.As wireless communication devices are commonly used,cellular station data has become integral for estimating populatio...This paper addresses the problem of predicting population density leveraging cellular station data.As wireless communication devices are commonly used,cellular station data has become integral for estimating population figures and studying their movement,thereby implying significant contributions to urban planning.However,existing research grapples with issues pertinent to preprocessing base station data and the modeling of population prediction.To address this,we propose methodologies for preprocessing cellular station data to eliminate any irregular or redundant data.The preprocessing reveals a distinct cyclical characteristic and high-frequency variation in population shift.Further,we devise a multi-view enhancement model grounded on the Transformer(MVformer),targeting the improvement of the accuracy of extended time-series population predictions.Comparative experiments,conducted on the above-mentioned population dataset using four alternate Transformer-based models,indicate that our proposedMVformer model enhances prediction accuracy by approximately 30%for both univariate and multivariate time-series prediction assignments.The performance of this model in tasks pertaining to population prediction exhibits commendable results.展开更多
This cohort study was designed to explore the relationship between maternal dietary patterns(DPs)and bone health in Chinese lactating mothers and infants.We recruited 150 lactating women at 1-month postpartum.The esti...This cohort study was designed to explore the relationship between maternal dietary patterns(DPs)and bone health in Chinese lactating mothers and infants.We recruited 150 lactating women at 1-month postpartum.The estimated bone mineral density(eBMD)of subjects’calcanei and the information on dietary intake were collected.After 5-month follow-up,the eBMD of mothers and their infants were measured again.Factor analysis was applied to determine maternal DPs.General linear models were used to evaluate the association between maternal DPs and maternal eBMD loss or infants’eBMD.With all potential covariates adjusted,Factor 2(high intake of whole grains,tubers,mixed beans,soybeans and soybean products,seaweeds,and nuts)showed a positive association with the changes of maternal eBMD(β=0.16,95%CI:0.005,0.310).Factor 3(high intake of soft drinks,fried foods,and puffed foods)was inversely correlated with the changes of maternal eBMD(β=-0.22,95%CI:-0.44,0.00).The changes of maternal eBMD were positively associated with 6-month infants’eBMD(β=0.34,95%CI:0.017,0.652).In conclusion,Factor 2 might contribute to the maintenance of eBMD in lactating women,while Factor 3 could exacerbate maternal eBMD loss.Additionally,the changes of maternal eBMD presented a positive correlation with 6-month infants’eBMD.展开更多
In real-world applications, datasets frequently contain outliers, which can hinder the generalization ability of machine learning models. Bayesian classifiers, a popular supervised learning method, rely on accurate pr...In real-world applications, datasets frequently contain outliers, which can hinder the generalization ability of machine learning models. Bayesian classifiers, a popular supervised learning method, rely on accurate probability density estimation for classifying continuous datasets. However, achieving precise density estimation with datasets containing outliers poses a significant challenge. This paper introduces a Bayesian classifier that utilizes optimized robust kernel density estimation to address this issue. Our proposed method enhances the accuracy of probability density distribution estimation by mitigating the impact of outliers on the training sample’s estimated distribution. Unlike the conventional kernel density estimator, our robust estimator can be seen as a weighted kernel mapping summary for each sample. This kernel mapping performs the inner product in the Hilbert space, allowing the kernel density estimation to be considered the average of the samples’ mapping in the Hilbert space using a reproducing kernel. M-estimation techniques are used to obtain accurate mean values and solve the weights. Meanwhile, complete cross-validation is used as the objective function to search for the optimal bandwidth, which impacts the estimator. The Harris Hawks Optimisation optimizes the objective function to improve the estimation accuracy. The experimental results show that it outperforms other optimization algorithms regarding convergence speed and objective function value during the bandwidth search. The optimal robust kernel density estimator achieves better fitness performance than the traditional kernel density estimator when the training data contains outliers. The Naïve Bayesian with optimal robust kernel density estimation improves the generalization in the classification with outliers.展开更多
The sixth-generation fighter has superior stealth performance,but for the traditional kernel density estimation(KDE),precision requirements are difficult to satisfy when dealing with the fluctuation characteristics of...The sixth-generation fighter has superior stealth performance,but for the traditional kernel density estimation(KDE),precision requirements are difficult to satisfy when dealing with the fluctuation characteristics of complex radar cross section(RCS).To solve this problem,this paper studies the KDE algorithm for F/AXX stealth fighter.By considering the accuracy lack of existing fixed bandwidth algorithms,a novel adaptive kernel density estimation(AKDE)algorithm equipped with least square cross validation and integrated squared error criterion is proposed to optimize the bandwidth.Meanwhile,an adaptive RCS density estimation can be obtained according to the optimized bandwidth.Finally,simulations verify that the estimation accuracy of the adaptive bandwidth RCS density estimation algorithm is more than 50%higher than that of the traditional algorithm.Based on the proposed algorithm(i.e.,AKDE),statistical characteristics of the considered fighter are more accurately acquired,and then the significant advantages of the AKDE algorithm in solving cumulative distribution function estimation of RCS less than 1 m2 are analyzed.展开更多
One-class support vector machine (OCSVM) and support vector data description (SVDD) are two main domain-based one-class (kernel) classifiers. To reveal their relationship with density estimation in the case of t...One-class support vector machine (OCSVM) and support vector data description (SVDD) are two main domain-based one-class (kernel) classifiers. To reveal their relationship with density estimation in the case of the Gaussian kernel, OCSVM and SVDD are firstly unified into the framework of kernel density estimation, and the essential relationship between them is explicitly revealed. Then the result proves that the density estimation induced by OCSVM or SVDD is in agreement with the true density. Meanwhile, it can also reduce the integrated squared error (ISE). Finally, experiments on several simulated datasets verify the revealed relationships.展开更多
In this paper we study a fractional stochastic heat equation on Rd (d 〉 1) with additive noise /t u(t, x) = Dα/δ u(t, x)+ b(u(t, x) ) + WH (t, x) where D α/δ is a nonlocal fractional differential...In this paper we study a fractional stochastic heat equation on Rd (d 〉 1) with additive noise /t u(t, x) = Dα/δ u(t, x)+ b(u(t, x) ) + WH (t, x) where D α/δ is a nonlocal fractional differential operator and W H is a Gaussian-colored noise. We show the existence and the uniqueness of the mild solution for this equation. In addition, in the case of space dimension d = 1, we prove the existence of the density for this solution and we establish lower and upper Gaussian bounds for the density by Malliavin calculus.展开更多
Mechanical properties are critical to the quality of hot-rolled steel pipe products.Accurately understanding the relationship between rolling parameters and mechanical properties is crucial for effective prediction an...Mechanical properties are critical to the quality of hot-rolled steel pipe products.Accurately understanding the relationship between rolling parameters and mechanical properties is crucial for effective prediction and control.To address this,an industrial big data platform was developed to collect and process multi-source heterogeneous data from the entire production process,providing a complete dataset for mechanical property prediction.The adaptive bandwidth kernel density estimation(ABKDE)method was proposed to adjust bandwidth dynamically based on data density.Combining long short-term memory neural networks with ABKDE offers robust prediction interval capabilities for mechanical properties.The proposed method was deployed in a large-scale steel plant,which demonstrated superior prediction interval performance compared to lower upper bound estimation,mean variance estimation,and extreme learning machine-adaptive bandwidth kernel density estimation,achieving a prediction interval normalized average width of 0.37,a prediction interval coverage probability of 0.94,and the lowest coverage width-based criterion of 1.35.Notably,shapley additive explanations-based explanations significantly improved the proposed model’s credibility by providing a clear analysis of feature impacts.展开更多
Urban air pollution has brought great troubles to physical and mental health,economic development,environmental protection,and other aspects.Predicting the changes and trends of air pollution can provide a scientific ...Urban air pollution has brought great troubles to physical and mental health,economic development,environmental protection,and other aspects.Predicting the changes and trends of air pollution can provide a scientific basis for governance and prevention efforts.In this paper,we propose an interval prediction method that considers the spatio-temporal characteristic information of PM_(2.5)signals from multiple stations.K-nearest neighbor(KNN)algorithm interpolates the lost signals in the process of collection,transmission,and storage to ensure the continuity of data.Graph generative network(GGN)is used to process time-series meteorological data with complex structures.The graph U-Nets framework is introduced into the GGN model to enhance its controllability to the graph generation process,which is beneficial to improve the efficiency and robustness of the model.In addition,sparse Bayesian regression is incorporated to improve the dimensional disaster defect of traditional kernel density estimation(KDE)interval prediction.With the support of sparse strategy,sparse Bayesian regression kernel density estimation(SBR-KDE)is very efficient in processing high-dimensional large-scale data.The PM_(2.5)data of spring,summer,autumn,and winter from 34 air quality monitoring sites in Beijing verified the accuracy,generalization,and superiority of the proposed model in interval prediction.展开更多
The development of digital construction management is an important initiative to promote the digital transformation of the construction industry. But the attention to the regional differences in the development level ...The development of digital construction management is an important initiative to promote the digital transformation of the construction industry. But the attention to the regional differences in the development level of digital construction management in China from the industrial level is still relatively scarce. In this paper, the combination assignment method, Dagum’s Gini coefficient and Kernel density estimation method, are used to explore the regional differences and their dynamic evolution trends of China’s digital construction management development level. The study finds that the overall development level in China’s construction industry is on the rise, but it is still at a relatively low level. The overall Gini coefficient has increased, which is mainly due to uneven development between regions. There are large development differences between the eastern region and the other three regions. The interregional Gini coefficients for the Central-Northeastern and Central-Western regions are all growing at a higher rate.展开更多
The reliable,rapid,and accurate Remaining Useful Life(RUL)prognostics of aircraft power supply and distribution system are essential for enhancing the reliability and stability of system and reducing the life-cycle co...The reliable,rapid,and accurate Remaining Useful Life(RUL)prognostics of aircraft power supply and distribution system are essential for enhancing the reliability and stability of system and reducing the life-cycle costs.To achieve the reliable,rapid,and accurate RUL prognostics,the balance between accuracy and computational burden deserves more attention.In addition,the uncertainty is intrinsically present in RUL prognostic process.Due to the limitation of the uncertainty quantification,the point-wise prognostics strategy is not trustworthy.A Dual Adaptive Sliding-window Hybrid(DASH)RUL probabilistic prognostics strategy is proposed to tackle these deficiencies.The DASH strategy contains two adaptive mechanisms,the adaptive Long Short-Term Memory-Polynomial Regression(LSTM-PR)hybrid prognostics mechanism and the adaptive sliding-window Kernel Density Estimation(KDE)probabilistic prognostics mechanism.Owing to the dual adaptive mechanisms,the DASH strategy can achieve the balance between accuracy and computational burden and obtain the trustworthy probabilistic prognostics.Based on the degradation dataset of aircraft electromagnetic contactors,the superiority of DASH strategy is validated.In terms of probabilistic,point-wise and integrated prognostics performance,the proposed strategy increases by 66.89%,81.73% and 25.84%on average compared with the baseline methods and their variants.展开更多
Abstract Data-driven tools, such as principal component analysis (PCA) and independent component analysis (ICA) have been applied to different benchmarks as process monitoring methods. The difference between the t...Abstract Data-driven tools, such as principal component analysis (PCA) and independent component analysis (ICA) have been applied to different benchmarks as process monitoring methods. The difference between the two methods is that the components of PCA are still dependent while ICA has no orthogonality constraint and its latentvariables are independent. Process monitoring with PCA often supposes that process data or principal components is Gaussian distribution. However, this kind of constraint cannot be satisfied by several practical processes. To ex-tend the use of PCA, a nonparametric method is added to PCA to overcome the difficulty, and kernel density estimation (KDE) is rather a good choice. Though ICA is based on non-Gaussian distribution intormation, .KDE can help in the close monitoring of the data. Methods, such as PCA, ICA, PCA.with .KDE(KPCA), and ICA with KDE,(KICA), are demonstrated and. compared by applying them to a practical industnal Spheripol craft polypropylene catalyzer reactor instead of a laboratory emulator.展开更多
Crowd density is an important factor of crowd stability.Previous crowd density estimation methods are highly dependent on the specific video scene.This paper presented a video scene invariant crowd density estimation ...Crowd density is an important factor of crowd stability.Previous crowd density estimation methods are highly dependent on the specific video scene.This paper presented a video scene invariant crowd density estimation method using Geographic Information Systems(GIS) to monitor crowd size for large areas.The proposed method mapped crowd images to GIS.Then we can estimate crowd density for each camera in GIS using an estimation model obtained by one camera.Test results show that one model obtained by one camera in GIS can be adaptively applied to other cameras in outdoor video scenes.A real-time monitoring system for crowd size in large areas based on scene invariant model has been successfully used in 'Jiangsu Qinhuai Lantern Festival,2012'.It can provide early warning information and scientific basis for safety and security decision making.展开更多
In the process of large-scale,grid-connected wind power operations,it is important to establish an accurate probability distribution model for wind farm fluctuations.In this study,a wind power fluctuation modeling met...In the process of large-scale,grid-connected wind power operations,it is important to establish an accurate probability distribution model for wind farm fluctuations.In this study,a wind power fluctuation modeling method is proposed based on the method of moving average and adaptive nonparametric kernel density estimation(NPKDE)method.Firstly,the method of moving average is used to reduce the fluctuation of the sampling wind power component,and the probability characteristics of the modeling are then determined based on the NPKDE.Secondly,the model is improved adaptively,and is then solved by using constraint-order optimization.The simulation results show that this method has a better accuracy and applicability compared with the modeling method based on traditional parameter estimation,and solves the local adaptation problem of traditional NPKDE.展开更多
A new algorithm for linear instantaneous independent component analysis is proposed based on maximizing the log-likelihood contrast function which can be changed into a gradient equation.An iterative method is introdu...A new algorithm for linear instantaneous independent component analysis is proposed based on maximizing the log-likelihood contrast function which can be changed into a gradient equation.An iterative method is introduced to solve this equation efficiently.The unknown probability density functions as well as their first and second derivatives in the gradient equation are estimated by kernel density method.Computer simulations on artificially generated signals and gray scale natural scene images confirm the efficiency and accuracy of the proposed algorithm.展开更多
基金supported by Ongoing Research Funding Program(ORF-2025-488)King Saud University,Riyadh,Saudi Arabia.
文摘Effectively handling imbalanced datasets remains a fundamental challenge in computational modeling and machine learning,particularly when class overlap significantly deteriorates classification performance.Traditional oversampling methods often generate synthetic samples without considering density variations,leading to redundant or misleading instances that exacerbate class overlap in high-density regions.To address these limitations,we propose Wasserstein Generative Adversarial Network Variational Density Estimation WGAN-VDE,a computationally efficient density-aware adversarial resampling framework that enhances minority class representation while strategically reducing class overlap.The originality of WGAN-VDE lies in its density-aware sample refinement,ensuring that synthetic samples are positioned in underrepresented regions,thereby improving class distinctiveness.By applying structured feature representation,targeted sample generation,and density-based selection mechanisms strategies,the proposed framework ensures the generation of well-separated and diverse synthetic samples,improving class separability and reducing redundancy.The experimental evaluation on 20 benchmark datasets demonstrates that this approach outperforms 11 state-of-the-art rebalancing techniques,achieving superior results in F1-score,Accuracy,G-Mean,and AUC metrics.These results establish the proposed method as an effective and robust computational approach,suitable for diverse engineering and scientific applications involving imbalanced data classification and computational modeling.
基金the Humanities and Social Science Fund of the Ministry of Education of China(21YJAZH077)。
文摘In a crowd density estimation dataset,the annotation of crowd locations is an extremely laborious task,and they are not taken into the evaluation metrics.In this paper,we aim to reduce the annotation cost of crowd datasets,and propose a crowd density estimation method based on weakly-supervised learning,in the absence of crowd position supervision information,which directly reduces the number of crowds by using the number of pedestrians in the image as the supervised information.For this purpose,we design a new training method,which exploits the correlation between global and local image features by incremental learning to train the network.Specifically,we design a parent-child network(PC-Net)focusing on the global and local image respectively,and propose a linear feature calibration structure to train the PC-Net simultaneously,and the child network learns feature transfer factors and feature bias weights,and uses the transfer factors and bias weights to linearly feature calibrate the features extracted from the Parent network,to improve the convergence of the network by using local features hidden in the crowd images.In addition,we use the pyramid vision transformer as the backbone of the PC-Net to extract crowd features at different levels,and design a global-local feature loss function(L2).We combine it with a crowd counting loss(LC)to enhance the sensitivity of the network to crowd features during the training process,which effectively improves the accuracy of crowd density estimation.The experimental results show that the PC-Net significantly reduces the gap between fullysupervised and weakly-supervised crowd density estimation,and outperforms the comparison methods on five datasets of Shanghai Tech Part A,ShanghaiTech Part B,UCF_CC_50,UCF_QNRF and JHU-CROWD++.
基金the financial support provided by the National Key Research and Development Program for Young Scientists(No.2021YFC2900400)Postdoctoral Fellowship Program of China Postdoctoral Science Foundation(CPSF)(No.GZB20230914)+2 种基金National Natural Science Foundation of China(No.52304123)China Postdoctoral Science Foundation(No.2023M730412)Chongqing Outstanding Youth Science Foundation Program(No.CSTB2023NSCQ-JQX0027).
文摘Monitoring sensors in complex engineering environments often record abnormal data,leading to significant positioning errors.To reduce the influence of abnormal arrival times,we introduce an innovative,outlier-robust localization method that integrates kernel density estimation(KDE)with damping linear correction to enhance the precision of microseismic/acoustic emission(MS/AE)source positioning.Our approach systematically addresses abnormal arrival times through a three-step process:initial location by 4-arrival combinations,elimination of outliers based on three-dimensional KDE,and refinement using a linear correction with an adaptive damping factor.We validate our method through lead-breaking experiments,demonstrating over a 23%improvement in positioning accuracy with a maximum error of 9.12 mm(relative error of 15.80%)—outperforming 4 existing methods.Simulations under various system errors,outlier scales,and ratios substantiate our method’s superior performance.Field blasting experiments also confirm the practical applicability,with an average positioning error of 11.71 m(relative error of 7.59%),compared to 23.56,66.09,16.95,and 28.52 m for other methods.This research is significant as it enhances the robustness of MS/AE source localization when confronted with data anomalies.It also provides a practical solution for real-world engineering and safety monitoring applications.
基金supported by Division of Mathematical Sciences[grant number 1916467].
文摘Controlled experiments are widely used in many applications to investigate the causal relationship between input factors and experimental outcomes.A completely randomised design is usually used to randomly assign treatment levels to experimental units.When covariates of the experimental units are available,the experimental design should achieve covariate balancing among the treatment groups,such that the statistical inference of the treatment effects is not confounded with any possible effects of covariates.However,covariate imbalance often exists,because the experiment is carried out based on a single realisation of the complete randomisation.It is more likely to occur and worsen when the size of the experimental units is small or moderate.In this paper,we introduce a new covariate balancing criterion,which measures the differences between kernel density estimates of the covariates of treatment groups.To achieve covariate balance before the treatments are randomly assigned,we partition the experimental units by minimising the criterion,then randomly assign the treatment levels to the partitioned groups.Through numerical examples,weshow that the proposed partition approach can improve the accuracy of the difference-in-mean estimator and outperforms the complete randomisation and rerandomisation approaches.
文摘Let {Xn, n≥1} be a strictly stationary sequence of random variables, which are either associated or negatively associated, f(.) be their common density. In this paper, the author shows a central limit theorem for a kernel estimate of f(.) under certain regular conditions.
基金supported by the National Natural Science Foundation of China(12131015,12071422)。
文摘In this paper,we consider the limit distribution of the error density function estima-tor in the rst-order autoregressive models with negatively associated and positively associated random errors.Under mild regularity assumptions,some asymptotic normality results of the residual density estimator are obtained when the autoregressive models are stationary process and explosive process.In order to illustrate these results,some simulations such as con dence intervals and mean integrated square errors are provided in this paper.It shows that the residual density estimator can replace the density\estimator"which contains errors.
基金Guangdong Basic and Applied Basic Research Foundation under Grant No.2024A1515012485in part by the Shenzhen Fundamental Research Program under Grant JCYJ20220810112354002.
文摘This paper addresses the problem of predicting population density leveraging cellular station data.As wireless communication devices are commonly used,cellular station data has become integral for estimating population figures and studying their movement,thereby implying significant contributions to urban planning.However,existing research grapples with issues pertinent to preprocessing base station data and the modeling of population prediction.To address this,we propose methodologies for preprocessing cellular station data to eliminate any irregular or redundant data.The preprocessing reveals a distinct cyclical characteristic and high-frequency variation in population shift.Further,we devise a multi-view enhancement model grounded on the Transformer(MVformer),targeting the improvement of the accuracy of extended time-series population predictions.Comparative experiments,conducted on the above-mentioned population dataset using four alternate Transformer-based models,indicate that our proposedMVformer model enhances prediction accuracy by approximately 30%for both univariate and multivariate time-series prediction assignments.The performance of this model in tasks pertaining to population prediction exhibits commendable results.
基金NSFC and CNS for funding the projectfunded by the National Natural Science Foundation of China(NSFC,82173500)“CNS-ZD Tizhi and Health Fund”(CNS-ZD2020-163).
文摘This cohort study was designed to explore the relationship between maternal dietary patterns(DPs)and bone health in Chinese lactating mothers and infants.We recruited 150 lactating women at 1-month postpartum.The estimated bone mineral density(eBMD)of subjects’calcanei and the information on dietary intake were collected.After 5-month follow-up,the eBMD of mothers and their infants were measured again.Factor analysis was applied to determine maternal DPs.General linear models were used to evaluate the association between maternal DPs and maternal eBMD loss or infants’eBMD.With all potential covariates adjusted,Factor 2(high intake of whole grains,tubers,mixed beans,soybeans and soybean products,seaweeds,and nuts)showed a positive association with the changes of maternal eBMD(β=0.16,95%CI:0.005,0.310).Factor 3(high intake of soft drinks,fried foods,and puffed foods)was inversely correlated with the changes of maternal eBMD(β=-0.22,95%CI:-0.44,0.00).The changes of maternal eBMD were positively associated with 6-month infants’eBMD(β=0.34,95%CI:0.017,0.652).In conclusion,Factor 2 might contribute to the maintenance of eBMD in lactating women,while Factor 3 could exacerbate maternal eBMD loss.Additionally,the changes of maternal eBMD presented a positive correlation with 6-month infants’eBMD.
文摘In real-world applications, datasets frequently contain outliers, which can hinder the generalization ability of machine learning models. Bayesian classifiers, a popular supervised learning method, rely on accurate probability density estimation for classifying continuous datasets. However, achieving precise density estimation with datasets containing outliers poses a significant challenge. This paper introduces a Bayesian classifier that utilizes optimized robust kernel density estimation to address this issue. Our proposed method enhances the accuracy of probability density distribution estimation by mitigating the impact of outliers on the training sample’s estimated distribution. Unlike the conventional kernel density estimator, our robust estimator can be seen as a weighted kernel mapping summary for each sample. This kernel mapping performs the inner product in the Hilbert space, allowing the kernel density estimation to be considered the average of the samples’ mapping in the Hilbert space using a reproducing kernel. M-estimation techniques are used to obtain accurate mean values and solve the weights. Meanwhile, complete cross-validation is used as the objective function to search for the optimal bandwidth, which impacts the estimator. The Harris Hawks Optimisation optimizes the objective function to improve the estimation accuracy. The experimental results show that it outperforms other optimization algorithms regarding convergence speed and objective function value during the bandwidth search. The optimal robust kernel density estimator achieves better fitness performance than the traditional kernel density estimator when the training data contains outliers. The Naïve Bayesian with optimal robust kernel density estimation improves the generalization in the classification with outliers.
基金the National Natural Science Foundation of China(Nos.61074090 and 60804025)。
文摘The sixth-generation fighter has superior stealth performance,but for the traditional kernel density estimation(KDE),precision requirements are difficult to satisfy when dealing with the fluctuation characteristics of complex radar cross section(RCS).To solve this problem,this paper studies the KDE algorithm for F/AXX stealth fighter.By considering the accuracy lack of existing fixed bandwidth algorithms,a novel adaptive kernel density estimation(AKDE)algorithm equipped with least square cross validation and integrated squared error criterion is proposed to optimize the bandwidth.Meanwhile,an adaptive RCS density estimation can be obtained according to the optimized bandwidth.Finally,simulations verify that the estimation accuracy of the adaptive bandwidth RCS density estimation algorithm is more than 50%higher than that of the traditional algorithm.Based on the proposed algorithm(i.e.,AKDE),statistical characteristics of the considered fighter are more accurately acquired,and then the significant advantages of the AKDE algorithm in solving cumulative distribution function estimation of RCS less than 1 m2 are analyzed.
基金Supported by the National Natural Science Foundation of China(60603029)the Natural Science Foundation of Jiangsu Province(BK2007074)the Natural Science Foundation for Colleges and Universities in Jiangsu Province(06KJB520132)~~
文摘One-class support vector machine (OCSVM) and support vector data description (SVDD) are two main domain-based one-class (kernel) classifiers. To reveal their relationship with density estimation in the case of the Gaussian kernel, OCSVM and SVDD are firstly unified into the framework of kernel density estimation, and the essential relationship between them is explicitly revealed. Then the result proves that the density estimation induced by OCSVM or SVDD is in agreement with the true density. Meanwhile, it can also reduce the integrated squared error (ISE). Finally, experiments on several simulated datasets verify the revealed relationships.
基金Supported by NNSFC(11401313)NSFJS(BK20161579)+2 种基金CPSF(2014M560368,2015T80475)2014 Qing Lan ProjectSupported by MEC Project PAI80160047,Conicyt,Chile
文摘In this paper we study a fractional stochastic heat equation on Rd (d 〉 1) with additive noise /t u(t, x) = Dα/δ u(t, x)+ b(u(t, x) ) + WH (t, x) where D α/δ is a nonlocal fractional differential operator and W H is a Gaussian-colored noise. We show the existence and the uniqueness of the mild solution for this equation. In addition, in the case of space dimension d = 1, we prove the existence of the density for this solution and we establish lower and upper Gaussian bounds for the density by Malliavin calculus.
基金supported by the National Key Research and Development Plan(Grant No.2023YFB3712400)the National Key Research and Development Plan(Grant No.2020YFB1713600).
文摘Mechanical properties are critical to the quality of hot-rolled steel pipe products.Accurately understanding the relationship between rolling parameters and mechanical properties is crucial for effective prediction and control.To address this,an industrial big data platform was developed to collect and process multi-source heterogeneous data from the entire production process,providing a complete dataset for mechanical property prediction.The adaptive bandwidth kernel density estimation(ABKDE)method was proposed to adjust bandwidth dynamically based on data density.Combining long short-term memory neural networks with ABKDE offers robust prediction interval capabilities for mechanical properties.The proposed method was deployed in a large-scale steel plant,which demonstrated superior prediction interval performance compared to lower upper bound estimation,mean variance estimation,and extreme learning machine-adaptive bandwidth kernel density estimation,achieving a prediction interval normalized average width of 0.37,a prediction interval coverage probability of 0.94,and the lowest coverage width-based criterion of 1.35.Notably,shapley additive explanations-based explanations significantly improved the proposed model’s credibility by providing a clear analysis of feature impacts.
基金Project(2020YFC2008605)supported by the National Key Research and Development Project of ChinaProject(52072412)supported by the National Natural Science Foundation of ChinaProject(2021JJ30359)supported by the Natural Science Foundation of Hunan Province,China。
文摘Urban air pollution has brought great troubles to physical and mental health,economic development,environmental protection,and other aspects.Predicting the changes and trends of air pollution can provide a scientific basis for governance and prevention efforts.In this paper,we propose an interval prediction method that considers the spatio-temporal characteristic information of PM_(2.5)signals from multiple stations.K-nearest neighbor(KNN)algorithm interpolates the lost signals in the process of collection,transmission,and storage to ensure the continuity of data.Graph generative network(GGN)is used to process time-series meteorological data with complex structures.The graph U-Nets framework is introduced into the GGN model to enhance its controllability to the graph generation process,which is beneficial to improve the efficiency and robustness of the model.In addition,sparse Bayesian regression is incorporated to improve the dimensional disaster defect of traditional kernel density estimation(KDE)interval prediction.With the support of sparse strategy,sparse Bayesian regression kernel density estimation(SBR-KDE)is very efficient in processing high-dimensional large-scale data.The PM_(2.5)data of spring,summer,autumn,and winter from 34 air quality monitoring sites in Beijing verified the accuracy,generalization,and superiority of the proposed model in interval prediction.
文摘The development of digital construction management is an important initiative to promote the digital transformation of the construction industry. But the attention to the regional differences in the development level of digital construction management in China from the industrial level is still relatively scarce. In this paper, the combination assignment method, Dagum’s Gini coefficient and Kernel density estimation method, are used to explore the regional differences and their dynamic evolution trends of China’s digital construction management development level. The study finds that the overall development level in China’s construction industry is on the rise, but it is still at a relatively low level. The overall Gini coefficient has increased, which is mainly due to uneven development between regions. There are large development differences between the eastern region and the other three regions. The interregional Gini coefficients for the Central-Northeastern and Central-Western regions are all growing at a higher rate.
基金co-supported by the National Natural Science Foundation of China(Nos.52272403,52402506)Natural Science Basic Research Program of Shaanxi,China(Nos.2022JC-27,2023-JC-QN-0599)。
文摘The reliable,rapid,and accurate Remaining Useful Life(RUL)prognostics of aircraft power supply and distribution system are essential for enhancing the reliability and stability of system and reducing the life-cycle costs.To achieve the reliable,rapid,and accurate RUL prognostics,the balance between accuracy and computational burden deserves more attention.In addition,the uncertainty is intrinsically present in RUL prognostic process.Due to the limitation of the uncertainty quantification,the point-wise prognostics strategy is not trustworthy.A Dual Adaptive Sliding-window Hybrid(DASH)RUL probabilistic prognostics strategy is proposed to tackle these deficiencies.The DASH strategy contains two adaptive mechanisms,the adaptive Long Short-Term Memory-Polynomial Regression(LSTM-PR)hybrid prognostics mechanism and the adaptive sliding-window Kernel Density Estimation(KDE)probabilistic prognostics mechanism.Owing to the dual adaptive mechanisms,the DASH strategy can achieve the balance between accuracy and computational burden and obtain the trustworthy probabilistic prognostics.Based on the degradation dataset of aircraft electromagnetic contactors,the superiority of DASH strategy is validated.In terms of probabilistic,point-wise and integrated prognostics performance,the proposed strategy increases by 66.89%,81.73% and 25.84%on average compared with the baseline methods and their variants.
基金Supported by the National Natural Science Foundation of China (No.60574047) and the Doctorate Foundation of the State Education Ministry of China (No.20050335018).
文摘Abstract Data-driven tools, such as principal component analysis (PCA) and independent component analysis (ICA) have been applied to different benchmarks as process monitoring methods. The difference between the two methods is that the components of PCA are still dependent while ICA has no orthogonality constraint and its latentvariables are independent. Process monitoring with PCA often supposes that process data or principal components is Gaussian distribution. However, this kind of constraint cannot be satisfied by several practical processes. To ex-tend the use of PCA, a nonparametric method is added to PCA to overcome the difficulty, and kernel density estimation (KDE) is rather a good choice. Though ICA is based on non-Gaussian distribution intormation, .KDE can help in the close monitoring of the data. Methods, such as PCA, ICA, PCA.with .KDE(KPCA), and ICA with KDE,(KICA), are demonstrated and. compared by applying them to a practical industnal Spheripol craft polypropylene catalyzer reactor instead of a laboratory emulator.
基金The authors would like to thank the reviewers for their detailed reviews and constructive comments. We are also grateful for Sophie Song's help on the improving English. This work was supported in part by the ‘Fivetwelfh' National Science and Technology Support Program of the Ministry of Science and Technology of China (No. 2012BAH35B02), the National Natural Science Foundation of China (NSFC) (No. 41401107, No. 41201402, and No. 41201417).
文摘Crowd density is an important factor of crowd stability.Previous crowd density estimation methods are highly dependent on the specific video scene.This paper presented a video scene invariant crowd density estimation method using Geographic Information Systems(GIS) to monitor crowd size for large areas.The proposed method mapped crowd images to GIS.Then we can estimate crowd density for each camera in GIS using an estimation model obtained by one camera.Test results show that one model obtained by one camera in GIS can be adaptively applied to other cameras in outdoor video scenes.A real-time monitoring system for crowd size in large areas based on scene invariant model has been successfully used in 'Jiangsu Qinhuai Lantern Festival,2012'.It can provide early warning information and scientific basis for safety and security decision making.
基金supported by Science and Technology project of the State Grid Corporation of China“Research on Active Development Planning Technology and Comprehensive Benefit Analysis Method for Regional Smart Grid Comprehensive Demonstration Zone”National Natural Science Foundation of China(51607104)
文摘In the process of large-scale,grid-connected wind power operations,it is important to establish an accurate probability distribution model for wind farm fluctuations.In this study,a wind power fluctuation modeling method is proposed based on the method of moving average and adaptive nonparametric kernel density estimation(NPKDE)method.Firstly,the method of moving average is used to reduce the fluctuation of the sampling wind power component,and the probability characteristics of the modeling are then determined based on the NPKDE.Secondly,the model is improved adaptively,and is then solved by using constraint-order optimization.The simulation results show that this method has a better accuracy and applicability compared with the modeling method based on traditional parameter estimation,and solves the local adaptation problem of traditional NPKDE.
文摘A new algorithm for linear instantaneous independent component analysis is proposed based on maximizing the log-likelihood contrast function which can be changed into a gradient equation.An iterative method is introduced to solve this equation efficiently.The unknown probability density functions as well as their first and second derivatives in the gradient equation are estimated by kernel density method.Computer simulations on artificially generated signals and gray scale natural scene images confirm the efficiency and accuracy of the proposed algorithm.