The pricing of moving window Asian option with an early exercise feature is considered a challenging problem in option pricing. The computational challenge lies in the unknown optimal exercise strategy and in the high...The pricing of moving window Asian option with an early exercise feature is considered a challenging problem in option pricing. The computational challenge lies in the unknown optimal exercise strategy and in the high dimensionality required for approximating the early exercise boundary. We use sparse grid basis functions in the Least Squares Monte Carlo approach to solve this “curse of dimensionality” problem. The resulting algorithm provides a general and convergent method for pricing moving window Asian options. The sparse grid technique presented in this paper can be generalized to pricing other high-dimensional, early-exercisable derivatives.展开更多
High-dimensional heterogeneous data have acquired increasing attention and discussion in the past decade.In the context of heterogeneity,semiparametric regression emerges as a popular method to model this type of data...High-dimensional heterogeneous data have acquired increasing attention and discussion in the past decade.In the context of heterogeneity,semiparametric regression emerges as a popular method to model this type of data in statistics.In this paper,we leverage the benefits of expectile regression for computational efficiency and analytical robustness in heterogeneity,and propose a regularized partially linear additive expectile regression model with a nonconvex penalty,such as SCAD or MCP,for high-dimensional heterogeneous data.We focus on a more realistic scenario where the regression error exhibits a heavy-tailed distribution with only finite moments.This scenario challenges the classical sub-gaussian distribution assumption and is more prevalent in practical applications.Under certain regular conditions,we demonstrate that with probability tending to one,the oracle estimator is one of the local minima of the induced optimization problem.Our theoretical analysis suggests that the dimensionality of linear covariates that our estimation procedure can handle is fundamentally limited by the moment condition of the regression error.Computationally,given the nonconvex and nonsmooth nature of the induced optimization problem,we have developed a two-step algorithm.Finally,our method’s effectiveness is demonstrated through its high estimation accuracy and effective model selection,as evidenced by Monte Carlo simulation studies and a real-data application.Furthermore,by taking various expectile weights,our method effectively detects heterogeneity and explores the complete conditional distribution of the response variable,underscoring its utility in analyzing high-dimensional heterogeneous data.展开更多
Owing to their global search capabilities and gradient-free operation,metaheuristic algorithms are widely applied to a wide range of optimization problems.However,their computational demands become prohibitive when ta...Owing to their global search capabilities and gradient-free operation,metaheuristic algorithms are widely applied to a wide range of optimization problems.However,their computational demands become prohibitive when tackling high-dimensional optimization challenges.To effectively address these challenges,this study introduces cooperative metaheuristics integrating dynamic dimension reduction(DR).Building upon particle swarm optimization(PSO)and differential evolution(DE),the proposed cooperative methods C-PSO and C-DE are developed.In the proposed methods,the modified principal components analysis(PCA)is utilized to reduce the dimension of design variables,thereby decreasing computational costs.The dynamic DR strategy implements periodic execution of modified PCA after a fixed number of iterations,resulting in the important dimensions being dynamically identified.Compared with the static one,the dynamic DR strategy can achieve precise identification of important dimensions,thereby enabling accelerated convergence toward optimal solutions.Furthermore,the influence of cumulative contribution rate thresholds on optimization problems with different dimensions is investigated.Metaheuristic algorithms(PSO,DE)and cooperative metaheuristics(C-PSO,C-DE)are examined by 15 benchmark functions and two engineering design problems(speed reducer and composite pressure vessel).Comparative results demonstrate that the cooperative methods achieve significantly superior performance compared to standard methods in both solution accuracy and computational efficiency.Compared to standard metaheuristic algorithms,cooperative metaheuristics achieve a reduction in computational cost of at least 40%.The cooperative metaheuristics can be effectively used to tackle both high-dimensional unconstrained and constrained optimization problems.展开更多
High-dimensional data causes difficulties in machine learning due to high time consumption and large memory requirements.In particular,in amulti-label environment,higher complexity is required asmuch as the number of ...High-dimensional data causes difficulties in machine learning due to high time consumption and large memory requirements.In particular,in amulti-label environment,higher complexity is required asmuch as the number of labels.Moreover,an optimization problem that fully considers all dependencies between features and labels is difficult to solve.In this study,we propose a novel regression-basedmulti-label feature selectionmethod that integrates mutual information to better exploit the underlying data structure.By incorporating mutual information into the regression formulation,the model captures not only linear relationships but also complex non-linear dependencies.The proposed objective function simultaneously considers three types of relationships:(1)feature redundancy,(2)featurelabel relevance,and(3)inter-label dependency.These three quantities are computed usingmutual information,allowing the proposed formulation to capture nonlinear dependencies among variables.These three types of relationships are key factors in multi-label feature selection,and our method expresses them within a unified formulation,enabling efficient optimization while simultaneously accounting for all of them.To efficiently solve the proposed optimization problem under non-negativity constraints,we develop a gradient-based optimization algorithm with fast convergence.Theexperimental results on sevenmulti-label datasets show that the proposed method outperforms existingmulti-label feature selection techniques.展开更多
The accessibility of urban public transit directly influences residents’quality of life,travel behavior,and social equity.Its correlation with housing prices has garnered significant attention across disciplines such...The accessibility of urban public transit directly influences residents’quality of life,travel behavior,and social equity.Its correlation with housing prices has garnered significant attention across disciplines such as geography,economics,and urban planning.Although much existing research focuses on the impact of individual transportation facilities on housing prices,there is a notable gap in comprehensive analyses that assess the influence of overall urban transit accessibility on housing market dynamics.This study selected the main urban area of Hefei,China,as a case to investigate the spatial distribution of housing prices and evaluate public transit accessibility in 2022.Employing techniques such as the optimized parameter geographical detector and local spatial regression models,the study aimed to elucidate the effects and underlying mechanisms of urban transit accessibility on housing prices.The findings revealed that:1)housing prices in Hefei exhibited a clustered spatial pattern,with high prices concentrated in the city center and lower prices in peripheral areas,forming three distinct high-price hotspots with a‘belt-like’distribution;2)public transit accessibility showed a‘coreperiphery’structure,with accessibility declining in a‘circumferential’pattern around the city center.Based on the‘housing price-accessibility’dimension,four categories were identified:high price-high accessibility(37.25%),high price-low accessibility(19.07%),low price-high accessibility(21.95%),and low price-low accessibility(21.73%);3)the impact of transit accessibility on housing prices was spatially heterogeneous,with bus travel showing the strongest explanatory power(0.692),followed by automobile,subway,and bicycle travel.The interaction of these transportation modes generated a synergistic effect on housing price differentiation,with most influencing factors contributing more than 25%.These findings offer valuable insights for optimizing the spatial distribution of public transit infrastructure and improving both urban housing quality and residents’living standards.展开更多
BACKGROUND Paternal perinatal depression(PPD)is closely associated with maternal mental health challenges,marital strain,and adverse child developmental outcomes.Despite its significant impact,PPD remains under-recogn...BACKGROUND Paternal perinatal depression(PPD)is closely associated with maternal mental health challenges,marital strain,and adverse child developmental outcomes.Despite its significant impact,PPD remains under-recognized in family-centered clinical practice.Concurrently,against the backdrop of rising rates of delayed marriage and China’s Maternity Incentive Policy,the proportion of women giving birth at an advanced maternal age is increasing.Nevertheless,research specifically examining PPD among spouses of older mothers remains critically scarce,both in China and globally.AIM To investigate PPD and its influencing factors in Chinese advanced maternal age families.METHODS This cross-sectional study included 358 participants;it was conducted among fathers of pregnant women of advanced maternal age at five hospitals in the Pearl River Delta region of China from September 2023 to June 2024.Data were collected via a general information questionnaire,the Social Support Rating Scale,and the Edinburgh Postnatal Depression Scale.Latent profile analysis and regression mixture models(RMMs)were adopted to analyze the latent PPD types and factors that influenced PPD.RESULTS The incidence of PPD was 16.48%,and three profiles were identified:Low-symptomatic(175 cases,48.89%),monophasic(140 cases,39.10%),and high-symptomatic(43 cases,12.01%).The RMM analysis revealed that first pregnancy,low income(<¥3000/month),part-time work,and a history of abnormal pregnancy were positively associated with the high-symptomatic type(P<0.05).Conversely,high subjective support and support utilization were negatively associated with the high-symptomatic type compared with the low-symptomatic type(P<0.05).Good couple relationships,high objective and subjective support,and high support utilization were negatively associated with monophasic disorder(P<0.05).CONCLUSION PPD incidence is high among Chinese fathers with advanced maternal age partners,and the characteristics of depression are varied.Healthcare practitioners should prioritize individuals with low levels of social support.展开更多
During the past decade,shrinkage priors have received much attention in Bayesian analysis of high-dimensional data.This paper establishes the posterior consistency for high-dimensional linear regression with a class o...During the past decade,shrinkage priors have received much attention in Bayesian analysis of high-dimensional data.This paper establishes the posterior consistency for high-dimensional linear regression with a class of shrinkage priors,which has a heavy and flat tail and allocates a sufficiently large probability mass in a very small neighborhood of zero.While enjoying its efficiency in posterior simulations,the shrinkage prior can lead to a nearly optimal posterior contraction rate and the variable selection consistency as the spike-and-slab prior.Our numerical results show that under the posterior consistency,Bayesian methods can yield much better results in variable selection than the regularization methods such as LASSO and SCAD.This paper also establishes a BvM-type result,which leads to a convenient way of uncertainty quantification for regression coefficient estimates.展开更多
Nonconvex penalties including the smoothly clipped absolute deviation penalty and the minimax concave penalty enjoy the properties of unbiasedness, continuity and sparsity,and the ridge regression can deal with the co...Nonconvex penalties including the smoothly clipped absolute deviation penalty and the minimax concave penalty enjoy the properties of unbiasedness, continuity and sparsity,and the ridge regression can deal with the collinearity problem. Combining the strengths of nonconvex penalties and ridge regression(abbreviated as NPR), we study the oracle property of the NPR estimator in high dimensional settings with highly correlated predictors, where the dimensionality of covariates pn is allowed to increase exponentially with the sample size n. Simulation studies and a real data example are presented to verify the performance of the NPR method.展开更多
Background:The COVID-1’s impact on influenza activity is of interest to inform future flu prevention and control strategies.Our study aim to examine COVID-19’s effects on influenza in Fujian Province,China,using a r...Background:The COVID-1’s impact on influenza activity is of interest to inform future flu prevention and control strategies.Our study aim to examine COVID-19’s effects on influenza in Fujian Province,China,using a regression discontinuity design.Methods:We utilized influenza-like illness(ILI)percentage as an indicator of influenza activity,with data from all sentinel hospitals between Week 4,2020,and Week 51,2023.The data is divided into two groups:the COVID-19 epidemic period and the post-epidemic period.Statistical analysis was performed with R software using robust RD design methods to account for potential confounders including seasonality,temperature,and influenza vaccination rates.Results:There was a discernible increase in the ILI percentage during the post-epidemic period.The robustness of the findings was confirmed with various RD design bandwidth selection methods and placebo tests,with certwo bandwidth providing the largest estimated effect size:a 14.6-percentage-point increase in the ILI percentage(β=0.146;95%CI:0.096–0.196).Sensitivity analyses and adjustments for confounders consistently pointed to an increased ILI percentage during the post-epidemic period compared to the epidemic period.Conclusion:The 14.6 percentage-point increase in the ILI percentage in Fujian Province,China,after the end of the COVID-19 pandemic suggests that there may be a need to re-evaluate and possibly enhance public health measures to control influenza transmission.Further research is needed to fully understand the factors contributing to this rise and to assess the ongoing impacts of post-pandemic behavioral changes.展开更多
This study numerically examines the heat and mass transfer characteristics of two ternary nanofluids via converging and diverg-ing channels.Furthermore,the study aims to assess two ternary nanofluids combinations to d...This study numerically examines the heat and mass transfer characteristics of two ternary nanofluids via converging and diverg-ing channels.Furthermore,the study aims to assess two ternary nanofluids combinations to determine which configuration can provide better heat and mass transfer and lower entropy production,while ensuring cost efficiency.This work bridges the gap be-tween academic research and industrial feasibility by incorporating cost analysis,entropy generation,and thermal efficiency.To compare the velocity,temperature,and concentration profiles,we examine two ternary nanofluids,i.e.,TiO_(2)+SiO_(2)+Al_(2)O_(3)/H_(2)O and TiO_(2)+SiO_(2)+Cu/H_(2)O,while considering the shape of nanoparticles.The velocity slip and Soret/Dufour effects are taken into consideration.Furthermore,regression analysis for Nusselt and Sherwood numbers of the model is carried out.The Runge-Kutta fourth-order method with shooting technique is employed to acquire the numerical solution of the governed system of ordinary differential equations.The flow pattern attributes of ternary nanofluids are meticulously examined and simulated with the fluc-tuation of flow-dominating parameters.Additionally,the influence of these parameters is demonstrated in the flow,temperature,and concentration fields.For variation in Eckert and Dufour numbers,TiO_(2)+SiO_(2)+Al_(2)O_(3)/H_(2)O has a higher temperature than TiO_(2)+SiO_(2)+Cu/H_(2)O.The results obtained indicate that the ternary nanofluid TiO_(2)+SiO_(2)+Al_(2)O_(3)/H_(2)O has a higher heat transfer rate,lesser entropy generation,greater mass transfer rate,and lower cost than that of TiO_(2)+SiO_(2)+Cu/H_(2)O ternary nanofluid.展开更多
Knowing the influence of the size of datasets for regression models can help in improving the accuracy of a solar power forecast and make the most out of renewable energy systems.This research explores the influence o...Knowing the influence of the size of datasets for regression models can help in improving the accuracy of a solar power forecast and make the most out of renewable energy systems.This research explores the influence of dataset size on the accuracy and reliability of regression models for solar power prediction,contributing to better forecasting methods.The study analyzes data from two solar panels,aSiMicro03036 and aSiTandem72-46,over 7,14,17,21,28,and 38 days,with each dataset comprising five independent and one dependent parameter,and split 80–20 for training and testing.Results indicate that Random Forest consistently outperforms other models,achieving the highest correlation coefficient of 0.9822 and the lowest Mean Absolute Error(MAE)of 2.0544 on the aSiTandem72-46 panel with 21 days of data.For the aSiMicro03036 panel,the best MAE of 4.2978 was reached using the k-Nearest Neighbor(k-NN)algorithm,which was set up as instance-based k-Nearest neighbors(IBk)in Weka after being trained on 17 days of data.Regression performance for most models(excluding IBk)stabilizes at 14 days or more.Compared to the 7-day dataset,increasing to 21 days reduced the MAE by around 20%and improved correlation coefficients by around 2.1%,highlighting the value of moderate dataset expansion.These findings suggest that datasets spanning 17 to 21 days,with 80%used for training,can significantly enhance the predictive accuracy of solar power generation models.展开更多
As the core component of inertial navigation systems, fiber optic gyroscope (FOG), with technical advantages such as low power consumption, long lifespan, fast startup speed, and flexible structural design, are widely...As the core component of inertial navigation systems, fiber optic gyroscope (FOG), with technical advantages such as low power consumption, long lifespan, fast startup speed, and flexible structural design, are widely used in aerospace, unmanned driving, and other fields. However, due to the temper-ature sensitivity of optical devices, the influence of environmen-tal temperature causes errors in FOG, thereby greatly limiting their output accuracy. This work researches on machine-learn-ing based temperature error compensation techniques for FOG. Specifically, it focuses on compensating for the bias errors gen-erated in the fiber ring due to the Shupe effect. This work pro-poses a composite model based on k-means clustering, sup-port vector regression, and particle swarm optimization algo-rithms. And it significantly reduced redundancy within the sam-ples by adopting the interval sequence sample. Moreover, met-rics such as root mean square error (RMSE), mean absolute error (MAE), bias stability, and Allan variance, are selected to evaluate the model’s performance and compensation effective-ness. This work effectively enhances the consistency between data and models across different temperature ranges and tem-perature gradients, improving the bias stability of the FOG from 0.022 °/h to 0.006 °/h. Compared to the existing methods utiliz-ing a single machine learning model, the proposed method increases the bias stability of the compensated FOG from 57.11% to 71.98%, and enhances the suppression of rate ramp noise coefficient from 2.29% to 14.83%. This work improves the accuracy of FOG after compensation, providing theoretical guid-ance and technical references for sensors error compensation work in other fields.展开更多
In this study,we examine the problem of sliced inverse regression(SIR),a widely used method for sufficient dimension reduction(SDR).It was designed to find reduced-dimensional versions of multivariate predictors by re...In this study,we examine the problem of sliced inverse regression(SIR),a widely used method for sufficient dimension reduction(SDR).It was designed to find reduced-dimensional versions of multivariate predictors by replacing them with a minimally adequate collection of their linear combinations without loss of information.Recently,regularization methods have been proposed in SIR to incorporate a sparse structure of predictors for better interpretability.However,existing methods consider convex relaxation to bypass the sparsity constraint,which may not lead to the best subset,and particularly tends to include irrelevant variables when predictors are correlated.In this study,we approach sparse SIR as a nonconvex optimization problem and directly tackle the sparsity constraint by establishing the optimal conditions and iteratively solving them by means of the splicing technique.Without employing convex relaxation on the sparsity constraint and the orthogonal constraint,our algorithm exhibits superior empirical merits,as evidenced by extensive numerical studies.Computationally,our algorithm is much faster than the relaxed approach for the natural sparse SIR estimator.Statistically,our algorithm surpasses existing methods in terms of accuracy for central subspace estimation and best subset selection and sustains high performance even with correlated predictors.展开更多
The impact of different global and local variables in urban development processes requires a systematic study to fully comprehend the underlying complexities in them.The interplay between such variables is crucial for...The impact of different global and local variables in urban development processes requires a systematic study to fully comprehend the underlying complexities in them.The interplay between such variables is crucial for modelling urban growth to closely reflects reality.Despite extensive research,ambiguity remains about how variations in these input variables influence urban densification.In this study,we conduct a global sensitivity analysis(SA)using a multinomial logistic regression(MNL)model to assess the model’s explanatory and predictive power.We examine the influence of global variables,including spatial resolution,neighborhood size,and density classes,under different input combinations at a provincial scale to understand their impact on densification.Additionally,we perform a stepwise regression to identify the significant explanatory variables that are important for understanding densification in the Brussels Metropolitan Area(BMA).Our results indicate that a finer spatial resolution of 50 m and 100 m,smaller neighborhood size of 5×5 and 3×3,and specific density classes—namely 3(non-built-up,low and high built-up)and 4(non-built-up,low,medium and high built-up)—optimally explain and predict urban densification.In line with the same,the stepwise regression reveals that models with a coarser resolution of 300 m lack significant variables,reflecting a lower explanatory power for densification.This approach aids in identifying optimal and significant global variables with higher explanatory power for understanding and predicting urban densification.Furthermore,these findings are reproducible in a global urban context,offering valuable insights for planners,modelers and geographers in managing future urban growth and minimizing modelling.展开更多
Triaxial tests,a staple in rock engineering,are labor-intensive,sample-demanding,and costly,making their optimization highly advantageous.These tests are essential for characterizing rock strength,and by adopting a fa...Triaxial tests,a staple in rock engineering,are labor-intensive,sample-demanding,and costly,making their optimization highly advantageous.These tests are essential for characterizing rock strength,and by adopting a failure criterion,they allow for the derivation of criterion parameters through regression,facilitating their integration into modeling programs.In this study,we introduce the application of an underutilized statistical technique—orthogonal regression—well-suited for analyzing triaxial test data.Additionally,we present an innovation in this technique by minimizing the Euclidean distance while incorporating orthogonality between vectors as a constraint,for the case of orthogonal linear regression.Also,we consider the Modified Least Squares method.We exemplify this approach by developing the necessary equations to apply the Mohr-Coulomb,Murrell,Hoek-Brown,andÚcar criteria,and implement these equations in both spreadsheet calculations and R scripts.Finally,we demonstrate the technique's application using five datasets of varied lithologies from specialized literature,showcasing its versatility and effectiveness.展开更多
To better capture the characteristics of asymmetry and structural fluctuations observed in count time series,this study delves into the application of the quantile regression(QR)method for analyzing and forecasting no...To better capture the characteristics of asymmetry and structural fluctuations observed in count time series,this study delves into the application of the quantile regression(QR)method for analyzing and forecasting nonlinear integer-valued time series exhibiting a piecewise phenomenon.Specifically,we focus on the parameter estimation in the first-order Self-Exciting Threshold Integer-valued Autoregressive(SETINAR(2,1))process with symmetry,asymmetry,and contaminated innovations.We establish the asymptotic properties of the estimator under certain regularity conditions.Monte Carlo simulations demonstrate the superior performance of the QR method compared to the conditional least squares(CLS)approach.Furthermore,we validate the robustness of the proposed method through empirical quantile regression estimation and forecasting for larceny incidents and CAD drug call counts in Pittsburgh,showcasing its effectiveness across diverse levels of data heterogeneity.展开更多
The decoherence of high-dimensional orbital angular momentum(OAM)entanglement in the weak scintillation regime has been investigated.In this study,we simulate atmospheric turbulence by utilizing a multiple-phase scree...The decoherence of high-dimensional orbital angular momentum(OAM)entanglement in the weak scintillation regime has been investigated.In this study,we simulate atmospheric turbulence by utilizing a multiple-phase screen imprinted with anisotropic non-Kolmogorov turbulence.The entanglement negativity and fidelity are introduced to quantify the entanglement of a high-dimensional OAM state.The numerical evaluation results indicate that entanglement negativity and fidelity last longer for a high-dimensional OAM state when the azimuthal mode has a lower value.Additionally,the evolution of higher-dimensional OAM entanglement is significantly influenced by OAM beam parameters and turbulence parameters.Compared to isotropic atmospheric turbulence,anisotropic turbulence has a lesser influence on highdimensional OAM entanglement.展开更多
It is known that monotone recurrence relations can induce a class of twist homeomorphisms on the high-dimensional cylinder,which is an extension of the class of monotone twist maps on the annulus or two-dimensional cy...It is known that monotone recurrence relations can induce a class of twist homeomorphisms on the high-dimensional cylinder,which is an extension of the class of monotone twist maps on the annulus or two-dimensional cylinder.By constructing a bounded solution of the monotone recurrence relation,the main conclusion in this paper is acquired:The induced homeomorphism has Birkhoff orbits provided there is a compact forward-invariant set.Therefore,it generalizes Angenent's results in low-dimensional cases.展开更多
Gastric cancer is the third leading cause of cancer-related mortality and remains a major global health issue^([1]).Annually,approximately 479,000individuals in China are diagnosed with gastric cancer,accounting for a...Gastric cancer is the third leading cause of cancer-related mortality and remains a major global health issue^([1]).Annually,approximately 479,000individuals in China are diagnosed with gastric cancer,accounting for almost 45%of all new cases worldwide^([2]).展开更多
Objective Humans are exposed to complex mixtures of environmental chemicals and other factors that can affect their health.Analysis of these mixture exposures presents several key challenges for environmental epidemio...Objective Humans are exposed to complex mixtures of environmental chemicals and other factors that can affect their health.Analysis of these mixture exposures presents several key challenges for environmental epidemiology and risk assessment,including high dimensionality,correlated exposure,and subtle individual effects.Methods We proposed a novel statistical approach,the generalized functional linear model(GFLM),to analyze the health effects of exposure mixtures.GFLM treats the effect of mixture exposures as a smooth function by reordering exposures based on specific mechanisms and capturing internal correlations to provide a meaningful estimation and interpretation.The robustness and efficiency was evaluated under various scenarios through extensive simulation studies.Results We applied the GFLM to two datasets from the National Health and Nutrition Examination Survey(NHANES).In the first application,we examined the effects of 37 nutrients on BMI(2011–2016 cycles).The GFLM identified a significant mixture effect,with fiber and fat emerging as the nutrients with the greatest negative and positive effects on BMI,respectively.For the second application,we investigated the association between four pre-and perfluoroalkyl substances(PFAS)and gout risk(2007–2018 cycles).Unlike traditional methods,the GFLM indicated no significant association,demonstrating its robustness to multicollinearity.Conclusion GFLM framework is a powerful tool for mixture exposure analysis,offering improved handling of correlated exposures and interpretable results.It demonstrates robust performance across various scenarios and real-world applications,advancing our understanding of complex environmental exposures and their health impacts on environmental epidemiology and toxicology.展开更多
文摘The pricing of moving window Asian option with an early exercise feature is considered a challenging problem in option pricing. The computational challenge lies in the unknown optimal exercise strategy and in the high dimensionality required for approximating the early exercise boundary. We use sparse grid basis functions in the Least Squares Monte Carlo approach to solve this “curse of dimensionality” problem. The resulting algorithm provides a general and convergent method for pricing moving window Asian options. The sparse grid technique presented in this paper can be generalized to pricing other high-dimensional, early-exercisable derivatives.
基金Supported by the Hangzhou Joint Fund of the Zhejiang Provincial Natural Science Foundation of Chi-na(LHZY24A010002)the MOE Project of Humanities and Social Sciences(21YJCZH235).
文摘High-dimensional heterogeneous data have acquired increasing attention and discussion in the past decade.In the context of heterogeneity,semiparametric regression emerges as a popular method to model this type of data in statistics.In this paper,we leverage the benefits of expectile regression for computational efficiency and analytical robustness in heterogeneity,and propose a regularized partially linear additive expectile regression model with a nonconvex penalty,such as SCAD or MCP,for high-dimensional heterogeneous data.We focus on a more realistic scenario where the regression error exhibits a heavy-tailed distribution with only finite moments.This scenario challenges the classical sub-gaussian distribution assumption and is more prevalent in practical applications.Under certain regular conditions,we demonstrate that with probability tending to one,the oracle estimator is one of the local minima of the induced optimization problem.Our theoretical analysis suggests that the dimensionality of linear covariates that our estimation procedure can handle is fundamentally limited by the moment condition of the regression error.Computationally,given the nonconvex and nonsmooth nature of the induced optimization problem,we have developed a two-step algorithm.Finally,our method’s effectiveness is demonstrated through its high estimation accuracy and effective model selection,as evidenced by Monte Carlo simulation studies and a real-data application.Furthermore,by taking various expectile weights,our method effectively detects heterogeneity and explores the complete conditional distribution of the response variable,underscoring its utility in analyzing high-dimensional heterogeneous data.
基金funded by National Natural Science Foundation of China(Nos.12402142,11832013 and 11572134)Natural Science Foundation of Hubei Province(No.2024AFB235)+1 种基金Hubei Provincial Department of Education Science and Technology Research Project(No.Q20221714)the Opening Foundation of Hubei Key Laboratory of Digital Textile Equipment(Nos.DTL2023019 and DTL2022012).
文摘Owing to their global search capabilities and gradient-free operation,metaheuristic algorithms are widely applied to a wide range of optimization problems.However,their computational demands become prohibitive when tackling high-dimensional optimization challenges.To effectively address these challenges,this study introduces cooperative metaheuristics integrating dynamic dimension reduction(DR).Building upon particle swarm optimization(PSO)and differential evolution(DE),the proposed cooperative methods C-PSO and C-DE are developed.In the proposed methods,the modified principal components analysis(PCA)is utilized to reduce the dimension of design variables,thereby decreasing computational costs.The dynamic DR strategy implements periodic execution of modified PCA after a fixed number of iterations,resulting in the important dimensions being dynamically identified.Compared with the static one,the dynamic DR strategy can achieve precise identification of important dimensions,thereby enabling accelerated convergence toward optimal solutions.Furthermore,the influence of cumulative contribution rate thresholds on optimization problems with different dimensions is investigated.Metaheuristic algorithms(PSO,DE)and cooperative metaheuristics(C-PSO,C-DE)are examined by 15 benchmark functions and two engineering design problems(speed reducer and composite pressure vessel).Comparative results demonstrate that the cooperative methods achieve significantly superior performance compared to standard methods in both solution accuracy and computational efficiency.Compared to standard metaheuristic algorithms,cooperative metaheuristics achieve a reduction in computational cost of at least 40%.The cooperative metaheuristics can be effectively used to tackle both high-dimensional unconstrained and constrained optimization problems.
基金supported by Basic Science Research Program through the National Research Foundation of Korea(NRF)funded by the Ministry of Education(RS-2020-NR049579).
文摘High-dimensional data causes difficulties in machine learning due to high time consumption and large memory requirements.In particular,in amulti-label environment,higher complexity is required asmuch as the number of labels.Moreover,an optimization problem that fully considers all dependencies between features and labels is difficult to solve.In this study,we propose a novel regression-basedmulti-label feature selectionmethod that integrates mutual information to better exploit the underlying data structure.By incorporating mutual information into the regression formulation,the model captures not only linear relationships but also complex non-linear dependencies.The proposed objective function simultaneously considers three types of relationships:(1)feature redundancy,(2)featurelabel relevance,and(3)inter-label dependency.These three quantities are computed usingmutual information,allowing the proposed formulation to capture nonlinear dependencies among variables.These three types of relationships are key factors in multi-label feature selection,and our method expresses them within a unified formulation,enabling efficient optimization while simultaneously accounting for all of them.To efficiently solve the proposed optimization problem under non-negativity constraints,we develop a gradient-based optimization algorithm with fast convergence.Theexperimental results on sevenmulti-label datasets show that the proposed method outperforms existingmulti-label feature selection techniques.
基金Under the auspices of the National Natural Science Foundation of China(No.42271224,41901193)Ministry of Edu cation Humanities and Social Sciences Research Planning Fund Project of China(No.24YJAZH190)+1 种基金Anhui Province Excellent Youth Research Project in Universities(No.2022AH030019)Anhui Social Sciences Innovation Development Research Project(No.2024CXQ503)。
文摘The accessibility of urban public transit directly influences residents’quality of life,travel behavior,and social equity.Its correlation with housing prices has garnered significant attention across disciplines such as geography,economics,and urban planning.Although much existing research focuses on the impact of individual transportation facilities on housing prices,there is a notable gap in comprehensive analyses that assess the influence of overall urban transit accessibility on housing market dynamics.This study selected the main urban area of Hefei,China,as a case to investigate the spatial distribution of housing prices and evaluate public transit accessibility in 2022.Employing techniques such as the optimized parameter geographical detector and local spatial regression models,the study aimed to elucidate the effects and underlying mechanisms of urban transit accessibility on housing prices.The findings revealed that:1)housing prices in Hefei exhibited a clustered spatial pattern,with high prices concentrated in the city center and lower prices in peripheral areas,forming three distinct high-price hotspots with a‘belt-like’distribution;2)public transit accessibility showed a‘coreperiphery’structure,with accessibility declining in a‘circumferential’pattern around the city center.Based on the‘housing price-accessibility’dimension,four categories were identified:high price-high accessibility(37.25%),high price-low accessibility(19.07%),low price-high accessibility(21.95%),and low price-low accessibility(21.73%);3)the impact of transit accessibility on housing prices was spatially heterogeneous,with bus travel showing the strongest explanatory power(0.692),followed by automobile,subway,and bicycle travel.The interaction of these transportation modes generated a synergistic effect on housing price differentiation,with most influencing factors contributing more than 25%.These findings offer valuable insights for optimizing the spatial distribution of public transit infrastructure and improving both urban housing quality and residents’living standards.
基金Supported by High-level Professional Groups in Gangdong Province,No.GSPZYQ2020101Guangdong Province Educational Research Planning Project,No.2024GXJK742。
文摘BACKGROUND Paternal perinatal depression(PPD)is closely associated with maternal mental health challenges,marital strain,and adverse child developmental outcomes.Despite its significant impact,PPD remains under-recognized in family-centered clinical practice.Concurrently,against the backdrop of rising rates of delayed marriage and China’s Maternity Incentive Policy,the proportion of women giving birth at an advanced maternal age is increasing.Nevertheless,research specifically examining PPD among spouses of older mothers remains critically scarce,both in China and globally.AIM To investigate PPD and its influencing factors in Chinese advanced maternal age families.METHODS This cross-sectional study included 358 participants;it was conducted among fathers of pregnant women of advanced maternal age at five hospitals in the Pearl River Delta region of China from September 2023 to June 2024.Data were collected via a general information questionnaire,the Social Support Rating Scale,and the Edinburgh Postnatal Depression Scale.Latent profile analysis and regression mixture models(RMMs)were adopted to analyze the latent PPD types and factors that influenced PPD.RESULTS The incidence of PPD was 16.48%,and three profiles were identified:Low-symptomatic(175 cases,48.89%),monophasic(140 cases,39.10%),and high-symptomatic(43 cases,12.01%).The RMM analysis revealed that first pregnancy,low income(<¥3000/month),part-time work,and a history of abnormal pregnancy were positively associated with the high-symptomatic type(P<0.05).Conversely,high subjective support and support utilization were negatively associated with the high-symptomatic type compared with the low-symptomatic type(P<0.05).Good couple relationships,high objective and subjective support,and high support utilization were negatively associated with monophasic disorder(P<0.05).CONCLUSION PPD incidence is high among Chinese fathers with advanced maternal age partners,and the characteristics of depression are varied.Healthcare practitioners should prioritize individuals with low levels of social support.
基金supported by National Science Foundation of USA(Grant No.DMS1811812)supported by National Science Foundation of USA(Grant No.DMS-2015498)National Institutes of Health of USA(Grant Nos.R01GM117597 and R01GM126089)。
文摘During the past decade,shrinkage priors have received much attention in Bayesian analysis of high-dimensional data.This paper establishes the posterior consistency for high-dimensional linear regression with a class of shrinkage priors,which has a heavy and flat tail and allocates a sufficiently large probability mass in a very small neighborhood of zero.While enjoying its efficiency in posterior simulations,the shrinkage prior can lead to a nearly optimal posterior contraction rate and the variable selection consistency as the spike-and-slab prior.Our numerical results show that under the posterior consistency,Bayesian methods can yield much better results in variable selection than the regularization methods such as LASSO and SCAD.This paper also establishes a BvM-type result,which leads to a convenient way of uncertainty quantification for regression coefficient estimates.
基金Supported by the National Natural Science Foundation of China(Grant No.11401340)China Postdoctoral Science Foundation(Grant No.2014M561892)+1 种基金the Foundation of Qufu Normal University(Grant Nos.bsqd2012041xkj201304)
文摘Nonconvex penalties including the smoothly clipped absolute deviation penalty and the minimax concave penalty enjoy the properties of unbiasedness, continuity and sparsity,and the ridge regression can deal with the collinearity problem. Combining the strengths of nonconvex penalties and ridge regression(abbreviated as NPR), we study the oracle property of the NPR estimator in high dimensional settings with highly correlated predictors, where the dimensionality of covariates pn is allowed to increase exponentially with the sample size n. Simulation studies and a real data example are presented to verify the performance of the NPR method.
基金supported by the Youth Scientific Research Project of Fujian Provincial Center for Disease Control and Prevention(2022QN02)the Fujian Provincial Health Youth Scientific Research Project(2023QNA040).
文摘Background:The COVID-1’s impact on influenza activity is of interest to inform future flu prevention and control strategies.Our study aim to examine COVID-19’s effects on influenza in Fujian Province,China,using a regression discontinuity design.Methods:We utilized influenza-like illness(ILI)percentage as an indicator of influenza activity,with data from all sentinel hospitals between Week 4,2020,and Week 51,2023.The data is divided into two groups:the COVID-19 epidemic period and the post-epidemic period.Statistical analysis was performed with R software using robust RD design methods to account for potential confounders including seasonality,temperature,and influenza vaccination rates.Results:There was a discernible increase in the ILI percentage during the post-epidemic period.The robustness of the findings was confirmed with various RD design bandwidth selection methods and placebo tests,with certwo bandwidth providing the largest estimated effect size:a 14.6-percentage-point increase in the ILI percentage(β=0.146;95%CI:0.096–0.196).Sensitivity analyses and adjustments for confounders consistently pointed to an increased ILI percentage during the post-epidemic period compared to the epidemic period.Conclusion:The 14.6 percentage-point increase in the ILI percentage in Fujian Province,China,after the end of the COVID-19 pandemic suggests that there may be a need to re-evaluate and possibly enhance public health measures to control influenza transmission.Further research is needed to fully understand the factors contributing to this rise and to assess the ongoing impacts of post-pandemic behavioral changes.
基金supported by DST-FIST(Government of India)(Grant No.SR/FIST/MS-1/2017/13)and Seed Money Project(Grant No.DoRDC/733).
文摘This study numerically examines the heat and mass transfer characteristics of two ternary nanofluids via converging and diverg-ing channels.Furthermore,the study aims to assess two ternary nanofluids combinations to determine which configuration can provide better heat and mass transfer and lower entropy production,while ensuring cost efficiency.This work bridges the gap be-tween academic research and industrial feasibility by incorporating cost analysis,entropy generation,and thermal efficiency.To compare the velocity,temperature,and concentration profiles,we examine two ternary nanofluids,i.e.,TiO_(2)+SiO_(2)+Al_(2)O_(3)/H_(2)O and TiO_(2)+SiO_(2)+Cu/H_(2)O,while considering the shape of nanoparticles.The velocity slip and Soret/Dufour effects are taken into consideration.Furthermore,regression analysis for Nusselt and Sherwood numbers of the model is carried out.The Runge-Kutta fourth-order method with shooting technique is employed to acquire the numerical solution of the governed system of ordinary differential equations.The flow pattern attributes of ternary nanofluids are meticulously examined and simulated with the fluc-tuation of flow-dominating parameters.Additionally,the influence of these parameters is demonstrated in the flow,temperature,and concentration fields.For variation in Eckert and Dufour numbers,TiO_(2)+SiO_(2)+Al_(2)O_(3)/H_(2)O has a higher temperature than TiO_(2)+SiO_(2)+Cu/H_(2)O.The results obtained indicate that the ternary nanofluid TiO_(2)+SiO_(2)+Al_(2)O_(3)/H_(2)O has a higher heat transfer rate,lesser entropy generation,greater mass transfer rate,and lower cost than that of TiO_(2)+SiO_(2)+Cu/H_(2)O ternary nanofluid.
文摘Knowing the influence of the size of datasets for regression models can help in improving the accuracy of a solar power forecast and make the most out of renewable energy systems.This research explores the influence of dataset size on the accuracy and reliability of regression models for solar power prediction,contributing to better forecasting methods.The study analyzes data from two solar panels,aSiMicro03036 and aSiTandem72-46,over 7,14,17,21,28,and 38 days,with each dataset comprising five independent and one dependent parameter,and split 80–20 for training and testing.Results indicate that Random Forest consistently outperforms other models,achieving the highest correlation coefficient of 0.9822 and the lowest Mean Absolute Error(MAE)of 2.0544 on the aSiTandem72-46 panel with 21 days of data.For the aSiMicro03036 panel,the best MAE of 4.2978 was reached using the k-Nearest Neighbor(k-NN)algorithm,which was set up as instance-based k-Nearest neighbors(IBk)in Weka after being trained on 17 days of data.Regression performance for most models(excluding IBk)stabilizes at 14 days or more.Compared to the 7-day dataset,increasing to 21 days reduced the MAE by around 20%and improved correlation coefficients by around 2.1%,highlighting the value of moderate dataset expansion.These findings suggest that datasets spanning 17 to 21 days,with 80%used for training,can significantly enhance the predictive accuracy of solar power generation models.
基金supported by the National Natural Science Foundation of China(62375013).
文摘As the core component of inertial navigation systems, fiber optic gyroscope (FOG), with technical advantages such as low power consumption, long lifespan, fast startup speed, and flexible structural design, are widely used in aerospace, unmanned driving, and other fields. However, due to the temper-ature sensitivity of optical devices, the influence of environmen-tal temperature causes errors in FOG, thereby greatly limiting their output accuracy. This work researches on machine-learn-ing based temperature error compensation techniques for FOG. Specifically, it focuses on compensating for the bias errors gen-erated in the fiber ring due to the Shupe effect. This work pro-poses a composite model based on k-means clustering, sup-port vector regression, and particle swarm optimization algo-rithms. And it significantly reduced redundancy within the sam-ples by adopting the interval sequence sample. Moreover, met-rics such as root mean square error (RMSE), mean absolute error (MAE), bias stability, and Allan variance, are selected to evaluate the model’s performance and compensation effective-ness. This work effectively enhances the consistency between data and models across different temperature ranges and tem-perature gradients, improving the bias stability of the FOG from 0.022 °/h to 0.006 °/h. Compared to the existing methods utiliz-ing a single machine learning model, the proposed method increases the bias stability of the compensated FOG from 57.11% to 71.98%, and enhances the suppression of rate ramp noise coefficient from 2.29% to 14.83%. This work improves the accuracy of FOG after compensation, providing theoretical guid-ance and technical references for sensors error compensation work in other fields.
文摘In this study,we examine the problem of sliced inverse regression(SIR),a widely used method for sufficient dimension reduction(SDR).It was designed to find reduced-dimensional versions of multivariate predictors by replacing them with a minimally adequate collection of their linear combinations without loss of information.Recently,regularization methods have been proposed in SIR to incorporate a sparse structure of predictors for better interpretability.However,existing methods consider convex relaxation to bypass the sparsity constraint,which may not lead to the best subset,and particularly tends to include irrelevant variables when predictors are correlated.In this study,we approach sparse SIR as a nonconvex optimization problem and directly tackle the sparsity constraint by establishing the optimal conditions and iteratively solving them by means of the splicing technique.Without employing convex relaxation on the sparsity constraint and the orthogonal constraint,our algorithm exhibits superior empirical merits,as evidenced by extensive numerical studies.Computationally,our algorithm is much faster than the relaxed approach for the natural sparse SIR estimator.Statistically,our algorithm surpasses existing methods in terms of accuracy for central subspace estimation and best subset selection and sustains high performance even with correlated predictors.
基金funded by the INTER program and cofunded by the Fond National de la Recherche,Luxembourg(FNR)and the Fund for Scientific Research-FNRS,Belgium(F.R.S-FNRS),T.0233.20-‘Sustainable Residential Densification’project(SusDens,2020–2024).
文摘The impact of different global and local variables in urban development processes requires a systematic study to fully comprehend the underlying complexities in them.The interplay between such variables is crucial for modelling urban growth to closely reflects reality.Despite extensive research,ambiguity remains about how variations in these input variables influence urban densification.In this study,we conduct a global sensitivity analysis(SA)using a multinomial logistic regression(MNL)model to assess the model’s explanatory and predictive power.We examine the influence of global variables,including spatial resolution,neighborhood size,and density classes,under different input combinations at a provincial scale to understand their impact on densification.Additionally,we perform a stepwise regression to identify the significant explanatory variables that are important for understanding densification in the Brussels Metropolitan Area(BMA).Our results indicate that a finer spatial resolution of 50 m and 100 m,smaller neighborhood size of 5×5 and 3×3,and specific density classes—namely 3(non-built-up,low and high built-up)and 4(non-built-up,low,medium and high built-up)—optimally explain and predict urban densification.In line with the same,the stepwise regression reveals that models with a coarser resolution of 300 m lack significant variables,reflecting a lower explanatory power for densification.This approach aids in identifying optimal and significant global variables with higher explanatory power for understanding and predicting urban densification.Furthermore,these findings are reproducible in a global urban context,offering valuable insights for planners,modelers and geographers in managing future urban growth and minimizing modelling.
文摘Triaxial tests,a staple in rock engineering,are labor-intensive,sample-demanding,and costly,making their optimization highly advantageous.These tests are essential for characterizing rock strength,and by adopting a failure criterion,they allow for the derivation of criterion parameters through regression,facilitating their integration into modeling programs.In this study,we introduce the application of an underutilized statistical technique—orthogonal regression—well-suited for analyzing triaxial test data.Additionally,we present an innovation in this technique by minimizing the Euclidean distance while incorporating orthogonality between vectors as a constraint,for the case of orthogonal linear regression.Also,we consider the Modified Least Squares method.We exemplify this approach by developing the necessary equations to apply the Mohr-Coulomb,Murrell,Hoek-Brown,andÚcar criteria,and implement these equations in both spreadsheet calculations and R scripts.Finally,we demonstrate the technique's application using five datasets of varied lithologies from specialized literature,showcasing its versatility and effectiveness.
基金supported by Social Science Planning Foundation of Liaoning Province(Grand No.L22ZD065)National Natural Science Foundation of China(Grand Nos.12271231,1247012719,12001229)。
文摘To better capture the characteristics of asymmetry and structural fluctuations observed in count time series,this study delves into the application of the quantile regression(QR)method for analyzing and forecasting nonlinear integer-valued time series exhibiting a piecewise phenomenon.Specifically,we focus on the parameter estimation in the first-order Self-Exciting Threshold Integer-valued Autoregressive(SETINAR(2,1))process with symmetry,asymmetry,and contaminated innovations.We establish the asymptotic properties of the estimator under certain regularity conditions.Monte Carlo simulations demonstrate the superior performance of the QR method compared to the conditional least squares(CLS)approach.Furthermore,we validate the robustness of the proposed method through empirical quantile regression estimation and forecasting for larceny incidents and CAD drug call counts in Pittsburgh,showcasing its effectiveness across diverse levels of data heterogeneity.
基金supported by the Project of the Hubei Provincial Department of Science and Technology(Grant Nos.2022CFB957,2022CFB475)the National Natural Science Foundation of China(Grant No.11847118)。
文摘The decoherence of high-dimensional orbital angular momentum(OAM)entanglement in the weak scintillation regime has been investigated.In this study,we simulate atmospheric turbulence by utilizing a multiple-phase screen imprinted with anisotropic non-Kolmogorov turbulence.The entanglement negativity and fidelity are introduced to quantify the entanglement of a high-dimensional OAM state.The numerical evaluation results indicate that entanglement negativity and fidelity last longer for a high-dimensional OAM state when the azimuthal mode has a lower value.Additionally,the evolution of higher-dimensional OAM entanglement is significantly influenced by OAM beam parameters and turbulence parameters.Compared to isotropic atmospheric turbulence,anisotropic turbulence has a lesser influence on highdimensional OAM entanglement.
基金Supported by the National Natural Science Foundation of China(12201446)the Natural Science Foundation of the Jiangsu Higher Education Institutions of China(22KJB110005)the Shuangchuang Program of Jiangsu Province(JSSCBS20220898)。
文摘It is known that monotone recurrence relations can induce a class of twist homeomorphisms on the high-dimensional cylinder,which is an extension of the class of monotone twist maps on the annulus or two-dimensional cylinder.By constructing a bounded solution of the monotone recurrence relation,the main conclusion in this paper is acquired:The induced homeomorphism has Birkhoff orbits provided there is a compact forward-invariant set.Therefore,it generalizes Angenent's results in low-dimensional cases.
基金supported by the Natural Science Foundation of Shanghai(23ZR1463600)Shanghai Pudong New Area Health Commission Research Project(PW2021A-69)Research Project of Clinical Research Center of Shanghai Health Medical University(22MC2022002)。
文摘Gastric cancer is the third leading cause of cancer-related mortality and remains a major global health issue^([1]).Annually,approximately 479,000individuals in China are diagnosed with gastric cancer,accounting for almost 45%of all new cases worldwide^([2]).
基金supported in part by the Young Scientists Fund of the National Natural Science Foundation of China(Grant Nos.82304253)(and 82273709)the Foundation for Young Talents in Higher Education of Guangdong Province(Grant No.2022KQNCX021)the PhD Starting Project of Guangdong Medical University(Grant No.GDMUB2022054).
文摘Objective Humans are exposed to complex mixtures of environmental chemicals and other factors that can affect their health.Analysis of these mixture exposures presents several key challenges for environmental epidemiology and risk assessment,including high dimensionality,correlated exposure,and subtle individual effects.Methods We proposed a novel statistical approach,the generalized functional linear model(GFLM),to analyze the health effects of exposure mixtures.GFLM treats the effect of mixture exposures as a smooth function by reordering exposures based on specific mechanisms and capturing internal correlations to provide a meaningful estimation and interpretation.The robustness and efficiency was evaluated under various scenarios through extensive simulation studies.Results We applied the GFLM to two datasets from the National Health and Nutrition Examination Survey(NHANES).In the first application,we examined the effects of 37 nutrients on BMI(2011–2016 cycles).The GFLM identified a significant mixture effect,with fiber and fat emerging as the nutrients with the greatest negative and positive effects on BMI,respectively.For the second application,we investigated the association between four pre-and perfluoroalkyl substances(PFAS)and gout risk(2007–2018 cycles).Unlike traditional methods,the GFLM indicated no significant association,demonstrating its robustness to multicollinearity.Conclusion GFLM framework is a powerful tool for mixture exposure analysis,offering improved handling of correlated exposures and interpretable results.It demonstrates robust performance across various scenarios and real-world applications,advancing our understanding of complex environmental exposures and their health impacts on environmental epidemiology and toxicology.