In this paper,we establish and study a single-species logistic model with impulsive age-selective harvesting.First,we prove the ultimate boundedness of the solutions of the system.Then,we obtain conditions for the asy...In this paper,we establish and study a single-species logistic model with impulsive age-selective harvesting.First,we prove the ultimate boundedness of the solutions of the system.Then,we obtain conditions for the asymptotic stability of the trivial solution and the positive periodic solution.Finally,numerical simulations are presented to validate our results.Our results show that age-selective harvesting is more conducive to sustainable population survival than non-age-selective harvesting.展开更多
Earth’s internal core and crustal magnetic fields,as measured by geomagnetic satellites like MSS-1(Macao Science Satellite-1)and Swarm,are vital for understanding core dynamics and tectonic evolution.To model these i...Earth’s internal core and crustal magnetic fields,as measured by geomagnetic satellites like MSS-1(Macao Science Satellite-1)and Swarm,are vital for understanding core dynamics and tectonic evolution.To model these internal magnetic fields accurately,data selection based on specific criteria is often employed to minimize the influence of rapidly changing current systems in the ionosphere and magnetosphere.However,the quantitative impact of various data selection criteria on internal geomagnetic field modeling is not well understood.This study aims to address this issue and provide a reference for constructing and applying geomagnetic field models.First,we collect the latest MSS-1 and Swarm satellite magnetic data and summarize widely used data selection criteria in geomagnetic field modeling.Second,we briefly describe the method to co-estimate the core,crustal,and large-scale magnetospheric fields using satellite magnetic data.Finally,we conduct a series of field modeling experiments with different data selection criteria to quantitatively estimate their influence.Our numerical experiments confirm that without selecting data from dark regions and geomagnetically quiet times,the resulting internal field differences at the Earth’s surface can range from tens to hundreds of nanotesla(nT).Additionally,we find that the uncertainties introduced into field models by different data selection criteria are significantly larger than the measurement accuracy of modern geomagnetic satellites.These uncertainties should be considered when utilizing constructed magnetic field models for scientific research and applications.展开更多
BACKGROUND Relieving pain is central to the early management of knee osteoarthritis,with a plethora of pharmacological agents licensed for this purpose.Intra-articular corticosteroid injections are a widely used optio...BACKGROUND Relieving pain is central to the early management of knee osteoarthritis,with a plethora of pharmacological agents licensed for this purpose.Intra-articular corticosteroid injections are a widely used option,albeit with variable efficacy.AIM To develop a machine learning(ML)model that predicts which patients will benefit from corticosteroid injections.METHODS Data from two prospective cohort studies[Osteoarthritis(OA)Initiative and Multicentre OA Study]was combined.The primary outcome was patientreported pain score following corticosteroid injection,assessed using the Western Ontario and McMaster Universities OA pain scale,with significant change defined using minimally clinically important difference and meaningful within person change.A ML algorithm was developed,utilizing linear discriminant analysis,to predict symptomatic improvement,and examine the association between pain scores and patient factors by calculating the sensitivity,specificity,positive predictive value,negative predictive value,accuracy,and F2 score.RESULTS A total of 330 patients were included,with a mean age of 63.4(SD:8.3).The mean Western Ontario and McMaster Universities OA pain score was 5.2(SD:4.1),with only 25.5%of patients achieving significant improvement in pain following corticosteroid injection.The ML model generated an accuracy of 67.8%(95%confidence interval:64.6%-70.9%),F1 score of 30.8%,and an area under the curve score of 0.60.CONCLUSION The model demonstrated feasibility to assist clinicians with decision-making in patient selection for corticosteroid injections.Further studies are required to improve the model prior to testing in clinical settings.展开更多
In clinical research,subgroup analysis can help identify patient groups that respond better or worse to specific treatments,improve therapeutic effect and safety,and is of great significance in precision medicine.This...In clinical research,subgroup analysis can help identify patient groups that respond better or worse to specific treatments,improve therapeutic effect and safety,and is of great significance in precision medicine.This article considers subgroup analysis methods for longitudinal data containing multiple covariates and biomarkers.We divide subgroups based on whether a linear combination of these biomarkers exceeds a predetermined threshold,and assess the heterogeneity of treatment effects across subgroups using the interaction between subgroups and exposure variables.Quantile regression is used to better characterize the global distribution of the response variable and sparsity penalties are imposed to achieve variable selection of covariates and biomarkers.The effectiveness of our proposed methodology for both variable selection and parameter estimation is verified through random simulations.Finally,we demonstrate the application of this method by analyzing data from the PA.3 trial,further illustrating the practicality of the method proposed in this paper.展开更多
Selecting proper descriptors(also known feature selection,FS)is key in the process of establishing mechanical properties prediction model of hot-rolled microalloyed steels by using machine learning(ML)algorithm.FS met...Selecting proper descriptors(also known feature selection,FS)is key in the process of establishing mechanical properties prediction model of hot-rolled microalloyed steels by using machine learning(ML)algorithm.FS methods based on data-driving can reduce the redundancy of data features and improve the prediction accuracy of mechanical properties.Based on the collected data of hot-rolled microalloyed steels,the association rules are used to mine the correlation information between the data.High-quality feature subsets are selected by the proposed FS method(FS method based on genetic algorithm embedding,GAMIC).Compared with the common FS method,it is shown on dataset that GAMIC selects feature subsets more appropriately.Six different ML algorithms are trained and tested for mechanical properties prediction.The result shows that the root-mean-square error of yield strength,tensile strength and elongation based on limit gradient enhancement(XGBoost)algorithm is 21.95 MPa,20.85 MPa and 1.96%,the correlation coefficient(R^(2))is 0.969,0.968 and 0.830,and the mean absolute error is 16.84 MPa,15.83 MPa and 1.48%,respectively,showing the best prediction performance.Finally,SHapley Additive exPlanation is used to further explore the influence of feature variables on mechanical properties.GAMIC feature selection method proposed is universal,which provides a basis for the development of high-precision mechanical property prediction model.展开更多
In this paper,a feature selection method for determining input parameters in antenna modeling is proposed.In antenna modeling,the input feature of artificial neural network(ANN)is geometric parameters.The selection cr...In this paper,a feature selection method for determining input parameters in antenna modeling is proposed.In antenna modeling,the input feature of artificial neural network(ANN)is geometric parameters.The selection criteria contain correlation and sensitivity between the geometric parameter and the electromagnetic(EM)response.Maximal information coefficient(MIC),an exploratory data mining tool,is introduced to evaluate both linear and nonlinear correlations.The EM response range is utilized to evaluate the sensitivity.The wide response range corresponding to varying values of a parameter implies the parameter is highly sensitive and the narrow response range suggests the parameter is insensitive.Only the parameter which is highly correlative and sensitive is selected as the input of ANN,and the sampling space of the model is highly reduced.The modeling of a wideband and circularly polarized antenna is studied as an example to verify the effectiveness of the proposed method.The number of input parameters decreases from8 to 4.The testing errors of|S_(11)|and axis ratio are reduced by8.74%and 8.95%,respectively,compared with the ANN with no feature selection.展开更多
Feature selection(FS)is a pivotal pre-processing step in developing data-driven models,influencing reliability,performance and optimization.Although existing FS techniques can yield high-performance metrics for certai...Feature selection(FS)is a pivotal pre-processing step in developing data-driven models,influencing reliability,performance and optimization.Although existing FS techniques can yield high-performance metrics for certain models,they do not invariably guarantee the extraction of the most critical or impactful features.Prior literature underscores the significance of equitable FS practices and has proposed diverse methodologies for the identification of appropriate features.However,the challenge of discerning the most relevant and influential features persists,particularly in the context of the exponential growth and heterogeneity of big data—a challenge that is increasingly salient in modern artificial intelligence(AI)applications.In response,this study introduces an innovative,automated statistical method termed Farea Similarity for Feature Selection(FSFS).The FSFS approach computes a similarity metric for each feature by benchmarking it against the record-wise mean,thereby finding feature dependencies and mitigating the influence of outliers that could potentially distort evaluation outcomes.Features are subsequently ranked according to their similarity scores,with the threshold established at the average similarity score.Notably,lower FSFS values indicate higher similarity and stronger data correlations,whereas higher values suggest lower similarity.The FSFS method is designed not only to yield reliable evaluation metrics but also to reduce data complexity without compromising model performance.Comparative analyses were performed against several established techniques,including Chi-squared(CS),Correlation Coefficient(CC),Genetic Algorithm(GA),Exhaustive Approach,Greedy Stepwise Approach,Gain Ratio,and Filtered Subset Eval,using a variety of datasets such as the Experimental Dataset,Breast Cancer Wisconsin(Original),KDD CUP 1999,NSL-KDD,UNSW-NB15,and Edge-IIoT.In the absence of the FSFS method,the highest classifier accuracies observed were 60.00%,95.13%,97.02%,98.17%,95.86%,and 94.62%for the respective datasets.When the FSFS technique was integrated with data normalization,encoding,balancing,and feature importance selection processes,accuracies improved to 100.00%,97.81%,98.63%,98.94%,94.27%,and 98.46%,respectively.The FSFS method,with a computational complexity of O(fn log n),demonstrates robust scalability and is well-suited for datasets of large size,ensuring efficient processing even when the number of features is substantial.By automatically eliminating outliers and redundant data,FSFS reduces computational overhead,resulting in faster training and improved model performance.Overall,the FSFS framework not only optimizes performance but also enhances the interpretability and explainability of data-driven models,thereby facilitating more trustworthy decision-making in AI applications.展开更多
With the development of More Electric Aircraft(MEA),the Permanent Magnet Synchronous Motor(PMSM)is widely used in the MEA field.The PMSM control system of MEA needs to consider the system reliability,and the inverter ...With the development of More Electric Aircraft(MEA),the Permanent Magnet Synchronous Motor(PMSM)is widely used in the MEA field.The PMSM control system of MEA needs to consider the system reliability,and the inverter switching frequency of the inverter is one of the impacting factors.At the same time,the control accuracy of the system also needs to be considered,and the torque ripple and flux ripple are usually considered to be its important indexes.This paper proposes a three-stage series Model Predictive Torque and Flux Control system(three-stage series MPTFC)based on fast optimal voltage vector selection to reduce switching frequency and suppress torque ripple and flux ripple.Firstly,the analytical model of the PMSM is established and the multi-stage series control method is used to reduce the switching frequency.Secondly,selectable voltage vectors are extended from 8 to 26 and a fast selection method for optimal voltage vector sectors is designed based on the hysteresis comparator,which can suppress the torque ripple and flux ripple to improve the control accuracy.Thirdly,a three-stage series control is obtained by expanding the two-stage series control using the P-Q torque decomposition theory.Finally,a model predictive torque and flux control experimental platform is built,and the feasibility and effectiveness of this method are verified through comparison experiments.展开更多
This paper briefs the configuration and performance of large size gas turbines and their composed combined cycle power plants designed and produced by four large renown gas turbine manufacturing firms in the world, pr...This paper briefs the configuration and performance of large size gas turbines and their composed combined cycle power plants designed and produced by four large renown gas turbine manufacturing firms in the world, providing reference for the relevant sectors and enterprises in importing advanced gas turbines and technologies.展开更多
In this paper, based on the theory of parameter estimation, we give a selection method and, in a sense of a good character of the parameter estimation, we think that it is very reasonable. Moreover, we offer a calcula...In this paper, based on the theory of parameter estimation, we give a selection method and, in a sense of a good character of the parameter estimation, we think that it is very reasonable. Moreover, we offer a calculation method of selection statistic and an applied example.展开更多
An improved social force model based on exit selection is proposed to simulate pedestrians' microscopic behaviors in subway station. The modification lies in considering three factors of spatial distance, occupant...An improved social force model based on exit selection is proposed to simulate pedestrians' microscopic behaviors in subway station. The modification lies in considering three factors of spatial distance, occupant density and exit width. In addition, the problem of pedestrians selecting exit frequently is solved as follows: not changing to other exits in the affected area of one exit, using the probability of remaining preceding exit and invoking function of exit selection after several simulation steps. Pedestrians in subway station have some special characteristics, such as explicit destinations, different familiarities with subway station. Finally, Beijing Zoo Subway Station is taken as an example and the feasibility of the model results is verified through the comparison of the actual data and simulation data. The simulation results show that the improved model can depict the microscopic behaviors of pedestrians in subway station.展开更多
In the helicopter transmission systems, it is important to monitor and track the tooth damage evolution using lots of sensors and detection methods. This paper develops a novel approach for sensor selection based on p...In the helicopter transmission systems, it is important to monitor and track the tooth damage evolution using lots of sensors and detection methods. This paper develops a novel approach for sensor selection based on physical model and sensitivity analysis. Firstly, a physical model of tooth damage and mesh stiffness is built. Secondly, some effective condition indicators (Cls) are presented, and the optimal Cls set is selected by comparing their test statistics according to Mann-Kendall test. Afterwards, the selected CIs are used to generate a health indicator (HI) through sen slop estimator. Then, the sensors are selected according to the monotonic relevance and sensitivity to the damage levels. Finally, the proposed method is verified by the simulation and experimental data. The results show that the approach can provide a guide for health monitor- ing of helicopter transmission systems, and it is effective to reduce the test cost and improve the system's reliability.展开更多
The traditional model selection criterions try to make a balance between fitted error and model complexity. Assumptions on the distribution of the response or the noise, which may be misspecified, should be made befor...The traditional model selection criterions try to make a balance between fitted error and model complexity. Assumptions on the distribution of the response or the noise, which may be misspecified, should be made before using the traditional ones. In this ar- ticle, we give a new model selection criterion, based on the assumption that noise term in the model is independent with explanatory variables, of minimizing the association strength between regression residuals and the response, with fewer assumptions. Maximal Information Coe^cient (MIC), a recently proposed dependence measure, captures a wide range of associ- ations, and gives almost the same score to different type of relationships with equal noise, so MIC is used to measure the association strength. Furthermore, partial maximal information coefficient (PMIC) is introduced to capture the association between two variables removing a third controlling random variable. In addition, the definition of general partial relationship is given.展开更多
The performance of six statistical approaches,which can be used for selection of the best model to describe the growth of individual fish,was analyzed using simulated and real length-at-age data.The six approaches inc...The performance of six statistical approaches,which can be used for selection of the best model to describe the growth of individual fish,was analyzed using simulated and real length-at-age data.The six approaches include coefficient of determination(R2),adjusted coefficient of determination(adj.-R2),root mean squared error(RMSE),Akaike's information criterion(AIC),bias correction of AIC(AICc) and Bayesian information criterion(BIC).The simulation data were generated by five growth models with different numbers of parameters.Four sets of real data were taken from the literature.The parameters in each of the five growth models were estimated using the maximum likelihood method under the assumption of the additive error structure for the data.The best supported model by the data was identified using each of the six approaches.The results show that R2 and RMSE have the same properties and perform worst.The sample size has an effect on the performance of adj.-R2,AIC,AICc and BIC.Adj.-R2 does better in small samples than in large samples.AIC is not suitable to use in small samples and tends to select more complex model when the sample size becomes large.AICc and BIC have best performance in small and large sample cases,respectively.Use of AICc or BIC is recommended for selection of fish growth model according to the size of the length-at-age data.展开更多
We present Turing pattern selection in a reaction-diffusion epidemic model under zero-flux boundary conditions. The value of this study is twofold. First, it establishes the amplitude equations for the excited modes, ...We present Turing pattern selection in a reaction-diffusion epidemic model under zero-flux boundary conditions. The value of this study is twofold. First, it establishes the amplitude equations for the excited modes, which determines the stability of amplitudes towards uniform and inhomogeneous perturbations. Second, it illustrates all five categories of Turing patterns close to the onset of Turing bifurcation via numerical simulations which indicates that the model dynamics exhibits complex pattern replication: on increasing the control parameter v, the sequence "H0 hexagons → H0-hexagon-stripe mixtures →stripes → Hπ-hexagon-stripe mixtures → Hπ hexagons" is observed. This may enrich the pattern dynamics in a diffusive epidemic model.展开更多
Covariance functions have been proposed as an alternative to model longitudinal data in animal breeding because of their various merits in comparison to the classical analytical methods.In practical estimation,differe...Covariance functions have been proposed as an alternative to model longitudinal data in animal breeding because of their various merits in comparison to the classical analytical methods.In practical estimation,different models and polynomial orders fitted can influence the estimates of covariance functions and thus genetic parameters.The objective of this study was to select model for estimation of covariance functions for body weights of Angora goats at 7 time points.Covariance functions were estimated by fitting 6 random regression models with birth year,birth month,sex,age of dam,birth type,and relative birth date as fixed effects.Random effects involved were direct and maternal additive genetic,and animal and maternal permanent environmental effects with different orders of fit.Selection of model and orders of fit were carried out by likelihood ratio test and 4 types of information criteria.The results showed that model with 6 orders of polynomial fit for direct additive genetic and animal permanent environmental effects and 4 and 5 orders for maternal genetic and permanent environmental effects,respectively,were preferable for estimation of covariance functions.Models with and without maternal effects influenced the estimates of covariance functions greatly.Maternal permanent environmental effect does not explain the variation of all permanent environments,well suggesting different sources of permanent environmental effects also has large influence on covariance function estimates.展开更多
Genomic selection(GS)can be used to accelerate genetic improvement by shortening the selection interval.The successful application of GS depends largely on the accuracy of the prediction of genomic estimated breeding ...Genomic selection(GS)can be used to accelerate genetic improvement by shortening the selection interval.The successful application of GS depends largely on the accuracy of the prediction of genomic estimated breeding value(GEBV).This study is a fi rst attempt to understand the practicality of GS in Litopenaeus vannamei and aims to evaluate models for GS on growth traits.The performance of GS models in L.vannamei was evaluated in a population consisting of 205 individuals,which were genotyped for 6 359 single nucleotide polymorphism(SNP)markers by specifi c length amplifi ed fragment sequencing(SLAF-seq)and phenotyped for body length and body weight.Three GS models(RR-BLUP,Bayes A,and Bayesian LASSO)were used to obtain the GEBV,and their predictive ability was assessed by the reliability of the GEBV and the bias of the predicted phenotypes.The mean reliability of the GEBVs for body length and body weight predicted by the dif ferent models was 0.296 and 0.411,respectively.For each trait,the performances of the three models were very similar to each other with respect to predictability.The regression coeffi cients estimated by the three models were close to one,suggesting near to zero bias for the predictions.Therefore,when GS was applied in a L.vannamei population for the studied scenarios,all three models appeared practicable.Further analyses suggested that improved estimation of the genomic prediction could be realized by increasing the size of the training population as well as the density of SNPs.展开更多
The test selection and optimization (TSO) can improve the abilities of fault diagnosis, prognosis and health-state evalua- tion for prognostics and health management (PHM) systems. Traditionally, TSO mainly focuse...The test selection and optimization (TSO) can improve the abilities of fault diagnosis, prognosis and health-state evalua- tion for prognostics and health management (PHM) systems. Traditionally, TSO mainly focuses on fault detection and isolation, but they cannot provide an effective guide for the design for testability (DFT) to improve the PHM performance level. To solve the problem, a model of TSO for PHM systems is proposed. Firstly, through integrating the characteristics of fault severity and propa- gation time, and analyzing the test timing and sensitivity, a testability model based on failure evolution mechanism model (FEMM) for PHM systems is built up. This model describes the fault evolution- test dependency using the fault-symptom parameter matrix and symptom parameter-test matrix. Secondly, a novel method of in- herent testability analysis for PHM systems is developed based on the above information. Having completed the analysis, a TSO model, whose objective is to maximize fault trackability and mini- mize the test cost, is proposed through inherent testability analysis results, and an adaptive simulated annealing genetic algorithm (ASAGA) is introduced to solve the TSO problem. Finally, a case of a centrifugal pump system is used to verify the feasibility and effectiveness of the proposed models and methods. The results show that the proposed technology is important for PHM systems to select and optimize the test set in order to improve their performance level.展开更多
The optimal selection of radar clutter model is the premise of target detection,tracking,recognition,and cognitive waveform design in clutter background.Clutter characterization models are usually derived by mathemati...The optimal selection of radar clutter model is the premise of target detection,tracking,recognition,and cognitive waveform design in clutter background.Clutter characterization models are usually derived by mathematical simplification or empirical data fitting.However,the lack of standard model labels is a challenge in the optimal selection process.To solve this problem,a general three-level evaluation system for the model selection performance is proposed,including model selection accuracy index based on simulation data,fit goodness indexs based on the optimally selected model,and evaluation index based on the supporting performance to its third-party.The three-level evaluation system can more comprehensively and accurately describe the selection performance of the radar clutter model in different ways,and can be popularized and applied to the evaluation of other similar characterization model selection.展开更多
In this paper we reparameterize covariance structures in longitudinal data analysis through the modified Cholesky decomposition of itself. Based on this modified Cholesky decomposition, the within-subject covariance m...In this paper we reparameterize covariance structures in longitudinal data analysis through the modified Cholesky decomposition of itself. Based on this modified Cholesky decomposition, the within-subject covariance matrix is decomposed into a unit lower triangular matrix involving moving average coefficients and a diagonal matrix involving innovation variances, which are modeled as linear functions of covariates. Then, we propose a penalized maximum likelihood method for variable selection in joint mean and covariance models based on this decomposition. Under certain regularity conditions, we establish the consistency and asymptotic normality of the penalized maximum likelihood estimators of parameters in the models. Simulation studies are undertaken to assess the finite sample performance of the proposed variable selection procedure.展开更多
基金Supported by the National Natural Science Foundation of China(12261018)Universities Key Laboratory of Mathematical Modeling and Data Mining in Guizhou Province(2023013)。
文摘In this paper,we establish and study a single-species logistic model with impulsive age-selective harvesting.First,we prove the ultimate boundedness of the solutions of the system.Then,we obtain conditions for the asymptotic stability of the trivial solution and the positive periodic solution.Finally,numerical simulations are presented to validate our results.Our results show that age-selective harvesting is more conducive to sustainable population survival than non-age-selective harvesting.
基金supported by the National Natural Science Foundation of China(42250101)the Macao Foundation。
文摘Earth’s internal core and crustal magnetic fields,as measured by geomagnetic satellites like MSS-1(Macao Science Satellite-1)and Swarm,are vital for understanding core dynamics and tectonic evolution.To model these internal magnetic fields accurately,data selection based on specific criteria is often employed to minimize the influence of rapidly changing current systems in the ionosphere and magnetosphere.However,the quantitative impact of various data selection criteria on internal geomagnetic field modeling is not well understood.This study aims to address this issue and provide a reference for constructing and applying geomagnetic field models.First,we collect the latest MSS-1 and Swarm satellite magnetic data and summarize widely used data selection criteria in geomagnetic field modeling.Second,we briefly describe the method to co-estimate the core,crustal,and large-scale magnetospheric fields using satellite magnetic data.Finally,we conduct a series of field modeling experiments with different data selection criteria to quantitatively estimate their influence.Our numerical experiments confirm that without selecting data from dark regions and geomagnetically quiet times,the resulting internal field differences at the Earth’s surface can range from tens to hundreds of nanotesla(nT).Additionally,we find that the uncertainties introduced into field models by different data selection criteria are significantly larger than the measurement accuracy of modern geomagnetic satellites.These uncertainties should be considered when utilizing constructed magnetic field models for scientific research and applications.
基金Supported by National Institute For Health and Care Research,No.NIHR302632.
文摘BACKGROUND Relieving pain is central to the early management of knee osteoarthritis,with a plethora of pharmacological agents licensed for this purpose.Intra-articular corticosteroid injections are a widely used option,albeit with variable efficacy.AIM To develop a machine learning(ML)model that predicts which patients will benefit from corticosteroid injections.METHODS Data from two prospective cohort studies[Osteoarthritis(OA)Initiative and Multicentre OA Study]was combined.The primary outcome was patientreported pain score following corticosteroid injection,assessed using the Western Ontario and McMaster Universities OA pain scale,with significant change defined using minimally clinically important difference and meaningful within person change.A ML algorithm was developed,utilizing linear discriminant analysis,to predict symptomatic improvement,and examine the association between pain scores and patient factors by calculating the sensitivity,specificity,positive predictive value,negative predictive value,accuracy,and F2 score.RESULTS A total of 330 patients were included,with a mean age of 63.4(SD:8.3).The mean Western Ontario and McMaster Universities OA pain score was 5.2(SD:4.1),with only 25.5%of patients achieving significant improvement in pain following corticosteroid injection.The ML model generated an accuracy of 67.8%(95%confidence interval:64.6%-70.9%),F1 score of 30.8%,and an area under the curve score of 0.60.CONCLUSION The model demonstrated feasibility to assist clinicians with decision-making in patient selection for corticosteroid injections.Further studies are required to improve the model prior to testing in clinical settings.
基金Supported by the Natural Science Foundation of Fujian Province(2022J011177,2024J01903)the Key Project of Fujian Provincial Education Department(JZ230054)。
文摘In clinical research,subgroup analysis can help identify patient groups that respond better or worse to specific treatments,improve therapeutic effect and safety,and is of great significance in precision medicine.This article considers subgroup analysis methods for longitudinal data containing multiple covariates and biomarkers.We divide subgroups based on whether a linear combination of these biomarkers exceeds a predetermined threshold,and assess the heterogeneity of treatment effects across subgroups using the interaction between subgroups and exposure variables.Quantile regression is used to better characterize the global distribution of the response variable and sparsity penalties are imposed to achieve variable selection of covariates and biomarkers.The effectiveness of our proposed methodology for both variable selection and parameter estimation is verified through random simulations.Finally,we demonstrate the application of this method by analyzing data from the PA.3 trial,further illustrating the practicality of the method proposed in this paper.
基金supported by the National Key Research and Development Program of China(Grant No.2021YFB3702404)the National Natural Science Foundation of China(Grant No.52104370)+4 种基金the Reviving-Liaoning Excellence Plan(XLYC2203186)Science and Technology Special Projects of Liaoning Province(Grant No.2022JH25/10200001)the Postdoctoral Research Fund for Northeastern(Grant No.20210203)Independent Projects of Basic Scientific Research(ZZ2021005)CITIC Niobium Steel Development Award Fund(2022-M1824).
文摘Selecting proper descriptors(also known feature selection,FS)is key in the process of establishing mechanical properties prediction model of hot-rolled microalloyed steels by using machine learning(ML)algorithm.FS methods based on data-driving can reduce the redundancy of data features and improve the prediction accuracy of mechanical properties.Based on the collected data of hot-rolled microalloyed steels,the association rules are used to mine the correlation information between the data.High-quality feature subsets are selected by the proposed FS method(FS method based on genetic algorithm embedding,GAMIC).Compared with the common FS method,it is shown on dataset that GAMIC selects feature subsets more appropriately.Six different ML algorithms are trained and tested for mechanical properties prediction.The result shows that the root-mean-square error of yield strength,tensile strength and elongation based on limit gradient enhancement(XGBoost)algorithm is 21.95 MPa,20.85 MPa and 1.96%,the correlation coefficient(R^(2))is 0.969,0.968 and 0.830,and the mean absolute error is 16.84 MPa,15.83 MPa and 1.48%,respectively,showing the best prediction performance.Finally,SHapley Additive exPlanation is used to further explore the influence of feature variables on mechanical properties.GAMIC feature selection method proposed is universal,which provides a basis for the development of high-precision mechanical property prediction model.
基金National Natural Science Foundation of China(62161048)Sichuan Science and Technology Program(2022NSFSC0547,2022ZYD0109)。
文摘In this paper,a feature selection method for determining input parameters in antenna modeling is proposed.In antenna modeling,the input feature of artificial neural network(ANN)is geometric parameters.The selection criteria contain correlation and sensitivity between the geometric parameter and the electromagnetic(EM)response.Maximal information coefficient(MIC),an exploratory data mining tool,is introduced to evaluate both linear and nonlinear correlations.The EM response range is utilized to evaluate the sensitivity.The wide response range corresponding to varying values of a parameter implies the parameter is highly sensitive and the narrow response range suggests the parameter is insensitive.Only the parameter which is highly correlative and sensitive is selected as the input of ANN,and the sampling space of the model is highly reduced.The modeling of a wideband and circularly polarized antenna is studied as an example to verify the effectiveness of the proposed method.The number of input parameters decreases from8 to 4.The testing errors of|S_(11)|and axis ratio are reduced by8.74%and 8.95%,respectively,compared with the ANN with no feature selection.
文摘Feature selection(FS)is a pivotal pre-processing step in developing data-driven models,influencing reliability,performance and optimization.Although existing FS techniques can yield high-performance metrics for certain models,they do not invariably guarantee the extraction of the most critical or impactful features.Prior literature underscores the significance of equitable FS practices and has proposed diverse methodologies for the identification of appropriate features.However,the challenge of discerning the most relevant and influential features persists,particularly in the context of the exponential growth and heterogeneity of big data—a challenge that is increasingly salient in modern artificial intelligence(AI)applications.In response,this study introduces an innovative,automated statistical method termed Farea Similarity for Feature Selection(FSFS).The FSFS approach computes a similarity metric for each feature by benchmarking it against the record-wise mean,thereby finding feature dependencies and mitigating the influence of outliers that could potentially distort evaluation outcomes.Features are subsequently ranked according to their similarity scores,with the threshold established at the average similarity score.Notably,lower FSFS values indicate higher similarity and stronger data correlations,whereas higher values suggest lower similarity.The FSFS method is designed not only to yield reliable evaluation metrics but also to reduce data complexity without compromising model performance.Comparative analyses were performed against several established techniques,including Chi-squared(CS),Correlation Coefficient(CC),Genetic Algorithm(GA),Exhaustive Approach,Greedy Stepwise Approach,Gain Ratio,and Filtered Subset Eval,using a variety of datasets such as the Experimental Dataset,Breast Cancer Wisconsin(Original),KDD CUP 1999,NSL-KDD,UNSW-NB15,and Edge-IIoT.In the absence of the FSFS method,the highest classifier accuracies observed were 60.00%,95.13%,97.02%,98.17%,95.86%,and 94.62%for the respective datasets.When the FSFS technique was integrated with data normalization,encoding,balancing,and feature importance selection processes,accuracies improved to 100.00%,97.81%,98.63%,98.94%,94.27%,and 98.46%,respectively.The FSFS method,with a computational complexity of O(fn log n),demonstrates robust scalability and is well-suited for datasets of large size,ensuring efficient processing even when the number of features is substantial.By automatically eliminating outliers and redundant data,FSFS reduces computational overhead,resulting in faster training and improved model performance.Overall,the FSFS framework not only optimizes performance but also enhances the interpretability and explainability of data-driven models,thereby facilitating more trustworthy decision-making in AI applications.
基金co-supported by the National Natural Science Foundation of China(No.52477063)the National Key Research and Development Program of China(No.2023YFF0719100)。
文摘With the development of More Electric Aircraft(MEA),the Permanent Magnet Synchronous Motor(PMSM)is widely used in the MEA field.The PMSM control system of MEA needs to consider the system reliability,and the inverter switching frequency of the inverter is one of the impacting factors.At the same time,the control accuracy of the system also needs to be considered,and the torque ripple and flux ripple are usually considered to be its important indexes.This paper proposes a three-stage series Model Predictive Torque and Flux Control system(three-stage series MPTFC)based on fast optimal voltage vector selection to reduce switching frequency and suppress torque ripple and flux ripple.Firstly,the analytical model of the PMSM is established and the multi-stage series control method is used to reduce the switching frequency.Secondly,selectable voltage vectors are extended from 8 to 26 and a fast selection method for optimal voltage vector sectors is designed based on the hysteresis comparator,which can suppress the torque ripple and flux ripple to improve the control accuracy.Thirdly,a three-stage series control is obtained by expanding the two-stage series control using the P-Q torque decomposition theory.Finally,a model predictive torque and flux control experimental platform is built,and the feasibility and effectiveness of this method are verified through comparison experiments.
文摘This paper briefs the configuration and performance of large size gas turbines and their composed combined cycle power plants designed and produced by four large renown gas turbine manufacturing firms in the world, providing reference for the relevant sectors and enterprises in importing advanced gas turbines and technologies.
基金Supported by the Natural Science Foundation of Anhui Education Committee
文摘In this paper, based on the theory of parameter estimation, we give a selection method and, in a sense of a good character of the parameter estimation, we think that it is very reasonable. Moreover, we offer a calculation method of selection statistic and an applied example.
基金Project(T14JB00200)supported by the Fundamental Research Funds for the Central UniversitiesChina+2 种基金Projects(RCS2012ZZ002RCS2012ZT003)supported by the State Key Laboratory of Rail Traffic Control and SafetyChina
文摘An improved social force model based on exit selection is proposed to simulate pedestrians' microscopic behaviors in subway station. The modification lies in considering three factors of spatial distance, occupant density and exit width. In addition, the problem of pedestrians selecting exit frequently is solved as follows: not changing to other exits in the affected area of one exit, using the probability of remaining preceding exit and invoking function of exit selection after several simulation steps. Pedestrians in subway station have some special characteristics, such as explicit destinations, different familiarities with subway station. Finally, Beijing Zoo Subway Station is taken as an example and the feasibility of the model results is verified through the comparison of the actual data and simulation data. The simulation results show that the improved model can depict the microscopic behaviors of pedestrians in subway station.
基金supported by the National Natural Science Foundation of China (No. 51175502)
文摘In the helicopter transmission systems, it is important to monitor and track the tooth damage evolution using lots of sensors and detection methods. This paper develops a novel approach for sensor selection based on physical model and sensitivity analysis. Firstly, a physical model of tooth damage and mesh stiffness is built. Secondly, some effective condition indicators (Cls) are presented, and the optimal Cls set is selected by comparing their test statistics according to Mann-Kendall test. Afterwards, the selected CIs are used to generate a health indicator (HI) through sen slop estimator. Then, the sensors are selected according to the monotonic relevance and sensitivity to the damage levels. Finally, the proposed method is verified by the simulation and experimental data. The results show that the approach can provide a guide for health monitor- ing of helicopter transmission systems, and it is effective to reduce the test cost and improve the system's reliability.
基金partly supported by National Basic Research Program of China(973 Program,2011CB707802,2013CB910200)National Science Foundation of China(11201466)
文摘The traditional model selection criterions try to make a balance between fitted error and model complexity. Assumptions on the distribution of the response or the noise, which may be misspecified, should be made before using the traditional ones. In this ar- ticle, we give a new model selection criterion, based on the assumption that noise term in the model is independent with explanatory variables, of minimizing the association strength between regression residuals and the response, with fewer assumptions. Maximal Information Coe^cient (MIC), a recently proposed dependence measure, captures a wide range of associ- ations, and gives almost the same score to different type of relationships with equal noise, so MIC is used to measure the association strength. Furthermore, partial maximal information coefficient (PMIC) is introduced to capture the association between two variables removing a third controlling random variable. In addition, the definition of general partial relationship is given.
基金Supported by the High Technology Research and Development Program of China (863 Program,No2006AA100301)
文摘The performance of six statistical approaches,which can be used for selection of the best model to describe the growth of individual fish,was analyzed using simulated and real length-at-age data.The six approaches include coefficient of determination(R2),adjusted coefficient of determination(adj.-R2),root mean squared error(RMSE),Akaike's information criterion(AIC),bias correction of AIC(AICc) and Bayesian information criterion(BIC).The simulation data were generated by five growth models with different numbers of parameters.Four sets of real data were taken from the literature.The parameters in each of the five growth models were estimated using the maximum likelihood method under the assumption of the additive error structure for the data.The best supported model by the data was identified using each of the six approaches.The results show that R2 and RMSE have the same properties and perform worst.The sample size has an effect on the performance of adj.-R2,AIC,AICc and BIC.Adj.-R2 does better in small samples than in large samples.AIC is not suitable to use in small samples and tends to select more complex model when the sample size becomes large.AICc and BIC have best performance in small and large sample cases,respectively.Use of AICc or BIC is recommended for selection of fish growth model according to the size of the length-at-age data.
基金Project supported by the Natural Science Foundation of Zhejiang Province of China (Grant No.Y7080041)
文摘We present Turing pattern selection in a reaction-diffusion epidemic model under zero-flux boundary conditions. The value of this study is twofold. First, it establishes the amplitude equations for the excited modes, which determines the stability of amplitudes towards uniform and inhomogeneous perturbations. Second, it illustrates all five categories of Turing patterns close to the onset of Turing bifurcation via numerical simulations which indicates that the model dynamics exhibits complex pattern replication: on increasing the control parameter v, the sequence "H0 hexagons → H0-hexagon-stripe mixtures →stripes → Hπ-hexagon-stripe mixtures → Hπ hexagons" is observed. This may enrich the pattern dynamics in a diffusive epidemic model.
基金funded by the Young Academic Leaders Supporting Project in Institutions of Higher Education of Shanxi Province,China
文摘Covariance functions have been proposed as an alternative to model longitudinal data in animal breeding because of their various merits in comparison to the classical analytical methods.In practical estimation,different models and polynomial orders fitted can influence the estimates of covariance functions and thus genetic parameters.The objective of this study was to select model for estimation of covariance functions for body weights of Angora goats at 7 time points.Covariance functions were estimated by fitting 6 random regression models with birth year,birth month,sex,age of dam,birth type,and relative birth date as fixed effects.Random effects involved were direct and maternal additive genetic,and animal and maternal permanent environmental effects with different orders of fit.Selection of model and orders of fit were carried out by likelihood ratio test and 4 types of information criteria.The results showed that model with 6 orders of polynomial fit for direct additive genetic and animal permanent environmental effects and 4 and 5 orders for maternal genetic and permanent environmental effects,respectively,were preferable for estimation of covariance functions.Models with and without maternal effects influenced the estimates of covariance functions greatly.Maternal permanent environmental effect does not explain the variation of all permanent environments,well suggesting different sources of permanent environmental effects also has large influence on covariance function estimates.
基金Supported by the National High Technology Research and Development Program of China(863 Program)(No.2012AA10A404)the National Natural Science Foundation of China(No.31502161)Financially Supported by Qingdao National Laboratory for Marine Science and Technology(No.2015ASKJ02)
文摘Genomic selection(GS)can be used to accelerate genetic improvement by shortening the selection interval.The successful application of GS depends largely on the accuracy of the prediction of genomic estimated breeding value(GEBV).This study is a fi rst attempt to understand the practicality of GS in Litopenaeus vannamei and aims to evaluate models for GS on growth traits.The performance of GS models in L.vannamei was evaluated in a population consisting of 205 individuals,which were genotyped for 6 359 single nucleotide polymorphism(SNP)markers by specifi c length amplifi ed fragment sequencing(SLAF-seq)and phenotyped for body length and body weight.Three GS models(RR-BLUP,Bayes A,and Bayesian LASSO)were used to obtain the GEBV,and their predictive ability was assessed by the reliability of the GEBV and the bias of the predicted phenotypes.The mean reliability of the GEBVs for body length and body weight predicted by the dif ferent models was 0.296 and 0.411,respectively.For each trait,the performances of the three models were very similar to each other with respect to predictability.The regression coeffi cients estimated by the three models were close to one,suggesting near to zero bias for the predictions.Therefore,when GS was applied in a L.vannamei population for the studied scenarios,all three models appeared practicable.Further analyses suggested that improved estimation of the genomic prediction could be realized by increasing the size of the training population as well as the density of SNPs.
基金supported by the National Natural Science Foundation of China(51175502)
文摘The test selection and optimization (TSO) can improve the abilities of fault diagnosis, prognosis and health-state evalua- tion for prognostics and health management (PHM) systems. Traditionally, TSO mainly focuses on fault detection and isolation, but they cannot provide an effective guide for the design for testability (DFT) to improve the PHM performance level. To solve the problem, a model of TSO for PHM systems is proposed. Firstly, through integrating the characteristics of fault severity and propa- gation time, and analyzing the test timing and sensitivity, a testability model based on failure evolution mechanism model (FEMM) for PHM systems is built up. This model describes the fault evolution- test dependency using the fault-symptom parameter matrix and symptom parameter-test matrix. Secondly, a novel method of in- herent testability analysis for PHM systems is developed based on the above information. Having completed the analysis, a TSO model, whose objective is to maximize fault trackability and mini- mize the test cost, is proposed through inherent testability analysis results, and an adaptive simulated annealing genetic algorithm (ASAGA) is introduced to solve the TSO problem. Finally, a case of a centrifugal pump system is used to verify the feasibility and effectiveness of the proposed models and methods. The results show that the proposed technology is important for PHM systems to select and optimize the test set in order to improve their performance level.
基金the National Natural Science Foundation of China(6187138461921001).
文摘The optimal selection of radar clutter model is the premise of target detection,tracking,recognition,and cognitive waveform design in clutter background.Clutter characterization models are usually derived by mathematical simplification or empirical data fitting.However,the lack of standard model labels is a challenge in the optimal selection process.To solve this problem,a general three-level evaluation system for the model selection performance is proposed,including model selection accuracy index based on simulation data,fit goodness indexs based on the optimally selected model,and evaluation index based on the supporting performance to its third-party.The three-level evaluation system can more comprehensively and accurately describe the selection performance of the radar clutter model in different ways,and can be popularized and applied to the evaluation of other similar characterization model selection.
文摘In this paper we reparameterize covariance structures in longitudinal data analysis through the modified Cholesky decomposition of itself. Based on this modified Cholesky decomposition, the within-subject covariance matrix is decomposed into a unit lower triangular matrix involving moving average coefficients and a diagonal matrix involving innovation variances, which are modeled as linear functions of covariates. Then, we propose a penalized maximum likelihood method for variable selection in joint mean and covariance models based on this decomposition. Under certain regularity conditions, we establish the consistency and asymptotic normality of the penalized maximum likelihood estimators of parameters in the models. Simulation studies are undertaken to assess the finite sample performance of the proposed variable selection procedure.