Feature selection(FS)is a pivotal pre-processing step in developing data-driven models,influencing reliability,performance and optimization.Although existing FS techniques can yield high-performance metrics for certai...Feature selection(FS)is a pivotal pre-processing step in developing data-driven models,influencing reliability,performance and optimization.Although existing FS techniques can yield high-performance metrics for certain models,they do not invariably guarantee the extraction of the most critical or impactful features.Prior literature underscores the significance of equitable FS practices and has proposed diverse methodologies for the identification of appropriate features.However,the challenge of discerning the most relevant and influential features persists,particularly in the context of the exponential growth and heterogeneity of big data—a challenge that is increasingly salient in modern artificial intelligence(AI)applications.In response,this study introduces an innovative,automated statistical method termed Farea Similarity for Feature Selection(FSFS).The FSFS approach computes a similarity metric for each feature by benchmarking it against the record-wise mean,thereby finding feature dependencies and mitigating the influence of outliers that could potentially distort evaluation outcomes.Features are subsequently ranked according to their similarity scores,with the threshold established at the average similarity score.Notably,lower FSFS values indicate higher similarity and stronger data correlations,whereas higher values suggest lower similarity.The FSFS method is designed not only to yield reliable evaluation metrics but also to reduce data complexity without compromising model performance.Comparative analyses were performed against several established techniques,including Chi-squared(CS),Correlation Coefficient(CC),Genetic Algorithm(GA),Exhaustive Approach,Greedy Stepwise Approach,Gain Ratio,and Filtered Subset Eval,using a variety of datasets such as the Experimental Dataset,Breast Cancer Wisconsin(Original),KDD CUP 1999,NSL-KDD,UNSW-NB15,and Edge-IIoT.In the absence of the FSFS method,the highest classifier accuracies observed were 60.00%,95.13%,97.02%,98.17%,95.86%,and 94.62%for the respective datasets.When the FSFS technique was integrated with data normalization,encoding,balancing,and feature importance selection processes,accuracies improved to 100.00%,97.81%,98.63%,98.94%,94.27%,and 98.46%,respectively.The FSFS method,with a computational complexity of O(fn log n),demonstrates robust scalability and is well-suited for datasets of large size,ensuring efficient processing even when the number of features is substantial.By automatically eliminating outliers and redundant data,FSFS reduces computational overhead,resulting in faster training and improved model performance.Overall,the FSFS framework not only optimizes performance but also enhances the interpretability and explainability of data-driven models,thereby facilitating more trustworthy decision-making in AI applications.展开更多
Detecting cyber attacks in networks connected to the Internet of Things(IoT)is of utmost importance because of the growing vulnerabilities in the smart environment.Conventional models,such as Naive Bayes and support v...Detecting cyber attacks in networks connected to the Internet of Things(IoT)is of utmost importance because of the growing vulnerabilities in the smart environment.Conventional models,such as Naive Bayes and support vector machine(SVM),as well as ensemble methods,such as Gradient Boosting and eXtreme gradient boosting(XGBoost),are often plagued by high computational costs,which makes it challenging for them to perform real-time detection.In this regard,we suggested an attack detection approach that integrates Visual Geometry Group 16(VGG16),Artificial Rabbits Optimizer(ARO),and Random Forest Model to increase detection accuracy and operational efficiency in Internet of Things(IoT)networks.In the suggested model,the extraction of features from malware pictures was accomplished with the help of VGG16.The prediction process is carried out by the random forest model using the extracted features from the VGG16.Additionally,ARO is used to improve the hyper-parameters of the random forest model of the random forest.With an accuracy of 96.36%,the suggested model outperforms the standard models in terms of accuracy,F1-score,precision,and recall.The comparative research highlights our strategy’s success,which improves performance while maintaining a lower computational cost.This method is ideal for real-time applications,but it is effective.展开更多
Selecting proper descriptors(also known feature selection,FS)is key in the process of establishing mechanical properties prediction model of hot-rolled microalloyed steels by using machine learning(ML)algorithm.FS met...Selecting proper descriptors(also known feature selection,FS)is key in the process of establishing mechanical properties prediction model of hot-rolled microalloyed steels by using machine learning(ML)algorithm.FS methods based on data-driving can reduce the redundancy of data features and improve the prediction accuracy of mechanical properties.Based on the collected data of hot-rolled microalloyed steels,the association rules are used to mine the correlation information between the data.High-quality feature subsets are selected by the proposed FS method(FS method based on genetic algorithm embedding,GAMIC).Compared with the common FS method,it is shown on dataset that GAMIC selects feature subsets more appropriately.Six different ML algorithms are trained and tested for mechanical properties prediction.The result shows that the root-mean-square error of yield strength,tensile strength and elongation based on limit gradient enhancement(XGBoost)algorithm is 21.95 MPa,20.85 MPa and 1.96%,the correlation coefficient(R^(2))is 0.969,0.968 and 0.830,and the mean absolute error is 16.84 MPa,15.83 MPa and 1.48%,respectively,showing the best prediction performance.Finally,SHapley Additive exPlanation is used to further explore the influence of feature variables on mechanical properties.GAMIC feature selection method proposed is universal,which provides a basis for the development of high-precision mechanical property prediction model.展开更多
The authors regret that the original publication of this paper did not include Jawad Fayaz as a co-author.After further discussions and a thorough review of the research contributions,it was agreed that his significan...The authors regret that the original publication of this paper did not include Jawad Fayaz as a co-author.After further discussions and a thorough review of the research contributions,it was agreed that his significant contributions to the foundational aspects of the research warranted recognition,and he has now been added as a co-author.展开更多
In this paper,a feature selection method for determining input parameters in antenna modeling is proposed.In antenna modeling,the input feature of artificial neural network(ANN)is geometric parameters.The selection cr...In this paper,a feature selection method for determining input parameters in antenna modeling is proposed.In antenna modeling,the input feature of artificial neural network(ANN)is geometric parameters.The selection criteria contain correlation and sensitivity between the geometric parameter and the electromagnetic(EM)response.Maximal information coefficient(MIC),an exploratory data mining tool,is introduced to evaluate both linear and nonlinear correlations.The EM response range is utilized to evaluate the sensitivity.The wide response range corresponding to varying values of a parameter implies the parameter is highly sensitive and the narrow response range suggests the parameter is insensitive.Only the parameter which is highly correlative and sensitive is selected as the input of ANN,and the sampling space of the model is highly reduced.The modeling of a wideband and circularly polarized antenna is studied as an example to verify the effectiveness of the proposed method.The number of input parameters decreases from8 to 4.The testing errors of|S_(11)|and axis ratio are reduced by8.74%and 8.95%,respectively,compared with the ANN with no feature selection.展开更多
Feature modeling is the key to the realization of CAD/CAPP/CAM and the information integration of concurrent engineering. This paper describes the method for the advanced development of the parametric modeling system ...Feature modeling is the key to the realization of CAD/CAPP/CAM and the information integration of concurrent engineering. This paper describes the method for the advanced development of the parametric modeling system based on features by using I DEAS 5 system. It elaborates the modeling technique based on the features and generates the product information models based on the features providing abundant information for the process of the ensuing applications. The development of the feature modeling system on the commercial CAD software platform can take a great advantage of the solid modeling resources of the existing software, save the input of funds and shorten the development cycles of the new systems.展开更多
It is well known that the human auditory system possesses remarkable capabilities to analyze and identify signals. Therefore, it would be significant to build an auditory model based on the mechanism of human auditory...It is well known that the human auditory system possesses remarkable capabilities to analyze and identify signals. Therefore, it would be significant to build an auditory model based on the mechanism of human auditory systems, which may improve the effects of mechanical signal analysis and enrich the methods of mechanical faults features extraction. However the existing methods are all based on explicit senses of mathematics or physics, and have some shortages on distinguishing different faults, stability, and suppressing the disturbance noise, etc. For the purpose of improving the performances of the work of feature extraction, an auditory model, early auditory(EA) model, is introduced for the first time. This auditory model transforms time domain signal into auditory spectrum via bandpass filtering, nonlinear compressing, and lateral inhibiting by simulating the principle of the human auditory system. The EA model is developed with the Gammatone filterbank as the basilar membrane. According to the characteristics of vibration signals, a method is proposed for determining the parameter of inner hair cells model of EA model. The performance of EA model is evaluated through experiments on four rotor faults, including misalignment, rotor-to-stator rubbing, oil film whirl, and pedestal looseness. The results show that the auditory spectrum, output of EA model, can effectively distinguish different faults with satisfactory stability and has the ability to suppress the disturbance noise. Then, it is feasible to apply auditory model, as a new method, to the feature extraction for mechanical faults diagnosis with effect.展开更多
This paper proposes an approach of developing the feature based parametric product modeling system which is suitable for integrated engineering design in CIMS environment.The architecture of ZD--MCADII and the charact...This paper proposes an approach of developing the feature based parametric product modeling system which is suitable for integrated engineering design in CIMS environment.The architecture of ZD--MCADII and the characteristics of its each module are introduced in detail. ZD--MCADII’s product data is managed by an object--oriented database management system OSCAR, and the product model is built according to the standard STEP. The product design is established on a unified product model, and all the product data are globally associated in ZD--MCADII. ZD--MCADII provides various design features to facilitate the product design, and supports the integrity of CAD, CAPP and CAM.展开更多
Sanduao is an important sea-breeding bay in Fujian,South China and holds a high economic status in aquaculture.Quickly and accurately obtaining information including the distribution area,quantity,and aquaculture area...Sanduao is an important sea-breeding bay in Fujian,South China and holds a high economic status in aquaculture.Quickly and accurately obtaining information including the distribution area,quantity,and aquaculture area is important for breeding area planning,production value estimation,ecological survey,and storm surge prevention.However,as the aquaculture area expands,the seawater background becomes increasingly complex and spectral characteristics differ dramatically,making it difficult to determine the aquaculture area.In this study,we used a high-resolution remote-sensing satellite GF-2 image to introduce a deep-learning Richer Convolutional Features(RCF)network model to extract the aquaculture area.Then we used the density of aquaculture as an assessment index to assess the vulnerability of aquaculture areas in Sanduao.The results demonstrate that this method does not require land and water separation of the area in advance,and good extraction can be achieved in the areas with more sediment and waves,with an extraction accuracy>93%,which is suitable for large-scale aquaculture area extraction.Vulnerability assessment results indicate that the density of aquaculture in the eastern part of Sanduao is considerably high,reaching a higher vulnerability level than other parts.展开更多
Based on the features extracted from generalized autoregressive (GAR) model parameters of the received waveform, and the use of multilayer perceptron(MLP) neural network classifier, a new digital modulation recognitio...Based on the features extracted from generalized autoregressive (GAR) model parameters of the received waveform, and the use of multilayer perceptron(MLP) neural network classifier, a new digital modulation recognition method is proposed in this paper. Because of the better noise suppression ability of the GAR model and the powerful pattern classification capacity of the MLP neural network classifier, the new method can significantly improve the recognition performance in lower SNR with better robustness. To assess the performance of the new method, computer simulations are also performed.展开更多
Aerodynamic surrogate modeling mostly relies only on integrated loads data obtained from simulation or experiment,while neglecting and wasting the valuable distributed physical information on the surface.To make full ...Aerodynamic surrogate modeling mostly relies only on integrated loads data obtained from simulation or experiment,while neglecting and wasting the valuable distributed physical information on the surface.To make full use of both integrated and distributed loads,a modeling paradigm,called the heterogeneous data-driven aerodynamic modeling,is presented.The essential concept is to incorporate the physical information of distributed loads as additional constraints within the end-to-end aerodynamic modeling.Towards heterogenous data,a novel and easily applicable physical feature embedding modeling framework is designed.This framework extracts lowdimensional physical features from pressure distribution and then effectively enhances the modeling of the integrated loads via feature embedding.The proposed framework can be coupled with multiple feature extraction methods,and the well-performed generalization capabilities over different airfoils are verified through a transonic case.Compared with traditional direct modeling,the proposed framework can reduce testing errors by almost 50%.Given the same prediction accuracy,it can save more than half of the training samples.Furthermore,the visualization analysis has revealed a significant correlation between the discovered low-dimensional physical features and the heterogeneous aerodynamic loads,which shows the interpretability and credibility of the superior performance offered by the proposed deep learning framework.展开更多
In recent years,convolutional neural networks(CNNs)have been applied successfully in many fields.However,these deep neural models are still considered as“black box”for most tasks.One of the fundamental issues underl...In recent years,convolutional neural networks(CNNs)have been applied successfully in many fields.However,these deep neural models are still considered as“black box”for most tasks.One of the fundamental issues underlying this problem is understanding which features are most influential in image recognition tasks and how CNNs process these features.It is widely believed that CNN models combine low‐level features to form complex shapes until the object can be readily classified,however,several recent studies have argued that texture features are more important than other features.In this paper,we assume that the importance of certain features varies depending on specific tasks,that is,specific tasks exhibit feature bias.We designed two classification tasks based on human intuition to train deep neural models to identify the anticipated biases.We designed experiments comprising many tasks to test these biases in the Res Net and Dense Net models.From the results,we conclude that(1)the combined effect of certain features is typically far more influential than any single feature;(2)in different tasks,neural models can perform different biases,that is,we can design a specific task to make a neural model biased towards a specific anticipated feature.展开更多
In order to accurately describe the dynamic characteristics of flight vehicles through aerodynamic modeling, an adaptive wavelet neural network (AWNN) aerodynamic modeling method is proposed, based on subset kernel pr...In order to accurately describe the dynamic characteristics of flight vehicles through aerodynamic modeling, an adaptive wavelet neural network (AWNN) aerodynamic modeling method is proposed, based on subset kernel principal components analysis (SKPCA) feature extraction. Firstly, by fuzzy C-means clustering, some samples are selected from the training sample set to constitute a sample subset. Then, the obtained samples subset is used to execute SKPCA for extracting basic features of the training samples. Finally, using the extracted basic features, the AWNN aerodynamic model is established. The experimental results show that, in 50 times repetitive modeling, the modeling ability of the method proposed is better than that of other six methods. It only needs about half the modeling time of KPCA-AWNN under a close prediction accuracy, and can easily determine the model parameters. This enables it to be effective and feasible to construct the aerodynamic modeling for flight vehicles.展开更多
To accurately describe damage within coal, digital image processing technology was used to determine texture parameters and obtain quantitative information related to coal meso-cracks. The relationship between damage ...To accurately describe damage within coal, digital image processing technology was used to determine texture parameters and obtain quantitative information related to coal meso-cracks. The relationship between damage and mesoscopic information for coal under compression was then analysed. The shape and distribution of damage were comprehensively considered in a defined damage variable, which was based on the texture characteristic. An elastic-brittle damage model based on the mesostructure information of coal was established. As a result, the damage model can appropriately and reliably replicate the processes of initiation, expansion, cut-through and eventual destruction of microscopic damage to coal under compression. After comparison, it was proved that the predicted overall stress-strain response of the model was comparable to the experimental result.展开更多
With the increasing intelligence and integration,a great number of two-valued variables(generally stored in the form of 0 or 1)often exist in large-scale industrial processes.However,these variables cannot be effectiv...With the increasing intelligence and integration,a great number of two-valued variables(generally stored in the form of 0 or 1)often exist in large-scale industrial processes.However,these variables cannot be effectively handled by traditional monitoring methods such as linear discriminant analysis(LDA),principal component analysis(PCA)and partial least square(PLS)analysis.Recently,a mixed hidden naive Bayesian model(MHNBM)is developed for the first time to utilize both two-valued and continuous variables for abnormality monitoring.Although the MHNBM is effective,it still has some shortcomings that need to be improved.For the MHNBM,the variables with greater correlation to other variables have greater weights,which can not guarantee greater weights are assigned to the more discriminating variables.In addition,the conditional P(x j|x j′,y=k)probability must be computed based on historical data.When the training data is scarce,the conditional probability between continuous variables tends to be uniformly distributed,which affects the performance of MHNBM.Here a novel feature weighted mixed naive Bayes model(FWMNBM)is developed to overcome the above shortcomings.For the FWMNBM,the variables that are more correlated to the class have greater weights,which makes the more discriminating variables contribute more to the model.At the same time,FWMNBM does not have to calculate the conditional probability between variables,thus it is less restricted by the number of training data samples.Compared with the MHNBM,the FWMNBM has better performance,and its effectiveness is validated through numerical cases of a simulation example and a practical case of the Zhoushan thermal power plant(ZTPP),China.展开更多
Modern medicine is reliant on various medical imaging technologies for non-invasively observing patients’anatomy.However,the interpretation of medical images can be highly subjective and dependent on the expertise of...Modern medicine is reliant on various medical imaging technologies for non-invasively observing patients’anatomy.However,the interpretation of medical images can be highly subjective and dependent on the expertise of clinicians.Moreover,some potentially useful quantitative information in medical images,especially that which is not visible to the naked eye,is often ignored during clinical practice.In contrast,radiomics performs high-throughput feature extraction from medical images,which enables quantitative analysis of medical images and prediction of various clinical endpoints.Studies have reported that radiomics exhibits promising performance in diagnosis and predicting treatment responses and prognosis,demonstrating its potential to be a non-invasive auxiliary tool for personalized medicine.However,radiomics remains in a developmental phase as numerous technical challenges have yet to be solved,especially in feature engineering and statistical modeling.In this review,we introduce the current utility of radiomics by summarizing research on its application in the diagnosis,prognosis,and prediction of treatment responses in patients with cancer.We focus on machine learning approaches,for feature extraction and selection during feature engineering and for imbalanced datasets and multi-modality fusion during statistical modeling.Furthermore,we introduce the stability,reproducibility,and interpretability of features,and the generalizability and interpretability of models.Finally,we offer possible solutions to current challenges in radiomics research.展开更多
A novel adaptive subspace ensemble slow feature regression model was developed for soft sensing application.Compared to traditional single models and random subspace models,the proposed method is improved in three asp...A novel adaptive subspace ensemble slow feature regression model was developed for soft sensing application.Compared to traditional single models and random subspace models,the proposed method is improved in three aspects.Firstly,sub-datasets are constructed through slow feature directions and variables in each subdatasets are selected according to the output related importance index.Then,an adaptive slow feature regression is presented for sub-models.Finally,a Bayesian inference strategy based on a slow feature analysis process that monitors statistics is developed for probabilistic combination.Two industrial examples were used to evaluate the proposed method.展开更多
Tyre pressure monitoring system(TPMS)is compulsory in most countries like the United States and European Union.The existing systems depend on pressure sensors strapped on the tyre or on wheel speed sensor data.A diffe...Tyre pressure monitoring system(TPMS)is compulsory in most countries like the United States and European Union.The existing systems depend on pressure sensors strapped on the tyre or on wheel speed sensor data.A difference in wheel speed would trigger an alarm based on the algorithm implemented.In this paper,machine learning approach is proposed as a new method to monitor tyre pressure by extracting the vertical vibrations from a wheel hub of a moving vehicle using an accelerometer.The obtained signals will be used to compute through statistical features and histogram features for the feature extraction process.The LMT(Logistic Model Tree)was used as the classifier and attained a classification accuracy of 92.5%with 10-fold cross validation for statistical features and 90.5% with 10-fold cross validation for histogram features.The proposed model can be used for monitoring the automobile tyre pressure successfully.展开更多
Due to global financial crisis,risk management has received significant attention to avoid loss and maximize profit in any business.Since the financial crisis prediction(FCP)process is mainly based on data driven deci...Due to global financial crisis,risk management has received significant attention to avoid loss and maximize profit in any business.Since the financial crisis prediction(FCP)process is mainly based on data driven decision making and intelligent models,artificial intelligence(AI)and machine learning(ML)models are widely utilized.This article introduces an intelligent feature selection with deep learning based financial risk assessment model(IFSDL-FRA).The proposed IFSDL-FRA technique aims to determine the financial crisis of a company or enterprise.In addition,the IFSDL-FRA technique involves the design of new water strider optimization algorithm based feature selection(WSOA-FS)manner to an optimum selection of feature subsets.Moreover,Deep Random Vector Functional Link network(DRVFLN)classification technique was applied to properly allot the class labels to the financial data.Furthermore,improved fruit fly optimization algorithm(IFFOA)based hyperparameter tuning process is carried out to optimally tune the hyperparameters of the DRVFLN model.For enhancing the better performance of the IFSDL-FRA technique,an extensive set of simulations are implemented on benchmark financial datasets and the obtained outcomes determine the betterment of IFSDL-FRA technique on the recent state of art approaches.展开更多
文摘Feature selection(FS)is a pivotal pre-processing step in developing data-driven models,influencing reliability,performance and optimization.Although existing FS techniques can yield high-performance metrics for certain models,they do not invariably guarantee the extraction of the most critical or impactful features.Prior literature underscores the significance of equitable FS practices and has proposed diverse methodologies for the identification of appropriate features.However,the challenge of discerning the most relevant and influential features persists,particularly in the context of the exponential growth and heterogeneity of big data—a challenge that is increasingly salient in modern artificial intelligence(AI)applications.In response,this study introduces an innovative,automated statistical method termed Farea Similarity for Feature Selection(FSFS).The FSFS approach computes a similarity metric for each feature by benchmarking it against the record-wise mean,thereby finding feature dependencies and mitigating the influence of outliers that could potentially distort evaluation outcomes.Features are subsequently ranked according to their similarity scores,with the threshold established at the average similarity score.Notably,lower FSFS values indicate higher similarity and stronger data correlations,whereas higher values suggest lower similarity.The FSFS method is designed not only to yield reliable evaluation metrics but also to reduce data complexity without compromising model performance.Comparative analyses were performed against several established techniques,including Chi-squared(CS),Correlation Coefficient(CC),Genetic Algorithm(GA),Exhaustive Approach,Greedy Stepwise Approach,Gain Ratio,and Filtered Subset Eval,using a variety of datasets such as the Experimental Dataset,Breast Cancer Wisconsin(Original),KDD CUP 1999,NSL-KDD,UNSW-NB15,and Edge-IIoT.In the absence of the FSFS method,the highest classifier accuracies observed were 60.00%,95.13%,97.02%,98.17%,95.86%,and 94.62%for the respective datasets.When the FSFS technique was integrated with data normalization,encoding,balancing,and feature importance selection processes,accuracies improved to 100.00%,97.81%,98.63%,98.94%,94.27%,and 98.46%,respectively.The FSFS method,with a computational complexity of O(fn log n),demonstrates robust scalability and is well-suited for datasets of large size,ensuring efficient processing even when the number of features is substantial.By automatically eliminating outliers and redundant data,FSFS reduces computational overhead,resulting in faster training and improved model performance.Overall,the FSFS framework not only optimizes performance but also enhances the interpretability and explainability of data-driven models,thereby facilitating more trustworthy decision-making in AI applications.
基金funded by Institutional Fund Projects under grant no.(IFPDP-261-22)。
文摘Detecting cyber attacks in networks connected to the Internet of Things(IoT)is of utmost importance because of the growing vulnerabilities in the smart environment.Conventional models,such as Naive Bayes and support vector machine(SVM),as well as ensemble methods,such as Gradient Boosting and eXtreme gradient boosting(XGBoost),are often plagued by high computational costs,which makes it challenging for them to perform real-time detection.In this regard,we suggested an attack detection approach that integrates Visual Geometry Group 16(VGG16),Artificial Rabbits Optimizer(ARO),and Random Forest Model to increase detection accuracy and operational efficiency in Internet of Things(IoT)networks.In the suggested model,the extraction of features from malware pictures was accomplished with the help of VGG16.The prediction process is carried out by the random forest model using the extracted features from the VGG16.Additionally,ARO is used to improve the hyper-parameters of the random forest model of the random forest.With an accuracy of 96.36%,the suggested model outperforms the standard models in terms of accuracy,F1-score,precision,and recall.The comparative research highlights our strategy’s success,which improves performance while maintaining a lower computational cost.This method is ideal for real-time applications,but it is effective.
基金supported by the National Key Research and Development Program of China(Grant No.2021YFB3702404)the National Natural Science Foundation of China(Grant No.52104370)+4 种基金the Reviving-Liaoning Excellence Plan(XLYC2203186)Science and Technology Special Projects of Liaoning Province(Grant No.2022JH25/10200001)the Postdoctoral Research Fund for Northeastern(Grant No.20210203)Independent Projects of Basic Scientific Research(ZZ2021005)CITIC Niobium Steel Development Award Fund(2022-M1824).
文摘Selecting proper descriptors(also known feature selection,FS)is key in the process of establishing mechanical properties prediction model of hot-rolled microalloyed steels by using machine learning(ML)algorithm.FS methods based on data-driving can reduce the redundancy of data features and improve the prediction accuracy of mechanical properties.Based on the collected data of hot-rolled microalloyed steels,the association rules are used to mine the correlation information between the data.High-quality feature subsets are selected by the proposed FS method(FS method based on genetic algorithm embedding,GAMIC).Compared with the common FS method,it is shown on dataset that GAMIC selects feature subsets more appropriately.Six different ML algorithms are trained and tested for mechanical properties prediction.The result shows that the root-mean-square error of yield strength,tensile strength and elongation based on limit gradient enhancement(XGBoost)algorithm is 21.95 MPa,20.85 MPa and 1.96%,the correlation coefficient(R^(2))is 0.969,0.968 and 0.830,and the mean absolute error is 16.84 MPa,15.83 MPa and 1.48%,respectively,showing the best prediction performance.Finally,SHapley Additive exPlanation is used to further explore the influence of feature variables on mechanical properties.GAMIC feature selection method proposed is universal,which provides a basis for the development of high-precision mechanical property prediction model.
文摘The authors regret that the original publication of this paper did not include Jawad Fayaz as a co-author.After further discussions and a thorough review of the research contributions,it was agreed that his significant contributions to the foundational aspects of the research warranted recognition,and he has now been added as a co-author.
基金National Natural Science Foundation of China(62161048)Sichuan Science and Technology Program(2022NSFSC0547,2022ZYD0109)。
文摘In this paper,a feature selection method for determining input parameters in antenna modeling is proposed.In antenna modeling,the input feature of artificial neural network(ANN)is geometric parameters.The selection criteria contain correlation and sensitivity between the geometric parameter and the electromagnetic(EM)response.Maximal information coefficient(MIC),an exploratory data mining tool,is introduced to evaluate both linear and nonlinear correlations.The EM response range is utilized to evaluate the sensitivity.The wide response range corresponding to varying values of a parameter implies the parameter is highly sensitive and the narrow response range suggests the parameter is insensitive.Only the parameter which is highly correlative and sensitive is selected as the input of ANN,and the sampling space of the model is highly reduced.The modeling of a wideband and circularly polarized antenna is studied as an example to verify the effectiveness of the proposed method.The number of input parameters decreases from8 to 4.The testing errors of|S_(11)|and axis ratio are reduced by8.74%and 8.95%,respectively,compared with the ANN with no feature selection.
文摘Feature modeling is the key to the realization of CAD/CAPP/CAM and the information integration of concurrent engineering. This paper describes the method for the advanced development of the parametric modeling system based on features by using I DEAS 5 system. It elaborates the modeling technique based on the features and generates the product information models based on the features providing abundant information for the process of the ensuing applications. The development of the feature modeling system on the commercial CAD software platform can take a great advantage of the solid modeling resources of the existing software, save the input of funds and shorten the development cycles of the new systems.
基金supported by National Natural Science Foundation of China (Grant No. 50805021)
文摘It is well known that the human auditory system possesses remarkable capabilities to analyze and identify signals. Therefore, it would be significant to build an auditory model based on the mechanism of human auditory systems, which may improve the effects of mechanical signal analysis and enrich the methods of mechanical faults features extraction. However the existing methods are all based on explicit senses of mathematics or physics, and have some shortages on distinguishing different faults, stability, and suppressing the disturbance noise, etc. For the purpose of improving the performances of the work of feature extraction, an auditory model, early auditory(EA) model, is introduced for the first time. This auditory model transforms time domain signal into auditory spectrum via bandpass filtering, nonlinear compressing, and lateral inhibiting by simulating the principle of the human auditory system. The EA model is developed with the Gammatone filterbank as the basilar membrane. According to the characteristics of vibration signals, a method is proposed for determining the parameter of inner hair cells model of EA model. The performance of EA model is evaluated through experiments on four rotor faults, including misalignment, rotor-to-stator rubbing, oil film whirl, and pedestal looseness. The results show that the auditory spectrum, output of EA model, can effectively distinguish different faults with satisfactory stability and has the ability to suppress the disturbance noise. Then, it is feasible to apply auditory model, as a new method, to the feature extraction for mechanical faults diagnosis with effect.
文摘This paper proposes an approach of developing the feature based parametric product modeling system which is suitable for integrated engineering design in CIMS environment.The architecture of ZD--MCADII and the characteristics of its each module are introduced in detail. ZD--MCADII’s product data is managed by an object--oriented database management system OSCAR, and the product model is built according to the standard STEP. The product design is established on a unified product model, and all the product data are globally associated in ZD--MCADII. ZD--MCADII provides various design features to facilitate the product design, and supports the integrity of CAD, CAPP and CAM.
基金Supported by the National Key Research and Development Program of China(No.2016YFC1402003)the National Natural Science Foundation of China(No.41671436)the Innovation Project of LREIS(No.O88RAA01YA)
文摘Sanduao is an important sea-breeding bay in Fujian,South China and holds a high economic status in aquaculture.Quickly and accurately obtaining information including the distribution area,quantity,and aquaculture area is important for breeding area planning,production value estimation,ecological survey,and storm surge prevention.However,as the aquaculture area expands,the seawater background becomes increasingly complex and spectral characteristics differ dramatically,making it difficult to determine the aquaculture area.In this study,we used a high-resolution remote-sensing satellite GF-2 image to introduce a deep-learning Richer Convolutional Features(RCF)network model to extract the aquaculture area.Then we used the density of aquaculture as an assessment index to assess the vulnerability of aquaculture areas in Sanduao.The results demonstrate that this method does not require land and water separation of the area in advance,and good extraction can be achieved in the areas with more sediment and waves,with an extraction accuracy>93%,which is suitable for large-scale aquaculture area extraction.Vulnerability assessment results indicate that the density of aquaculture in the eastern part of Sanduao is considerably high,reaching a higher vulnerability level than other parts.
文摘Based on the features extracted from generalized autoregressive (GAR) model parameters of the received waveform, and the use of multilayer perceptron(MLP) neural network classifier, a new digital modulation recognition method is proposed in this paper. Because of the better noise suppression ability of the GAR model and the powerful pattern classification capacity of the MLP neural network classifier, the new method can significantly improve the recognition performance in lower SNR with better robustness. To assess the performance of the new method, computer simulations are also performed.
基金supported by the National Natural Science Foundation of China(Nos.92152301,12072282)。
文摘Aerodynamic surrogate modeling mostly relies only on integrated loads data obtained from simulation or experiment,while neglecting and wasting the valuable distributed physical information on the surface.To make full use of both integrated and distributed loads,a modeling paradigm,called the heterogeneous data-driven aerodynamic modeling,is presented.The essential concept is to incorporate the physical information of distributed loads as additional constraints within the end-to-end aerodynamic modeling.Towards heterogenous data,a novel and easily applicable physical feature embedding modeling framework is designed.This framework extracts lowdimensional physical features from pressure distribution and then effectively enhances the modeling of the integrated loads via feature embedding.The proposed framework can be coupled with multiple feature extraction methods,and the well-performed generalization capabilities over different airfoils are verified through a transonic case.Compared with traditional direct modeling,the proposed framework can reduce testing errors by almost 50%.Given the same prediction accuracy,it can save more than half of the training samples.Furthermore,the visualization analysis has revealed a significant correlation between the discovered low-dimensional physical features and the heterogeneous aerodynamic loads,which shows the interpretability and credibility of the superior performance offered by the proposed deep learning framework.
基金National Natural Science Foundation of China,Grant/Award Number:61936001Natural Science Foundation of Chongqing,Grant/Award Number:cstc2019jcyj-msxmX0380China Postdoctoral Science Foundation,Grant/Award Number:2021M700562。
文摘In recent years,convolutional neural networks(CNNs)have been applied successfully in many fields.However,these deep neural models are still considered as“black box”for most tasks.One of the fundamental issues underlying this problem is understanding which features are most influential in image recognition tasks and how CNNs process these features.It is widely believed that CNN models combine low‐level features to form complex shapes until the object can be readily classified,however,several recent studies have argued that texture features are more important than other features.In this paper,we assume that the importance of certain features varies depending on specific tasks,that is,specific tasks exhibit feature bias.We designed two classification tasks based on human intuition to train deep neural models to identify the anticipated biases.We designed experiments comprising many tasks to test these biases in the Res Net and Dense Net models.From the results,we conclude that(1)the combined effect of certain features is typically far more influential than any single feature;(2)in different tasks,neural models can perform different biases,that is,we can design a specific task to make a neural model biased towards a specific anticipated feature.
基金Project(51209167) supported by Youth Project of the National Natural Science Foundation of ChinaProject(2012JM8026) supported by Shaanxi Provincial Natural Science Foundation, China
文摘In order to accurately describe the dynamic characteristics of flight vehicles through aerodynamic modeling, an adaptive wavelet neural network (AWNN) aerodynamic modeling method is proposed, based on subset kernel principal components analysis (SKPCA) feature extraction. Firstly, by fuzzy C-means clustering, some samples are selected from the training sample set to constitute a sample subset. Then, the obtained samples subset is used to execute SKPCA for extracting basic features of the training samples. Finally, using the extracted basic features, the AWNN aerodynamic model is established. The experimental results show that, in 50 times repetitive modeling, the modeling ability of the method proposed is better than that of other six methods. It only needs about half the modeling time of KPCA-AWNN under a close prediction accuracy, and can easily determine the model parameters. This enables it to be effective and feasible to construct the aerodynamic modeling for flight vehicles.
基金funding by the National Natural Science Foundation of China(Nos.51474039 and 51404046)the Project of Shanxi Provincial Federation of Coalbed Methane Research(No.2013012010)the Science Foundation of North University of China(No.XJJ2016033)
文摘To accurately describe damage within coal, digital image processing technology was used to determine texture parameters and obtain quantitative information related to coal meso-cracks. The relationship between damage and mesoscopic information for coal under compression was then analysed. The shape and distribution of damage were comprehensively considered in a defined damage variable, which was based on the texture characteristic. An elastic-brittle damage model based on the mesostructure information of coal was established. As a result, the damage model can appropriately and reliably replicate the processes of initiation, expansion, cut-through and eventual destruction of microscopic damage to coal under compression. After comparison, it was proved that the predicted overall stress-strain response of the model was comparable to the experimental result.
基金supported by the National Natural Science Foundation of China(62033008,61873143)。
文摘With the increasing intelligence and integration,a great number of two-valued variables(generally stored in the form of 0 or 1)often exist in large-scale industrial processes.However,these variables cannot be effectively handled by traditional monitoring methods such as linear discriminant analysis(LDA),principal component analysis(PCA)and partial least square(PLS)analysis.Recently,a mixed hidden naive Bayesian model(MHNBM)is developed for the first time to utilize both two-valued and continuous variables for abnormality monitoring.Although the MHNBM is effective,it still has some shortcomings that need to be improved.For the MHNBM,the variables with greater correlation to other variables have greater weights,which can not guarantee greater weights are assigned to the more discriminating variables.In addition,the conditional P(x j|x j′,y=k)probability must be computed based on historical data.When the training data is scarce,the conditional probability between continuous variables tends to be uniformly distributed,which affects the performance of MHNBM.Here a novel feature weighted mixed naive Bayes model(FWMNBM)is developed to overcome the above shortcomings.For the FWMNBM,the variables that are more correlated to the class have greater weights,which makes the more discriminating variables contribute more to the model.At the same time,FWMNBM does not have to calculate the conditional probability between variables,thus it is less restricted by the number of training data samples.Compared with the MHNBM,the FWMNBM has better performance,and its effectiveness is validated through numerical cases of a simulation example and a practical case of the Zhoushan thermal power plant(ZTPP),China.
基金supported in part by the National Natural Science Foundation of China(82072019)the Shenzhen Basic Research Program(JCYJ20210324130209023)+5 种基金the Shenzhen-Hong Kong-Macao S&T Program(Category C)(SGDX20201103095002019)the Mainland-Hong Kong Joint Funding Scheme(MHKJFS)(MHP/005/20),the Project of Strategic Importance Fund(P0035421)the Projects of RISA(P0043001)from the Hong Kong Polytechnic University,the Natural Science Foundation of Jiangsu Province(BK20201441)the Provincial and Ministry Co-constructed Project of Henan Province Medical Science and Technology Research(SBGJ202103038,SBGJ202102056)the Henan Province Key R&D and Promotion Project(Science and Technology Research)(222102310015)the Natural Science Foundation of Henan Province(222300420575),and the Henan Province Science and Technology Research(222102310322).
文摘Modern medicine is reliant on various medical imaging technologies for non-invasively observing patients’anatomy.However,the interpretation of medical images can be highly subjective and dependent on the expertise of clinicians.Moreover,some potentially useful quantitative information in medical images,especially that which is not visible to the naked eye,is often ignored during clinical practice.In contrast,radiomics performs high-throughput feature extraction from medical images,which enables quantitative analysis of medical images and prediction of various clinical endpoints.Studies have reported that radiomics exhibits promising performance in diagnosis and predicting treatment responses and prognosis,demonstrating its potential to be a non-invasive auxiliary tool for personalized medicine.However,radiomics remains in a developmental phase as numerous technical challenges have yet to be solved,especially in feature engineering and statistical modeling.In this review,we introduce the current utility of radiomics by summarizing research on its application in the diagnosis,prognosis,and prediction of treatment responses in patients with cancer.We focus on machine learning approaches,for feature extraction and selection during feature engineering and for imbalanced datasets and multi-modality fusion during statistical modeling.Furthermore,we introduce the stability,reproducibility,and interpretability of features,and the generalizability and interpretability of models.Finally,we offer possible solutions to current challenges in radiomics research.
基金the support from the National Natural Science Foundation of China(No.21676086).
文摘A novel adaptive subspace ensemble slow feature regression model was developed for soft sensing application.Compared to traditional single models and random subspace models,the proposed method is improved in three aspects.Firstly,sub-datasets are constructed through slow feature directions and variables in each subdatasets are selected according to the output related importance index.Then,an adaptive slow feature regression is presented for sub-models.Finally,a Bayesian inference strategy based on a slow feature analysis process that monitors statistics is developed for probabilistic combination.Two industrial examples were used to evaluate the proposed method.
文摘Tyre pressure monitoring system(TPMS)is compulsory in most countries like the United States and European Union.The existing systems depend on pressure sensors strapped on the tyre or on wheel speed sensor data.A difference in wheel speed would trigger an alarm based on the algorithm implemented.In this paper,machine learning approach is proposed as a new method to monitor tyre pressure by extracting the vertical vibrations from a wheel hub of a moving vehicle using an accelerometer.The obtained signals will be used to compute through statistical features and histogram features for the feature extraction process.The LMT(Logistic Model Tree)was used as the classifier and attained a classification accuracy of 92.5%with 10-fold cross validation for statistical features and 90.5% with 10-fold cross validation for histogram features.The proposed model can be used for monitoring the automobile tyre pressure successfully.
文摘Due to global financial crisis,risk management has received significant attention to avoid loss and maximize profit in any business.Since the financial crisis prediction(FCP)process is mainly based on data driven decision making and intelligent models,artificial intelligence(AI)and machine learning(ML)models are widely utilized.This article introduces an intelligent feature selection with deep learning based financial risk assessment model(IFSDL-FRA).The proposed IFSDL-FRA technique aims to determine the financial crisis of a company or enterprise.In addition,the IFSDL-FRA technique involves the design of new water strider optimization algorithm based feature selection(WSOA-FS)manner to an optimum selection of feature subsets.Moreover,Deep Random Vector Functional Link network(DRVFLN)classification technique was applied to properly allot the class labels to the financial data.Furthermore,improved fruit fly optimization algorithm(IFFOA)based hyperparameter tuning process is carried out to optimally tune the hyperparameters of the DRVFLN model.For enhancing the better performance of the IFSDL-FRA technique,an extensive set of simulations are implemented on benchmark financial datasets and the obtained outcomes determine the betterment of IFSDL-FRA technique on the recent state of art approaches.