High-dimensional data causes difficulties in machine learning due to high time consumption and large memory requirements.In particular,in amulti-label environment,higher complexity is required asmuch as the number of ...High-dimensional data causes difficulties in machine learning due to high time consumption and large memory requirements.In particular,in amulti-label environment,higher complexity is required asmuch as the number of labels.Moreover,an optimization problem that fully considers all dependencies between features and labels is difficult to solve.In this study,we propose a novel regression-basedmulti-label feature selectionmethod that integrates mutual information to better exploit the underlying data structure.By incorporating mutual information into the regression formulation,the model captures not only linear relationships but also complex non-linear dependencies.The proposed objective function simultaneously considers three types of relationships:(1)feature redundancy,(2)featurelabel relevance,and(3)inter-label dependency.These three quantities are computed usingmutual information,allowing the proposed formulation to capture nonlinear dependencies among variables.These three types of relationships are key factors in multi-label feature selection,and our method expresses them within a unified formulation,enabling efficient optimization while simultaneously accounting for all of them.To efficiently solve the proposed optimization problem under non-negativity constraints,we develop a gradient-based optimization algorithm with fast convergence.Theexperimental results on sevenmulti-label datasets show that the proposed method outperforms existingmulti-label feature selection techniques.展开更多
Because the hydraulic directional valve usually works in a bad working environment and is disturbed by multi-factor noise,the traditional single sensor monitoring technology is difficult to use for an accurate diagnos...Because the hydraulic directional valve usually works in a bad working environment and is disturbed by multi-factor noise,the traditional single sensor monitoring technology is difficult to use for an accurate diagnosis of it.Therefore,a fault diagnosis method based on multi-sensor information fusion is proposed in this paper to reduce the inaccuracy and uncertainty of traditional single sensor information diagnosis technology and to realize accurate monitoring for the location or diagnosis of early faults in such valves in noisy environments.Firstly,the statistical features of signals collected by the multi-sensor are extracted and the depth features are obtained by a convolutional neural network(CNN)to form a complete and stable multi-dimensional feature set.Secondly,to obtain a weighted multi-dimensional feature set,the multi-dimensional feature sets of similar sensors are combined,and the entropy weight method is used to weight these features to reduce the interference of insensitive features.Finally,the attention mechanism is introduced to improve the dual-channel CNN,which is used to adaptively fuse the weighted multi-dimensional feature sets of heterogeneous sensors,to flexibly select heterogeneous sensor information so as to achieve an accurate diagnosis.Experimental results show that the weighted multi-dimensional feature set obtained by the proposed method has a high fault-representation ability and low information redundancy.It can diagnose simultaneously internal wear faults of the hydraulic directional valve and electromagnetic faults of actuators that are difficult to diagnose by traditional methods.This proposed method can achieve high fault-diagnosis accuracy under severe working conditions.展开更多
For a single-structure deep learning fault diagnosis model,its disadvantages are an insufficient feature extraction and weak fault classification capability.This paper proposes a multi-scale deep feature fusion intell...For a single-structure deep learning fault diagnosis model,its disadvantages are an insufficient feature extraction and weak fault classification capability.This paper proposes a multi-scale deep feature fusion intelligent fault diagnosis method based on information entropy.First,a normal autoencoder,denoising autoencoder,sparse autoencoder,and contractive autoencoder are used in parallel to construct a multi-scale deep neural network feature extraction structure.A deep feature fusion strategy based on information entropy is proposed to obtain low-dimensional features and ensure the robustness of the model and the quality of deep features.Finally,the advantage of the deep belief network probability model is used as the fault classifier to identify the faults.The effectiveness of the proposed method was verified by a gearbox test-bed.Experimental results show that,compared with traditional and existing intelligent fault diagnosis methods,the proposed method can obtain representative information and features from the raw data with higher classification accuracy.展开更多
Mutual information is an important information measure for feature subset. In this paper, a hashing mechanism is proposed to calculate the mutual information on the feature subset. Redundancy-synergy coefficient, a no...Mutual information is an important information measure for feature subset. In this paper, a hashing mechanism is proposed to calculate the mutual information on the feature subset. Redundancy-synergy coefficient, a novel redundancy and synergy measure of features to express the class feature, is defined by mutual information. The information maximization rule was applied to derive the heuristic feature subset selection method based on mutual information and redundancy-synergy coefficient. Our experiment results showed the good performance of the new feature selection method.展开更多
Intelligence and perception are two operative technologies in 6G scenarios.The intelligent wireless network and information perception require a deep fusion of artificial intelligence(AI)and wireless communications in...Intelligence and perception are two operative technologies in 6G scenarios.The intelligent wireless network and information perception require a deep fusion of artificial intelligence(AI)and wireless communications in 6G systems.Therefore,fusion is becoming a typical feature and key challenge of 6G wireless communication systems.In this paper,we focus on the critical issues and propose three application scenarios in 6G wireless systems.Specifically,we first discuss the fusion of AI and 6G networks for the enhancement of 5G-advanced technology and future wireless communication systems.Then,we introduce the wireless AI technology architecture with 6G multidimensional information perception,which includes the physical layer technology of multi-dimensional feature information perception,full spectrum fusion technology,and intelligent wireless resource management.The discussion of key technologies for intelligent 6G wireless network networks is expected to provide a guideline for future research.展开更多
In order to solve the poor performance in text classification when using traditional formula of mutual information (MI) , a feature selection algorithm were proposed based on improved mutual information. The improve...In order to solve the poor performance in text classification when using traditional formula of mutual information (MI) , a feature selection algorithm were proposed based on improved mutual information. The improved mutual information algorithm, which is on the basis of traditional improved mutual information methods that enbance the MI value of negative characteristics and feature' s frequency, supports the concept of concentration degree and dispersion degree. In accordance with the concept of concentration degree and dispersion degree, formulas which embody concentration degree and dispersion degree were constructed and the improved mutual information was implemented based on these. In this paper, the feature selection algorithm was applied based on improved mutual information to a text classifier based on Biomimetic Pattern Recognition and it was compared with several other feature selection methods. The experimental results showed that the improved mutu- al information feature selection method greatly enhances the performance compared with traditional mutual information feature selection methods and the performance is better than that of information gain. Through the introduction of the concept of concentration degree and dispersion degree, the improved mutual information feature selection method greatly improves the performance of text classification system.展开更多
Computer-aided detection and diagnosis (CAD) systems are increasingly being used as an aid by clinicians for detection and interpretation of diseases. In general, a CAD system employs a classifier to detect or disting...Computer-aided detection and diagnosis (CAD) systems are increasingly being used as an aid by clinicians for detection and interpretation of diseases. In general, a CAD system employs a classifier to detect or distinguish between abnormal and normal tissues on images. In the phase of classification, a set of image features and/or texture features extracted from the images are commonly used. In this article, we investigated the characteristic of the output entropy of an image and demonstrated the usefulness of the output entropy acting as a texture feature in CAD systems. In order to validate the effectiveness and superiority of the output-entropy-based texture feature, two well-known texture features, i.e., mean and standard deviation were used for comparison. The database used in this study comprised 50 CT images obtained from 10 patients with pulmonary nodules, and 50 CT images obtained from 5 normal subjects. We used a support vector machine for classification. A leave-one-out method was employed for training and classification. Three combinations of texture features, i.e., mean and entropy, standard deviation and entropy, and standard deviation and mean were used as the inputs to the classifier. Three different regions of interest (ROI) sizes, i.e., 11 × 11, 9 × 9 and 7 × 7 pixels from the database were selected for computation of the feature values. Our experimental results show that the combination of entropy and standard deviation is significantly better than both the combination of mean and entropy and that of standard deviation and mean in the case of the ROI size of 11 × 11 pixels (p < 0.05). These results suggest that information entropy of an image can be used as an effective feature for CAD applications.展开更多
As a prerequisite for effective prognostics, the goodness of the features affects the complexity of the prognostic methods. Comparing to features quality evaluation in diagnostics, features evaluation for prognostics ...As a prerequisite for effective prognostics, the goodness of the features affects the complexity of the prognostic methods. Comparing to features quality evaluation in diagnostics, features evaluation for prognostics is a new problem. Normally, the monotonic tendency of feature series can be used as the visual representation of equipment damage cumulation so that forecasting its future health states is easy to implement. Through introducing the concept of ranking mutual information in ordinal case, a monotonicity evaluation method of monitoring feature series is proposed. Finally, this method is verified by the simulating feature series and the results verify its effectivity. For the specific application in industry, the evaluation results can be used as the standard for selecting prognostic feature.展开更多
Product information model for welding structure plays an important role for the integration of welding CAD/CAPP/CAM. However, existing CAD modeling systems are not capable of providing enough information for subsequen...Product information model for welding structure plays an important role for the integration of welding CAD/CAPP/CAM. However, existing CAD modeling systems are not capable of providing enough information for subsequent manufacturing activities such as CAPP and CAM. A new design approach using feature technique and object oriented programming method is put forward in this paper in order to create the product information model of welding structure. With this approach, the product information model is able to effectively support computer aided welding process planning, fixturing, assembling, path planning of welding robot and other manufacturing activities. The feature classification and representing scheme of welding structure are discussed. A prototype system is developed based on feature and object oriented programming. Its structure and functions are given in detail.展开更多
With the rapid development of digital and intelligent information systems, display of radar situation interface has become an important challenge in the field of human-computer interaction. We propose a method for the...With the rapid development of digital and intelligent information systems, display of radar situation interface has become an important challenge in the field of human-computer interaction. We propose a method for the optimization of radar situation interface from error-cognition through the mapping of information characteristics. A mapping method of matrix description is adopted to analyze the association properties between error-cognition sets and design information sets. Based on the mapping relationship between the domain of error-cognition and the domain of design information, a cross-correlational analysis is carried out between error-cognition and design information.We obtain the relationship matrix between the error-cognition of correlation between design information and the degree of importance among design information. Taking the task interface of a warfare navigation display as an example, error factors and the features of design information are extracted. Based on the results, we also propose an optimization design scheme for the radar situation interface.展开更多
For data mining tasks on large-scale data,feature selection is a pivotal stage that plays an important role in removing redundant or irrelevant features while improving classifier performance.Traditional wrapper featu...For data mining tasks on large-scale data,feature selection is a pivotal stage that plays an important role in removing redundant or irrelevant features while improving classifier performance.Traditional wrapper feature selection methodologies typically require extensive model training and evaluation,which cannot deliver desired outcomes within a reasonable computing time.In this paper,an innovative wrapper approach termed Contribution Tracking Feature Selection(CTFS)is proposed for feature selection of large-scale data,which can locate informative features without population-level evolution.In other words,fewer evaluations are needed for CTFS compared to other evolutionary methods.We initially introduce a refined sparse autoencoder to assess the prominence of each feature in the subsequent wrapper method.Subsequently,we utilize an enhanced wrapper feature selection technique that merges Mutual Information(MI)with individual feature contributions.Finally,a fine-tuning contribution tracking mechanism discerns informative features within the optimal feature subset,operating via a dominance accumulation mechanism.Experimental results for multiple classification performance metrics demonstrate that the proposed method effectively yields smaller feature subsets without degrading classification performance in an acceptable runtime compared to state-of-the-art algorithms across most large-scale benchmark datasets.展开更多
Feature subset selection is a fundamental problem of data mining. The mutual information of feature subset is a measure for feature subset containing class feature information. A hashing mechanism is proposed to calcu...Feature subset selection is a fundamental problem of data mining. The mutual information of feature subset is a measure for feature subset containing class feature information. A hashing mechanism is proposed to calculate the mutual information of feature subset. The feature relevancy is defined by mutual information. Redundancy-synergy coefficient, a novel redundancy and synergy measure for features to describe the class feature, is defined. In terms of information maximization rule, a bidirectional heuristic feature subset selection method based on mutual information and redundancy-synergy coefficient is presented. This study’s experiments show the good performance of the new method.展开更多
In allusion to the difficulty of integrating data with different models in integrating spatial information, the characteristics of raster structure, vector structure and mixed model were analyzed, and a hierarchical v...In allusion to the difficulty of integrating data with different models in integrating spatial information, the characteristics of raster structure, vector structure and mixed model were analyzed, and a hierarchical vector-raster integrative full feature model was put forward by integrating the advantage of vector and raster model and using the object-oriented method. The data structures of the four basic features, i.e. point, line, surface and solid, were described. An application was analyzed and described, and the characteristics of this model were described. In this model, all objects in the real world are divided into and described as features with hierarchy, and all the data are organized in vector. This model can describe data based on feature, field, network and other models, and avoid the disadvantage of inability to integrate data based on different models and perform spatial analysis on them in spatial information integration.展开更多
Positioning technology based on wireless network signals in indoor environments has developed rapidly in recent years as the demand for locationbased services continues to increase.Channel state information(CSI)can be...Positioning technology based on wireless network signals in indoor environments has developed rapidly in recent years as the demand for locationbased services continues to increase.Channel state information(CSI)can be used as location feature information in fingerprint-based positioning systems because it can reflect the characteristics of the signal on multiple subcarriers.However,the random noise contained in the raw CSI information increases the likelihood of confusion when matching fingerprint data.In this paper,the Dynamic Fusion Feature(DFF)is proposed as a new fingerprint formation method to remove the noise and improve the feature resolution of the system,which combines the pre-processed amplitude and phase data.Then,the improved edit distance on real sequence(IEDR)is used as a similarity metric for fingerprint matching.Based on the above studies,we propose a new indoor fingerprint positioning method,named DFF-EDR,for improving positioning performance.During the experimental stage,data were collected and analyzed in two typical indoor environments.The results show that the proposed localization method in this paper effectively improves the feature resolution of the system in terms of both fingerprint features and similarity measures,has good anti-noise capability,and effectively reduces the localization errors.展开更多
Sentiment analysis is the process of determining the intention or emotion behind an article.The subjective information from the context is analyzed by the sentimental analysis of the people’s opinion.The data that is...Sentiment analysis is the process of determining the intention or emotion behind an article.The subjective information from the context is analyzed by the sentimental analysis of the people’s opinion.The data that is analyzed quantifies the reactions or sentiments and reveals the information’s contextual polarity.In social behavior,sentiment can be thought of as a latent variable.Measuring and comprehending this behavior could help us to better understand the social issues.Because sentiments are domain specific,sentimental analysis in a specific context is critical in any real-world scenario.Textual sentiment analysis is done in sentence,document level and feature levels.This work introduces a new Information Gain based Feature Selection(IGbFS)algorithm for selecting highly correlated features eliminating irrelevant and redundant ones.Extensive textual sentiment analysis on sentence,document and feature levels are performed by exploiting the proposed Information Gain based Feature Selection algorithm.The analysis is done based on the datasets from Cornell and Kaggle repositories.When compared to existing baseline classifiers,the suggested Information Gain based classifier resulted in an increased accuracy of 96%for document,97.4%for sentence and 98.5%for feature levels respectively.Also,the proposed method is tested with IMDB,Yelp 2013 and Yelp 2014 datasets.Experimental results for these high dimensional datasets give increased accuracy of 95%,96%and 98%for the proposed Information Gain based classifier for document,sentence and feature levels respectively compared to existing baseline classifiers.展开更多
At present,knowledge embedding methods are widely used in the field of knowledge graph(KG)reasoning,and have been successfully applied to those with large entities and relationships.However,in research and production ...At present,knowledge embedding methods are widely used in the field of knowledge graph(KG)reasoning,and have been successfully applied to those with large entities and relationships.However,in research and production environments,there are a large number of KGs with a small number of entities and relations,which are called sparse KGs.Limited by the performance of knowledge extraction methods or some other reasons(some common-sense information does not appear in the natural corpus),the relation between entities is often incomplete.To solve this problem,a method of the graph neural network and information enhancement is proposed.The improved method increases the mean reciprocal rank(MRR)and Hit@3 by 1.6%and 1.7%,respectively,when the sparsity of the FB15K-237 dataset is 10%.When the sparsity is 50%,the evaluation indexes MRR and Hit@10 are increased by 0.8%and 1.8%,respectively.展开更多
For realizing of long text information hiding and covert communication, a binary watermark sequence was obtained firstly from a text file and encoded by a redundant encoding method. Then, two neighboring blocks were s...For realizing of long text information hiding and covert communication, a binary watermark sequence was obtained firstly from a text file and encoded by a redundant encoding method. Then, two neighboring blocks were selected at each time from the Hilbert scanning sequence of carrier image blocks, and transformed by 1-level discrete wavelet transformation (DWT). And then the double block based JNDs (just noticeable difference) were calculated with a visual model. According to the different codes of each two watermark bits, the average values of two corresponding detail sub-bands were modified by using one of JNDs to hide information into carrier image. The experimental results show that the hidden information is invisible to human eyes, and the algorithm is robust to some common image processing operations. The conclusion is that the algorithm is effective and practical.展开更多
In multi-target tracking,Multiple Hypothesis Tracking (MHT) can effectively solve the data association problem. However,traditional MHT can not make full use of motion information. In this work,we combine MHT with Int...In multi-target tracking,Multiple Hypothesis Tracking (MHT) can effectively solve the data association problem. However,traditional MHT can not make full use of motion information. In this work,we combine MHT with Interactive Multiple Model (IMM) estimator and feature fusion. New algorithm greatly improves the tracking performance due to the fact that IMM estimator provides better estimation and feature information enhances the accuracy of data association. The new algorithm is tested by tracking tropical fish in fish container. Experimental result shows that this algorithm can significantly reduce tracking lost rate and restrain the noises with higher computational effectiveness when compares with traditional MHT.展开更多
It is common for datasets to contain both categorical and continuous variables. However, many feature screening methods designed for high-dimensional classification assume that the variables are continuous. This limit...It is common for datasets to contain both categorical and continuous variables. However, many feature screening methods designed for high-dimensional classification assume that the variables are continuous. This limits the applicability of existing methods in handling this complex scenario. To address this issue, we propose a model-free feature screening approach for ultra-high-dimensional multi-classification that can handle both categorical and continuous variables. Our proposed feature screening method utilizes the Maximal Information Coefficient to assess the predictive power of the variables. By satisfying certain regularity conditions, we have proven that our screening procedure possesses the sure screening property and ranking consistency properties. To validate the effectiveness of our approach, we conduct simulation studies and provide real data analysis examples to demonstrate its performance in finite samples. In summary, our proposed method offers a solution for effectively screening features in ultra-high-dimensional datasets with a mixture of categorical and continuous covariates.展开更多
基金supported by Basic Science Research Program through the National Research Foundation of Korea(NRF)funded by the Ministry of Education(RS-2020-NR049579).
文摘High-dimensional data causes difficulties in machine learning due to high time consumption and large memory requirements.In particular,in amulti-label environment,higher complexity is required asmuch as the number of labels.Moreover,an optimization problem that fully considers all dependencies between features and labels is difficult to solve.In this study,we propose a novel regression-basedmulti-label feature selectionmethod that integrates mutual information to better exploit the underlying data structure.By incorporating mutual information into the regression formulation,the model captures not only linear relationships but also complex non-linear dependencies.The proposed objective function simultaneously considers three types of relationships:(1)feature redundancy,(2)featurelabel relevance,and(3)inter-label dependency.These three quantities are computed usingmutual information,allowing the proposed formulation to capture nonlinear dependencies among variables.These three types of relationships are key factors in multi-label feature selection,and our method expresses them within a unified formulation,enabling efficient optimization while simultaneously accounting for all of them.To efficiently solve the proposed optimization problem under non-negativity constraints,we develop a gradient-based optimization algorithm with fast convergence.Theexperimental results on sevenmulti-label datasets show that the proposed method outperforms existingmulti-label feature selection techniques.
基金supported by the National Natural Science Foundation of China(Nos.51805376 and U1709208)the Zhejiang Provincial Natural Science Foundation of China(Nos.LY20E050028 and LD21E050001)。
文摘Because the hydraulic directional valve usually works in a bad working environment and is disturbed by multi-factor noise,the traditional single sensor monitoring technology is difficult to use for an accurate diagnosis of it.Therefore,a fault diagnosis method based on multi-sensor information fusion is proposed in this paper to reduce the inaccuracy and uncertainty of traditional single sensor information diagnosis technology and to realize accurate monitoring for the location or diagnosis of early faults in such valves in noisy environments.Firstly,the statistical features of signals collected by the multi-sensor are extracted and the depth features are obtained by a convolutional neural network(CNN)to form a complete and stable multi-dimensional feature set.Secondly,to obtain a weighted multi-dimensional feature set,the multi-dimensional feature sets of similar sensors are combined,and the entropy weight method is used to weight these features to reduce the interference of insensitive features.Finally,the attention mechanism is introduced to improve the dual-channel CNN,which is used to adaptively fuse the weighted multi-dimensional feature sets of heterogeneous sensors,to flexibly select heterogeneous sensor information so as to achieve an accurate diagnosis.Experimental results show that the weighted multi-dimensional feature set obtained by the proposed method has a high fault-representation ability and low information redundancy.It can diagnose simultaneously internal wear faults of the hydraulic directional valve and electromagnetic faults of actuators that are difficult to diagnose by traditional methods.This proposed method can achieve high fault-diagnosis accuracy under severe working conditions.
基金Supported by National Natural Science Foundation of China and Civil Aviation Administration of China Joint Funded Project(Grant No.U1733108)Key Project of Tianjin Science and Technology Support Program(Grant No.16YFZCSY00860).
文摘For a single-structure deep learning fault diagnosis model,its disadvantages are an insufficient feature extraction and weak fault classification capability.This paper proposes a multi-scale deep feature fusion intelligent fault diagnosis method based on information entropy.First,a normal autoencoder,denoising autoencoder,sparse autoencoder,and contractive autoencoder are used in parallel to construct a multi-scale deep neural network feature extraction structure.A deep feature fusion strategy based on information entropy is proposed to obtain low-dimensional features and ensure the robustness of the model and the quality of deep features.Finally,the advantage of the deep belief network probability model is used as the fault classifier to identify the faults.The effectiveness of the proposed method was verified by a gearbox test-bed.Experimental results show that,compared with traditional and existing intelligent fault diagnosis methods,the proposed method can obtain representative information and features from the raw data with higher classification accuracy.
基金Project supported by the National Natural Science Foundation ofChina (No. 60075007) and the National Basic Research Program(973) of China (No. G1998030401)
文摘Mutual information is an important information measure for feature subset. In this paper, a hashing mechanism is proposed to calculate the mutual information on the feature subset. Redundancy-synergy coefficient, a novel redundancy and synergy measure of features to express the class feature, is defined by mutual information. The information maximization rule was applied to derive the heuristic feature subset selection method based on mutual information and redundancy-synergy coefficient. Our experiment results showed the good performance of the new feature selection method.
文摘Intelligence and perception are two operative technologies in 6G scenarios.The intelligent wireless network and information perception require a deep fusion of artificial intelligence(AI)and wireless communications in 6G systems.Therefore,fusion is becoming a typical feature and key challenge of 6G wireless communication systems.In this paper,we focus on the critical issues and propose three application scenarios in 6G wireless systems.Specifically,we first discuss the fusion of AI and 6G networks for the enhancement of 5G-advanced technology and future wireless communication systems.Then,we introduce the wireless AI technology architecture with 6G multidimensional information perception,which includes the physical layer technology of multi-dimensional feature information perception,full spectrum fusion technology,and intelligent wireless resource management.The discussion of key technologies for intelligent 6G wireless network networks is expected to provide a guideline for future research.
基金Sponsored by the National Nature Science Foundation Projects (Grant No. 60773070,60736044)
文摘In order to solve the poor performance in text classification when using traditional formula of mutual information (MI) , a feature selection algorithm were proposed based on improved mutual information. The improved mutual information algorithm, which is on the basis of traditional improved mutual information methods that enbance the MI value of negative characteristics and feature' s frequency, supports the concept of concentration degree and dispersion degree. In accordance with the concept of concentration degree and dispersion degree, formulas which embody concentration degree and dispersion degree were constructed and the improved mutual information was implemented based on these. In this paper, the feature selection algorithm was applied based on improved mutual information to a text classifier based on Biomimetic Pattern Recognition and it was compared with several other feature selection methods. The experimental results showed that the improved mutu- al information feature selection method greatly enhances the performance compared with traditional mutual information feature selection methods and the performance is better than that of information gain. Through the introduction of the concept of concentration degree and dispersion degree, the improved mutual information feature selection method greatly improves the performance of text classification system.
文摘Computer-aided detection and diagnosis (CAD) systems are increasingly being used as an aid by clinicians for detection and interpretation of diseases. In general, a CAD system employs a classifier to detect or distinguish between abnormal and normal tissues on images. In the phase of classification, a set of image features and/or texture features extracted from the images are commonly used. In this article, we investigated the characteristic of the output entropy of an image and demonstrated the usefulness of the output entropy acting as a texture feature in CAD systems. In order to validate the effectiveness and superiority of the output-entropy-based texture feature, two well-known texture features, i.e., mean and standard deviation were used for comparison. The database used in this study comprised 50 CT images obtained from 10 patients with pulmonary nodules, and 50 CT images obtained from 5 normal subjects. We used a support vector machine for classification. A leave-one-out method was employed for training and classification. Three combinations of texture features, i.e., mean and entropy, standard deviation and entropy, and standard deviation and mean were used as the inputs to the classifier. Three different regions of interest (ROI) sizes, i.e., 11 × 11, 9 × 9 and 7 × 7 pixels from the database were selected for computation of the feature values. Our experimental results show that the combination of entropy and standard deviation is significantly better than both the combination of mean and entropy and that of standard deviation and mean in the case of the ROI size of 11 × 11 pixels (p < 0.05). These results suggest that information entropy of an image can be used as an effective feature for CAD applications.
基金Supported by National Natural Science Foundation of China(60575036),Natural Science Foundation of Heilongjiang Province of China(F0316),the Science and Technology Foundation for Innovative Talents of Harbin City of China(2007RFXXG023),and the Science Foundation for Top Talents with the Spirit of Innovation of Harbin University of Science and Technology
基金the Test Technique Research Project(No.2014SZJY3101)
文摘As a prerequisite for effective prognostics, the goodness of the features affects the complexity of the prognostic methods. Comparing to features quality evaluation in diagnostics, features evaluation for prognostics is a new problem. Normally, the monotonic tendency of feature series can be used as the visual representation of equipment damage cumulation so that forecasting its future health states is easy to implement. Through introducing the concept of ranking mutual information in ordinal case, a monotonicity evaluation method of monitoring feature series is proposed. Finally, this method is verified by the simulating feature series and the results verify its effectivity. For the specific application in industry, the evaluation results can be used as the standard for selecting prognostic feature.
文摘Product information model for welding structure plays an important role for the integration of welding CAD/CAPP/CAM. However, existing CAD modeling systems are not capable of providing enough information for subsequent manufacturing activities such as CAPP and CAM. A new design approach using feature technique and object oriented programming method is put forward in this paper in order to create the product information model of welding structure. With this approach, the product information model is able to effectively support computer aided welding process planning, fixturing, assembling, path planning of welding robot and other manufacturing activities. The feature classification and representing scheme of welding structure are discussed. A prototype system is developed based on feature and object oriented programming. Its structure and functions are given in detail.
基金supported by Jiangsu Province Nature Science Foundation of China (BK20221490)the Key Fundamental Research Funds for the Central Universities (30920041114)+2 种基金the National Natural Science Foundation of China (52175469,71601068)the Key Research and Development (Social Development) Project of Jiangsu Province(BE2019647)Jiangsu Province Social Science Foundation of China (20YSB013)。
文摘With the rapid development of digital and intelligent information systems, display of radar situation interface has become an important challenge in the field of human-computer interaction. We propose a method for the optimization of radar situation interface from error-cognition through the mapping of information characteristics. A mapping method of matrix description is adopted to analyze the association properties between error-cognition sets and design information sets. Based on the mapping relationship between the domain of error-cognition and the domain of design information, a cross-correlational analysis is carried out between error-cognition and design information.We obtain the relationship matrix between the error-cognition of correlation between design information and the degree of importance among design information. Taking the task interface of a warfare navigation display as an example, error factors and the features of design information are extracted. Based on the results, we also propose an optimization design scheme for the radar situation interface.
基金supported in part by the National Key Research and Development Program of China under Grant(No.2021YFB3300900)the NSFC Key Supported Project of the Major Research Plan under Grant(No.92267206)+2 种基金the National Natural Science Foundation of China under Grant(Nos.72201052,62032013,62173076)the Fundamental Research Funds for the Central Universities under Grant(No.N2204017)the Fundamental Research Funds for State Key Laboratory of Synthetical Automation for Process Industries under Grant(No.2013ZCX11).
文摘For data mining tasks on large-scale data,feature selection is a pivotal stage that plays an important role in removing redundant or irrelevant features while improving classifier performance.Traditional wrapper feature selection methodologies typically require extensive model training and evaluation,which cannot deliver desired outcomes within a reasonable computing time.In this paper,an innovative wrapper approach termed Contribution Tracking Feature Selection(CTFS)is proposed for feature selection of large-scale data,which can locate informative features without population-level evolution.In other words,fewer evaluations are needed for CTFS compared to other evolutionary methods.We initially introduce a refined sparse autoencoder to assess the prominence of each feature in the subsequent wrapper method.Subsequently,we utilize an enhanced wrapper feature selection technique that merges Mutual Information(MI)with individual feature contributions.Finally,a fine-tuning contribution tracking mechanism discerns informative features within the optimal feature subset,operating via a dominance accumulation mechanism.Experimental results for multiple classification performance metrics demonstrate that the proposed method effectively yields smaller feature subsets without degrading classification performance in an acceptable runtime compared to state-of-the-art algorithms across most large-scale benchmark datasets.
文摘Feature subset selection is a fundamental problem of data mining. The mutual information of feature subset is a measure for feature subset containing class feature information. A hashing mechanism is proposed to calculate the mutual information of feature subset. The feature relevancy is defined by mutual information. Redundancy-synergy coefficient, a novel redundancy and synergy measure for features to describe the class feature, is defined. In terms of information maximization rule, a bidirectional heuristic feature subset selection method based on mutual information and redundancy-synergy coefficient is presented. This study’s experiments show the good performance of the new method.
基金Project (40473029) supported bythe National Natural Science Foundation of China project (04JJ3046) supported bytheNatural Science Foundation of Hunan Province , China
文摘In allusion to the difficulty of integrating data with different models in integrating spatial information, the characteristics of raster structure, vector structure and mixed model were analyzed, and a hierarchical vector-raster integrative full feature model was put forward by integrating the advantage of vector and raster model and using the object-oriented method. The data structures of the four basic features, i.e. point, line, surface and solid, were described. An application was analyzed and described, and the characteristics of this model were described. In this model, all objects in the real world are divided into and described as features with hierarchy, and all the data are organized in vector. This model can describe data based on feature, field, network and other models, and avoid the disadvantage of inability to integrate data based on different models and perform spatial analysis on them in spatial information integration.
基金This work was financially supported by the National Key Research&Development Program of China under Grant No.2020YFC1511702the Beijing Municipal Natural Science Foundation under Grant No.L191003.
文摘Positioning technology based on wireless network signals in indoor environments has developed rapidly in recent years as the demand for locationbased services continues to increase.Channel state information(CSI)can be used as location feature information in fingerprint-based positioning systems because it can reflect the characteristics of the signal on multiple subcarriers.However,the random noise contained in the raw CSI information increases the likelihood of confusion when matching fingerprint data.In this paper,the Dynamic Fusion Feature(DFF)is proposed as a new fingerprint formation method to remove the noise and improve the feature resolution of the system,which combines the pre-processed amplitude and phase data.Then,the improved edit distance on real sequence(IEDR)is used as a similarity metric for fingerprint matching.Based on the above studies,we propose a new indoor fingerprint positioning method,named DFF-EDR,for improving positioning performance.During the experimental stage,data were collected and analyzed in two typical indoor environments.The results show that the proposed localization method in this paper effectively improves the feature resolution of the system in terms of both fingerprint features and similarity measures,has good anti-noise capability,and effectively reduces the localization errors.
文摘Sentiment analysis is the process of determining the intention or emotion behind an article.The subjective information from the context is analyzed by the sentimental analysis of the people’s opinion.The data that is analyzed quantifies the reactions or sentiments and reveals the information’s contextual polarity.In social behavior,sentiment can be thought of as a latent variable.Measuring and comprehending this behavior could help us to better understand the social issues.Because sentiments are domain specific,sentimental analysis in a specific context is critical in any real-world scenario.Textual sentiment analysis is done in sentence,document level and feature levels.This work introduces a new Information Gain based Feature Selection(IGbFS)algorithm for selecting highly correlated features eliminating irrelevant and redundant ones.Extensive textual sentiment analysis on sentence,document and feature levels are performed by exploiting the proposed Information Gain based Feature Selection algorithm.The analysis is done based on the datasets from Cornell and Kaggle repositories.When compared to existing baseline classifiers,the suggested Information Gain based classifier resulted in an increased accuracy of 96%for document,97.4%for sentence and 98.5%for feature levels respectively.Also,the proposed method is tested with IMDB,Yelp 2013 and Yelp 2014 datasets.Experimental results for these high dimensional datasets give increased accuracy of 95%,96%and 98%for the proposed Information Gain based classifier for document,sentence and feature levels respectively compared to existing baseline classifiers.
基金supported by the Sichuan Science and Technology Program under Grants No.2022YFQ0052 and No.2021YFQ0009.
文摘At present,knowledge embedding methods are widely used in the field of knowledge graph(KG)reasoning,and have been successfully applied to those with large entities and relationships.However,in research and production environments,there are a large number of KGs with a small number of entities and relations,which are called sparse KGs.Limited by the performance of knowledge extraction methods or some other reasons(some common-sense information does not appear in the natural corpus),the relation between entities is often incomplete.To solve this problem,a method of the graph neural network and information enhancement is proposed.The improved method increases the mean reciprocal rank(MRR)and Hit@3 by 1.6%and 1.7%,respectively,when the sparsity of the FB15K-237 dataset is 10%.When the sparsity is 50%,the evaluation indexes MRR and Hit@10 are increased by 0.8%and 1.8%,respectively.
文摘For realizing of long text information hiding and covert communication, a binary watermark sequence was obtained firstly from a text file and encoded by a redundant encoding method. Then, two neighboring blocks were selected at each time from the Hilbert scanning sequence of carrier image blocks, and transformed by 1-level discrete wavelet transformation (DWT). And then the double block based JNDs (just noticeable difference) were calculated with a visual model. According to the different codes of each two watermark bits, the average values of two corresponding detail sub-bands were modified by using one of JNDs to hide information into carrier image. The experimental results show that the hidden information is invisible to human eyes, and the algorithm is robust to some common image processing operations. The conclusion is that the algorithm is effective and practical.
基金Supported by the National Natural Science Foundation of China (No. 60772154)the President Foundation of Graduate University of Chinese Academy of Sciences (No. 085102GN00)
文摘In multi-target tracking,Multiple Hypothesis Tracking (MHT) can effectively solve the data association problem. However,traditional MHT can not make full use of motion information. In this work,we combine MHT with Interactive Multiple Model (IMM) estimator and feature fusion. New algorithm greatly improves the tracking performance due to the fact that IMM estimator provides better estimation and feature information enhances the accuracy of data association. The new algorithm is tested by tracking tropical fish in fish container. Experimental result shows that this algorithm can significantly reduce tracking lost rate and restrain the noises with higher computational effectiveness when compares with traditional MHT.
文摘It is common for datasets to contain both categorical and continuous variables. However, many feature screening methods designed for high-dimensional classification assume that the variables are continuous. This limits the applicability of existing methods in handling this complex scenario. To address this issue, we propose a model-free feature screening approach for ultra-high-dimensional multi-classification that can handle both categorical and continuous variables. Our proposed feature screening method utilizes the Maximal Information Coefficient to assess the predictive power of the variables. By satisfying certain regularity conditions, we have proven that our screening procedure possesses the sure screening property and ranking consistency properties. To validate the effectiveness of our approach, we conduct simulation studies and provide real data analysis examples to demonstrate its performance in finite samples. In summary, our proposed method offers a solution for effectively screening features in ultra-high-dimensional datasets with a mixture of categorical and continuous covariates.