The fusion of infrared and visible images should emphasize the salient targets in the infrared image while preserving the textural details of the visible images.To meet these requirements,an autoencoder-based method f...The fusion of infrared and visible images should emphasize the salient targets in the infrared image while preserving the textural details of the visible images.To meet these requirements,an autoencoder-based method for infrared and visible image fusion is proposed.The encoder designed according to the optimization objective consists of a base encoder and a detail encoder,which is used to extract low-frequency and high-frequency information from the image.This extraction may lead to some information not being captured,so a compensation encoder is proposed to supplement the missing information.Multi-scale decomposition is also employed to extract image features more comprehensively.The decoder combines low-frequency,high-frequency and supplementary information to obtain multi-scale features.Subsequently,the attention strategy and fusion module are introduced to perform multi-scale fusion for image reconstruction.Experimental results on three datasets show that the fused images generated by this network effectively retain salient targets while being more consistent with human visual perception.展开更多
Globally,diabetic retinopathy(DR)is the primary cause of blindness,affecting millions of people worldwide.This widespread impact underscores the critical need for reliable and precise diagnostic techniques to ensure p...Globally,diabetic retinopathy(DR)is the primary cause of blindness,affecting millions of people worldwide.This widespread impact underscores the critical need for reliable and precise diagnostic techniques to ensure prompt diagnosis and effective treatment.Deep learning-based automated diagnosis for diabetic retinopathy can facilitate early detection and treatment.However,traditional deep learning models that focus on local views often learn feature representations that are less discriminative at the semantic level.On the other hand,models that focus on global semantic-level information might overlook critical,subtle local pathological features.To address this issue,we propose an adaptive multi-scale feature fusion network called(AMSFuse),which can adaptively combine multi-scale global and local features without compromising their individual representation.Specifically,our model incorporates global features for extracting high-level contextual information from retinal images.Concurrently,local features capture fine-grained details,such as microaneurysms,hemorrhages,and exudates,which are critical for DR diagnosis.These global and local features are adaptively fused using a fusion block,followed by an Integrated Attention Mechanism(IAM)that refines the fused features by emphasizing relevant regions,thereby enhancing classification accuracy for DR classification.Our model achieves 86.3%accuracy on the APTOS dataset and 96.6%RFMiD,both of which are comparable to state-of-the-art methods.展开更多
Detecting abnormal cervical cells is crucial for early identification and timely treatment of cervical cancer.However,this task is challenging due to the morphological similarities between abnormal and normal cells an...Detecting abnormal cervical cells is crucial for early identification and timely treatment of cervical cancer.However,this task is challenging due to the morphological similarities between abnormal and normal cells and the significant variations in cell size.Pathologists often refer to surrounding cells to identify abnormalities.To emulate this slide examination behavior,this study proposes a Multi-Scale Feature Fusion Network(MSFF-Net)for detecting cervical abnormal cells.MSFF-Net employs a Cross-Scale Pooling Model(CSPM)to effectively capture diverse features and contextual information,ranging from local details to the overall structure.Additionally,a Multi-Scale Fusion Attention(MSFA)module is introduced to mitigate the impact of cell size variations by adaptively fusing local and global information at different scales.To handle the complex environment of cervical cell images,such as cell adhesion and overlapping,the Inner-CIoU loss function is utilized to more precisely measure the overlap between bounding boxes,thereby improving detection accuracy in such scenarios.Experimental results on the Comparison detector dataset demonstrate that MSFF-Net achieves a mean average precision(mAP)of 63.2%,outperforming state-of-the-art methods while maintaining a relatively small number of parameters(26.8 M).This study highlights the effectiveness of multi-scale feature fusion in enhancing the detection of cervical abnormal cells,contributing to more accurate and efficient cervical cancer screening.展开更多
Considering that the algorithm accuracy of the traditional sparse representation models is not high under the influence of multiple complex environmental factors,this study focuses on the improvement of feature extrac...Considering that the algorithm accuracy of the traditional sparse representation models is not high under the influence of multiple complex environmental factors,this study focuses on the improvement of feature extraction and model construction.Firstly,the convolutional neural network(CNN)features of the face are extracted by the trained deep learning network.Next,the steady-state and dynamic classifiers for face recognition are constructed based on the CNN features and Haar features respectively,with two-stage sparse representation introduced in the process of constructing the steady-state classifier and the feature templates with high reliability are dynamically selected as alternative templates from the sparse representation template dictionary constructed using the CNN features.Finally,the results of face recognition are given based on the classification results of the steady-state classifier and the dynamic classifier together.Based on this,the feature weights of the steady-state classifier template are adjusted in real time and the dictionary set is dynamically updated to reduce the probability of irrelevant features entering the dictionary set.The average recognition accuracy of this method is 94.45%on the CMU PIE face database and 96.58%on the AR face database,which is significantly improved compared with that of the traditional face recognition methods.展开更多
With the rapid growth of socialmedia,the spread of fake news has become a growing problem,misleading the public and causing significant harm.As social media content is often composed of both images and text,the use of...With the rapid growth of socialmedia,the spread of fake news has become a growing problem,misleading the public and causing significant harm.As social media content is often composed of both images and text,the use of multimodal approaches for fake news detection has gained significant attention.To solve the problems existing in previous multi-modal fake news detection algorithms,such as insufficient feature extraction and insufficient use of semantic relations between modes,this paper proposes the MFFFND-Co(Multimodal Feature Fusion Fake News Detection with Co-Attention Block)model.First,the model deeply explores the textual content,image content,and frequency domain features.Then,it employs a Co-Attention mechanism for cross-modal fusion.Additionally,a semantic consistency detectionmodule is designed to quantify semantic deviations,thereby enhancing the performance of fake news detection.Experimentally verified on two commonly used datasets,Twitter and Weibo,the model achieved F1 scores of 90.0% and 94.0%,respectively,significantly outperforming the pre-modified MFFFND(Multimodal Feature Fusion Fake News Detection with Attention Block)model and surpassing other baseline models.This improves the accuracy of detecting fake information in artificial intelligence detection and engineering software detection.展开更多
To solve the problems of redundant feature information,the insignificant difference in feature representation,and low recognition accuracy of the fine-grained image,based on the ResNeXt50 model,an MSFResNet network mo...To solve the problems of redundant feature information,the insignificant difference in feature representation,and low recognition accuracy of the fine-grained image,based on the ResNeXt50 model,an MSFResNet network model is proposed by fusing multi-scale feature information.Firstly,a multi-scale feature extraction module is designed to obtain multi-scale information on feature images by using different scales of convolution kernels.Meanwhile,the channel attention mechanism is used to increase the global information acquisition of the network.Secondly,the feature images processed by the multi-scale feature extraction module are fused with the deep feature images through short links to guide the full learning of the network,thus reducing the loss of texture details of the deep network feature images,and improving network generalization ability and recognition accuracy.Finally,the validity of the MSFResNet model is verified using public datasets and applied to wild mushroom identification.Experimental results show that compared with ResNeXt50 network model,the accuracy of the MSFResNet model is improved by 6.01%on the FGVC-Aircraft common dataset.It achieves 99.13%classification accuracy on the wild mushroom dataset,which is 0.47%higher than ResNeXt50.Furthermore,the experimental results of the thermal map show that the MSFResNet model significantly reduces the interference of background information,making the network focus on the location of the main body of wild mushroom,which can effectively improve the accuracy of wild mushroom identification.展开更多
Deep Learning has been widely used to model soft sensors in modern industrial processes with nonlinear variables and uncertainty.Due to the outstanding ability for high-level feature extraction,stacked autoencoder(SAE...Deep Learning has been widely used to model soft sensors in modern industrial processes with nonlinear variables and uncertainty.Due to the outstanding ability for high-level feature extraction,stacked autoencoder(SAE)has been widely used to improve the model accuracy of soft sensors.However,with the increase of network layers,SAE may encounter serious information loss issues,which affect the modeling performance of soft sensors.Besides,there are typically very few labeled samples in the data set,which brings challenges to traditional neural networks to solve.In this paper,a multi-scale feature fused stacked autoencoder(MFF-SAE)is suggested for feature representation related to hierarchical output,where stacked autoencoder,mutual information(MI)and multi-scale feature fusion(MFF)strategies are integrated.Based on correlation analysis between output and input variables,critical hidden variables are extracted from the original variables in each autoencoder's input layer,which are correspondingly given varying weights.Besides,an integration strategy based on multi-scale feature fusion is adopted to mitigate the impact of information loss with the deepening of the network layers.Then,the MFF-SAE method is designed and stacked to form deep networks.Two practical industrial processes are utilized to evaluate the performance of MFF-SAE.Results from simulations indicate that in comparison to other cutting-edge techniques,the proposed method may considerably enhance the accuracy of soft sensor modeling,where the suggested method reduces the root mean square error(RMSE)by 71.8%,17.1%and 64.7%,15.1%,respectively.展开更多
Background Three-dimensional(3D)shape representation using mesh data is essential in various applications,such as virtual reality and simulation technologies.Current methods for extracting features from mesh edges or ...Background Three-dimensional(3D)shape representation using mesh data is essential in various applications,such as virtual reality and simulation technologies.Current methods for extracting features from mesh edges or faces struggle with complex 3D models because edge-based approaches miss global contexts and face-based methods overlook variations in adjacent areas,which affects the overall precision.To address these issues,we propose the Feature Discrimination and Context Propagation Network(FDCPNet),which is a novel approach that synergistically integrates local and global features in mesh datasets.Methods FDCPNet is composed of two modules:(1)the Feature Discrimination Module,which employs an attention mechanism to enhance the identification of key local features,and(2)the Context Propagation Module,which enriches key local features by integrating global contextual information,thereby facilitating a more detailed and comprehensive representation of crucial areas within the mesh model.Results Experiments on popular datasets validated the effectiveness of FDCPNet,showing an improvement in the classification accuracy over the baseline MeshNet.Furthermore,even with reduced mesh face numbers and limited training data,FDCPNet achieved promising results,demonstrating its robustness in scenarios of variable complexity.展开更多
Research indicates that microbe activity within the human body significantly influences health by being closely linked to various diseases.Accurately predicting microbe-disease interactions(MDIs)offers critical insigh...Research indicates that microbe activity within the human body significantly influences health by being closely linked to various diseases.Accurately predicting microbe-disease interactions(MDIs)offers critical insights for disease intervention and pharmaceutical research.Current advanced AI-based technologies automatically generate robust representations of microbes and diseases,enabling effective MDI predictions.However,these models continue to face significant challenges.A major issue is their reliance on complex feature extractors and classifiers,which substantially diminishes the models’generalizability.To address this,we introduce a novel graph autoencoder framework that utilizes decoupled representation learning and multi-scale information fusion strategies to efficiently infer potential MDIs.Initially,we randomly mask portions of the input microbe-disease graph based on Bernoulli distribution to boost self-supervised training and minimize noise-related performance degradation.Secondly,we employ decoupled representation learning technology,compelling the graph neural network(GNN)to independently learn the weights for each feature subspace,thus enhancing its expressive power.Finally,we implement multi-scale information fusion technology to amalgamate the multi-layer outputs of GNN,reducing information loss due to occlusion.Extensive experiments on public datasets demonstrate that our model significantly surpasses existing top MDI prediction models.This indicates that our model can accurately predict unknown MDIs and is likely to aid in disease discovery and precision pharmaceutical research.Code and data are accessible at:https://github.com/shmildsj/MDI-IFDRL.展开更多
Deep forgery detection technologies are crucial for image and video recognition tasks,with their performance heavily reliant on the features extracted from both real and fake images.However,most existing methods prima...Deep forgery detection technologies are crucial for image and video recognition tasks,with their performance heavily reliant on the features extracted from both real and fake images.However,most existing methods primarily focus on spatial domain features,which limits their accuracy.To address this limitation,we propose an adaptive dual-domain feature representation method for enhanced deep forgery detection.Specifically,an adaptive region dynamic convolution module is established to efficiently extract facial features from the spatial domain.Then,we introduce an adaptive frequency dynamic filter to capture effective frequency domain features.By fusing both spatial and frequency domain features,our approach significantly improves the accuracy of classifying real and fake facial images.Finally,experimental results on three real-world datasets validate the effectiveness of our dual-domain feature representation method,which substantially improves classification precision.展开更多
Transportation systems are experiencing a significant transformation due to the integration of advanced technologies, including artificial intelligence and machine learning. In the context of intelligent transportatio...Transportation systems are experiencing a significant transformation due to the integration of advanced technologies, including artificial intelligence and machine learning. In the context of intelligent transportation systems (ITS) and Advanced Driver Assistance Systems (ADAS), the development of efficient and reliable traffic light detection mechanisms is crucial for enhancing road safety and traffic management. This paper presents an optimized convolutional neural network (CNN) framework designed to detect traffic lights in real-time within complex urban environments. Leveraging multi-scale pyramid feature maps, the proposed model addresses key challenges such as the detection of small, occluded, and low-resolution traffic lights amidst complex backgrounds. The integration of dilated convolutions, Region of Interest (ROI) alignment, and Soft Non-Maximum Suppression (Soft-NMS) further improves detection accuracy and reduces false positives. By optimizing computational efficiency and parameter complexity, the framework is designed to operate seamlessly on embedded systems, ensuring robust performance in real-world applications. Extensive experiments using real-world datasets demonstrate that our model significantly outperforms existing methods, providing a scalable solution for ITS and ADAS applications. This research contributes to the advancement of Artificial Intelligence-driven (AI-driven) pattern recognition in transportation systems and offers a mathematical approach to improving efficiency and safety in logistics and transportation networks.展开更多
To minimize the low classification accuracy and low utilization of spatial information in traditional hyperspectral image classification methods, we propose a new hyperspectral image classification method, which is ba...To minimize the low classification accuracy and low utilization of spatial information in traditional hyperspectral image classification methods, we propose a new hyperspectral image classification method, which is based on the Gabor spatial texture features and nonparametric weighted spectral features, and the sparse representation classification method(Gabor–NWSF and SRC), abbreviated GNWSF–SRC. The proposed(GNWSF–SRC) method first combines the Gabor spatial features and nonparametric weighted spectral features to describe the hyperspectral image, and then applies the sparse representation method. Finally, the classification is obtained by analyzing the reconstruction error. We use the proposed method to process two typical hyperspectral data sets with different percentages of training samples. Theoretical analysis and simulation demonstrate that the proposed method improves the classification accuracy and Kappa coefficient compared with traditional classification methods and achieves better classification performance.展开更多
Feature based design has been regarded as a promising approach for CAD/CAM integration.This paper aims to establish a domain independent representation formalism for feature based design in three aspects: formal re...Feature based design has been regarded as a promising approach for CAD/CAM integration.This paper aims to establish a domain independent representation formalism for feature based design in three aspects: formal representation,design process model and design algorithm.The implementing scheme and formal description of feature taxonomy,feature operator,feature model validation and feature transformation are given in the paper.The feature based design process model suited for either sequencial or concurrent engineering is proposed and its application to product structural design and process plan design is presented. Some general design algorithms for developing feature based design system are also addressed.The proposed scheme provides a formal methodology elementary for feature based design system development and operation in a structural way.展开更多
In expression recognition, feature representation is critical for successful recognition since it contains distinctive information of expressions. In this paper, a new approach for representing facial expression featu...In expression recognition, feature representation is critical for successful recognition since it contains distinctive information of expressions. In this paper, a new approach for representing facial expression features is proposed with its objective to describe features in an effective and efficient way in order to improve the recognition performance. The method combines the facial action coding system(FACS) and 'uniform' local binary patterns(LBP) to represent facial expression features from coarse to fine. The facial feature regions are extracted by active shape models(ASM) based on FACS to obtain the gray-level texture. Then, LBP is used to represent expression features for enhancing the discriminant. A facial expression recognition system is developed based on this feature extraction method by using K nearest neighborhood(K-NN) classifier to recognize facial expressions. Finally, experiments are carried out to evaluate this feature extraction method. The significance of removing the unrelated facial regions and enhancing the discrimination ability of expression features in the recognition process is indicated by the results, in addition to its convenience.展开更多
Focused on the task of fast and accurate armored target detection in ground battlefield,a detection method based on multi-scale representation network(MS-RN) and shape-fixed Guided Anchor(SF-GA)scheme is proposed.Firs...Focused on the task of fast and accurate armored target detection in ground battlefield,a detection method based on multi-scale representation network(MS-RN) and shape-fixed Guided Anchor(SF-GA)scheme is proposed.Firstly,considering the large-scale variation and camouflage of armored target,a new MS-RN integrating contextual information in battlefield environment is designed.The MS-RN extracts deep features from templates with different scales and strengthens the detection ability of small targets.Armored targets of different sizes are detected on different representation features.Secondly,aiming at the accuracy and real-time detection requirements,improved shape-fixed Guided Anchor is used on feature maps of different scales to recommend regions of interests(ROIs).Different from sliding or random anchor,the SF-GA can filter out 80% of the regions while still improving the recall.A special detection dataset for armored target,named Armored Target Dataset(ARTD),is constructed,based on which the comparable experiments with state-of-art detection methods are conducted.Experimental results show that the proposed method achieves outstanding performance in detection accuracy and efficiency,especially when small armored targets are involved.展开更多
In modern electromagnetic environment, radar emitter signal recognition is an important research topic. On the basis of multi-resolution wavelet analysis, an adaptive radar emitter signal recognition method based on m...In modern electromagnetic environment, radar emitter signal recognition is an important research topic. On the basis of multi-resolution wavelet analysis, an adaptive radar emitter signal recognition method based on multi-scale wavelet entropy feature extraction and feature weighting was proposed. With the only priori knowledge of signal to noise ratio(SNR), the method of extracting multi-scale wavelet entropy features of wavelet coefficients from different received signals were combined with calculating uneven weight factor and stability weight factor of the extracted multi-dimensional characteristics. Radar emitter signals of different modulation types and different parameters modulated were recognized through feature weighting and feature fusion. Theoretical analysis and simulation results show that the presented algorithm has a high recognition rate. Additionally, when the SNR is greater than-4 d B, the correct recognition rate is higher than 93%. Hence, the proposed algorithm has great application value.展开更多
In the era of Big data,learning discriminant feature representation from network traffic is identified has as an invariably essential task for improving the detection ability of an intrusion detection system(IDS).Owin...In the era of Big data,learning discriminant feature representation from network traffic is identified has as an invariably essential task for improving the detection ability of an intrusion detection system(IDS).Owing to the lack of accurately labeled network traffic data,many unsupervised feature representation learning models have been proposed with state-of-theart performance.Yet,these models fail to consider the classification error while learning the feature representation.Intuitively,the learnt feature representation may degrade the performance of the classification task.For the first time in the field of intrusion detection,this paper proposes an unsupervised IDS model leveraging the benefits of deep autoencoder(DAE)for learning the robust feature representation and one-class support vector machine(OCSVM)for finding the more compact decision hyperplane for intrusion detection.Specially,the proposed model defines a new unified objective function to minimize the reconstruction and classification error simultaneously.This unique contribution not only enables the model to support joint learning for feature representation and classifier training but also guides to learn the robust feature representation which can improve the discrimination ability of the classifier for intrusion detection.Three set of evaluation experiments are conducted to demonstrate the potential of the proposed model.First,the ablation evaluation on benchmark dataset,NSL-KDD validates the design decision of the proposed model.Next,the performance evaluation on recent intrusion dataset,UNSW-NB15 signifies the stable performance of the proposed model.Finally,the comparative evaluation verifies the efficacy of the proposed model against recently published state-of-the-art methods.展开更多
Distributed denial of service(DDoS)attacks launch more and more frequently and are more destructive.Feature representation as an important part of DDoS defense technology directly affects the efficiency of defense.Mos...Distributed denial of service(DDoS)attacks launch more and more frequently and are more destructive.Feature representation as an important part of DDoS defense technology directly affects the efficiency of defense.Most DDoS feature extraction methods cannot fully utilize the information of the original data,resulting in the extracted features losing useful features.In this paper,a DDoS feature representation method based on deep belief network(DBN)is proposed.We quantify the original data by the size of the network flows,the distribution of IP addresses and ports,and the diversity of packet sizes of different protocols and train the DBN in an unsupervised manner by these quantified values.Two feedforward neural networks(FFNN)are initialized by the trained deep belief network,and one of the feedforward neural networks continues to be trained in a supervised manner.The canonical correlation analysis(CCA)method is used to fuse the features extracted by two feedforward neural networks per layer.Experiments show that compared with other methods,the proposed method can extract better features.展开更多
Feature extraction of signals plays an important role in classification problems because of data dimension reduction property and potential improvement of a classification accuracy rate. Principal component analysis (...Feature extraction of signals plays an important role in classification problems because of data dimension reduction property and potential improvement of a classification accuracy rate. Principal component analysis (PCA), wavelets transform or Fourier transform methods are often used for feature extraction. In this paper, we propose a multi-scale PCA, which combines discrete wavelet transform, and PCA for feature extraction of signals in both the spatial and temporal domains. Our study shows that the multi-scale PCA combined with the proposed new classification methods leads to high classification accuracy for the considered signals.展开更多
基金Supported by the Henan Province Key Research and Development Project(231111211300)the Central Government of Henan Province Guides Local Science and Technology Development Funds(Z20231811005)+2 种基金Henan Province Key Research and Development Project(231111110100)Henan Provincial Outstanding Foreign Scientist Studio(GZS2024006)Henan Provincial Joint Fund for Scientific and Technological Research and Development Plan(Application and Overcoming Technical Barriers)(242103810028)。
文摘The fusion of infrared and visible images should emphasize the salient targets in the infrared image while preserving the textural details of the visible images.To meet these requirements,an autoencoder-based method for infrared and visible image fusion is proposed.The encoder designed according to the optimization objective consists of a base encoder and a detail encoder,which is used to extract low-frequency and high-frequency information from the image.This extraction may lead to some information not being captured,so a compensation encoder is proposed to supplement the missing information.Multi-scale decomposition is also employed to extract image features more comprehensively.The decoder combines low-frequency,high-frequency and supplementary information to obtain multi-scale features.Subsequently,the attention strategy and fusion module are introduced to perform multi-scale fusion for image reconstruction.Experimental results on three datasets show that the fused images generated by this network effectively retain salient targets while being more consistent with human visual perception.
基金supported by the National Natural Science Foundation of China(No.62376287)the International Science and Technology Innovation Joint Base of Machine Vision and Medical Image Processing in Hunan Province(2021CB1013)the Natural Science Foundation of Hunan Province(Nos.2022JJ30762,2023JJ70016).
文摘Globally,diabetic retinopathy(DR)is the primary cause of blindness,affecting millions of people worldwide.This widespread impact underscores the critical need for reliable and precise diagnostic techniques to ensure prompt diagnosis and effective treatment.Deep learning-based automated diagnosis for diabetic retinopathy can facilitate early detection and treatment.However,traditional deep learning models that focus on local views often learn feature representations that are less discriminative at the semantic level.On the other hand,models that focus on global semantic-level information might overlook critical,subtle local pathological features.To address this issue,we propose an adaptive multi-scale feature fusion network called(AMSFuse),which can adaptively combine multi-scale global and local features without compromising their individual representation.Specifically,our model incorporates global features for extracting high-level contextual information from retinal images.Concurrently,local features capture fine-grained details,such as microaneurysms,hemorrhages,and exudates,which are critical for DR diagnosis.These global and local features are adaptively fused using a fusion block,followed by an Integrated Attention Mechanism(IAM)that refines the fused features by emphasizing relevant regions,thereby enhancing classification accuracy for DR classification.Our model achieves 86.3%accuracy on the APTOS dataset and 96.6%RFMiD,both of which are comparable to state-of-the-art methods.
基金funded by the China Chongqing Municipal Science and Technology Bureau,grant numbers 2024TIAD-CYKJCXX0121,2024NSCQ-LZX0135Chongqing Municipal Commission of Housing and Urban-Rural Development,grant number CKZ2024-87+3 种基金the Chongqing University of Technology graduate education high-quality development project,grant number gzlsz202401the Chongqing University of Technology-Chongqing LINGLUE Technology Co.,Ltd.,Electronic Information(Artificial Intelligence)graduate joint training basethe Postgraduate Education and Teaching Reform Research Project in Chongqing,grant number yjg213116the Chongqing University of Technology-CISDI Chongqing Information Technology Co.,Ltd.,Computer Technology graduate joint training base.
文摘Detecting abnormal cervical cells is crucial for early identification and timely treatment of cervical cancer.However,this task is challenging due to the morphological similarities between abnormal and normal cells and the significant variations in cell size.Pathologists often refer to surrounding cells to identify abnormalities.To emulate this slide examination behavior,this study proposes a Multi-Scale Feature Fusion Network(MSFF-Net)for detecting cervical abnormal cells.MSFF-Net employs a Cross-Scale Pooling Model(CSPM)to effectively capture diverse features and contextual information,ranging from local details to the overall structure.Additionally,a Multi-Scale Fusion Attention(MSFA)module is introduced to mitigate the impact of cell size variations by adaptively fusing local and global information at different scales.To handle the complex environment of cervical cell images,such as cell adhesion and overlapping,the Inner-CIoU loss function is utilized to more precisely measure the overlap between bounding boxes,thereby improving detection accuracy in such scenarios.Experimental results on the Comparison detector dataset demonstrate that MSFF-Net achieves a mean average precision(mAP)of 63.2%,outperforming state-of-the-art methods while maintaining a relatively small number of parameters(26.8 M).This study highlights the effectiveness of multi-scale feature fusion in enhancing the detection of cervical abnormal cells,contributing to more accurate and efficient cervical cancer screening.
基金the financial support from Natural Science Foundation of Gansu Province(Nos.22JR5RA217,22JR5RA216)Lanzhou Science and Technology Program(No.2022-2-111)+1 种基金Lanzhou University of Arts and Sciences School Innovation Fund Project(No.XJ2022000103)Lanzhou College of Arts and Sciences 2023 Talent Cultivation Quality Improvement Project(No.2023-ZL-jxzz-03)。
文摘Considering that the algorithm accuracy of the traditional sparse representation models is not high under the influence of multiple complex environmental factors,this study focuses on the improvement of feature extraction and model construction.Firstly,the convolutional neural network(CNN)features of the face are extracted by the trained deep learning network.Next,the steady-state and dynamic classifiers for face recognition are constructed based on the CNN features and Haar features respectively,with two-stage sparse representation introduced in the process of constructing the steady-state classifier and the feature templates with high reliability are dynamically selected as alternative templates from the sparse representation template dictionary constructed using the CNN features.Finally,the results of face recognition are given based on the classification results of the steady-state classifier and the dynamic classifier together.Based on this,the feature weights of the steady-state classifier template are adjusted in real time and the dictionary set is dynamically updated to reduce the probability of irrelevant features entering the dictionary set.The average recognition accuracy of this method is 94.45%on the CMU PIE face database and 96.58%on the AR face database,which is significantly improved compared with that of the traditional face recognition methods.
基金supported by Communication University of China(HG23035)partly supported by the Fundamental Research Funds for the Central Universities(CUC230A013).
文摘With the rapid growth of socialmedia,the spread of fake news has become a growing problem,misleading the public and causing significant harm.As social media content is often composed of both images and text,the use of multimodal approaches for fake news detection has gained significant attention.To solve the problems existing in previous multi-modal fake news detection algorithms,such as insufficient feature extraction and insufficient use of semantic relations between modes,this paper proposes the MFFFND-Co(Multimodal Feature Fusion Fake News Detection with Co-Attention Block)model.First,the model deeply explores the textual content,image content,and frequency domain features.Then,it employs a Co-Attention mechanism for cross-modal fusion.Additionally,a semantic consistency detectionmodule is designed to quantify semantic deviations,thereby enhancing the performance of fake news detection.Experimentally verified on two commonly used datasets,Twitter and Weibo,the model achieved F1 scores of 90.0% and 94.0%,respectively,significantly outperforming the pre-modified MFFFND(Multimodal Feature Fusion Fake News Detection with Attention Block)model and surpassing other baseline models.This improves the accuracy of detecting fake information in artificial intelligence detection and engineering software detection.
基金supported by National Natural Science Foundation of China(No.61862037)Lanzhou Jiaotong University Tianyou Innovation Team Project(No.TY202002)。
文摘To solve the problems of redundant feature information,the insignificant difference in feature representation,and low recognition accuracy of the fine-grained image,based on the ResNeXt50 model,an MSFResNet network model is proposed by fusing multi-scale feature information.Firstly,a multi-scale feature extraction module is designed to obtain multi-scale information on feature images by using different scales of convolution kernels.Meanwhile,the channel attention mechanism is used to increase the global information acquisition of the network.Secondly,the feature images processed by the multi-scale feature extraction module are fused with the deep feature images through short links to guide the full learning of the network,thus reducing the loss of texture details of the deep network feature images,and improving network generalization ability and recognition accuracy.Finally,the validity of the MSFResNet model is verified using public datasets and applied to wild mushroom identification.Experimental results show that compared with ResNeXt50 network model,the accuracy of the MSFResNet model is improved by 6.01%on the FGVC-Aircraft common dataset.It achieves 99.13%classification accuracy on the wild mushroom dataset,which is 0.47%higher than ResNeXt50.Furthermore,the experimental results of the thermal map show that the MSFResNet model significantly reduces the interference of background information,making the network focus on the location of the main body of wild mushroom,which can effectively improve the accuracy of wild mushroom identification.
基金supported by the National Key Research and Development Program of China(2023YFB3307800)National Natural Science Foundation of China(62394343,62373155)+2 种基金Major Science and Technology Project of Xinjiang(No.2022A01006-4)State Key Laboratory of Industrial Control Technology,China(Grant No.ICT2024A26)Fundamental Research Funds for the Central Universities.
文摘Deep Learning has been widely used to model soft sensors in modern industrial processes with nonlinear variables and uncertainty.Due to the outstanding ability for high-level feature extraction,stacked autoencoder(SAE)has been widely used to improve the model accuracy of soft sensors.However,with the increase of network layers,SAE may encounter serious information loss issues,which affect the modeling performance of soft sensors.Besides,there are typically very few labeled samples in the data set,which brings challenges to traditional neural networks to solve.In this paper,a multi-scale feature fused stacked autoencoder(MFF-SAE)is suggested for feature representation related to hierarchical output,where stacked autoencoder,mutual information(MI)and multi-scale feature fusion(MFF)strategies are integrated.Based on correlation analysis between output and input variables,critical hidden variables are extracted from the original variables in each autoencoder's input layer,which are correspondingly given varying weights.Besides,an integration strategy based on multi-scale feature fusion is adopted to mitigate the impact of information loss with the deepening of the network layers.Then,the MFF-SAE method is designed and stacked to form deep networks.Two practical industrial processes are utilized to evaluate the performance of MFF-SAE.Results from simulations indicate that in comparison to other cutting-edge techniques,the proposed method may considerably enhance the accuracy of soft sensor modeling,where the suggested method reduces the root mean square error(RMSE)by 71.8%,17.1%and 64.7%,15.1%,respectively.
基金Supported by the National Key R&D Program of China(2022YFC3803600).
文摘Background Three-dimensional(3D)shape representation using mesh data is essential in various applications,such as virtual reality and simulation technologies.Current methods for extracting features from mesh edges or faces struggle with complex 3D models because edge-based approaches miss global contexts and face-based methods overlook variations in adjacent areas,which affects the overall precision.To address these issues,we propose the Feature Discrimination and Context Propagation Network(FDCPNet),which is a novel approach that synergistically integrates local and global features in mesh datasets.Methods FDCPNet is composed of two modules:(1)the Feature Discrimination Module,which employs an attention mechanism to enhance the identification of key local features,and(2)the Context Propagation Module,which enriches key local features by integrating global contextual information,thereby facilitating a more detailed and comprehensive representation of crucial areas within the mesh model.Results Experiments on popular datasets validated the effectiveness of FDCPNet,showing an improvement in the classification accuracy over the baseline MeshNet.Furthermore,even with reduced mesh face numbers and limited training data,FDCPNet achieved promising results,demonstrating its robustness in scenarios of variable complexity.
基金supported by the Natural Science Foundation of Wenzhou University of Technology,China(Grant No.:ky202211).
文摘Research indicates that microbe activity within the human body significantly influences health by being closely linked to various diseases.Accurately predicting microbe-disease interactions(MDIs)offers critical insights for disease intervention and pharmaceutical research.Current advanced AI-based technologies automatically generate robust representations of microbes and diseases,enabling effective MDI predictions.However,these models continue to face significant challenges.A major issue is their reliance on complex feature extractors and classifiers,which substantially diminishes the models’generalizability.To address this,we introduce a novel graph autoencoder framework that utilizes decoupled representation learning and multi-scale information fusion strategies to efficiently infer potential MDIs.Initially,we randomly mask portions of the input microbe-disease graph based on Bernoulli distribution to boost self-supervised training and minimize noise-related performance degradation.Secondly,we employ decoupled representation learning technology,compelling the graph neural network(GNN)to independently learn the weights for each feature subspace,thus enhancing its expressive power.Finally,we implement multi-scale information fusion technology to amalgamate the multi-layer outputs of GNN,reducing information loss due to occlusion.Extensive experiments on public datasets demonstrate that our model significantly surpasses existing top MDI prediction models.This indicates that our model can accurately predict unknown MDIs and is likely to aid in disease discovery and precision pharmaceutical research.Code and data are accessible at:https://github.com/shmildsj/MDI-IFDRL.
基金supported in part by the National Natural Science Foundation of China under No.12401679the Nature Science Foundation of the Jiangsu Higher Education Institutions of China under No.23KJB520006the Haizhou Bay Talent Innovation Program of Jiangsu Ocean University under No.PD2024026。
文摘Deep forgery detection technologies are crucial for image and video recognition tasks,with their performance heavily reliant on the features extracted from both real and fake images.However,most existing methods primarily focus on spatial domain features,which limits their accuracy.To address this limitation,we propose an adaptive dual-domain feature representation method for enhanced deep forgery detection.Specifically,an adaptive region dynamic convolution module is established to efficiently extract facial features from the spatial domain.Then,we introduce an adaptive frequency dynamic filter to capture effective frequency domain features.By fusing both spatial and frequency domain features,our approach significantly improves the accuracy of classifying real and fake facial images.Finally,experimental results on three real-world datasets validate the effectiveness of our dual-domain feature representation method,which substantially improves classification precision.
基金funded by the Deanship of Scientific Research at Northern Border University,Arar,Saudi Arabia through research group No.(RG-NBU-2022-1234).
文摘Transportation systems are experiencing a significant transformation due to the integration of advanced technologies, including artificial intelligence and machine learning. In the context of intelligent transportation systems (ITS) and Advanced Driver Assistance Systems (ADAS), the development of efficient and reliable traffic light detection mechanisms is crucial for enhancing road safety and traffic management. This paper presents an optimized convolutional neural network (CNN) framework designed to detect traffic lights in real-time within complex urban environments. Leveraging multi-scale pyramid feature maps, the proposed model addresses key challenges such as the detection of small, occluded, and low-resolution traffic lights amidst complex backgrounds. The integration of dilated convolutions, Region of Interest (ROI) alignment, and Soft Non-Maximum Suppression (Soft-NMS) further improves detection accuracy and reduces false positives. By optimizing computational efficiency and parameter complexity, the framework is designed to operate seamlessly on embedded systems, ensuring robust performance in real-world applications. Extensive experiments using real-world datasets demonstrate that our model significantly outperforms existing methods, providing a scalable solution for ITS and ADAS applications. This research contributes to the advancement of Artificial Intelligence-driven (AI-driven) pattern recognition in transportation systems and offers a mathematical approach to improving efficiency and safety in logistics and transportation networks.
基金supported by the National Natural Science Foundation of China(No.61275010)the Ph.D.Programs Foundation of Ministry of Education of China(No.20132304110007)+1 种基金the Heilongjiang Natural Science Foundation(No.F201409)the Fundamental Research Funds for the Central Universities(No.HEUCFD1410)
文摘To minimize the low classification accuracy and low utilization of spatial information in traditional hyperspectral image classification methods, we propose a new hyperspectral image classification method, which is based on the Gabor spatial texture features and nonparametric weighted spectral features, and the sparse representation classification method(Gabor–NWSF and SRC), abbreviated GNWSF–SRC. The proposed(GNWSF–SRC) method first combines the Gabor spatial features and nonparametric weighted spectral features to describe the hyperspectral image, and then applies the sparse representation method. Finally, the classification is obtained by analyzing the reconstruction error. We use the proposed method to process two typical hyperspectral data sets with different percentages of training samples. Theoretical analysis and simulation demonstrate that the proposed method improves the classification accuracy and Kappa coefficient compared with traditional classification methods and achieves better classification performance.
文摘Feature based design has been regarded as a promising approach for CAD/CAM integration.This paper aims to establish a domain independent representation formalism for feature based design in three aspects: formal representation,design process model and design algorithm.The implementing scheme and formal description of feature taxonomy,feature operator,feature model validation and feature transformation are given in the paper.The feature based design process model suited for either sequencial or concurrent engineering is proposed and its application to product structural design and process plan design is presented. Some general design algorithms for developing feature based design system are also addressed.The proposed scheme provides a formal methodology elementary for feature based design system development and operation in a structural way.
基金supported by National Natural Science Foundation of China(No.61273339)
文摘In expression recognition, feature representation is critical for successful recognition since it contains distinctive information of expressions. In this paper, a new approach for representing facial expression features is proposed with its objective to describe features in an effective and efficient way in order to improve the recognition performance. The method combines the facial action coding system(FACS) and 'uniform' local binary patterns(LBP) to represent facial expression features from coarse to fine. The facial feature regions are extracted by active shape models(ASM) based on FACS to obtain the gray-level texture. Then, LBP is used to represent expression features for enhancing the discriminant. A facial expression recognition system is developed based on this feature extraction method by using K nearest neighborhood(K-NN) classifier to recognize facial expressions. Finally, experiments are carried out to evaluate this feature extraction method. The significance of removing the unrelated facial regions and enhancing the discrimination ability of expression features in the recognition process is indicated by the results, in addition to its convenience.
基金supported by the National Key Research and Development Program of China under grant 2016YFC0802904National Natural Science Foundation of China under grant61671470the Postdoctoral Science Foundation Funded Project of China under grant 2017M623423。
文摘Focused on the task of fast and accurate armored target detection in ground battlefield,a detection method based on multi-scale representation network(MS-RN) and shape-fixed Guided Anchor(SF-GA)scheme is proposed.Firstly,considering the large-scale variation and camouflage of armored target,a new MS-RN integrating contextual information in battlefield environment is designed.The MS-RN extracts deep features from templates with different scales and strengthens the detection ability of small targets.Armored targets of different sizes are detected on different representation features.Secondly,aiming at the accuracy and real-time detection requirements,improved shape-fixed Guided Anchor is used on feature maps of different scales to recommend regions of interests(ROIs).Different from sliding or random anchor,the SF-GA can filter out 80% of the regions while still improving the recall.A special detection dataset for armored target,named Armored Target Dataset(ARTD),is constructed,based on which the comparable experiments with state-of-art detection methods are conducted.Experimental results show that the proposed method achieves outstanding performance in detection accuracy and efficiency,especially when small armored targets are involved.
基金Project(61301095)supported by the National Natural Science Foundation of ChinaProject(QC2012C070)supported by Heilongjiang Provincial Natural Science Foundation for the Youth,ChinaProjects(HEUCF130807,HEUCFZ1129)supported by the Fundamental Research Funds for the Central Universities of China
文摘In modern electromagnetic environment, radar emitter signal recognition is an important research topic. On the basis of multi-resolution wavelet analysis, an adaptive radar emitter signal recognition method based on multi-scale wavelet entropy feature extraction and feature weighting was proposed. With the only priori knowledge of signal to noise ratio(SNR), the method of extracting multi-scale wavelet entropy features of wavelet coefficients from different received signals were combined with calculating uneven weight factor and stability weight factor of the extracted multi-dimensional characteristics. Radar emitter signals of different modulation types and different parameters modulated were recognized through feature weighting and feature fusion. Theoretical analysis and simulation results show that the presented algorithm has a high recognition rate. Additionally, when the SNR is greater than-4 d B, the correct recognition rate is higher than 93%. Hence, the proposed algorithm has great application value.
基金This work was supported by the Research Deanship of Prince Sattam Bin Abdulaziz University,Al-Kharj,Saudi Arabia(Grant No.2020/01/17215).Also,the author thanks Deanship of college of computer engineering and sciences for technical support provided to complete the project successfully。
文摘In the era of Big data,learning discriminant feature representation from network traffic is identified has as an invariably essential task for improving the detection ability of an intrusion detection system(IDS).Owing to the lack of accurately labeled network traffic data,many unsupervised feature representation learning models have been proposed with state-of-theart performance.Yet,these models fail to consider the classification error while learning the feature representation.Intuitively,the learnt feature representation may degrade the performance of the classification task.For the first time in the field of intrusion detection,this paper proposes an unsupervised IDS model leveraging the benefits of deep autoencoder(DAE)for learning the robust feature representation and one-class support vector machine(OCSVM)for finding the more compact decision hyperplane for intrusion detection.Specially,the proposed model defines a new unified objective function to minimize the reconstruction and classification error simultaneously.This unique contribution not only enables the model to support joint learning for feature representation and classifier training but also guides to learn the robust feature representation which can improve the discrimination ability of the classifier for intrusion detection.Three set of evaluation experiments are conducted to demonstrate the potential of the proposed model.First,the ablation evaluation on benchmark dataset,NSL-KDD validates the design decision of the proposed model.Next,the performance evaluation on recent intrusion dataset,UNSW-NB15 signifies the stable performance of the proposed model.Finally,the comparative evaluation verifies the efficacy of the proposed model against recently published state-of-the-art methods.
基金supported by the National Natural Science Foundation of Hainan(2018CXTD333,617048)National Natural Science Foundation of China(61762033,61702539)+4 种基金The National Natural Science Foundation of Hunan(2018JJ3611)Social Development Project of Public Welfare Technology Application of Zhejiang Province(LGF18F020019)Hainan University Doctor Start Fund Project(kyqd1328)Hainan University Youth Fund Project(qnjj1444)State Key Laboratory of Marine Resource Utilization in South China Sea Funding.
文摘Distributed denial of service(DDoS)attacks launch more and more frequently and are more destructive.Feature representation as an important part of DDoS defense technology directly affects the efficiency of defense.Most DDoS feature extraction methods cannot fully utilize the information of the original data,resulting in the extracted features losing useful features.In this paper,a DDoS feature representation method based on deep belief network(DBN)is proposed.We quantify the original data by the size of the network flows,the distribution of IP addresses and ports,and the diversity of packet sizes of different protocols and train the DBN in an unsupervised manner by these quantified values.Two feedforward neural networks(FFNN)are initialized by the trained deep belief network,and one of the feedforward neural networks continues to be trained in a supervised manner.The canonical correlation analysis(CCA)method is used to fuse the features extracted by two feedforward neural networks per layer.Experiments show that compared with other methods,the proposed method can extract better features.
文摘Feature extraction of signals plays an important role in classification problems because of data dimension reduction property and potential improvement of a classification accuracy rate. Principal component analysis (PCA), wavelets transform or Fourier transform methods are often used for feature extraction. In this paper, we propose a multi-scale PCA, which combines discrete wavelet transform, and PCA for feature extraction of signals in both the spatial and temporal domains. Our study shows that the multi-scale PCA combined with the proposed new classification methods leads to high classification accuracy for the considered signals.