Anomaly Detection (AD) has been extensively adopted in industrial settings to facilitate quality control of products. It is critical to industrial production, especially to areas such as aircraft manufacturing, which ...Anomaly Detection (AD) has been extensively adopted in industrial settings to facilitate quality control of products. It is critical to industrial production, especially to areas such as aircraft manufacturing, which require strict part qualification rates. Although being more efficient and practical, few-shot AD has not been well explored. The existing AD methods only extract features in a single frequency while defects exist in multiple frequency domains. Moreover, current methods have not fully leveraged the few-shot support samples to extract input-related normal patterns. To address these issues, we propose an industrial few-shot AD method, Feature Extender for Anomaly Detection (FEAD), which extracts normal patterns in multiple frequency domains from few-shot samples under the guidance of the input sample. Firstly, to achieve better coverage of normal patterns in the input sample, we introduce a Sample-Conditioned Transformation Module (SCTM), which transforms support features under the guidance of the input sample to obtain extra normal patterns. Secondly, to effectively distinguish and localize anomaly patterns in multiple frequency domains, we devise an Adaptive Descriptor Construction Module (ADCM) to build and select pattern descriptors in a series of frequencies adaptively. Finally, an auxiliary task for SCTM is designed to ensure the diversity of transformations and include more normal patterns into support features. Extensive experiments on two widely used industrial AD datasets (MVTec-AD and VisA) demonstrate the effectiveness of the proposed FEAD.展开更多
Remote sensing plays a pivotal role in environmental monitoring,disaster relief,and urban planning,where accurate scene classification of aerial images is essential.However,conventional convolutional neural networks(C...Remote sensing plays a pivotal role in environmental monitoring,disaster relief,and urban planning,where accurate scene classification of aerial images is essential.However,conventional convolutional neural networks(CNNs)struggle with long-range dependencies and preserving high-resolution features,limiting their effectiveness in complex aerial image analysis.To address these challenges,we propose a Hybrid HRNet-Swin Transformer model that synergizes the strengths of HRNet-W48 for high-resolution segmentation and the Swin Transformer for global feature extraction.This hybrid architecture ensures robust multi-scale feature fusion,capturing fine-grained details and broader contextual relationships in aerial imagery.Our methodology begins with preprocessing steps,including normalization,histogram equalization,and noise reduction,to enhance input data quality.The HRNet-W48 backbone maintains high-resolution feature maps throughout the network,enabling precise segmentation,while the Swin Transformer leverages hierarchical self-attention to model long-range dependencies efficiently.By integrating these components,our model achieves superior performance in segmentation and classification tasks compared to traditional CNNs and standalone transformer models.We evaluate our approach on two benchmark datasets:UC Merced and WHU-RS19.Experimental results demonstrate that the proposed hybrid model outperforms existing methods,achieving state-of-the-art accuracy while maintaining computational efficiency.Specifically,it excels in preserving fine spatial details and contextual understanding,critical for applications like land-use classification and disaster assessment.展开更多
Hematoxylin and Eosin(H&E)images,popularly used in the field of digital pathology,often pose challenges due to their limited color richness,hindering the differentiation of subtle cell features crucial for accurat...Hematoxylin and Eosin(H&E)images,popularly used in the field of digital pathology,often pose challenges due to their limited color richness,hindering the differentiation of subtle cell features crucial for accurate classification.Enhancing the visibility of these elusive cell features helps train robust deep-learning models.However,the selection and application of image processing techniques for such enhancement have not been systematically explored in the research community.To address this challenge,we introduce Salient Features Guided Augmentation(SFGA),an approach that strategically integrates machine learning and image processing.SFGA utilizes machine learning algorithms to identify crucial features within cell images,subsequently mapping these features to appropriate image processing techniques to enhance training images.By emphasizing salient features and aligning them with corresponding image processing methods,SFGA is designed to enhance the discriminating power of deep learning models in cell classification tasks.Our research undertakes a series of experiments,each exploring the performance of different datasets and data enhancement techniques in classifying cell types,highlighting the significance of data quality and enhancement in mitigating overfitting and distinguishing cell characteristics.Specifically,SFGA focuses on identifying tumor cells from tissue for extranodal extension detection,with the SFGA-enhanced dataset showing notable advantages in accuracy.We conducted a preliminary study of five experiments,among which the accuracy of the pleomorphism experiment improved significantly from 50.81%to 95.15%.The accuracy of the other four experiments also increased,with improvements ranging from 3 to 43 percentage points.Our preliminary study shows the possibilities to enhance the diagnostic accuracy of deep learning models and proposes a systematic approach that could enhance cancer diagnosis,contributing as a first step in using SFGA in medical image enhancement.展开更多
Detecting cyber attacks in networks connected to the Internet of Things(IoT)is of utmost importance because of the growing vulnerabilities in the smart environment.Conventional models,such as Naive Bayes and support v...Detecting cyber attacks in networks connected to the Internet of Things(IoT)is of utmost importance because of the growing vulnerabilities in the smart environment.Conventional models,such as Naive Bayes and support vector machine(SVM),as well as ensemble methods,such as Gradient Boosting and eXtreme gradient boosting(XGBoost),are often plagued by high computational costs,which makes it challenging for them to perform real-time detection.In this regard,we suggested an attack detection approach that integrates Visual Geometry Group 16(VGG16),Artificial Rabbits Optimizer(ARO),and Random Forest Model to increase detection accuracy and operational efficiency in Internet of Things(IoT)networks.In the suggested model,the extraction of features from malware pictures was accomplished with the help of VGG16.The prediction process is carried out by the random forest model using the extracted features from the VGG16.Additionally,ARO is used to improve the hyper-parameters of the random forest model of the random forest.With an accuracy of 96.36%,the suggested model outperforms the standard models in terms of accuracy,F1-score,precision,and recall.The comparative research highlights our strategy’s success,which improves performance while maintaining a lower computational cost.This method is ideal for real-time applications,but it is effective.展开更多
[Objective]Accurate prediction of tomato growth height is crucial for optimizing production environments in smart farming.However,current prediction methods predominantly rely on empirical,mechanistic,or learning-base...[Objective]Accurate prediction of tomato growth height is crucial for optimizing production environments in smart farming.However,current prediction methods predominantly rely on empirical,mechanistic,or learning-based models that utilize either images data or environmental data.These methods fail to fully leverage multi-modal data to capture the diverse aspects of plant growth comprehensively.[Methods]To address this limitation,a two-stage phenotypic feature extraction(PFE)model based on deep learning algorithm of recurrent neural network(RNN)and long short-term memory(LSTM)was developed.The model integrated environment and plant information to provide a holistic understanding of the growth process,emploied phenotypic and temporal feature extractors to comprehensively capture both types of features,enabled a deeper understanding of the interaction between tomato plants and their environment,ultimately leading to highly accurate predictions of growth height.[Results and Discussions]The experimental results showed the model's ef‐fectiveness:When predicting the next two days based on the past five days,the PFE-based RNN and LSTM models achieved mean absolute percentage error(MAPE)of 0.81%and 0.40%,respectively,which were significantly lower than the 8.00%MAPE of the large language model(LLM)and 6.72%MAPE of the Transformer-based model.In longer-term predictions,the 10-day prediction for 4 days ahead and the 30-day prediction for 12 days ahead,the PFE-RNN model continued to outperform the other two baseline models,with MAPE of 2.66%and 14.05%,respectively.[Conclusions]The proposed method,which leverages phenotypic-temporal collaboration,shows great potential for intelligent,data-driven management of tomato cultivation,making it a promising approach for enhancing the efficiency and precision of smart tomato planting management.展开更多
Background:Early and accurate diagnosis of cataracts,which ranks among the leading preventable causes of blindness,is critical to securing positive outcomes for patients.Recently,eye image analyses have used deep lear...Background:Early and accurate diagnosis of cataracts,which ranks among the leading preventable causes of blindness,is critical to securing positive outcomes for patients.Recently,eye image analyses have used deep learning(DL)approaches to automate cataract classification more precisely,leading to the development of the Multiscale Parallel Feature Aggregation Network with Attention Fusion(MPFAN-AF).Focused on improving a model’s performance,this approach applies multiscale feature extraction,parallel feature fusion,along with attention-based fusion to sharpen its focus on salient features,which are crucial in detecting cataracts.Methods:Coarse-level features are captured through the application of convolutional layers,and these features undergo refinement through layered kernels of varying sizes.Moreover,this method captures all the diverse representations of cataracts accurately by parallel feature aggregation.Utilizing the Cataract Eye Dataset available on Kaggle,containing 612 labelled images of eyes with and without cataracts proportionately(normal vs.pathological),this model was trained and tested.Results:Results using the proposed model reflect greater precision over traditional convolutional neural networks(CNNs)models,achieving a classification accuracy of 97.52%.Additionally,the model demonstrated exceptional performance in classification tasks.The ablation studies validated that all applications added value to the prediction process,particularly emphasizing the attention fusion module.Conclusion:The MPFAN-AF model demonstrates high efficiency together with interpretability because it shows promise as an integration solution for real-time mobile cataract detection screening systems.Standard performance indicators indicate that AI-based ophthalmology tools have a promising future for use in remote conditions that lack medical resources.展开更多
Medical visual question answering(MedVQA)faces unique challenges due to the high precision required for images and the specialized nature of the questions.These challenges include insufficient feature extraction capab...Medical visual question answering(MedVQA)faces unique challenges due to the high precision required for images and the specialized nature of the questions.These challenges include insufficient feature extraction capabilities,a lack of textual priors,and incomplete information fusion and interaction.This paper proposes an enhanced bootstrapping language-image pre-training(BLIP)model for MedVQA based on multimodal feature augmentation and triple-path collaborative attention(FCA-BLIP)to address these issues.First,FCA-BLIP employs a unified bootstrap multimodal model architecture that integrates ResNet and bidirectional encoder representations from Transformer(BERT)models to enhance feature extraction capabilities.It enables a more precise analysis of the details in images and questions.Next,the pre-trained BLIP model is used to extract features from image-text sample pairs.The model can understand the semantic relationships and shared information between images and text.Finally,a novel attention structure is developed to fuse the multimodal feature vectors,thereby improving the alignment accuracy between modalities.Experimental results demonstrate that the proposed method performs well in clinical visual question-answering tasks.For the MedVQA task of staging diabetic macular edema in fundus imaging,the proposed method outperforms the existing major models in several performance metrics.展开更多
Recent advancements in smart-meter technology are transforming traditional power systems into intelligent smart grids.It offers substantial benefits across social,environmental,and economic dimensions.To effectively r...Recent advancements in smart-meter technology are transforming traditional power systems into intelligent smart grids.It offers substantial benefits across social,environmental,and economic dimensions.To effectively realize these advantages,a fine-grained collection and analysis of smart meter data is essential.However,the high dimensionality and volume of such time-series present significant challenges,including increased computational load,data transmission overhead,latency,and complexity in real-time analysis.This study proposes a novel,computationally efficient framework for feature extraction and selection tailored to smart meter time-series data.The approach begins with an extensive offline analysis,where features are derived from multiple domains—time,frequency,and statistical—to capture diverse signal characteristics.Various feature sets are fused and evaluated using robust machine learning classifiers to identify the most informative combinations for automated appliance categorization.The bestperforming fused features set undergoes further refinement using Analysis of Variance(ANOVA)to identify the most discriminative features.The mathematical models,used to compute the selected features,are optimized to extract them with computational efficiency during online processing.Moreover,a notable dimension reduction is secured which facilitates data storage,transmission,and post processing.Onward,a specifically designed LogitBoost(LB)based ensemble of Random Forest base learners is used for an automated classification.The proposed solution demonstrates a high classification accuracy(97.93%)for the case of nine-class problem and dimension reduction(17.33-fold)with minimal front-end computational requirements,making it well-suited for real-world applications in smart grid environments.展开更多
This study proposes a multi-scene smoke detection algorithm based on a multi-feature extraction method to address the problems of varying smoke shapes in different scenes,difficulty in locating and detecting transluce...This study proposes a multi-scene smoke detection algorithm based on a multi-feature extraction method to address the problems of varying smoke shapes in different scenes,difficulty in locating and detecting translucent smoke,and variable smoke scales.First,the convolution module of feature extraction in YOLOv5s backbone network is replaced with asymmetric convolution block re-parameterization convolution to improve the detection of different shapes of smoke.Then,coordinate attention mechanism is introduced in the deeper layer of the backbone network to further improve the localization of translucent smoke.Finally,the detection of smoke at different scales is further improved by using the feature pyramid convolution module instead of the standard convolution module of the feature pyramid in the model.The experimental results demonstrate the feasibility and superiority of the proposed model for multi-scene smoke detection.展开更多
The rapid development of electricity retail market has prompted an increasing number of electricity consumers to sign green electricity contracts with retail electricity companies,which poses greater challenges for th...The rapid development of electricity retail market has prompted an increasing number of electricity consumers to sign green electricity contracts with retail electricity companies,which poses greater challenges for the market service for green energy consumers.This study proposed a two-stage feature extraction approach for green energy consumers leveraging clustering and termfrequency-inverse document frequency(TF-IDF)algorithms within a knowledge graph framework to provide an information basis that supports the green development of the retail electricity market.First,the multi-source heterogeneous data of green energy consumers under an actual market environment is systematically introduced and the information is categorized into discrete,interval,and relational features.A clustering algorithm was employed to extract features of the trading behavior of green energy consumers in the first stage using the parameter data of green retail electricity contracts.Then,TF-IDF algorithm was applied in the second stage to extract features for green energy consumers in different clusters.Finally,the effectiveness of the proposed approach was validated based on the actual operational data in a southern province of China.It is shown that the most significant discrepancy between the retail trading behaviors of green energy consumers is the power share of green retail packages,whose averaged values are 25.64%,50%,39.66%,and 24.89%in four different clusters,respectively.Additionally,power supply bureaus and electricity retail companies affects the behavior of the green energy consumers most significantly.展开更多
Real-time detection of surface defects on cables is crucial for ensuring the safe operation of power systems.However,existing methods struggle with small target sizes,complex backgrounds,low-quality image acquisition,...Real-time detection of surface defects on cables is crucial for ensuring the safe operation of power systems.However,existing methods struggle with small target sizes,complex backgrounds,low-quality image acquisition,and interference from contamination.To address these challenges,this paper proposes the Real-time Cable Defect Detection Network(RC2DNet),which achieves an optimal balance between detection accuracy and computational efficiency.Unlike conventional approaches,RC2DNet introduces a small object feature extraction module that enhances the semantic representation of small targets through feature pyramids,multi-level feature fusion,and an adaptive weighting mechanism.Additionally,a boundary feature enhancement module is designed,incorporating boundary-aware convolution,a novel boundary attention mechanism,and an improved loss function to significantly enhance boundary localization accuracy.Experimental results demonstrate that RC2DNet outperforms state-of-the-art methods in precision,recall,F1-score,mean Intersection over Union(mIoU),and frame rate,enabling real-time and highly accurate cable defect detection in complex backgrounds.展开更多
Considering that the algorithm accuracy of the traditional sparse representation models is not high under the influence of multiple complex environmental factors,this study focuses on the improvement of feature extrac...Considering that the algorithm accuracy of the traditional sparse representation models is not high under the influence of multiple complex environmental factors,this study focuses on the improvement of feature extraction and model construction.Firstly,the convolutional neural network(CNN)features of the face are extracted by the trained deep learning network.Next,the steady-state and dynamic classifiers for face recognition are constructed based on the CNN features and Haar features respectively,with two-stage sparse representation introduced in the process of constructing the steady-state classifier and the feature templates with high reliability are dynamically selected as alternative templates from the sparse representation template dictionary constructed using the CNN features.Finally,the results of face recognition are given based on the classification results of the steady-state classifier and the dynamic classifier together.Based on this,the feature weights of the steady-state classifier template are adjusted in real time and the dictionary set is dynamically updated to reduce the probability of irrelevant features entering the dictionary set.The average recognition accuracy of this method is 94.45%on the CMU PIE face database and 96.58%on the AR face database,which is significantly improved compared with that of the traditional face recognition methods.展开更多
Recent advances in convolution neural network (CNN) have fostered the progress in object recognition and semantic segmentation, which in turn has improved the performance of hyperspectral image (HSI) classification. N...Recent advances in convolution neural network (CNN) have fostered the progress in object recognition and semantic segmentation, which in turn has improved the performance of hyperspectral image (HSI) classification. Nevertheless, the difficulty of high dimensional feature extraction and the shortage of small training samples seriously hinder the future development of HSI classification. In this paper, we propose a novel algorithm for HSI classification based on three-dimensional (3D) CNN and a feature pyramid network (FPN), called 3D-FPN. The framework contains a principle component analysis, a feature extraction structure and a logistic regression. Specifically, the FPN built with 3D convolutions not only retains the advantages of 3D convolution to fully extract the spectral-spatial feature maps, but also concentrates on more detailed information and performs multi-scale feature fusion. This method avoids the excessive complexity of the model and is suitable for small sample hyperspectral classification with varying categories and spatial resolutions. In order to test the performance of our proposed 3D-FPN method, rigorous experimental analysis was performed on three public hyperspectral data sets and hyperspectral data of GF-5 satellite. Quantitative and qualitative results indicated that our proposed method attained the best performance among other current state-of-the-art end-to-end deep learning-based methods.展开更多
1 Introduction Sound event detection(SED)aims to identify and locate specific sound event categories and their corresponding timestamps within continuous audio streams.To overcome the limitations posed by the scarcity...1 Introduction Sound event detection(SED)aims to identify and locate specific sound event categories and their corresponding timestamps within continuous audio streams.To overcome the limitations posed by the scarcity of strongly labeled training data,researchers have increasingly turned to semi-supervised learning(SSL)[1],which leverages unlabeled data to augment training and improve detection performance.Among many SSL methods[2-4].展开更多
Advanced Persistent Threats(APTs)pose significant challenges to detect due to their“low-and-slow”attack patterns and frequent use of zero-day vulnerabilities.Within this task,the extraction of long-term features is ...Advanced Persistent Threats(APTs)pose significant challenges to detect due to their“low-and-slow”attack patterns and frequent use of zero-day vulnerabilities.Within this task,the extraction of long-term features is often crucial.In this work,we propose a novel end-to-end APT detection framework named Long-Term Feature Association Provenance Graph Detector(LT-ProveGD).Specifically,LT-ProveGD encodes contextual information of the dynamic provenance graph while preserving the topological information with space efficiency.To combat“low-and-slow”attacks,LT-ProveGD develops an autoencoder with an integrated multi-head attention mechanism to extract long-term dependencies within the encoded representations.Furthermore,to facilitate the detection of previously unknown attacks,we leverage Jenks’natural breaks methodology,enabling detection without relying on specific attack information.By conducting extensive experiments on five widely used datasets with state-of-the-art attack detection methods,we demonstrate the superior effectiveness of LT-ProveGD.展开更多
The success of robot-assisted pelvic fracture reduction surgery heavily relies on the accuracy of 3D/3D feature-based registration.This process involves extracting anatomical feature points from pre-operative 3D image...The success of robot-assisted pelvic fracture reduction surgery heavily relies on the accuracy of 3D/3D feature-based registration.This process involves extracting anatomical feature points from pre-operative 3D images which can be challenging because of the complex and variable structure of the pelvis.PointMLP_RegNet,a modified PointMLP,was introduced to address this issue.It retains the feature extraction module of PointMLP but replaces the classification layer with a regression layer to predict the coordinates of feature points instead of conducting regular classification.A flowchart for an automatic feature points extraction method was presented,and a series of experiments was conducted on a clinical pelvic dataset to confirm the accuracy and effectiveness of the method.PointMLP_RegNet extracted feature points more accurately,with 8 out of 10 points showing less than 4 mm errors and the remaining two less than 5 mm.Compared to PointNettt and PointNet,it exhibited higher accuracy,robustness and space efficiency.The proposed method will improve the accuracy of anatomical feature points extraction,enhance intra-operative registration precision and facilitate the widespread clinical application of robot-assisted pelvic fracture reduction.展开更多
To address the issues of unknown target size,blurred edges,background interference and low contrast in infrared small target detection,this paper proposes a method based on density peaks searching and weighted multi-f...To address the issues of unknown target size,blurred edges,background interference and low contrast in infrared small target detection,this paper proposes a method based on density peaks searching and weighted multi-feature local difference.Firstly,an improved high-boost filter is used for preprocessing to eliminate background clutter and high-brightness interference,thereby increasing the probability of capturing real targets in the density peak search.Secondly,a triple-layer window is used to extract features from the area surrounding candidate targets,addressing the uncertainty of small target sizes.By calculating multi-feature local differences between the triple-layer windows,the problems of blurred target edges and low contrast are resolved.To balance the contribution of different features,intra-class distance is used to calculate weights,achieving weighted fusion of multi-feature local differences to obtain the weighted multi-feature local differences of candidate targets.The real targets are then extracted using the interquartile range.Experiments on datasets such as SIRST and IRSTD-IK show that the proposed method is suitable for various complex types and demonstrates good robustness and detection performance.展开更多
The main purpose of nonlinear time series analysis is based on the rebuilding theory of phase space, and to study how to transform the response signal to rebuilt phase space in order to extract dynamic feature informa...The main purpose of nonlinear time series analysis is based on the rebuilding theory of phase space, and to study how to transform the response signal to rebuilt phase space in order to extract dynamic feature information, and to provide effective approach for nonlinear signal analysis and fault diagnosis of nonlinear dynamic system. Now, it has already formed an important offset of nonlinear science. But, traditional method cannot extract chaos features automatically, and it needs man's participation in the whole process. A new method is put forward, which can implement auto-extracting of chaos features for nonlinear time series. Firstly, to confirm time delay r by autocorrelation method; Secondly, to compute embedded dimension m and correlation dimension D; Thirdly, to compute the maximum Lyapunov index λmax; Finally, to calculate the chaos degree Dch of Poincare map, and the non-circle degree Dnc and non-order degree Dno of quasi-phase orbit. Chaos features extracting has important meaning to fault diagnosis of nonlinear system based on nonlinear chaos features. Examples show validity of the proposed method.展开更多
With the development of modern industry, sheet-metal parts in mass production have been widely applied in mechanical, communication, electronics, and light industries in recent decades; but the advances in sheet-metal...With the development of modern industry, sheet-metal parts in mass production have been widely applied in mechanical, communication, electronics, and light industries in recent decades; but the advances in sheet-metal part design and manufacturing remain too slow compared with the increasing importance of sheet-metal parts in modern industry. This paper proposes a method for automatically extracting features from an arbitrary solid model of sheet-metal parts; whose characteristics are used for classification and graph-based representation of the sheet-metal features to extract the features embodied in a sheet-metal part. The extracting feature process can be divided for valid checking of the model geometry, feature matching, and feature relationship. Since the extracted features include abundant geometry and engineering information, they will be effective for downstream application such as feature rebuilding and stamping process planning.展开更多
Feature information extraction is one of the key steps in prognostics and health management of rotating machinery.In the present study,an investigation about the feasibility of a methodology based on generalized S tra...Feature information extraction is one of the key steps in prognostics and health management of rotating machinery.In the present study,an investigation about the feasibility of a methodology based on generalized S transform(GST)and singular value decomposition(SVD)methods for feature extraction in rolling bearing,due to local damage under variable conditions,is conducted.The technique adopts the GST method,following the time-frequency analysis,to transform a raw fault signal of the rolling bearing into a two-dimensional complex matrix.And then,the SVD method is performed to decompose the matrix to obtain the feature vectors.By this procedure it is possible to obtain the fault feature information of rolling bearing under different speeds and different loads.In order to streamline the feature parameters of the feature vectors to train more uncomplicated models,the principal component analysis(PCA)subsequently performed.The particle swarm optimization-support vector machine(PSO-SVM)model is used to identify and classify the different fault states of rolling bearing.Furthermore,in order to highlight the superiority of the proposed method some comparisons are conducted with the conventional methods.The obtained results show that the proposed method can effectively extract fault features of the rolling bearing under variable conditions.展开更多
基金supported by the National Natural Science Foundation of China(No.52188102).
文摘Anomaly Detection (AD) has been extensively adopted in industrial settings to facilitate quality control of products. It is critical to industrial production, especially to areas such as aircraft manufacturing, which require strict part qualification rates. Although being more efficient and practical, few-shot AD has not been well explored. The existing AD methods only extract features in a single frequency while defects exist in multiple frequency domains. Moreover, current methods have not fully leveraged the few-shot support samples to extract input-related normal patterns. To address these issues, we propose an industrial few-shot AD method, Feature Extender for Anomaly Detection (FEAD), which extracts normal patterns in multiple frequency domains from few-shot samples under the guidance of the input sample. Firstly, to achieve better coverage of normal patterns in the input sample, we introduce a Sample-Conditioned Transformation Module (SCTM), which transforms support features under the guidance of the input sample to obtain extra normal patterns. Secondly, to effectively distinguish and localize anomaly patterns in multiple frequency domains, we devise an Adaptive Descriptor Construction Module (ADCM) to build and select pattern descriptors in a series of frequencies adaptively. Finally, an auxiliary task for SCTM is designed to ensure the diversity of transformations and include more normal patterns into support features. Extensive experiments on two widely used industrial AD datasets (MVTec-AD and VisA) demonstrate the effectiveness of the proposed FEAD.
基金supported by the ITP(Institute of Information&Communications Technology Planning&Evaluation)-ICAN(ICT Challenge and Advanced Network of HRD)(ITP-2025-RS-2022-00156326,33)grant funded by the Korea government(Ministry of Science and ICT)the Deanship of Research and Graduate Studies at King Khalid University for funding this work through the Large Group Project under grant number(RGP2/568/45)the Deanship of Scientific Research at Northern Border University,Arar,Saudi Arabia,for funding this research work through the Project Number"NBU-FFR-2025-231-03".
文摘Remote sensing plays a pivotal role in environmental monitoring,disaster relief,and urban planning,where accurate scene classification of aerial images is essential.However,conventional convolutional neural networks(CNNs)struggle with long-range dependencies and preserving high-resolution features,limiting their effectiveness in complex aerial image analysis.To address these challenges,we propose a Hybrid HRNet-Swin Transformer model that synergizes the strengths of HRNet-W48 for high-resolution segmentation and the Swin Transformer for global feature extraction.This hybrid architecture ensures robust multi-scale feature fusion,capturing fine-grained details and broader contextual relationships in aerial imagery.Our methodology begins with preprocessing steps,including normalization,histogram equalization,and noise reduction,to enhance input data quality.The HRNet-W48 backbone maintains high-resolution feature maps throughout the network,enabling precise segmentation,while the Swin Transformer leverages hierarchical self-attention to model long-range dependencies efficiently.By integrating these components,our model achieves superior performance in segmentation and classification tasks compared to traditional CNNs and standalone transformer models.We evaluate our approach on two benchmark datasets:UC Merced and WHU-RS19.Experimental results demonstrate that the proposed hybrid model outperforms existing methods,achieving state-of-the-art accuracy while maintaining computational efficiency.Specifically,it excels in preserving fine spatial details and contextual understanding,critical for applications like land-use classification and disaster assessment.
基金supported by grants fromthe North China University of Technology Research Start-Up Fund(11005136024XN147-14)and(110051360024XN151-97)Guangzhou Development Zone Science and Technology Project(2023GH02)+4 种基金the National Key R&D Program of China(2021YFE0201100 and 2022YFA1103401 to Juntao Gao)National Natural Science Foundation of China(981890991 to Juntao Gao)Beijing Municipal Natural Science Foundation(Z200021 to Juntao Gao)CAS Interdisciplinary Innovation Team(JCTD-2020-04 to Juntao Gao)0032/2022/A,by Macao FDCT,and MYRG2022-00271-FST.
文摘Hematoxylin and Eosin(H&E)images,popularly used in the field of digital pathology,often pose challenges due to their limited color richness,hindering the differentiation of subtle cell features crucial for accurate classification.Enhancing the visibility of these elusive cell features helps train robust deep-learning models.However,the selection and application of image processing techniques for such enhancement have not been systematically explored in the research community.To address this challenge,we introduce Salient Features Guided Augmentation(SFGA),an approach that strategically integrates machine learning and image processing.SFGA utilizes machine learning algorithms to identify crucial features within cell images,subsequently mapping these features to appropriate image processing techniques to enhance training images.By emphasizing salient features and aligning them with corresponding image processing methods,SFGA is designed to enhance the discriminating power of deep learning models in cell classification tasks.Our research undertakes a series of experiments,each exploring the performance of different datasets and data enhancement techniques in classifying cell types,highlighting the significance of data quality and enhancement in mitigating overfitting and distinguishing cell characteristics.Specifically,SFGA focuses on identifying tumor cells from tissue for extranodal extension detection,with the SFGA-enhanced dataset showing notable advantages in accuracy.We conducted a preliminary study of five experiments,among which the accuracy of the pleomorphism experiment improved significantly from 50.81%to 95.15%.The accuracy of the other four experiments also increased,with improvements ranging from 3 to 43 percentage points.Our preliminary study shows the possibilities to enhance the diagnostic accuracy of deep learning models and proposes a systematic approach that could enhance cancer diagnosis,contributing as a first step in using SFGA in medical image enhancement.
基金funded by Institutional Fund Projects under grant no.(IFPDP-261-22)。
文摘Detecting cyber attacks in networks connected to the Internet of Things(IoT)is of utmost importance because of the growing vulnerabilities in the smart environment.Conventional models,such as Naive Bayes and support vector machine(SVM),as well as ensemble methods,such as Gradient Boosting and eXtreme gradient boosting(XGBoost),are often plagued by high computational costs,which makes it challenging for them to perform real-time detection.In this regard,we suggested an attack detection approach that integrates Visual Geometry Group 16(VGG16),Artificial Rabbits Optimizer(ARO),and Random Forest Model to increase detection accuracy and operational efficiency in Internet of Things(IoT)networks.In the suggested model,the extraction of features from malware pictures was accomplished with the help of VGG16.The prediction process is carried out by the random forest model using the extracted features from the VGG16.Additionally,ARO is used to improve the hyper-parameters of the random forest model of the random forest.With an accuracy of 96.36%,the suggested model outperforms the standard models in terms of accuracy,F1-score,precision,and recall.The comparative research highlights our strategy’s success,which improves performance while maintaining a lower computational cost.This method is ideal for real-time applications,but it is effective.
文摘[Objective]Accurate prediction of tomato growth height is crucial for optimizing production environments in smart farming.However,current prediction methods predominantly rely on empirical,mechanistic,or learning-based models that utilize either images data or environmental data.These methods fail to fully leverage multi-modal data to capture the diverse aspects of plant growth comprehensively.[Methods]To address this limitation,a two-stage phenotypic feature extraction(PFE)model based on deep learning algorithm of recurrent neural network(RNN)and long short-term memory(LSTM)was developed.The model integrated environment and plant information to provide a holistic understanding of the growth process,emploied phenotypic and temporal feature extractors to comprehensively capture both types of features,enabled a deeper understanding of the interaction between tomato plants and their environment,ultimately leading to highly accurate predictions of growth height.[Results and Discussions]The experimental results showed the model's ef‐fectiveness:When predicting the next two days based on the past five days,the PFE-based RNN and LSTM models achieved mean absolute percentage error(MAPE)of 0.81%and 0.40%,respectively,which were significantly lower than the 8.00%MAPE of the large language model(LLM)and 6.72%MAPE of the Transformer-based model.In longer-term predictions,the 10-day prediction for 4 days ahead and the 30-day prediction for 12 days ahead,the PFE-RNN model continued to outperform the other two baseline models,with MAPE of 2.66%and 14.05%,respectively.[Conclusions]The proposed method,which leverages phenotypic-temporal collaboration,shows great potential for intelligent,data-driven management of tomato cultivation,making it a promising approach for enhancing the efficiency and precision of smart tomato planting management.
文摘Background:Early and accurate diagnosis of cataracts,which ranks among the leading preventable causes of blindness,is critical to securing positive outcomes for patients.Recently,eye image analyses have used deep learning(DL)approaches to automate cataract classification more precisely,leading to the development of the Multiscale Parallel Feature Aggregation Network with Attention Fusion(MPFAN-AF).Focused on improving a model’s performance,this approach applies multiscale feature extraction,parallel feature fusion,along with attention-based fusion to sharpen its focus on salient features,which are crucial in detecting cataracts.Methods:Coarse-level features are captured through the application of convolutional layers,and these features undergo refinement through layered kernels of varying sizes.Moreover,this method captures all the diverse representations of cataracts accurately by parallel feature aggregation.Utilizing the Cataract Eye Dataset available on Kaggle,containing 612 labelled images of eyes with and without cataracts proportionately(normal vs.pathological),this model was trained and tested.Results:Results using the proposed model reflect greater precision over traditional convolutional neural networks(CNNs)models,achieving a classification accuracy of 97.52%.Additionally,the model demonstrated exceptional performance in classification tasks.The ablation studies validated that all applications added value to the prediction process,particularly emphasizing the attention fusion module.Conclusion:The MPFAN-AF model demonstrates high efficiency together with interpretability because it shows promise as an integration solution for real-time mobile cataract detection screening systems.Standard performance indicators indicate that AI-based ophthalmology tools have a promising future for use in remote conditions that lack medical resources.
基金Supported by the Program for Liaoning Excellent Talents in University(No.LR15045)the Liaoning Provincial Science and Technology Department Applied Basic Research Plan(No.101300243).
文摘Medical visual question answering(MedVQA)faces unique challenges due to the high precision required for images and the specialized nature of the questions.These challenges include insufficient feature extraction capabilities,a lack of textual priors,and incomplete information fusion and interaction.This paper proposes an enhanced bootstrapping language-image pre-training(BLIP)model for MedVQA based on multimodal feature augmentation and triple-path collaborative attention(FCA-BLIP)to address these issues.First,FCA-BLIP employs a unified bootstrap multimodal model architecture that integrates ResNet and bidirectional encoder representations from Transformer(BERT)models to enhance feature extraction capabilities.It enables a more precise analysis of the details in images and questions.Next,the pre-trained BLIP model is used to extract features from image-text sample pairs.The model can understand the semantic relationships and shared information between images and text.Finally,a novel attention structure is developed to fuse the multimodal feature vectors,thereby improving the alignment accuracy between modalities.Experimental results demonstrate that the proposed method performs well in clinical visual question-answering tasks.For the MedVQA task of staging diabetic macular edema in fundus imaging,the proposed method outperforms the existing major models in several performance metrics.
文摘Recent advancements in smart-meter technology are transforming traditional power systems into intelligent smart grids.It offers substantial benefits across social,environmental,and economic dimensions.To effectively realize these advantages,a fine-grained collection and analysis of smart meter data is essential.However,the high dimensionality and volume of such time-series present significant challenges,including increased computational load,data transmission overhead,latency,and complexity in real-time analysis.This study proposes a novel,computationally efficient framework for feature extraction and selection tailored to smart meter time-series data.The approach begins with an extensive offline analysis,where features are derived from multiple domains—time,frequency,and statistical—to capture diverse signal characteristics.Various feature sets are fused and evaluated using robust machine learning classifiers to identify the most informative combinations for automated appliance categorization.The bestperforming fused features set undergoes further refinement using Analysis of Variance(ANOVA)to identify the most discriminative features.The mathematical models,used to compute the selected features,are optimized to extract them with computational efficiency during online processing.Moreover,a notable dimension reduction is secured which facilitates data storage,transmission,and post processing.Onward,a specifically designed LogitBoost(LB)based ensemble of Random Forest base learners is used for an automated classification.The proposed solution demonstrates a high classification accuracy(97.93%)for the case of nine-class problem and dimension reduction(17.33-fold)with minimal front-end computational requirements,making it well-suited for real-world applications in smart grid environments.
基金the Natural Science Foundation of Zhejiang Province(Nos.LY20F020015 and LY21F020015)the National Natural Science Foundation of China(Nos.61972121 and 61902099)。
文摘This study proposes a multi-scene smoke detection algorithm based on a multi-feature extraction method to address the problems of varying smoke shapes in different scenes,difficulty in locating and detecting translucent smoke,and variable smoke scales.First,the convolution module of feature extraction in YOLOv5s backbone network is replaced with asymmetric convolution block re-parameterization convolution to improve the detection of different shapes of smoke.Then,coordinate attention mechanism is introduced in the deeper layer of the backbone network to further improve the localization of translucent smoke.Finally,the detection of smoke at different scales is further improved by using the feature pyramid convolution module instead of the standard convolution module of the feature pyramid in the model.The experimental results demonstrate the feasibility and superiority of the proposed model for multi-scene smoke detection.
基金support by the Science and Technology Project of Guangdong Power Exchange Center Co.,Ltd.(No.GDKJXM20222599)National Natural Science Foundation of China(No.52207104)Natural Science Foundation of Guangdong Province(No.2024A1515010426).
文摘The rapid development of electricity retail market has prompted an increasing number of electricity consumers to sign green electricity contracts with retail electricity companies,which poses greater challenges for the market service for green energy consumers.This study proposed a two-stage feature extraction approach for green energy consumers leveraging clustering and termfrequency-inverse document frequency(TF-IDF)algorithms within a knowledge graph framework to provide an information basis that supports the green development of the retail electricity market.First,the multi-source heterogeneous data of green energy consumers under an actual market environment is systematically introduced and the information is categorized into discrete,interval,and relational features.A clustering algorithm was employed to extract features of the trading behavior of green energy consumers in the first stage using the parameter data of green retail electricity contracts.Then,TF-IDF algorithm was applied in the second stage to extract features for green energy consumers in different clusters.Finally,the effectiveness of the proposed approach was validated based on the actual operational data in a southern province of China.It is shown that the most significant discrepancy between the retail trading behaviors of green energy consumers is the power share of green retail packages,whose averaged values are 25.64%,50%,39.66%,and 24.89%in four different clusters,respectively.Additionally,power supply bureaus and electricity retail companies affects the behavior of the green energy consumers most significantly.
基金supported by the National Natural Science Foundation of China under Grant 62306128the Basic Science Research Project of Jiangsu Provincial Department of Education under Grant 23KJD520003the Leading Innovation Project of Changzhou Science and Technology Bureau under Grant CQ20230072.
文摘Real-time detection of surface defects on cables is crucial for ensuring the safe operation of power systems.However,existing methods struggle with small target sizes,complex backgrounds,low-quality image acquisition,and interference from contamination.To address these challenges,this paper proposes the Real-time Cable Defect Detection Network(RC2DNet),which achieves an optimal balance between detection accuracy and computational efficiency.Unlike conventional approaches,RC2DNet introduces a small object feature extraction module that enhances the semantic representation of small targets through feature pyramids,multi-level feature fusion,and an adaptive weighting mechanism.Additionally,a boundary feature enhancement module is designed,incorporating boundary-aware convolution,a novel boundary attention mechanism,and an improved loss function to significantly enhance boundary localization accuracy.Experimental results demonstrate that RC2DNet outperforms state-of-the-art methods in precision,recall,F1-score,mean Intersection over Union(mIoU),and frame rate,enabling real-time and highly accurate cable defect detection in complex backgrounds.
基金the financial support from Natural Science Foundation of Gansu Province(Nos.22JR5RA217,22JR5RA216)Lanzhou Science and Technology Program(No.2022-2-111)+1 种基金Lanzhou University of Arts and Sciences School Innovation Fund Project(No.XJ2022000103)Lanzhou College of Arts and Sciences 2023 Talent Cultivation Quality Improvement Project(No.2023-ZL-jxzz-03)。
文摘Considering that the algorithm accuracy of the traditional sparse representation models is not high under the influence of multiple complex environmental factors,this study focuses on the improvement of feature extraction and model construction.Firstly,the convolutional neural network(CNN)features of the face are extracted by the trained deep learning network.Next,the steady-state and dynamic classifiers for face recognition are constructed based on the CNN features and Haar features respectively,with two-stage sparse representation introduced in the process of constructing the steady-state classifier and the feature templates with high reliability are dynamically selected as alternative templates from the sparse representation template dictionary constructed using the CNN features.Finally,the results of face recognition are given based on the classification results of the steady-state classifier and the dynamic classifier together.Based on this,the feature weights of the steady-state classifier template are adjusted in real time and the dictionary set is dynamically updated to reduce the probability of irrelevant features entering the dictionary set.The average recognition accuracy of this method is 94.45%on the CMU PIE face database and 96.58%on the AR face database,which is significantly improved compared with that of the traditional face recognition methods.
基金the National Natural Science Foundation of China(No.51975374)。
文摘Recent advances in convolution neural network (CNN) have fostered the progress in object recognition and semantic segmentation, which in turn has improved the performance of hyperspectral image (HSI) classification. Nevertheless, the difficulty of high dimensional feature extraction and the shortage of small training samples seriously hinder the future development of HSI classification. In this paper, we propose a novel algorithm for HSI classification based on three-dimensional (3D) CNN and a feature pyramid network (FPN), called 3D-FPN. The framework contains a principle component analysis, a feature extraction structure and a logistic regression. Specifically, the FPN built with 3D convolutions not only retains the advantages of 3D convolution to fully extract the spectral-spatial feature maps, but also concentrates on more detailed information and performs multi-scale feature fusion. This method avoids the excessive complexity of the model and is suitable for small sample hyperspectral classification with varying categories and spatial resolutions. In order to test the performance of our proposed 3D-FPN method, rigorous experimental analysis was performed on three public hyperspectral data sets and hyperspectral data of GF-5 satellite. Quantitative and qualitative results indicated that our proposed method attained the best performance among other current state-of-the-art end-to-end deep learning-based methods.
基金supported by the Zhejiang Provincial Key R&D Program(Nos.2024C01108,2023C01030,2023C01034)the Hangzhou Key R&D Program(Nos.2023SZD0046,2024SZD1A03)the Ningbo Key R&D Program(No.2024Z114).
文摘1 Introduction Sound event detection(SED)aims to identify and locate specific sound event categories and their corresponding timestamps within continuous audio streams.To overcome the limitations posed by the scarcity of strongly labeled training data,researchers have increasingly turned to semi-supervised learning(SSL)[1],which leverages unlabeled data to augment training and improve detection performance.Among many SSL methods[2-4].
基金supported in part by the Fundamental Research Funds for the Central Universities(2024JBMC031)the OpenFund of Advanced Cryptography and System Security Key Laboratory of Sichuan Province(No.SKLACSS-202312)+2 种基金the CCF-NSFOCUS Open Fund,the National Natural Science Foundation of China(Grant Nos.62202042,U20A6003,62076146,62021002,U19A2062,62127803,U1911401 and 6212780016)the Fundamental Research Funds for the Central Universities,JLU,the Industrial Technology Infrastructure Public Service Platform Project‘Public Service Platform for Urban Rail Transit Equipment Signal System Testing and Safety Evaluation’(No.2022-233-225)Ministry of Industry and Information Technology of China.
文摘Advanced Persistent Threats(APTs)pose significant challenges to detect due to their“low-and-slow”attack patterns and frequent use of zero-day vulnerabilities.Within this task,the extraction of long-term features is often crucial.In this work,we propose a novel end-to-end APT detection framework named Long-Term Feature Association Provenance Graph Detector(LT-ProveGD).Specifically,LT-ProveGD encodes contextual information of the dynamic provenance graph while preserving the topological information with space efficiency.To combat“low-and-slow”attacks,LT-ProveGD develops an autoencoder with an integrated multi-head attention mechanism to extract long-term dependencies within the encoded representations.Furthermore,to facilitate the detection of previously unknown attacks,we leverage Jenks’natural breaks methodology,enabling detection without relying on specific attack information.By conducting extensive experiments on five widely used datasets with state-of-the-art attack detection methods,we demonstrate the superior effectiveness of LT-ProveGD.
基金supported by the National Key Research and Development Program of China(Grant No.2020YFB1313800)the National Science Foundation of China(Grant No.NSFC62373259)+1 种基金the Natural Science Foundation of Top Talent of SZTU(Grant No.GDRC202303)the Education Promotion Foundation of Guangdong Province(Grant No.2022ZDJS115).
文摘The success of robot-assisted pelvic fracture reduction surgery heavily relies on the accuracy of 3D/3D feature-based registration.This process involves extracting anatomical feature points from pre-operative 3D images which can be challenging because of the complex and variable structure of the pelvis.PointMLP_RegNet,a modified PointMLP,was introduced to address this issue.It retains the feature extraction module of PointMLP but replaces the classification layer with a regression layer to predict the coordinates of feature points instead of conducting regular classification.A flowchart for an automatic feature points extraction method was presented,and a series of experiments was conducted on a clinical pelvic dataset to confirm the accuracy and effectiveness of the method.PointMLP_RegNet extracted feature points more accurately,with 8 out of 10 points showing less than 4 mm errors and the remaining two less than 5 mm.Compared to PointNettt and PointNet,it exhibited higher accuracy,robustness and space efficiency.The proposed method will improve the accuracy of anatomical feature points extraction,enhance intra-operative registration precision and facilitate the widespread clinical application of robot-assisted pelvic fracture reduction.
基金supported by the National Natural Science Foundation of China (No.52205548)。
文摘To address the issues of unknown target size,blurred edges,background interference and low contrast in infrared small target detection,this paper proposes a method based on density peaks searching and weighted multi-feature local difference.Firstly,an improved high-boost filter is used for preprocessing to eliminate background clutter and high-brightness interference,thereby increasing the probability of capturing real targets in the density peak search.Secondly,a triple-layer window is used to extract features from the area surrounding candidate targets,addressing the uncertainty of small target sizes.By calculating multi-feature local differences between the triple-layer windows,the problems of blurred target edges and low contrast are resolved.To balance the contribution of different features,intra-class distance is used to calculate weights,achieving weighted fusion of multi-feature local differences to obtain the weighted multi-feature local differences of candidate targets.The real targets are then extracted using the interquartile range.Experiments on datasets such as SIRST and IRSTD-IK show that the proposed method is suitable for various complex types and demonstrates good robustness and detection performance.
文摘The main purpose of nonlinear time series analysis is based on the rebuilding theory of phase space, and to study how to transform the response signal to rebuilt phase space in order to extract dynamic feature information, and to provide effective approach for nonlinear signal analysis and fault diagnosis of nonlinear dynamic system. Now, it has already formed an important offset of nonlinear science. But, traditional method cannot extract chaos features automatically, and it needs man's participation in the whole process. A new method is put forward, which can implement auto-extracting of chaos features for nonlinear time series. Firstly, to confirm time delay r by autocorrelation method; Secondly, to compute embedded dimension m and correlation dimension D; Thirdly, to compute the maximum Lyapunov index λmax; Finally, to calculate the chaos degree Dch of Poincare map, and the non-circle degree Dnc and non-order degree Dno of quasi-phase orbit. Chaos features extracting has important meaning to fault diagnosis of nonlinear system based on nonlinear chaos features. Examples show validity of the proposed method.
文摘With the development of modern industry, sheet-metal parts in mass production have been widely applied in mechanical, communication, electronics, and light industries in recent decades; but the advances in sheet-metal part design and manufacturing remain too slow compared with the increasing importance of sheet-metal parts in modern industry. This paper proposes a method for automatically extracting features from an arbitrary solid model of sheet-metal parts; whose characteristics are used for classification and graph-based representation of the sheet-metal features to extract the features embodied in a sheet-metal part. The extracting feature process can be divided for valid checking of the model geometry, feature matching, and feature relationship. Since the extracted features include abundant geometry and engineering information, they will be effective for downstream application such as feature rebuilding and stamping process planning.
基金Guangdong Provincial Natural Science Foundation of China(Grant No.2020B1515120006)Guangdong Innovation Team(Grant Nos.2020KCXTD015,2022KCXTD029)Guangdong Universities New Information Field(Grant No.2021ZDZX1057).
文摘Feature information extraction is one of the key steps in prognostics and health management of rotating machinery.In the present study,an investigation about the feasibility of a methodology based on generalized S transform(GST)and singular value decomposition(SVD)methods for feature extraction in rolling bearing,due to local damage under variable conditions,is conducted.The technique adopts the GST method,following the time-frequency analysis,to transform a raw fault signal of the rolling bearing into a two-dimensional complex matrix.And then,the SVD method is performed to decompose the matrix to obtain the feature vectors.By this procedure it is possible to obtain the fault feature information of rolling bearing under different speeds and different loads.In order to streamline the feature parameters of the feature vectors to train more uncomplicated models,the principal component analysis(PCA)subsequently performed.The particle swarm optimization-support vector machine(PSO-SVM)model is used to identify and classify the different fault states of rolling bearing.Furthermore,in order to highlight the superiority of the proposed method some comparisons are conducted with the conventional methods.The obtained results show that the proposed method can effectively extract fault features of the rolling bearing under variable conditions.