Graph Neural Networks(GNNs)have proven highly effective for graph classification across diverse fields such as social networks,bioinformatics,and finance,due to their capability to learn complex graph structures.Howev...Graph Neural Networks(GNNs)have proven highly effective for graph classification across diverse fields such as social networks,bioinformatics,and finance,due to their capability to learn complex graph structures.However,despite their success,GNNs remain vulnerable to adversarial attacks that can significantly degrade their classification accuracy.Existing adversarial attack strategies primarily rely on label information to guide the attacks,which limits their applicability in scenarios where such information is scarce or unavailable.This paper introduces an innovative unsupervised attack method for graph classification,which operates without relying on label information,thereby enhancing its applicability in a broad range of scenarios.Specifically,our method first leverages a graph contrastive learning loss to learn high-quality graph embeddings by comparing different stochastic augmented views of the graphs.To effectively perturb the graphs,we then introduce an implicit estimator that measures the impact of various modifications on graph structures.The proposed strategy identifies and flips edges with the top-K highest scores,determined by the estimator,to maximize the degradation of the model’s performance.In addition,to defend against such attack,we propose a lightweight regularization-based defense mechanism that is specifically tailored to mitigate the structural perturbations introduced by our attack strategy.It enhances model robustness by enforcing embedding consistency and edge-level smoothness during training.We conduct experiments on six public TU graph classification datasets:NCI1,NCI109,Mutagenicity,ENZYMES,COLLAB,and DBLP_v1,to evaluate the effectiveness of our attack and defense strategies.Under an attack budget of 3,the maximum reduction in model accuracy reaches 6.67%on the Graph Convolutional Network(GCN)and 11.67%on the Graph Attention Network(GAT)across different datasets,indicating that our unsupervised method induces degradation comparable to state-of-the-art supervised attacks.Meanwhile,our defense achieves the highest accuracy recovery of 3.89%(GCN)and 5.00%(GAT),demonstrating improved robustness against structural perturbations.展开更多
Accurate fine-grained geospatial scene classification using remote sensing imagery is essential for a wide range of applications.However,existing approaches often rely on manually zooming remote sensing images at diff...Accurate fine-grained geospatial scene classification using remote sensing imagery is essential for a wide range of applications.However,existing approaches often rely on manually zooming remote sensing images at different scales to create typical scene samples.This approach fails to adequately support the fixed-resolution image interpretation requirements in real-world scenarios.To address this limitation,we introduce the million-scale fine-grained geospatial scene classification dataset(MEET),which contains over 1.03 million zoom-free remote sensing scene samples,manually annotated into 80 fine-grained categories.In MEET,each scene sample follows a scene-in-scene layout,where the central scene serves as the reference,and auxiliary scenes provide crucial spatial context for fine-grained classification.Moreover,to tackle the emerging challenge of scene-in-scene classification,we present the context-aware transformer(CAT),a model specifically designed for this task,which adaptively fuses spatial context to accurately classify the scene samples.CAT adaptively fuses spatial context to accurately classify the scene samples by learning attentional features that capture the relationships between the center and auxiliary scenes.Based on MEET,we establish a comprehensive benchmark for fine-grained geospatial scene classification,evaluating CAT against 11 competitive baselines.The results demonstrate that CAT significantly outperforms these baselines,achieving a 1.88%higher balanced accuracy(BA)with the Swin-Large backbone,and a notable 7.87%improvement with the Swin-Huge backbone.Further experiments validate the effectiveness of each module in CAT and show the practical applicability of CAT in the urban functional zone mapping.The source code and dataset will be publicly available at https://jerrywyn.github.io/project/MEET.html.展开更多
In this paper,we propose hierarchical attention dual network(DNet)for fine-grained image classification.The DNet can randomly select pairs of inputs from the dataset and compare the differences between them through hi...In this paper,we propose hierarchical attention dual network(DNet)for fine-grained image classification.The DNet can randomly select pairs of inputs from the dataset and compare the differences between them through hierarchical attention feature learning,which are used simultaneously to remove noise and retain salient features.In the loss function,it considers the losses of difference in paired images according to the intra-variance and inter-variance.In addition,we also collect the disaster scene dataset from remote sensing images and apply the proposed method to disaster scene classification,which contains complex scenes and multiple types of disasters.Compared to other methods,experimental results show that the DNet with hierarchical attention is robust to different datasets and performs better.展开更多
Bird monitoring and protection are essential for maintaining biodiversity,and fine-grained bird classification has become a key focus in this field.Audio-visual modalities provide critical cues for this task,but robus...Bird monitoring and protection are essential for maintaining biodiversity,and fine-grained bird classification has become a key focus in this field.Audio-visual modalities provide critical cues for this task,but robust feature extraction and efficient fusion remain major challenges.We introduce a multi-stage fine-grained audiovisual fusion network(MSFG-AVFNet) for fine-grained bird species classification,which addresses these challenges through two key components:(1) the audiovisual feature extraction module,which adopts a multi-stage finetuning strategy to provide high-quality unimodal features,laying a solid foundation for modality fusion;(2) the audiovisual feature fusion module,which combines a max pooling aggregation strategy with a novel audiovisual loss function to achieve effective and robust feature fusion.Experiments were conducted on the self-built AVB81and the publicly available SSW60 datasets,which contain data from 81 and 60 bird species,respectively.Comprehensive experiments demonstrate that our approach achieves notable performance gains,outperforming existing state-of-the-art methods.These results highlight its effectiveness in leveraging audiovisual modalities for fine-grained bird classification and its potential to support ecological monitoring and biodiversity research.展开更多
This study proposes an efficient traffic classification model to address the growing threat of distributed denial-of-service(DDoS)attacks in 5th generation technology standard(5G)slicing networks.The proposed method u...This study proposes an efficient traffic classification model to address the growing threat of distributed denial-of-service(DDoS)attacks in 5th generation technology standard(5G)slicing networks.The proposed method utilizes an ensemble of encoder components from multiple autoencoders to compress and extract latent representations from high-dimensional traffic data.These representations are then used as input for a support vector machine(SVM)-based metadata classifier,enabling precise detection of attack traffic.This architecture is designed to achieve both high detection accuracy and training efficiency,while adapting flexibly to the diverse service requirements and complexity of 5G network slicing.The model was evaluated using the DDoS Datasets 2022,collected in a simulated 5G slicing environment.Experiments were conducted under both class-balanced and class-imbalanced conditions.In the balanced setting,the model achieved an accuracy of 89.33%,an F1-score of 88.23%,and an Area Under the Curve(AUC)of 89.45%.In the imbalanced setting(attack:normal 7:3),the model maintained strong robustness,=achieving a recall of 100%and an F1-score of 90.91%,demonstrating its effectiveness in diverse real-world scenarios.Compared to existing AI-based detection methods,the proposed model showed higher precision,better handling of class imbalance,and strong generalization performance.Moreover,its modular structure is well-suited for deployment in containerized network function(NF)environments,making it a practical solution for real-world 5G infrastructure.These results highlight the potential of the proposed approach to enhance both the security and operational resilience of 5G slicing networks.展开更多
The deep learning technology has shown impressive performance in various vision tasks such as image classification, object detection and semantic segmentation. In particular, recent advances of deep learning technique...The deep learning technology has shown impressive performance in various vision tasks such as image classification, object detection and semantic segmentation. In particular, recent advances of deep learning techniques bring encouraging performance to fine-grained image classification which aims to distinguish subordinate-level categories, such as bird species or dog breeds. This task is extremely challenging due to high intra-class and low inter-class variance. In this paper, we review four types of deep learning based fine-grained image classification approaches, including the general convolutional neural networks (CNNs), part detection based, ensemble of networks based and visual attention based fine-grained image classification approaches. Besides, the deep learning based semantic segmentation approaches are also covered in this paper. The region proposal based and fully convolutional networks based approaches for semantic segmentation are introduced respectively.展开更多
Fine-grained image classification, which aims to distinguish images with subtle distinctions, is a challenging task for two main reasons: lack of sufficient training data for every class and difficulty in learning dis...Fine-grained image classification, which aims to distinguish images with subtle distinctions, is a challenging task for two main reasons: lack of sufficient training data for every class and difficulty in learning discriminative features for representation. In this paper, to address the two issues, we propose a two-phase framework for recognizing images from unseen fine-grained classes, i.e., zeroshot fine-grained classification. In the first feature learning phase, we finetune deep convolutional neural networks using hierarchical semantic structure among fine-grained classes to extract discriminative deep visual features. Meanwhile, a domain adaptation structure is induced into deep convolutional neural networks to avoid domain shift from training data to test data. In the second label inference phase, a semantic directed graph is constructed over attributes of fine-grained classes. Based on this graph, we develop a label propagation algorithm to infer the labels of images in the unseen classes. Experimental results on two benchmark datasets demonstrate that our model outperforms the state-of-the-art zero-shot learning models. In addition, the features obtained by our feature learning model also yield significant gains when they are used by other zero-shot learning models, which shows the flexility of our model in zero-shot finegrained classification.展开更多
Fine-grained sedimentary rocks are defined as rocks which mainly compose of fine grains(〈62.5 μm). The detailed studies on these rocks have revealed the need of a more unified, comprehensive and inclusive classifi...Fine-grained sedimentary rocks are defined as rocks which mainly compose of fine grains(〈62.5 μm). The detailed studies on these rocks have revealed the need of a more unified, comprehensive and inclusive classification. The study focuses on fine-grained rocks has turned from the differences of inorganic mineral components to the significance of organic matter and microorganisms. The proposed classification is based on mineral composition, and it is noted that organic matters have been taken as a very important parameter in this classification scheme. Thus, four parameters, the TOC content, silica(quartz plus feldspars), clay minerals and carbonate minerals, are considered to divide the fine-grained sedimentary rocks into eight categories, and the further classification within every category is refined depending on subordinate mineral composition. The nomenclature consists of a root name preceded by a primary adjective. The root names reflect mineral constituent of the rock, including low organic(TOC〈2%), middle organic(2%4%) claystone, siliceous mudstone, limestone, and mixed mudstone. Primary adjectives convey structure and organic content information, including massive or limanited. The lithofacies are closely related to the reservoir storage space, porosity, permeability, hydrocarbon potential and shale oil/gas sweet spot, and are the key factor for the shale oil and gas exploration. The classification helps to systematically and practicably describe variability within fine-grained sedimentary rocks, what's more, it helps to guide the hydrocarbon exploration.展开更多
Based on reviews and summaries of the naming schemes of fine-grained sedimentary rocks, and analysis of characteristics of fine-grained sedimentary rocks, the problems existing in the classification and naming of fine...Based on reviews and summaries of the naming schemes of fine-grained sedimentary rocks, and analysis of characteristics of fine-grained sedimentary rocks, the problems existing in the classification and naming of fine-grained sedimentary rocks are discussed. On this basis, following the principle of three-level nomenclature, a new scheme of rock classification and naming for fine-grained sedimentary rocks is determined from two perspectives: First, fine-grained sedimentary rocks are divided into 12 types in two major categories, mudstone and siltstone, according to particle size(sand, silt and mud). Second,fine-grained sedimentary rocks are divided into 18 types in four categories, carbonate rock, fine-grained felsic sedimentary rock,clay rock and mixed fine-grained sedimentary rock according to mineral composition(carbonate minerals, felsic detrital minerals and clay minerals as three end elements). Considering the importance of organic matter in unconventional oil and gas generation and evaluation, organic matter is taken as the fourth element in the scheme. Taking the organic matter contents of 0.5% and 2% as dividing points, fine grained sedimentary rocks are divided into three categories, organic-poor, organic-bearing,and organic-rich ones. The new scheme meets the requirement of unconventional oil and gas exploration and development today and solves the problem of conceptual confusion in fine-grained sedimentary rocks, providing a unified basic term system for the research of fine-grained sedimentology.展开更多
The remote sensing ships’fine-grained classification technology makes it possible to identify certain ship types in remote sensing images,and it has broad application prospects in civil and military fields.However,th...The remote sensing ships’fine-grained classification technology makes it possible to identify certain ship types in remote sensing images,and it has broad application prospects in civil and military fields.However,the current model does not examine the properties of ship targets in remote sensing images with mixed multi-granularity features and a complicated backdrop.There is still an opportunity for future enhancement of the classification impact.To solve the challenges brought by the above characteristics,this paper proposes a Metaformer and Residual fusion network based on Visual Attention Network(VAN-MR)for fine-grained classification tasks.For the complex background of remote sensing images,the VAN-MR model adopts the parallel structure of large kernel attention and spatial attention to enhance the model’s feature extraction ability of interest targets and improve the classification performance of remote sensing ship targets.For the problem of multi-grained feature mixing in remote sensing images,the VAN-MR model uses a Metaformer structure and a parallel network of residual modules to extract ship features.The parallel network has different depths,considering both high-level and lowlevel semantic information.The model achieves better classification performance in remote sensing ship images with multi-granularity mixing.Finally,the model achieves 88.73%and 94.56%accuracy on the public fine-grained ship collection-23(FGSC-23)and FGSCR-42 datasets,respectively,while the parameter size is only 53.47 M,the floating point operations is 9.9 G.The experimental results show that the classification effect of VAN-MR is superior to that of traditional CNNs model and visual model with Transformer structure under the same parameter quantity.展开更多
The rapid and increasing growth in the volume and number of cyber threats from malware is not a real danger;the real threat lies in the obfuscation of these cyberattacks,as they constantly change their behavior,making...The rapid and increasing growth in the volume and number of cyber threats from malware is not a real danger;the real threat lies in the obfuscation of these cyberattacks,as they constantly change their behavior,making detection more difficult.Numerous researchers and developers have devoted considerable attention to this topic;however,the research field has not yet been fully saturated with high-quality studies that address these problems.For this reason,this paper presents a novel multi-objective Markov-enhanced adaptive whale optimization(MOMEAWO)cybersecurity model to improve the classification of binary and multi-class malware threats through the proposed MOMEAWO approach.The proposed MOMEAWO cybersecurity model aims to provide an innovative solution for analyzing,detecting,and classifying the behavior of obfuscated malware within their respective families.The proposed model includes three classification types:Binary classification and multi-class classification(e.g.,four families and 16 malware families).To evaluate the performance of this model,we used a recently published dataset called the Canadian Institute for Cybersecurity Malware Memory Analysis(CIC-MalMem-2022)that contains balanced data.The results show near-perfect accuracy in binary classification and high accuracy in multi-class classification compared with related work using the same dataset.展开更多
Recently developed fault classification methods for industrial processes are mainly data-driven.Notably,models based on deep neural networks have significantly improved fault classification accuracy owing to the inclu...Recently developed fault classification methods for industrial processes are mainly data-driven.Notably,models based on deep neural networks have significantly improved fault classification accuracy owing to the inclusion of a large number of data patterns.However,these data-driven models are vulnerable to adversarial attacks;thus,small perturbations on the samples can cause the models to provide incorrect fault predictions.Several recent studies have demonstrated the vulnerability of machine learning methods and the existence of adversarial samples.This paper proposes a black-box attack method with an extreme constraint for a safe-critical industrial fault classification system:Only one variable can be perturbed to craft adversarial samples.Moreover,to hide the adversarial samples in the visualization space,a Jacobian matrix is used to guide the perturbed variable selection,making the adversarial samples in the dimensional reduction space invisible to the human eye.Using the one-variable attack(OVA)method,we explore the vulnerability of industrial variables and fault types,which can help understand the geometric characteristics of fault classification systems.Based on the attack method,a corresponding adversarial training defense method is also proposed,which efficiently defends against an OVA and improves the prediction accuracy of the classifiers.In experiments,the proposed method was tested on two datasets from the Tennessee–Eastman process(TEP)and steel plates(SP).We explore the vulnerability and correlation within variables and faults and verify the effectiveness of OVAs and defenses for various classifiers and datasets.For industrial fault classification systems,the attack success rate of our method is close to(on TEP)or even higher than(on SP)the current most effective first-order white-box attack method,which requires perturbation of all variables.展开更多
In the era of the Internet of Things(IoT),the proliferation of connected devices has raised security concerns,increasing the risk of intrusions into diverse systems.Despite the convenience and efficiency offered by Io...In the era of the Internet of Things(IoT),the proliferation of connected devices has raised security concerns,increasing the risk of intrusions into diverse systems.Despite the convenience and efficiency offered by IoT technology,the growing number of IoT devices escalates the likelihood of attacks,emphasizing the need for robust security tools to automatically detect and explain threats.This paper introduces a deep learning methodology for detecting and classifying distributed denial of service(DDoS)attacks,addressing a significant security concern within IoT environments.An effective procedure of deep transfer learning is applied to utilize deep learning backbones,which is then evaluated on two benchmarking datasets of DDoS attacks in terms of accuracy and time complexity.By leveraging several deep architectures,the study conducts thorough binary and multiclass experiments,each varying in the complexity of classifying attack types and demonstrating real-world scenarios.Additionally,this study employs an explainable artificial intelligence(XAI)AI technique to elucidate the contribution of extracted features in the process of attack detection.The experimental results demonstrate the effectiveness of the proposed method,achieving a recall of 99.39%by the XAI bidirectional long short-term memory(XAI-BiLSTM)model.展开更多
In recent times among the multitude of attacks present in network system, DDoS attacks have emerged to be the attacks with the most devastating effects. The main objective of this paper is to propose a system that eff...In recent times among the multitude of attacks present in network system, DDoS attacks have emerged to be the attacks with the most devastating effects. The main objective of this paper is to propose a system that effectively detects DDoS attacks appearing in any networked system using the clustering technique of data mining followed by classification. This method uses a Heuristics Clustering Algorithm (HCA) to cluster the available data and Na?ve Bayes (NB) classification to classify the data and detect the attacks created in the system based on some network attributes of the data packet. The clustering algorithm is based in unsupervised learning technique and is sometimes unable to detect some of the attack instances and few normal instances, therefore classification techniques are also used along with clustering to overcome this classification problem and to enhance the accuracy. Na?ve Bayes classifiers are based on very strong independence assumptions with fairly simple construction to derive the conditional probability for each relationship. A series of experiment is performed using “The CAIDA UCSD DDoS Attack 2007 Dataset” and “DARPA 2000 Dataset” and the efficiency of the proposed system has been tested based on the following performance parameters: Accuracy, Detection Rate and False Positive Rate and the result obtained from the proposed system has been found that it has enhanced accuracy and detection rate with low false positive rate.展开更多
With the rapid development of deepfake technology,the authenticity of various types of fake synthetic content is increasing rapidly,which brings potential security threats to people’s daily life and social stability....With the rapid development of deepfake technology,the authenticity of various types of fake synthetic content is increasing rapidly,which brings potential security threats to people’s daily life and social stability.Currently,most algorithms define deepfake detection as a binary classification problem,i.e.,global features are first extracted using a backbone network and then fed into a binary classifier to discriminate true or false.However,the differences between real and fake samples are often subtle and local,and such global feature-based detection algorithms are not optimal in efficiency and accuracy.To this end,to enhance the extraction of forgery details in deep forgery samples,we propose a multi-branch deepfake detection algorithm based on fine-grained features from the perspective of fine-grained classification.First,to address the critical problem in locating discriminative feature regions in fine-grained classification tasks,we investigate a method for locating multiple different discriminative regions and design a lightweight feature localization module to obtain crucial feature representations by augmenting the most significant parts of the feature map.Second,using information complementation,we introduce a correlation-guided fusion module to enhance the discriminative feature information of different branches.Finally,we use the global attention module in the multi-branch model to improve the cross-dimensional interaction of spatial domain and channel domain information and increase the weights of crucial feature regions and feature channels.We conduct sufficient ablation experiments and comparative experiments.The experimental results show that the algorithm outperforms the detection accuracy and effectiveness on the FaceForensics++and Celeb-DF-v2 datasets compared with the representative detection algorithms in recent years,which can achieve better detection results.展开更多
Pneumonia is part of the main diseases causing the death of children.It is generally diagnosed through chest Xray images.With the development of Deep Learning(DL),the diagnosis of pneumonia based on DL has received ex...Pneumonia is part of the main diseases causing the death of children.It is generally diagnosed through chest Xray images.With the development of Deep Learning(DL),the diagnosis of pneumonia based on DL has received extensive attention.However,due to the small difference between pneumonia and normal images,the performance of DL methods could be improved.This research proposes a new fine-grained Convolutional Neural Network(CNN)for children’s pneumonia diagnosis(FG-CPD).Firstly,the fine-grainedCNNclassificationwhich can handle the slight difference in images is investigated.To obtain the raw images from the real-world chest X-ray data,the YOLOv4 algorithm is trained to detect and position the chest part in the raw images.Secondly,a novel attention network is proposed,named SGNet,which integrates the spatial information and channel information of the images to locate the discriminative parts in the chest image for expanding the difference between pneumonia and normal images.Thirdly,the automatic data augmentation method is adopted to increase the diversity of the images and avoid the overfitting of FG-CPD.The FG-CPD has been tested on the public Chest X-ray 2017 dataset,and the results show that it has achieved great effect.Then,the FG-CPD is tested on the real chest X-ray images from children aged 3–12 years ago from Tongji Hospital.The results show that FG-CPD has achieved up to 96.91%accuracy,which can validate the potential of the FG-CPD.展开更多
In view of the fact that adversarial examples can lead to high-confidence erroneous outputs of deep neural networks,this study aims to improve the safety of deep neural networks by distinguishing adversarial examples....In view of the fact that adversarial examples can lead to high-confidence erroneous outputs of deep neural networks,this study aims to improve the safety of deep neural networks by distinguishing adversarial examples.A classification model based on filter residual network structure is used to accurately classify adversarial examples.The filter-based classification model includes residual network feature extraction and classification modules,which are iteratively optimized by an adversarial training strategy.Three mainstream adversarial attack methods are improved,and adversarial samples are generated on the Mini-ImageNet dataset.Subsequently,these samples are used to attack the EfficientNet and the filter-based classification model respectively,and the attack effects are compared.Experimental results show that the filter-based classification model has high classification accuracy when dealing with Mini-ImageNet adversarial examples.Adversarial training can effectively enhance the robustness of deep neural network models.展开更多
Fine-grained classification of fashion items is essential for enhancing user experiences in online shopping,enabling more effective categorization and personalized recommendations.While recent advances in Artificial I...Fine-grained classification of fashion items is essential for enhancing user experiences in online shopping,enabling more effective categorization and personalized recommendations.While recent advances in Artificial Intelligence(AI)have shown promise,achieving high classification accuracy typically requires large volumes of labeled data—an expensive and labor-intensive requirement,particularly burdensome for smaller retailers.Existing approaches attempt to mitigate this challenge by pretraining models on synthetic or geometric data and fine-tuning on limited fashion images,but these methods have thus far plateaued at around 90%accuracy,falling short of practical deployment standards.In this paper,we introduce a novel iterative image generation framework designed to overcome data scarcity and significantly improve classification accuracy.Our method combines a conditional Generative Adversarial Network(cGAN)with a ResNet50-based image classifier in a closed-loop system.The cGAN generates synthetic fashion images conditioned on class labels,while the classifier filters outputs based on confidence scores and predefined criteria.This cycle is repeated iteratively,progressively enriching the training dataset with highquality synthetic images and refining the classifier.We validate our approach on a task involving classification of five distinct neckline types.The converged model achieves an average accuracy of 94.6%,substantially outperforming previous methods despite using a limited amount of real labeled data.These results highlight the effectiveness of iterative data augmentation using generative models for fine-grained visual classification in resource-constrained settings.展开更多
Finding more specific subcategories within a larger category is the goal of fine-grained image classification(FGIC),and the key is to find local discriminative regions of visual features.Most existing methods use trad...Finding more specific subcategories within a larger category is the goal of fine-grained image classification(FGIC),and the key is to find local discriminative regions of visual features.Most existing methods use traditional convolutional operations to achieve fine-grained image classification.However,traditional convolution cannot extract multi-scale features of an image and existing methods are susceptible to interference from image background information.Therefore,to address the above problems,this paper proposes an FGIC model(Attention-PCNN)based on hybrid attention mechanism and pyramidal convolution.The model feeds the multi-scale features extracted by the pyramidal convolutional neural network into two branches capturing global and local information respectively.In particular,a hybrid attention mechanism is added to the branch capturing global information in order to reduce the interference of image background information and make the model pay more attention to the target region with fine-grained features.In addition,the mutual-channel loss(MC-LOSS)is introduced in the local information branch to capture fine-grained features.We evaluated the model on three publicly available datasets CUB-200-2011,Stanford Cars,FGVCAircraft,etc.Compared to the state-of-the-art methods,the results show that Attention-PCNN performs better.展开更多
Face Presentation Attack Detection(fPAD)plays a vital role in securing face recognition systems against various presentation attacks.While supervised learning-based methods demonstrate effectiveness,they are prone to ...Face Presentation Attack Detection(fPAD)plays a vital role in securing face recognition systems against various presentation attacks.While supervised learning-based methods demonstrate effectiveness,they are prone to overfitting to known attack types and struggle to generalize to novel attack scenarios.Recent studies have explored formulating fPAD as an anomaly detection problem or one-class classification task,enabling the training of generalized models for unknown attack detection.However,conventional anomaly detection approaches encounter difficulties in precisely delineating the boundary between bonafide samples and unknown attacks.To address this challenge,we propose a novel framework focusing on unknown attack detection using exclusively bonafide facial data during training.The core innovation lies in our pseudo-negative sample synthesis(PNSS)strategy,which facilitates learning of compact decision boundaries between bonafide faces and potential attack variations.Specifically,PNSS generates synthetic negative samples within low-likelihood regions of the bonafide feature space to represent diverse unknown attack patterns.To overcome the inherent imbalance between positive and synthetic negative samples during iterative training,we implement a dual-loss mechanism combining focal loss for classification optimization with pairwise confusion loss as a regularizer.This architecture effectively mitigates model bias towards bonafide samples while maintaining discriminative power.Comprehensive evaluations across three benchmark datasets validate the framework’s superior performance.Notably,our PNSS achieves 8%–18% average classification error rate(ACER)reduction compared with state-of-the-art one-class fPAD methods in cross-dataset evaluations on Idiap Replay-Attack and MSU-MFSD datasets.展开更多
基金funded by the National Key Research and Development Program of China(Grant No.2024YFE0209000)the NSFC(Grant No.U23B2019).
文摘Graph Neural Networks(GNNs)have proven highly effective for graph classification across diverse fields such as social networks,bioinformatics,and finance,due to their capability to learn complex graph structures.However,despite their success,GNNs remain vulnerable to adversarial attacks that can significantly degrade their classification accuracy.Existing adversarial attack strategies primarily rely on label information to guide the attacks,which limits their applicability in scenarios where such information is scarce or unavailable.This paper introduces an innovative unsupervised attack method for graph classification,which operates without relying on label information,thereby enhancing its applicability in a broad range of scenarios.Specifically,our method first leverages a graph contrastive learning loss to learn high-quality graph embeddings by comparing different stochastic augmented views of the graphs.To effectively perturb the graphs,we then introduce an implicit estimator that measures the impact of various modifications on graph structures.The proposed strategy identifies and flips edges with the top-K highest scores,determined by the estimator,to maximize the degradation of the model’s performance.In addition,to defend against such attack,we propose a lightweight regularization-based defense mechanism that is specifically tailored to mitigate the structural perturbations introduced by our attack strategy.It enhances model robustness by enforcing embedding consistency and edge-level smoothness during training.We conduct experiments on six public TU graph classification datasets:NCI1,NCI109,Mutagenicity,ENZYMES,COLLAB,and DBLP_v1,to evaluate the effectiveness of our attack and defense strategies.Under an attack budget of 3,the maximum reduction in model accuracy reaches 6.67%on the Graph Convolutional Network(GCN)and 11.67%on the Graph Attention Network(GAT)across different datasets,indicating that our unsupervised method induces degradation comparable to state-of-the-art supervised attacks.Meanwhile,our defense achieves the highest accuracy recovery of 3.89%(GCN)and 5.00%(GAT),demonstrating improved robustness against structural perturbations.
基金supported by the National Natural Science Foundation of China(42030102,42371321).
文摘Accurate fine-grained geospatial scene classification using remote sensing imagery is essential for a wide range of applications.However,existing approaches often rely on manually zooming remote sensing images at different scales to create typical scene samples.This approach fails to adequately support the fixed-resolution image interpretation requirements in real-world scenarios.To address this limitation,we introduce the million-scale fine-grained geospatial scene classification dataset(MEET),which contains over 1.03 million zoom-free remote sensing scene samples,manually annotated into 80 fine-grained categories.In MEET,each scene sample follows a scene-in-scene layout,where the central scene serves as the reference,and auxiliary scenes provide crucial spatial context for fine-grained classification.Moreover,to tackle the emerging challenge of scene-in-scene classification,we present the context-aware transformer(CAT),a model specifically designed for this task,which adaptively fuses spatial context to accurately classify the scene samples.CAT adaptively fuses spatial context to accurately classify the scene samples by learning attentional features that capture the relationships between the center and auxiliary scenes.Based on MEET,we establish a comprehensive benchmark for fine-grained geospatial scene classification,evaluating CAT against 11 competitive baselines.The results demonstrate that CAT significantly outperforms these baselines,achieving a 1.88%higher balanced accuracy(BA)with the Swin-Large backbone,and a notable 7.87%improvement with the Swin-Huge backbone.Further experiments validate the effectiveness of each module in CAT and show the practical applicability of CAT in the urban functional zone mapping.The source code and dataset will be publicly available at https://jerrywyn.github.io/project/MEET.html.
基金Supported by the National Natural Science Foundation of China(61601176)。
文摘In this paper,we propose hierarchical attention dual network(DNet)for fine-grained image classification.The DNet can randomly select pairs of inputs from the dataset and compare the differences between them through hierarchical attention feature learning,which are used simultaneously to remove noise and retain salient features.In the loss function,it considers the losses of difference in paired images according to the intra-variance and inter-variance.In addition,we also collect the disaster scene dataset from remote sensing images and apply the proposed method to disaster scene classification,which contains complex scenes and multiple types of disasters.Compared to other methods,experimental results show that the DNet with hierarchical attention is robust to different datasets and performs better.
基金supported by the Beijing Natural Science Foundation(No.5252014)the Open Fund of The Key Laboratory of Urban Ecological Environment Simulation and Protection,Ministry of Ecology and Environment of the People's Republic of China (No.UEESP-202502)the National Natural Science Foundation of China (No.62303063&32371874)。
文摘Bird monitoring and protection are essential for maintaining biodiversity,and fine-grained bird classification has become a key focus in this field.Audio-visual modalities provide critical cues for this task,but robust feature extraction and efficient fusion remain major challenges.We introduce a multi-stage fine-grained audiovisual fusion network(MSFG-AVFNet) for fine-grained bird species classification,which addresses these challenges through two key components:(1) the audiovisual feature extraction module,which adopts a multi-stage finetuning strategy to provide high-quality unimodal features,laying a solid foundation for modality fusion;(2) the audiovisual feature fusion module,which combines a max pooling aggregation strategy with a novel audiovisual loss function to achieve effective and robust feature fusion.Experiments were conducted on the self-built AVB81and the publicly available SSW60 datasets,which contain data from 81 and 60 bird species,respectively.Comprehensive experiments demonstrate that our approach achieves notable performance gains,outperforming existing state-of-the-art methods.These results highlight its effectiveness in leveraging audiovisual modalities for fine-grained bird classification and its potential to support ecological monitoring and biodiversity research.
基金supported by an Institute of Information&Communications Technology Planning&Evaluation(IITP)grant funded by the Korean government(MSIT)(RS-2024-00438156,Development of Security Resilience Technology Based on Network Slicing Services in a 5G Specialized Network).
文摘This study proposes an efficient traffic classification model to address the growing threat of distributed denial-of-service(DDoS)attacks in 5th generation technology standard(5G)slicing networks.The proposed method utilizes an ensemble of encoder components from multiple autoencoders to compress and extract latent representations from high-dimensional traffic data.These representations are then used as input for a support vector machine(SVM)-based metadata classifier,enabling precise detection of attack traffic.This architecture is designed to achieve both high detection accuracy and training efficiency,while adapting flexibly to the diverse service requirements and complexity of 5G network slicing.The model was evaluated using the DDoS Datasets 2022,collected in a simulated 5G slicing environment.Experiments were conducted under both class-balanced and class-imbalanced conditions.In the balanced setting,the model achieved an accuracy of 89.33%,an F1-score of 88.23%,and an Area Under the Curve(AUC)of 89.45%.In the imbalanced setting(attack:normal 7:3),the model maintained strong robustness,=achieving a recall of 100%and an F1-score of 90.91%,demonstrating its effectiveness in diverse real-world scenarios.Compared to existing AI-based detection methods,the proposed model showed higher precision,better handling of class imbalance,and strong generalization performance.Moreover,its modular structure is well-suited for deployment in containerized network function(NF)environments,making it a practical solution for real-world 5G infrastructure.These results highlight the potential of the proposed approach to enhance both the security and operational resilience of 5G slicing networks.
基金supported by the National Natural Science Foundation of China(Nos.61373121 and 61328205)Program for Sichuan Provincial Science Fund for Distinguished Young Scholars(No.13QNJJ0149)+1 种基金the Fundamental Research Funds for the Central UniversitiesChina Scholarship Council(No.201507000032)
文摘The deep learning technology has shown impressive performance in various vision tasks such as image classification, object detection and semantic segmentation. In particular, recent advances of deep learning techniques bring encouraging performance to fine-grained image classification which aims to distinguish subordinate-level categories, such as bird species or dog breeds. This task is extremely challenging due to high intra-class and low inter-class variance. In this paper, we review four types of deep learning based fine-grained image classification approaches, including the general convolutional neural networks (CNNs), part detection based, ensemble of networks based and visual attention based fine-grained image classification approaches. Besides, the deep learning based semantic segmentation approaches are also covered in this paper. The region proposal based and fully convolutional networks based approaches for semantic segmentation are introduced respectively.
基金supported by National Basic Research Program of China (973 Program) (No. 2015CB352502)National Nature Science Foundation of China (No. 61573026)Beijing Nature Science Foundation (No. L172037)
文摘Fine-grained image classification, which aims to distinguish images with subtle distinctions, is a challenging task for two main reasons: lack of sufficient training data for every class and difficulty in learning discriminative features for representation. In this paper, to address the two issues, we propose a two-phase framework for recognizing images from unseen fine-grained classes, i.e., zeroshot fine-grained classification. In the first feature learning phase, we finetune deep convolutional neural networks using hierarchical semantic structure among fine-grained classes to extract discriminative deep visual features. Meanwhile, a domain adaptation structure is induced into deep convolutional neural networks to avoid domain shift from training data to test data. In the second label inference phase, a semantic directed graph is constructed over attributes of fine-grained classes. Based on this graph, we develop a label propagation algorithm to infer the labels of images in the unseen classes. Experimental results on two benchmark datasets demonstrate that our model outperforms the state-of-the-art zero-shot learning models. In addition, the features obtained by our feature learning model also yield significant gains when they are used by other zero-shot learning models, which shows the flexility of our model in zero-shot finegrained classification.
基金supported by the Certificate of China Postdoctoral Science Foundation (No. 2015M582165)the National Natural Science Foundation of China (Nos. 41602142, 41772090)the National Science and Technology Special (No. 2017ZX05009-002)
文摘Fine-grained sedimentary rocks are defined as rocks which mainly compose of fine grains(〈62.5 μm). The detailed studies on these rocks have revealed the need of a more unified, comprehensive and inclusive classification. The study focuses on fine-grained rocks has turned from the differences of inorganic mineral components to the significance of organic matter and microorganisms. The proposed classification is based on mineral composition, and it is noted that organic matters have been taken as a very important parameter in this classification scheme. Thus, four parameters, the TOC content, silica(quartz plus feldspars), clay minerals and carbonate minerals, are considered to divide the fine-grained sedimentary rocks into eight categories, and the further classification within every category is refined depending on subordinate mineral composition. The nomenclature consists of a root name preceded by a primary adjective. The root names reflect mineral constituent of the rock, including low organic(TOC〈2%), middle organic(2%4%) claystone, siliceous mudstone, limestone, and mixed mudstone. Primary adjectives convey structure and organic content information, including massive or limanited. The lithofacies are closely related to the reservoir storage space, porosity, permeability, hydrocarbon potential and shale oil/gas sweet spot, and are the key factor for the shale oil and gas exploration. The classification helps to systematically and practicably describe variability within fine-grained sedimentary rocks, what's more, it helps to guide the hydrocarbon exploration.
基金Supported by the National Natural Science Foundation of China (41872166)。
文摘Based on reviews and summaries of the naming schemes of fine-grained sedimentary rocks, and analysis of characteristics of fine-grained sedimentary rocks, the problems existing in the classification and naming of fine-grained sedimentary rocks are discussed. On this basis, following the principle of three-level nomenclature, a new scheme of rock classification and naming for fine-grained sedimentary rocks is determined from two perspectives: First, fine-grained sedimentary rocks are divided into 12 types in two major categories, mudstone and siltstone, according to particle size(sand, silt and mud). Second,fine-grained sedimentary rocks are divided into 18 types in four categories, carbonate rock, fine-grained felsic sedimentary rock,clay rock and mixed fine-grained sedimentary rock according to mineral composition(carbonate minerals, felsic detrital minerals and clay minerals as three end elements). Considering the importance of organic matter in unconventional oil and gas generation and evaluation, organic matter is taken as the fourth element in the scheme. Taking the organic matter contents of 0.5% and 2% as dividing points, fine grained sedimentary rocks are divided into three categories, organic-poor, organic-bearing,and organic-rich ones. The new scheme meets the requirement of unconventional oil and gas exploration and development today and solves the problem of conceptual confusion in fine-grained sedimentary rocks, providing a unified basic term system for the research of fine-grained sedimentology.
文摘The remote sensing ships’fine-grained classification technology makes it possible to identify certain ship types in remote sensing images,and it has broad application prospects in civil and military fields.However,the current model does not examine the properties of ship targets in remote sensing images with mixed multi-granularity features and a complicated backdrop.There is still an opportunity for future enhancement of the classification impact.To solve the challenges brought by the above characteristics,this paper proposes a Metaformer and Residual fusion network based on Visual Attention Network(VAN-MR)for fine-grained classification tasks.For the complex background of remote sensing images,the VAN-MR model adopts the parallel structure of large kernel attention and spatial attention to enhance the model’s feature extraction ability of interest targets and improve the classification performance of remote sensing ship targets.For the problem of multi-grained feature mixing in remote sensing images,the VAN-MR model uses a Metaformer structure and a parallel network of residual modules to extract ship features.The parallel network has different depths,considering both high-level and lowlevel semantic information.The model achieves better classification performance in remote sensing ship images with multi-granularity mixing.Finally,the model achieves 88.73%and 94.56%accuracy on the public fine-grained ship collection-23(FGSC-23)and FGSCR-42 datasets,respectively,while the parameter size is only 53.47 M,the floating point operations is 9.9 G.The experimental results show that the classification effect of VAN-MR is superior to that of traditional CNNs model and visual model with Transformer structure under the same parameter quantity.
文摘The rapid and increasing growth in the volume and number of cyber threats from malware is not a real danger;the real threat lies in the obfuscation of these cyberattacks,as they constantly change their behavior,making detection more difficult.Numerous researchers and developers have devoted considerable attention to this topic;however,the research field has not yet been fully saturated with high-quality studies that address these problems.For this reason,this paper presents a novel multi-objective Markov-enhanced adaptive whale optimization(MOMEAWO)cybersecurity model to improve the classification of binary and multi-class malware threats through the proposed MOMEAWO approach.The proposed MOMEAWO cybersecurity model aims to provide an innovative solution for analyzing,detecting,and classifying the behavior of obfuscated malware within their respective families.The proposed model includes three classification types:Binary classification and multi-class classification(e.g.,four families and 16 malware families).To evaluate the performance of this model,we used a recently published dataset called the Canadian Institute for Cybersecurity Malware Memory Analysis(CIC-MalMem-2022)that contains balanced data.The results show near-perfect accuracy in binary classification and high accuracy in multi-class classification compared with related work using the same dataset.
基金This work was supported in part by the National Natural Science Foundation of China(NSFC)(92167106,62103362,and 61833014)the Natural Science Foundation of Zhejiang Province(LR18F030001).
文摘Recently developed fault classification methods for industrial processes are mainly data-driven.Notably,models based on deep neural networks have significantly improved fault classification accuracy owing to the inclusion of a large number of data patterns.However,these data-driven models are vulnerable to adversarial attacks;thus,small perturbations on the samples can cause the models to provide incorrect fault predictions.Several recent studies have demonstrated the vulnerability of machine learning methods and the existence of adversarial samples.This paper proposes a black-box attack method with an extreme constraint for a safe-critical industrial fault classification system:Only one variable can be perturbed to craft adversarial samples.Moreover,to hide the adversarial samples in the visualization space,a Jacobian matrix is used to guide the perturbed variable selection,making the adversarial samples in the dimensional reduction space invisible to the human eye.Using the one-variable attack(OVA)method,we explore the vulnerability of industrial variables and fault types,which can help understand the geometric characteristics of fault classification systems.Based on the attack method,a corresponding adversarial training defense method is also proposed,which efficiently defends against an OVA and improves the prediction accuracy of the classifiers.In experiments,the proposed method was tested on two datasets from the Tennessee–Eastman process(TEP)and steel plates(SP).We explore the vulnerability and correlation within variables and faults and verify the effectiveness of OVAs and defenses for various classifiers and datasets.For industrial fault classification systems,the attack success rate of our method is close to(on TEP)or even higher than(on SP)the current most effective first-order white-box attack method,which requires perturbation of all variables.
文摘In the era of the Internet of Things(IoT),the proliferation of connected devices has raised security concerns,increasing the risk of intrusions into diverse systems.Despite the convenience and efficiency offered by IoT technology,the growing number of IoT devices escalates the likelihood of attacks,emphasizing the need for robust security tools to automatically detect and explain threats.This paper introduces a deep learning methodology for detecting and classifying distributed denial of service(DDoS)attacks,addressing a significant security concern within IoT environments.An effective procedure of deep transfer learning is applied to utilize deep learning backbones,which is then evaluated on two benchmarking datasets of DDoS attacks in terms of accuracy and time complexity.By leveraging several deep architectures,the study conducts thorough binary and multiclass experiments,each varying in the complexity of classifying attack types and demonstrating real-world scenarios.Additionally,this study employs an explainable artificial intelligence(XAI)AI technique to elucidate the contribution of extracted features in the process of attack detection.The experimental results demonstrate the effectiveness of the proposed method,achieving a recall of 99.39%by the XAI bidirectional long short-term memory(XAI-BiLSTM)model.
基金The authors would like to extend their gratitude to Department of Graduate StudiesNepal College of Information Technology for its constant support and motivationWe would also like to thank the Journal of Information Security for its feedbacks and reviews
文摘In recent times among the multitude of attacks present in network system, DDoS attacks have emerged to be the attacks with the most devastating effects. The main objective of this paper is to propose a system that effectively detects DDoS attacks appearing in any networked system using the clustering technique of data mining followed by classification. This method uses a Heuristics Clustering Algorithm (HCA) to cluster the available data and Na?ve Bayes (NB) classification to classify the data and detect the attacks created in the system based on some network attributes of the data packet. The clustering algorithm is based in unsupervised learning technique and is sometimes unable to detect some of the attack instances and few normal instances, therefore classification techniques are also used along with clustering to overcome this classification problem and to enhance the accuracy. Na?ve Bayes classifiers are based on very strong independence assumptions with fairly simple construction to derive the conditional probability for each relationship. A series of experiment is performed using “The CAIDA UCSD DDoS Attack 2007 Dataset” and “DARPA 2000 Dataset” and the efficiency of the proposed system has been tested based on the following performance parameters: Accuracy, Detection Rate and False Positive Rate and the result obtained from the proposed system has been found that it has enhanced accuracy and detection rate with low false positive rate.
基金supported by the 2023 Open Project of Key Laboratory of Ministry of Public Security for Artificial Intelligence Security(RGZNAQ-2304)the Fundamental Research Funds for the Central Universities of PPSUC(2023JKF01ZK08).
文摘With the rapid development of deepfake technology,the authenticity of various types of fake synthetic content is increasing rapidly,which brings potential security threats to people’s daily life and social stability.Currently,most algorithms define deepfake detection as a binary classification problem,i.e.,global features are first extracted using a backbone network and then fed into a binary classifier to discriminate true or false.However,the differences between real and fake samples are often subtle and local,and such global feature-based detection algorithms are not optimal in efficiency and accuracy.To this end,to enhance the extraction of forgery details in deep forgery samples,we propose a multi-branch deepfake detection algorithm based on fine-grained features from the perspective of fine-grained classification.First,to address the critical problem in locating discriminative feature regions in fine-grained classification tasks,we investigate a method for locating multiple different discriminative regions and design a lightweight feature localization module to obtain crucial feature representations by augmenting the most significant parts of the feature map.Second,using information complementation,we introduce a correlation-guided fusion module to enhance the discriminative feature information of different branches.Finally,we use the global attention module in the multi-branch model to improve the cross-dimensional interaction of spatial domain and channel domain information and increase the weights of crucial feature regions and feature channels.We conduct sufficient ablation experiments and comparative experiments.The experimental results show that the algorithm outperforms the detection accuracy and effectiveness on the FaceForensics++and Celeb-DF-v2 datasets compared with the representative detection algorithms in recent years,which can achieve better detection results.
基金supported in part by the Natural Science Foundation of China(NSFC)underGrant No.51805192,Major Special Science and Technology Project of Hubei Province under Grant No.2020AEA009sponsored by the State Key Laboratory of Digital Manufacturing Equipment and Technology(DMET)of Huazhong University of Science and Technology(HUST)under Grant No.DMETKF2020029.
文摘Pneumonia is part of the main diseases causing the death of children.It is generally diagnosed through chest Xray images.With the development of Deep Learning(DL),the diagnosis of pneumonia based on DL has received extensive attention.However,due to the small difference between pneumonia and normal images,the performance of DL methods could be improved.This research proposes a new fine-grained Convolutional Neural Network(CNN)for children’s pneumonia diagnosis(FG-CPD).Firstly,the fine-grainedCNNclassificationwhich can handle the slight difference in images is investigated.To obtain the raw images from the real-world chest X-ray data,the YOLOv4 algorithm is trained to detect and position the chest part in the raw images.Secondly,a novel attention network is proposed,named SGNet,which integrates the spatial information and channel information of the images to locate the discriminative parts in the chest image for expanding the difference between pneumonia and normal images.Thirdly,the automatic data augmentation method is adopted to increase the diversity of the images and avoid the overfitting of FG-CPD.The FG-CPD has been tested on the public Chest X-ray 2017 dataset,and the results show that it has achieved great effect.Then,the FG-CPD is tested on the real chest X-ray images from children aged 3–12 years ago from Tongji Hospital.The results show that FG-CPD has achieved up to 96.91%accuracy,which can validate the potential of the FG-CPD.
文摘In view of the fact that adversarial examples can lead to high-confidence erroneous outputs of deep neural networks,this study aims to improve the safety of deep neural networks by distinguishing adversarial examples.A classification model based on filter residual network structure is used to accurately classify adversarial examples.The filter-based classification model includes residual network feature extraction and classification modules,which are iteratively optimized by an adversarial training strategy.Three mainstream adversarial attack methods are improved,and adversarial samples are generated on the Mini-ImageNet dataset.Subsequently,these samples are used to attack the EfficientNet and the filter-based classification model respectively,and the attack effects are compared.Experimental results show that the filter-based classification model has high classification accuracy when dealing with Mini-ImageNet adversarial examples.Adversarial training can effectively enhance the robustness of deep neural network models.
基金supported by the Institute of Information&communications Technology Planning&Evaluation grant funded by the Republic of Korea government(No.RS-2024-00431388,the Global Research Support Program in the Digital Field program)the National Science Foundation of USA(Nos.2429960 and 2434899).
文摘Fine-grained classification of fashion items is essential for enhancing user experiences in online shopping,enabling more effective categorization and personalized recommendations.While recent advances in Artificial Intelligence(AI)have shown promise,achieving high classification accuracy typically requires large volumes of labeled data—an expensive and labor-intensive requirement,particularly burdensome for smaller retailers.Existing approaches attempt to mitigate this challenge by pretraining models on synthetic or geometric data and fine-tuning on limited fashion images,but these methods have thus far plateaued at around 90%accuracy,falling short of practical deployment standards.In this paper,we introduce a novel iterative image generation framework designed to overcome data scarcity and significantly improve classification accuracy.Our method combines a conditional Generative Adversarial Network(cGAN)with a ResNet50-based image classifier in a closed-loop system.The cGAN generates synthetic fashion images conditioned on class labels,while the classifier filters outputs based on confidence scores and predefined criteria.This cycle is repeated iteratively,progressively enriching the training dataset with highquality synthetic images and refining the classifier.We validate our approach on a task involving classification of five distinct neckline types.The converged model achieves an average accuracy of 94.6%,substantially outperforming previous methods despite using a limited amount of real labeled data.These results highlight the effectiveness of iterative data augmentation using generative models for fine-grained visual classification in resource-constrained settings.
基金supported by the National Natural Science Foundation of China(Nos.62372266,61832012,12271295,and 62072273)the Natural Science Foundation of Shandong Province(Nos.ZR2020MF149,ZR2022MF304,ZR2021MF075,ZR2021QF050,and ZR2019ZD10)the Key Research and Development Program Project of Shandong Province(No.2022CXPT055).
文摘Finding more specific subcategories within a larger category is the goal of fine-grained image classification(FGIC),and the key is to find local discriminative regions of visual features.Most existing methods use traditional convolutional operations to achieve fine-grained image classification.However,traditional convolution cannot extract multi-scale features of an image and existing methods are susceptible to interference from image background information.Therefore,to address the above problems,this paper proposes an FGIC model(Attention-PCNN)based on hybrid attention mechanism and pyramidal convolution.The model feeds the multi-scale features extracted by the pyramidal convolutional neural network into two branches capturing global and local information respectively.In particular,a hybrid attention mechanism is added to the branch capturing global information in order to reduce the interference of image background information and make the model pay more attention to the target region with fine-grained features.In addition,the mutual-channel loss(MC-LOSS)is introduced in the local information branch to capture fine-grained features.We evaluated the model on three publicly available datasets CUB-200-2011,Stanford Cars,FGVCAircraft,etc.Compared to the state-of-the-art methods,the results show that Attention-PCNN performs better.
基金supported in part by the National Natural Science Foundation of China under Grants 61972267,and 61772070in part by the Natural Science Foundation of Hebei Province under Grant F2024210005.
文摘Face Presentation Attack Detection(fPAD)plays a vital role in securing face recognition systems against various presentation attacks.While supervised learning-based methods demonstrate effectiveness,they are prone to overfitting to known attack types and struggle to generalize to novel attack scenarios.Recent studies have explored formulating fPAD as an anomaly detection problem or one-class classification task,enabling the training of generalized models for unknown attack detection.However,conventional anomaly detection approaches encounter difficulties in precisely delineating the boundary between bonafide samples and unknown attacks.To address this challenge,we propose a novel framework focusing on unknown attack detection using exclusively bonafide facial data during training.The core innovation lies in our pseudo-negative sample synthesis(PNSS)strategy,which facilitates learning of compact decision boundaries between bonafide faces and potential attack variations.Specifically,PNSS generates synthetic negative samples within low-likelihood regions of the bonafide feature space to represent diverse unknown attack patterns.To overcome the inherent imbalance between positive and synthetic negative samples during iterative training,we implement a dual-loss mechanism combining focal loss for classification optimization with pairwise confusion loss as a regularizer.This architecture effectively mitigates model bias towards bonafide samples while maintaining discriminative power.Comprehensive evaluations across three benchmark datasets validate the framework’s superior performance.Notably,our PNSS achieves 8%–18% average classification error rate(ACER)reduction compared with state-of-the-art one-class fPAD methods in cross-dataset evaluations on Idiap Replay-Attack and MSU-MFSD datasets.