期刊文献+
共找到306篇文章
< 1 2 16 >
每页显示 20 50 100
Graph Attention Networks for Skin Lesion Classification with CNN-Driven Node Features
1
作者 Ghadah Naif Alwakid Samabia Tehsin +3 位作者 Mamoona Humayun Asad Farooq Ibrahim Alrashdi Amjad Alsirhani 《Computers, Materials & Continua》 2026年第1期1964-1984,共21页
Skin diseases affect millions worldwide.Early detection is key to preventing disfigurement,lifelong disability,or death.Dermoscopic images acquired in primary-care settings show high intra-class visual similarity and ... Skin diseases affect millions worldwide.Early detection is key to preventing disfigurement,lifelong disability,or death.Dermoscopic images acquired in primary-care settings show high intra-class visual similarity and severe class imbalance,and occasional imaging artifacts can create ambiguity for state-of-the-art convolutional neural networks(CNNs).We frame skin lesion recognition as graph-based reasoning and,to ensure fair evaluation and avoid data leakage,adopt a strict lesion-level partitioning strategy.Each image is first over-segmented using SLIC(Simple Linear Iterative Clustering)to produce perceptually homogeneous superpixels.These superpixels form the nodes of a region-adjacency graph whose edges encode spatial continuity.Node attributes are 1280-dimensional embeddings extracted with a lightweight yet expressive EfficientNet-B0 backbone,providing strong representational power at modest computational cost.The resulting graphs are processed by a five-layer Graph Attention Network(GAT)that learns to weight inter-node relationships dynamically and aggregates multi-hop context before classifying lesions into seven classes with a log-softmax output.Extensive experiments on the DermaMNIST benchmark show the proposed pipeline achieves 88.35%accuracy and 98.04%AUC,outperforming contemporary CNNs,AutoML approaches,and alternative graph neural networks.An ablation study indicates EfficientNet-B0 produces superior node descriptors compared with ResNet-18 and DenseNet,and that roughly five GAT layers strike a good balance between being too shallow and over-deep while avoiding oversmoothing.The method requires no data augmentation or external metadata,making it a drop-in upgrade for clinical computer-aided diagnosis systems. 展开更多
关键词 Graph neural network image classification DermaMNIST dataset graph representation
在线阅读 下载PDF
MMGC-Net: Deep neural network for classification of mineral grains using multi-modal polarization images 被引量:1
2
作者 Jun Shu Xiaohai He +3 位作者 Qizhi Teng Pengcheng Yan Haibo He Honggang Chen 《Journal of Rock Mechanics and Geotechnical Engineering》 2025年第6期3894-3909,共16页
The multi-modal characteristics of mineral particles play a pivotal role in enhancing the classification accuracy,which is critical for obtaining a profound understanding of the Earth's composition and ensuring ef... The multi-modal characteristics of mineral particles play a pivotal role in enhancing the classification accuracy,which is critical for obtaining a profound understanding of the Earth's composition and ensuring effective exploitation utilization of its resources.However,the existing methods for classifying mineral particles do not fully utilize these multi-modal features,thereby limiting the classification accuracy.Furthermore,when conventional multi-modal image classification methods are applied to planepolarized and cross-polarized sequence images of mineral particles,they encounter issues such as information loss,misaligned features,and challenges in spatiotemporal feature extraction.To address these challenges,we propose a multi-modal mineral particle polarization image classification network(MMGC-Net)for precise mineral particle classification.Initially,MMGC-Net employs a two-dimensional(2D)backbone network with shared parameters to extract features from two types of polarized images to ensure feature alignment.Subsequently,a cross-polarized intra-modal feature fusion module is designed to refine the spatiotemporal features from the extracted features of the cross-polarized sequence images.Ultimately,the inter-modal feature fusion module integrates the two types of modal features to enhance the classification precision.Quantitative and qualitative experimental results indicate that when compared with the current state-of-the-art multi-modal image classification methods,MMGC-Net demonstrates marked superiority in terms of mineral particle multi-modal feature learning and four classification evaluation metrics.It also demonstrates better stability than the existing models. 展开更多
关键词 Mineral particles Multi-modal image classification Shared parameters Feature fusion Spatiotemporal feature
暂未订购
Congruent Feature Selection Method to Improve the Efficacy of Machine Learning-Based Classification in Medical Image Processing
3
作者 Mohd Anjum Naoufel Kraiem +2 位作者 Hong Min Ashit Kumar Dutta Yousef Ibrahim Daradkeh 《Computer Modeling in Engineering & Sciences》 SCIE EI 2025年第1期357-384,共28页
Machine learning(ML)is increasingly applied for medical image processing with appropriate learning paradigms.These applications include analyzing images of various organs,such as the brain,lung,eye,etc.,to identify sp... Machine learning(ML)is increasingly applied for medical image processing with appropriate learning paradigms.These applications include analyzing images of various organs,such as the brain,lung,eye,etc.,to identify specific flaws/diseases for diagnosis.The primary concern of ML applications is the precise selection of flexible image features for pattern detection and region classification.Most of the extracted image features are irrelevant and lead to an increase in computation time.Therefore,this article uses an analytical learning paradigm to design a Congruent Feature Selection Method to select the most relevant image features.This process trains the learning paradigm using similarity and correlation-based features over different textural intensities and pixel distributions.The similarity between the pixels over the various distribution patterns with high indexes is recommended for disease diagnosis.Later,the correlation based on intensity and distribution is analyzed to improve the feature selection congruency.Therefore,the more congruent pixels are sorted in the descending order of the selection,which identifies better regions than the distribution.Now,the learning paradigm is trained using intensity and region-based similarity to maximize the chances of selection.Therefore,the probability of feature selection,regardless of the textures and medical image patterns,is improved.This process enhances the performance of ML applications for different medical image processing.The proposed method improves the accuracy,precision,and training rate by 13.19%,10.69%,and 11.06%,respectively,compared to other models for the selected dataset.The mean error and selection time is also reduced by 12.56%and 13.56%,respectively,compared to the same models and dataset. 展开更多
关键词 Computer vision feature selection machine learning region detection texture analysis image classification medical images
在线阅读 下载PDF
Dual networks with hierarchical attention for fine-grained image classification
4
作者 YANG Tao WANG Gaihua 《中国科学院大学学报(中英文)》 北大核心 2025年第6期806-813,共8页
In this paper,we propose hierarchical attention dual network(DNet)for fine-grained image classification.The DNet can randomly select pairs of inputs from the dataset and compare the differences between them through hi... In this paper,we propose hierarchical attention dual network(DNet)for fine-grained image classification.The DNet can randomly select pairs of inputs from the dataset and compare the differences between them through hierarchical attention feature learning,which are used simultaneously to remove noise and retain salient features.In the loss function,it considers the losses of difference in paired images according to the intra-variance and inter-variance.In addition,we also collect the disaster scene dataset from remote sensing images and apply the proposed method to disaster scene classification,which contains complex scenes and multiple types of disasters.Compared to other methods,experimental results show that the DNet with hierarchical attention is robust to different datasets and performs better. 展开更多
关键词 dual network(DNet) fine-grained image classification hierarchical attention features
在线阅读 下载PDF
Hybrid Fusion Net with Explanability:A Novel Explainable Deep Learning-Based Hybrid Framework for Enhanced Skin Lesion Classification Using Dermoscopic Images
5
作者 Mohamed Hammad Mohammed El Affendi Souham Meshoul 《Computer Modeling in Engineering & Sciences》 2025年第10期1055-1086,共32页
Skin cancer is among the most common malignancies worldwide,but its mortality burden is largely driven by aggressive subtypes such as melanoma,with outcomes varying across regions and healthcare settings.These variati... Skin cancer is among the most common malignancies worldwide,but its mortality burden is largely driven by aggressive subtypes such as melanoma,with outcomes varying across regions and healthcare settings.These variations emphasize the importance of reliable diagnostic technologies that support clinicians in detecting skin malignancies with higher accuracy.Traditional diagnostic methods often rely on subjective visual assessments,which can lead to misdiagnosis.This study addresses these challenges by developing HybridFusionNet,a novel model that integrates Convolutional Neural Networks(CNN)with 1D feature extraction techniques to enhance diagnostic accuracy.Utilizing two extensive datasets,BCN20000 and HAM10000,the methodology includes data preprocessing,application of Synthetic Minority Oversampling Technique combined with Edited Nearest Neighbors(SMOTEENN)for data balancing,and optimization of feature selection using the Tree-based Pipeline Optimization Tool(TPOT).The results demonstrate significant performance improvements over traditional CNN models,achieving an accuracy of 0.9693 on the BCN20000 dataset and 0.9909 on the HAM10000 dataset.The HybridFusionNet model not only outperforms conventionalmethods but also effectively addresses class imbalance.To enhance transparency,it integrates post-hoc explanation techniques such as LIME,which highlight the features influencing predictions.These findings highlight the potential of HybridFusionNet to support real-world applications,including physician-assist systems,teledermatology,and large-scale skin cancer screening programs.By improving diagnostic efficiency and enabling access to expert-level analysis,the modelmay enhance patient outcomes and foster greater trust in artificial intelligence(AI)-assisted clinical decision-making. 展开更多
关键词 AI CNN deep learning image classification model optimization skin cancer detection
在线阅读 下载PDF
CloudViT:A Lightweight Ground-Based Cloud Image Classification Model with the Ability to Capture Global Features
6
作者 Daoming Wei Fangyan Ge +5 位作者 Bopeng Zhang Zhiqiang Zhao Dequan Li Lizong Xi Jinrong Hu Xin Wang 《Computers, Materials & Continua》 2025年第6期5729-5746,共18页
Accurate cloud classification plays a crucial role in aviation safety,climate monitoring,and localized weather forecasting.Current research has been focusing on machine learning techniques,particularly deep learning b... Accurate cloud classification plays a crucial role in aviation safety,climate monitoring,and localized weather forecasting.Current research has been focusing on machine learning techniques,particularly deep learning based model,for the types identification.However,traditional approaches such as convolutional neural networks(CNNs)encounter difficulties in capturing global contextual information.In addition,they are computationally expensive,which restricts their usability in resource-limited environments.To tackle these issues,we present the Cloud Vision Transformer(CloudViT),a lightweight model that integrates CNNs with Transformers.The integration enables an effective balance between local and global feature extraction.To be specific,CloudViT comprises two innovative modules:Feature Extraction(E_Module)and Downsampling(D_Module).These modules are able to significantly reduce the number of model parameters and computational complexity while maintaining translation invariance and enhancing contextual comprehension.Overall,the CloudViT includes 0.93×10^(6)parameters,which decreases more than ten times compared to the SOTA(State-of-the-Art)model CloudNet.Comprehensive evaluations conducted on the HBMCD and SWIMCAT datasets showcase the outstanding performance of CloudViT.It achieves classification accuracies of 98.45%and 100%,respectively.Moreover,the efficiency and scalability of CloudViT make it an ideal candidate for deployment inmobile cloud observation systems,enabling real-time cloud image classification.The proposed hybrid architecture of CloudViT offers a promising approach for advancing ground-based cloud image classification.It holds significant potential for both optimizing performance and facilitating practical deployment scenarios. 展开更多
关键词 Image classification ground-based cloud images lightweight neural networks attention mechanism deep learning vision transformer
在线阅读 下载PDF
Compressed meta-optical encoder for image classification
7
作者 Anna Wirth-Singh Jinlin Xiang +5 位作者 Minho Choi Johannes EFröch Luocheng Huang Shane Colburn Eli Shlizerman Arka Majumdar 《Advanced Photonics Nexus》 2025年第2期87-96,共10页
Optical and hybrid convolutional neural networks(CNNs)recently have become of increasing interest to achieve low-latency,low-power image classification,and computer-vision tasks.However,implementing optical nonlineari... Optical and hybrid convolutional neural networks(CNNs)recently have become of increasing interest to achieve low-latency,low-power image classification,and computer-vision tasks.However,implementing optical nonlinearity is challenging,and omitting the nonlinear layers in a standard CNN comes with a significant reduction in accuracy.We use knowledge distillation to compress modified AlexNet to a single linear convolutional layer and an electronic backend(two fully connected layers).We obtain comparable performance with a purely electronic CNN with five convolutional layers and three fully connected layers.We implement the convolution optically via engineering the point spread function of an inverse-designed meta-optic.Using this hybrid approach,we estimate a reduction in multiply-accumulate operations from 17M in a conventional electronic modified AlexNet to only 86 K in the hybrid compressed network enabled by the optical front end.This constitutes over 2 orders of magnitude of reduction in latency and power consumption.Furthermore,we experimentally demonstrate that the classification accuracy of the system exceeds 93%on the MNIST dataset of handwritten digits. 展开更多
关键词 neural network meta-optics image classification knowledge distillation optical computing
在线阅读 下载PDF
Multi-Scale Feature Fusion and Advanced Representation Learning for Multi Label Image Classification
8
作者 Naikang Zhong Xiao Lin +1 位作者 Wen Du Jin Shi 《Computers, Materials & Continua》 2025年第3期5285-5306,共22页
Multi-label image classification is a challenging task due to the diverse sizes and complex backgrounds of objects in images.Obtaining class-specific precise representations at different scales is a key aspect of feat... Multi-label image classification is a challenging task due to the diverse sizes and complex backgrounds of objects in images.Obtaining class-specific precise representations at different scales is a key aspect of feature representation.However,existing methods often rely on the single-scale deep feature,neglecting shallow and deeper layer features,which poses challenges when predicting objects of varying scales within the same image.Although some studies have explored multi-scale features,they rarely address the flow of information between scales or efficiently obtain class-specific precise representations for features at different scales.To address these issues,we propose a two-stage,three-branch Transformer-based framework.The first stage incorporates multi-scale image feature extraction and hierarchical scale attention.This design enables the model to consider objects at various scales while enhancing the flow of information across different feature scales,improving the model’s generalization to diverse object scales.The second stage includes a global feature enhancement module and a region selection module.The global feature enhancement module strengthens interconnections between different image regions,mitigating the issue of incomplete represen-tations,while the region selection module models the cross-modal relationships between image features and labels.Together,these components enable the efficient acquisition of class-specific precise feature representations.Extensive experiments on public datasets,including COCO2014,VOC2007,and VOC2012,demonstrate the effectiveness of our proposed method.Our approach achieves consistent performance gains of 0.3%,0.4%,and 0.2%over state-of-the-art methods on the three datasets,respectively.These results validate the reliability and superiority of our approach for multi-label image classification. 展开更多
关键词 Image classification MULTI-LABEL multi scale attention mechanisms feature fusion
在线阅读 下载PDF
A Hybrid Model of Transfer Learning and Convolutional Neural Networks for Accurate Coffee Leaf Miner(CLM)Classification
9
作者 Nameer Baht Enrique Domínguez 《Computers, Materials & Continua》 2025年第12期4441-4455,共15页
Coffee is an important agricultural commodity,and its production is threatened by various diseases.It is also a source of concern for coffee-exporting countries,which is causing them to rethink their strategies for th... Coffee is an important agricultural commodity,and its production is threatened by various diseases.It is also a source of concern for coffee-exporting countries,which is causing them to rethink their strategies for the future.Maintaining crop production requires early diagnosis.Notably,Coffee Leaf Miner(CLM)Machine learning(ML)offers promising tools for automated disease detection.Early detection of CLM is crucial for minimising yield losses.However,this study explores the effectiveness of using Convolutional Neural Networks(CNNs)with transfer learning algorithms ResNet50,DenseNet121,MobileNet,Inception,and hybrid VGG19 for classifying coffee leaf images as healthy or CLM-infected.Leveraging the JMuBEN1 dataset,the proposed hybrid VGG19 model achieved exceptional performance,reaching 97%accuracy on both training and validation data.Additionally,high scores for precision,recall,and F1-score.The confusion matrix shows that all the test samples were correctly classified,which indicates the model’s strong performance on this dataset,demonstrating that the model is effective in distinguishing between healthy and CLM-infected leaves.This suggests strong potential for implementing this approach in real-world coffee plantations for early disease detection and improved disease management,and adapting it for practical deployment in agricultural settings.As well as supporting farmers in detecting diseases using modern,inexpensive methods that do not require specialists,and utilising deep learning technologies. 展开更多
关键词 Coffee leaf disease transfer learning image classification disease detection JMuBEN1 dataset VGG19 architecture
在线阅读 下载PDF
DenseSwinGNNNet:A Novel Deep Learning Framework for Accurate Turmeric Leaf Disease Classification
10
作者 Seerat Singla Gunjan Shandilya +4 位作者 Ayman Altameem Ruby Pant Ajay Kumar Ateeq Ur Rehman Ahmad Almogren 《Phyton-International Journal of Experimental Botany》 2025年第12期4021-4057,共37页
Turmeric Leaf diseases pose a major threat to turmeric cultivation,causing significant yield loss and economic impact.Early and accurate identification of these diseases is essential for effective crop management and ... Turmeric Leaf diseases pose a major threat to turmeric cultivation,causing significant yield loss and economic impact.Early and accurate identification of these diseases is essential for effective crop management and timely intervention.This study proposes DenseSwinGNNNet,a hybrid deep learning framework that integrates DenseNet-121,the Swin Transformer,and a Graph Neural Network(GNN)to enhance the classification of turmeric leaf conditions.DenseNet121 extracts discriminative low-level features,the Swin Transformer captures long-range contextual relationships through hierarchical self-attention,and the GNN models inter-feature dependencies to refine the final representation.A total of 4361 images from the Mendeley turmeric leaf dataset were used,categorized into four classes:Aphids Disease,Blotch,Leaf Spot,and Healthy Leaf.The dataset underwent extensive preprocessing,including augmentation,normalization,and resizing,to improve generalization.An 80:10:10 split was applied for training,validation,and testing respectively.Model performance was evaluated using accuracy,precision,recall,F1-score,confusion matrices,and ROC curves.Optimized with the Adam optimizer at the learning rate of 0.0001,DenseSwinGNNNet achieved an overall accuracy of 99.7%,with precision,recall,and F1-scores exceeding 99%across all classes.The ROC curves reported AUC values near 1.0,indicating excellent class separability,while the confusion matrix showed minimal misclassification.Beyond high predictive performance,the framework incorporates considerations for cybersecurity and privacy in data-driven agriculture,supporting secure data handling and robust model deployment.This work contributes a reliable and scalable approach for turmeric leaf disease detection and advances the application of AI-driven precision agriculture. 展开更多
关键词 Turmeric leaf disease deep learning DenseNet121 swin transformer graph neural network(GNN) image classification
在线阅读 下载PDF
Multi-Label Image Classification Model Based on Multiscale Fusion and Adaptive Label Correlation
11
作者 YE Jihua JIANG Lu +2 位作者 XIAO Shunjie ZONG Yi JIANG Aiwen 《Journal of Shanghai Jiaotong university(Science)》 2025年第5期889-898,共10页
At present,research on multi-label image classification mainly focuses on exploring the correlation between labels to improve the classification accuracy of multi-label images.However,in existing methods,label correla... At present,research on multi-label image classification mainly focuses on exploring the correlation between labels to improve the classification accuracy of multi-label images.However,in existing methods,label correlation is calculated based on the statistical information of the data.This label correlation is global and depends on the dataset,not suitable for all samples.In the process of extracting image features,the characteristic information of small objects in the image is easily lost,resulting in a low classification accuracy of small objects.To this end,this paper proposes a multi-label image classification model based on multiscale fusion and adaptive label correlation.The main idea is:first,the feature maps of multiple scales are fused to enhance the feature information of small objects.Semantic guidance decomposes the fusion feature map into feature vectors of each category,then adaptively mines the correlation between categories in the image through the self-attention mechanism of graph attention network,and obtains feature vectors containing category-related information for the final classification.The mean average precision of the model on the two public datasets of VOC 2007 and MS COCO 2014 reached 95.6% and 83.6%,respectively,and most of the indicators are better than those of the existing latest methods. 展开更多
关键词 image classification label correlation graph attention network small object multi-scale fusion
原文传递
Marine organism classification method based on hierarchical multi-scale attention mechanism
12
作者 XU Haotian CHENG Yuanzhi +1 位作者 ZHAO Dong XIE Peidong 《Optoelectronics Letters》 2025年第6期354-361,共8页
We propose a hierarchical multi-scale attention mechanism-based model in response to the low accuracy and inefficient manual classification of existing oceanic biological image classification methods. Firstly, the hie... We propose a hierarchical multi-scale attention mechanism-based model in response to the low accuracy and inefficient manual classification of existing oceanic biological image classification methods. Firstly, the hierarchical efficient multi-scale attention(H-EMA) module is designed for lightweight feature extraction, achieving outstanding performance at a relatively low cost. Secondly, an improved EfficientNetV2 block is used to integrate information from different scales better and enhance inter-layer message passing. Furthermore, introducing the convolutional block attention module(CBAM) enhances the model's perception of critical features, optimizing its generalization ability. Lastly, Focal Loss is introduced to adjust the weights of complex samples to address the issue of imbalanced categories in the dataset, further improving the model's performance. The model achieved 96.11% accuracy on the intertidal marine organism dataset of Nanji Islands and 84.78% accuracy on the CIFAR-100 dataset, demonstrating its strong generalization ability to meet the demands of oceanic biological image classification. 展开更多
关键词 integrate information different scales hierarchical multi scale attention lightweight feature extraction focal loss efficientnetv marine organism classification oceanic biological image classification methods convolutional block attention module
原文传递
Exploratory Research on Defense against Natural Adversarial Examples in Image Classification
13
作者 Yaoxuan Zhu Hua Yang Bin Zhu 《Computers, Materials & Continua》 2025年第2期1947-1968,共22页
The emergence of adversarial examples has revealed the inadequacies in the robustness of image classification models based on Convolutional Neural Networks (CNNs). Particularly in recent years, the discovery of natura... The emergence of adversarial examples has revealed the inadequacies in the robustness of image classification models based on Convolutional Neural Networks (CNNs). Particularly in recent years, the discovery of natural adversarial examples has posed significant challenges, as traditional defense methods against adversarial attacks have proven to be largely ineffective against these natural adversarial examples. This paper explores defenses against these natural adversarial examples from three perspectives: adversarial examples, model architecture, and dataset. First, it employs Class Activation Mapping (CAM) to visualize how models classify natural adversarial examples, identifying several typical attack patterns. Next, various common CNN models are analyzed to evaluate their susceptibility to these attacks, revealing that different architectures exhibit varying defensive capabilities. The study finds that as the depth of a network increases, its defenses against natural adversarial examples strengthen. Lastly, Finally, the impact of dataset class distribution on the defense capability of models is examined, focusing on two aspects: the number of classes in the training set and the number of predicted classes. This study investigates how these factors influence the model’s ability to defend against natural adversarial examples. Results indicate that reducing the number of training classes enhances the model’s defense against natural adversarial examples. Additionally, under a fixed number of training classes, some CNN models show an optimal range of predicted classes for achieving the best defense performance against these adversarial examples. 展开更多
关键词 Image classification convolutional neural network natural adversarial example data set defense against adversarial examples
在线阅读 下载PDF
A Prior Causality-Guided Multi-View Diffusion Network for Brain Disorder Classification
14
作者 Xubin Wu Yan Niu +2 位作者 Xia Li Jie Xiang Yidi Li 《CAAI Transactions on Intelligence Technology》 2025年第6期1731-1744,共14页
Functional brain networks have been used to diagnose brain disorders such as autism spectrum disorder(ASD)and attentiondeficit/hyperactivity disorder(ADHD).However,existing methods not only fail to fully consider vari... Functional brain networks have been used to diagnose brain disorders such as autism spectrum disorder(ASD)and attentiondeficit/hyperactivity disorder(ADHD).However,existing methods not only fail to fully consider various levels of interaction information between brain regions,but also limit the transmission of information among unconnected regions,resulting in the node information loss and bias.To address these issues,we propose a causality-guided multi-view diffusion(CG-MVD)network,which can more comprehensively capture node information that is difficult to observe when aggregating direct neighbours alone.Specifically,our approach designs multi-view brain graphs and multi-hop causality graphs to represent multilevel node interactions and guide the diffusion of interaction information.Building on this,a multi-view diffusion graph attention module is put forward to learn node multi-dimensional embedding features by broadening the interaction range and extending the receptive field.Additionally,we propose a bilinear adaptive fusion module to generate and fuse connectivitybased features,addressing the challenge of high-dimensional node-level features and integrating richer feature information to enhance classification.Experimental results on the ADHD-200 and ABIDE-I datasets demonstrate the effectiveness of the CG-MVD network,achieving average accuracies of 79.47% and 80.90%,respectively,and surpassing state-of-the-art methods. 展开更多
关键词 BIOINFORMATICS data analysis deep learning image classification
在线阅读 下载PDF
Improving long-tail classification via decoupling and regularisation
15
作者 Shuzheng Gao Chaozheng Wang +4 位作者 Cuiyun Gao Wenjian Luo Peiyi Han Qing Liao Guandong Xu 《CAAI Transactions on Intelligence Technology》 2025年第1期62-71,共10页
Real-world data always exhibit an imbalanced and long-tailed distribution,which leads to poor performance for neural network-based classification.Existing methods mainly tackle this problem by reweighting the loss fun... Real-world data always exhibit an imbalanced and long-tailed distribution,which leads to poor performance for neural network-based classification.Existing methods mainly tackle this problem by reweighting the loss function or rebalancing the classifier.However,one crucial aspect overlooked by previous research studies is the imbalanced feature space problem caused by the imbalanced angle distribution.In this paper,the authors shed light on the significance of the angle distribution in achieving a balanced feature space,which is essential for improving model performance under long-tailed distributions.Nevertheless,it is challenging to effectively balance both the classifier norms and angle distribution due to problems such as the low feature norm.To tackle these challenges,the authors first thoroughly analyse the classifier and feature space by decoupling the classification logits into three key components:classifier norm(i.e.the magnitude of the classifier vector),feature norm(i.e.the magnitude of the feature vector),and cosine similarity between the classifier vector and feature vector.In this way,the authors analyse the change of each component in the training process and reveal three critical problems that should be solved,that is,the imbalanced angle distribution,the lack of feature discrimination,and the low feature norm.Drawing from this analysis,the authors propose a novel loss function that incorporates hyperspherical uniformity,additive angular margin,and feature norm regularisation.Each component of the loss function addresses a specific problem and synergistically contributes to achieving a balanced classifier and feature space.The authors conduct extensive experiments on three popular benchmark datasets including CIFAR-10/100-LT,ImageNet-LT,and iNaturalist 2018.The experimental results demonstrate that the authors’loss function outperforms several previous state-of-the-art methods in addressing the challenges posed by imbalanced and longtailed datasets,that is,by improving upon the best-performing baselines on CIFAR-100-LT by 1.34,1.41,1.41 and 1.33,respectively. 展开更多
关键词 computer vision image classification long-tailed data machine learning
在线阅读 下载PDF
Highly-reliable ferroelectric thin-film transistors array for hardware implementation of image classification
16
作者 Peng Yang Peiwen Tong +9 位作者 Hui Xu Sen Liu Changlin Chen Yefan Zhang Shihao Yu Wei Wang Rongrong Cao Haijun Liu Lei Liao Qingjiang Li 《Journal of Materials Science & Technology》 2025年第28期20-29,共10页
Ferroelectric thin film transistors(FeTFTs)have attracted great attention for in-memory computing appli-cations due to low power consumption and monolithic three-dimensional integration capability.Herein,we propose a ... Ferroelectric thin film transistors(FeTFTs)have attracted great attention for in-memory computing appli-cations due to low power consumption and monolithic three-dimensional integration capability.Herein,we propose a planar integrated highly-reliable metal-ferroelectric-metal-insulator-semiconductor FeTFTs device,in which the weak erase issue is suppressed by implanting a floating gate,and the interface de-fects are reduced by simplifying the fabrication process.These lead to significant improvements in device performance,including large memory window(4.3 V),high conductance dynamic range(1400),high en-durance(10^(12)),and low variation(cycle-to-cycle:2.5%/device-to-device:3.5%).Moreover,we fabricated a 16×16 FeTFTs pseudo-crossbar array for in-memory computing and experimentally demonstrated full hardware implementation of multi-layer perceptron for the classification of four fundamental arithmetic operation symbols.This work provides a potential hardware solution for implementing a highly-efficient in-memory computing system based on highly-reliable FeTFTs array. 展开更多
关键词 FERROELECTRIC Thin-film transistors array In-memory computing Image classification
原文传递
Central-Pixel Guiding Sub-Pixel and Sub-Channel Convolution Network for Hyperspectral Image Classification
17
作者 Xin Guan Shan Wang Qiang Li 《Journal of Beijing Institute of Technology》 2025年第5期510-525,共16页
In hyperspectral image classification(HSIC),accurately extracting spatial and spectral information from hyperspectral images(HSI)is crucial for achieving precise classification.However,due to low spatial resolution an... In hyperspectral image classification(HSIC),accurately extracting spatial and spectral information from hyperspectral images(HSI)is crucial for achieving precise classification.However,due to low spatial resolution and complex category boundary,mixed pixels containing features from multiple classes are inevitable in HSIs.Additionally,the spectral similarity among different classes challenge for extracting distinctive spectral features essential for HSIC.To address the impact of mixed pixels and spectral similarity for HSIC,we propose a central-pixel guiding sub-pixel and sub-channel convolution network(CP-SPSC)to extract more precise spatial and spectral features.Firstly,we designed spatial attention(CP-SPA)and spectral attention(CP-SPE)informed by the central pixel to effectively reduce spectral interference of irrelevant categories in the same patch.Furthermore,we use CP-SPA to guide 2D sub-pixel convolution(SPConv2d)to capture spatial features finer than the pixel level.Meanwhile,CP-SPE is also utilized to guide 1D sub-channel con-volution(SCConv1d)in selecting more precise spectral channels.For fusing spatial and spectral information at the feature-level,the spectral feature extension transformation module(SFET)adopts mirror-padding and snake permutation to transform 1D spectral information of the center pixel into 2D spectral features.Experiments on three popular datasets demonstrate that ours out-performs several state-of-the-art methods in accuracy. 展开更多
关键词 hyperspectral image classification similar spectra mixed pixel ATTENTION
在线阅读 下载PDF
New MDA Transformation Process from Urban Satellite Image Classification to Specific Urban Landsat Satellite Image Classification
18
作者 Hafsa Ouchra Abdessamad Belangour +1 位作者 Allae Erraissi Maria Labied 《Journal of Environmental & Earth Sciences》 2025年第1期81-91,共11页
In a context where urban satellite image processing technologies are undergoing rapid evolution,this article presents an innovative and rigorous approach to satellite image classification applied to urban planning.Thi... In a context where urban satellite image processing technologies are undergoing rapid evolution,this article presents an innovative and rigorous approach to satellite image classification applied to urban planning.This research proposes an integrated methodological framework,based on the principles of model-driven engineering(MDE),to transform a generic meta-model into a meta-model specifically dedicated to urban satellite image classification.We implemented this transformation using the Atlas Transformation Language(ATL),guaranteeing a smooth and consistent transition from platform-independent model(PIM)to platform-specific model(PSM),according to the principles of model-driven architecture(MDA).The application of this IDM methodology enables advanced structuring of satellite data for targeted urban planning analyses,making it possible to classify various urban zones such as built-up,cultivated,arid and water areas.The novelty of this approach lies in the automation and standardization of the classification process,which significantly reduces the need for manual intervention,and thus improves the reliability,reproducibility and efficiency of urban data analysis.By adopting this method,decision-makers and urban planners are provided with a powerful tool for systematically and consistently analyzing and interpreting satellite images,facilitating decision-making in critical areas such as urban space management,infrastructure planning and environmental preservation. 展开更多
关键词 Model-Driven Engineering META-MODEL ATL Transformation Urban Satellite Image classification Meta-Model
在线阅读 下载PDF
Enhancing Medical Image Classification with BSDA-Mamba:Integrating Bayesian Random Semantic Data Augmentation and Residual Connections
19
作者 Honglin Wang Yaohua Xu Cheng Zhu 《Computers, Materials & Continua》 2025年第6期4999-5018,共20页
Medical image classification is crucial in disease diagnosis,treatment planning,and clinical decisionmaking.We introduced a novel medical image classification approach that integrates Bayesian Random Semantic Data Aug... Medical image classification is crucial in disease diagnosis,treatment planning,and clinical decisionmaking.We introduced a novel medical image classification approach that integrates Bayesian Random Semantic Data Augmentation(BSDA)with a Vision Mamba-based model for medical image classification(MedMamba),enhanced by residual connection blocks,we named the model BSDA-Mamba.BSDA augments medical image data semantically,enhancing the model’s generalization ability and classification performance.MedMamba,a deep learning-based state space model,excels in capturing long-range dependencies in medical images.By incorporating residual connections,BSDA-Mamba further improves feature extraction capabilities.Through comprehensive experiments on eight medical image datasets,we demonstrate that BSDA-Mamba outperforms existing models in accuracy,area under the curve,and F1-score.Our results highlight BSDA-Mamba’s potential as a reliable tool for medical image analysis,particularly in handling diverse imaging modalities from X-rays to MRI.The open-sourcing of our model’s code and datasets,will facilitate the reproduction and extension of our work. 展开更多
关键词 Deep learning medical image classification data augmentation visual state space model
在线阅读 下载PDF
Recognition and Classification of Concrete Surface Cracks with an Inception Quantum Convolutional Neural Network Algorithm
20
作者 Bu Yun-zhe Xiao Yi-lei +1 位作者 Li Ya-jun Meng Ling-guang 《Applied Geophysics》 2025年第4期1475-1490,1502,共17页
Current concrete surface crack detection methods cannot simultaneously achieve high detection accuracy and efficiency.Thus,this study focuses on the recognition and classification of crack images and proposes a concre... Current concrete surface crack detection methods cannot simultaneously achieve high detection accuracy and efficiency.Thus,this study focuses on the recognition and classification of crack images and proposes a concrete crack detection method that integrates the Inception module and a quantum convolutional neural network.First,the features of concrete cracks are highlighted by image gray processing,morphological operations,and threshold segmentation,and then the image is quantum coded by angle coding to transform the classical image information into quantum image information.Then,quantum circuits are used to implement classical image convolution operations to improve the convergence speed of the model and enhance the image representation.Second,two image input paths are designed:one with a quantum convolutional layer and the other with a classical convolutional layer.Finally,comparative experiments are conducted using different parameters to determine the optimal concrete crack classification parameter values for concrete crack image classification.Experimental results show that the method is suitable for crack classification in different scenarios,and training speed is greatly improved compared with that of existing deep learning models.The two evaluation metrics,accuracy and recall,are considerably enhanced. 展开更多
关键词 Concrete crack Quantum computing Image recognition and classification Quantum convolutional neural network
在线阅读 下载PDF
上一页 1 2 16 下一页 到第
使用帮助 返回顶部