The remote sensing ships’fine-grained classification technology makes it possible to identify certain ship types in remote sensing images,and it has broad application prospects in civil and military fields.However,th...The remote sensing ships’fine-grained classification technology makes it possible to identify certain ship types in remote sensing images,and it has broad application prospects in civil and military fields.However,the current model does not examine the properties of ship targets in remote sensing images with mixed multi-granularity features and a complicated backdrop.There is still an opportunity for future enhancement of the classification impact.To solve the challenges brought by the above characteristics,this paper proposes a Metaformer and Residual fusion network based on Visual Attention Network(VAN-MR)for fine-grained classification tasks.For the complex background of remote sensing images,the VAN-MR model adopts the parallel structure of large kernel attention and spatial attention to enhance the model’s feature extraction ability of interest targets and improve the classification performance of remote sensing ship targets.For the problem of multi-grained feature mixing in remote sensing images,the VAN-MR model uses a Metaformer structure and a parallel network of residual modules to extract ship features.The parallel network has different depths,considering both high-level and lowlevel semantic information.The model achieves better classification performance in remote sensing ship images with multi-granularity mixing.Finally,the model achieves 88.73%and 94.56%accuracy on the public fine-grained ship collection-23(FGSC-23)and FGSCR-42 datasets,respectively,while the parameter size is only 53.47 M,the floating point operations is 9.9 G.The experimental results show that the classification effect of VAN-MR is superior to that of traditional CNNs model and visual model with Transformer structure under the same parameter quantity.展开更多
Fine-grained image classification, which aims to distinguish images with subtle distinctions, is a challenging task for two main reasons: lack of sufficient training data for every class and difficulty in learning dis...Fine-grained image classification, which aims to distinguish images with subtle distinctions, is a challenging task for two main reasons: lack of sufficient training data for every class and difficulty in learning discriminative features for representation. In this paper, to address the two issues, we propose a two-phase framework for recognizing images from unseen fine-grained classes, i.e., zeroshot fine-grained classification. In the first feature learning phase, we finetune deep convolutional neural networks using hierarchical semantic structure among fine-grained classes to extract discriminative deep visual features. Meanwhile, a domain adaptation structure is induced into deep convolutional neural networks to avoid domain shift from training data to test data. In the second label inference phase, a semantic directed graph is constructed over attributes of fine-grained classes. Based on this graph, we develop a label propagation algorithm to infer the labels of images in the unseen classes. Experimental results on two benchmark datasets demonstrate that our model outperforms the state-of-the-art zero-shot learning models. In addition, the features obtained by our feature learning model also yield significant gains when they are used by other zero-shot learning models, which shows the flexility of our model in zero-shot finegrained classification.展开更多
With the rapid development of the Internet of things and e-commerce, feature-based image retrieval and classification have become a serious challenge for shoppers searching websites for relevant product information. T...With the rapid development of the Internet of things and e-commerce, feature-based image retrieval and classification have become a serious challenge for shoppers searching websites for relevant product information. The last decade has witnessed great interest in research on content-based feature extraction techniques. Moreover, semantic attributes cannot fully express the rich image information. This paper designs and trains a deep convolutional neural network that the convolution kernel size and the order of network connection are based on the high efficiency of the filter capacity and coverage. To solve the problem of long training time and high resource share of deep convolutional neural network, this paper designed a shallow convolutional neural network to achieve the similar classification accuracy. The deep and shallow convolutional neural networks have data pre-processing, feature extraction and softmax classification. To evaluate the classification performance of the network, experiments were conducted using a public database Caltech256 and a homemade product image database containing 15 species of garment and 5 species of shoes on a total of 20,000 color images from shopping websites. Compared with the classification accuracy of combining content-based feature extraction techniques with traditional support vector machine techniques from 76.3% to 86.2%, the deep convolutional neural network obtains an impressive state-of-the-art classification accuracy of 92.1%, and the shallow convolutional neural network reached a classification accuracy of 90.6%. Moreover, the proposed convolutional neural networks can be integrated and implemented in other colour image database.展开更多
In this paper,we propose hierarchical attention dual network(DNet)for fine-grained image classification.The DNet can randomly select pairs of inputs from the dataset and compare the differences between them through hi...In this paper,we propose hierarchical attention dual network(DNet)for fine-grained image classification.The DNet can randomly select pairs of inputs from the dataset and compare the differences between them through hierarchical attention feature learning,which are used simultaneously to remove noise and retain salient features.In the loss function,it considers the losses of difference in paired images according to the intra-variance and inter-variance.In addition,we also collect the disaster scene dataset from remote sensing images and apply the proposed method to disaster scene classification,which contains complex scenes and multiple types of disasters.Compared to other methods,experimental results show that the DNet with hierarchical attention is robust to different datasets and performs better.展开更多
Bird monitoring and protection are essential for maintaining biodiversity,and fine-grained bird classification has become a key focus in this field.Audio-visual modalities provide critical cues for this task,but robus...Bird monitoring and protection are essential for maintaining biodiversity,and fine-grained bird classification has become a key focus in this field.Audio-visual modalities provide critical cues for this task,but robust feature extraction and efficient fusion remain major challenges.We introduce a multi-stage fine-grained audiovisual fusion network(MSFG-AVFNet) for fine-grained bird species classification,which addresses these challenges through two key components:(1) the audiovisual feature extraction module,which adopts a multi-stage finetuning strategy to provide high-quality unimodal features,laying a solid foundation for modality fusion;(2) the audiovisual feature fusion module,which combines a max pooling aggregation strategy with a novel audiovisual loss function to achieve effective and robust feature fusion.Experiments were conducted on the self-built AVB81and the publicly available SSW60 datasets,which contain data from 81 and 60 bird species,respectively.Comprehensive experiments demonstrate that our approach achieves notable performance gains,outperforming existing state-of-the-art methods.These results highlight its effectiveness in leveraging audiovisual modalities for fine-grained bird classification and its potential to support ecological monitoring and biodiversity research.展开更多
Intelligent vehicle applications provide convenience but raise privacy and security concerns.Misuse of sensitive data,including vehicle location,and facial recognition information,poses a threat to user privacy.Hence,...Intelligent vehicle applications provide convenience but raise privacy and security concerns.Misuse of sensitive data,including vehicle location,and facial recognition information,poses a threat to user privacy.Hence,traffic classification is vital for promptly overseeing and controlling applications with sensitive information.In this paper,we propose ETNet,a framework that combines multiple features and leverages self-attention mechanisms to learn deep relationships between packets.ET-Net employs a multisimilarity triplet network to extract features from raw bytes,and exploits self-attention to capture long-range dependencies within packets in a session and contextual information features.Additionally,we utilizing the loss function to more effectively integrate information acquired from both byte sequences and their corresponding lengths.Through simulated evaluations on datasets with similar attributes,ET-Net demonstrates the ability to finely distinguish between nine categories of applications,achieving superior results compared to existing methods.展开更多
In this paper, we introduce an image dataset for fine-grained classification of dog breeds: the Tsinghua Dogs Dataset. It is currently the largest dataset for fine-grained classification of dogs, including 130 dog bre...In this paper, we introduce an image dataset for fine-grained classification of dog breeds: the Tsinghua Dogs Dataset. It is currently the largest dataset for fine-grained classification of dogs, including 130 dog breeds and 70,428 real-world images. It has only one dog in each image and provides annotated bounding boxes for the whole body and head. In comparison to previous similar datasets, it contains more breeds and more carefully chosen images for each breed. The diversity within each breed is greater,with between 200 and 7000+ images for each breed.Annotation of the whole body and head makes the dataset not only suitable for the improvement of finegrained image classification models based on overall features, but also for those locating local informative parts. We show that dataset provides a tough challenge by benchmarking several state-of-the-art deep neural models. The dataset is available for academic purposes at https://cg.cs.tsinghua.edu.cn/ThuDogs/.展开更多
The value of grape cultivars varies.The use of a mixture of cultivars can negate the benefits of improved cultivars and hamper the protection of genetic resources and the identification of new hybrid cultivars.Classif...The value of grape cultivars varies.The use of a mixture of cultivars can negate the benefits of improved cultivars and hamper the protection of genetic resources and the identification of new hybrid cultivars.Classifying cultivars based on their leaves is therefore highly practical.Transplanted grape seedlings take years to bear fruit,but leaves mature in months.Foliar morphology differs among cultivars,so identifying cultivars based on leaves is feasible.Different cultivars,however,can be bred from the same parents,so the leaves of some cultivars can have similar morphologies.In this work,a pyramid residual convolution neural network was developed to classify images of eleven grape cultivars.The model extracts multi-scale feature maps of the leaf images through the convolution layer and enters them into three residual convolution neural networks.Features are fused by adding the value of the convolution kernel feature matrix to enhance the attention on the edge and center regions of the leaves and classify the images.The results indicated that the average accuracy of the model was 92.26%for the proposed leaf dataset.The proposed model is superior to previous models and provides a reliable method for the fine-grained classification and identification of plant cultivars.展开更多
The deep learning technology has shown impressive performance in various vision tasks such as image classification, object detection and semantic segmentation. In particular, recent advances of deep learning technique...The deep learning technology has shown impressive performance in various vision tasks such as image classification, object detection and semantic segmentation. In particular, recent advances of deep learning techniques bring encouraging performance to fine-grained image classification which aims to distinguish subordinate-level categories, such as bird species or dog breeds. This task is extremely challenging due to high intra-class and low inter-class variance. In this paper, we review four types of deep learning based fine-grained image classification approaches, including the general convolutional neural networks (CNNs), part detection based, ensemble of networks based and visual attention based fine-grained image classification approaches. Besides, the deep learning based semantic segmentation approaches are also covered in this paper. The region proposal based and fully convolutional networks based approaches for semantic segmentation are introduced respectively.展开更多
Fine-grained sedimentary rocks are defined as rocks which mainly compose of fine grains(〈62.5 μm). The detailed studies on these rocks have revealed the need of a more unified, comprehensive and inclusive classifi...Fine-grained sedimentary rocks are defined as rocks which mainly compose of fine grains(〈62.5 μm). The detailed studies on these rocks have revealed the need of a more unified, comprehensive and inclusive classification. The study focuses on fine-grained rocks has turned from the differences of inorganic mineral components to the significance of organic matter and microorganisms. The proposed classification is based on mineral composition, and it is noted that organic matters have been taken as a very important parameter in this classification scheme. Thus, four parameters, the TOC content, silica(quartz plus feldspars), clay minerals and carbonate minerals, are considered to divide the fine-grained sedimentary rocks into eight categories, and the further classification within every category is refined depending on subordinate mineral composition. The nomenclature consists of a root name preceded by a primary adjective. The root names reflect mineral constituent of the rock, including low organic(TOC〈2%), middle organic(2%4%) claystone, siliceous mudstone, limestone, and mixed mudstone. Primary adjectives convey structure and organic content information, including massive or limanited. The lithofacies are closely related to the reservoir storage space, porosity, permeability, hydrocarbon potential and shale oil/gas sweet spot, and are the key factor for the shale oil and gas exploration. The classification helps to systematically and practicably describe variability within fine-grained sedimentary rocks, what's more, it helps to guide the hydrocarbon exploration.展开更多
Based on reviews and summaries of the naming schemes of fine-grained sedimentary rocks, and analysis of characteristics of fine-grained sedimentary rocks, the problems existing in the classification and naming of fine...Based on reviews and summaries of the naming schemes of fine-grained sedimentary rocks, and analysis of characteristics of fine-grained sedimentary rocks, the problems existing in the classification and naming of fine-grained sedimentary rocks are discussed. On this basis, following the principle of three-level nomenclature, a new scheme of rock classification and naming for fine-grained sedimentary rocks is determined from two perspectives: First, fine-grained sedimentary rocks are divided into 12 types in two major categories, mudstone and siltstone, according to particle size(sand, silt and mud). Second,fine-grained sedimentary rocks are divided into 18 types in four categories, carbonate rock, fine-grained felsic sedimentary rock,clay rock and mixed fine-grained sedimentary rock according to mineral composition(carbonate minerals, felsic detrital minerals and clay minerals as three end elements). Considering the importance of organic matter in unconventional oil and gas generation and evaluation, organic matter is taken as the fourth element in the scheme. Taking the organic matter contents of 0.5% and 2% as dividing points, fine grained sedimentary rocks are divided into three categories, organic-poor, organic-bearing,and organic-rich ones. The new scheme meets the requirement of unconventional oil and gas exploration and development today and solves the problem of conceptual confusion in fine-grained sedimentary rocks, providing a unified basic term system for the research of fine-grained sedimentology.展开更多
Urban tree species provide various essential ecosystem services in cities,such as regulating urban temperatures,reducing noise,capturing carbon,and mitigating the urban heat island effect.The quality of these services...Urban tree species provide various essential ecosystem services in cities,such as regulating urban temperatures,reducing noise,capturing carbon,and mitigating the urban heat island effect.The quality of these services is influenced by species diversity,tree health,and the distribution and the composition of trees.Traditionally,data on urban trees has been collected through field surveys and manual interpretation of remote sensing images.In this study,we evaluated the effectiveness of multispectral airborne laser scanning(ALS)data in classifying 24 common urban roadside tree species in Espoo,Finland.Tree crown structure information,intensity features,and spectral data were used for classification.Eight different machine learning algorithms were tested,with the extra trees(ET)algorithm performing the best,achieving an overall accuracy of 71.7%using multispectral LiDAR data.This result highlights that integrating structural and spectral information within a single framework can improve the classification accuracy.Future research will focus on identifying the most important features for species classification and developing algorithms with greater efficiency and accuracy.展开更多
Purpose:Interdisciplinary research has become a critical approach to addressing complex societal,economic,technological,and environmental challenges,driving innovation and integrating scientific knowledge.While interd...Purpose:Interdisciplinary research has become a critical approach to addressing complex societal,economic,technological,and environmental challenges,driving innovation and integrating scientific knowledge.While interdisciplinarity indicators are widely used to evaluate research performance,the impact of classification granularity on these assessments remains underexplored.Design/methodology/approach:This study investigates how different levels of classification granularity-macro,meso,and micro-affect the evaluation of interdisciplinarity in research institutes.Using a dataset of 262 institutes from four major German non-university organizations(FHG,HGF,MPG,WGL)from 2018 to 2022,we examine inconsistencies in interdisciplinarity across levels,analyze ranking changes,and explore the influence of institutional fields and research focus(applied vs.basic).Findings:Our findings reveal significant inconsistencies in interdisciplinarity across classification levels,with rankings varying substantially.Notably,the Fraunhofer Society(FHG),which performs well at the macro level,experiences significant ranking declines at meso and micro levels.Normalizing interdisciplinarity by research field confirmed that these declines persist.The research focus of institutes,whether applied,basic,or mixed,does not significantly explain the observed ranking dynamics.Research limitations:This study has only considered the publication-based dimension of institutional interdisciplinarity and has not explored other aspects.Practical implications:The findings provide insights for policymakers,research managers,and scholars to better interpret interdisciplinarity metrics and support interdisciplinary research effectively.Originality/value:This study underscores the critical role of classification granularity in interdisciplinarity assessment and emphasizes the need for standardized approaches to ensure robust and fair evaluations.展开更多
Fine-grained sediments are widely distributed and constitute the most abundant component in sedi-mentary systems,thus the research on their genesis and distribution is of great significance.In recent years,fine-graine...Fine-grained sediments are widely distributed and constitute the most abundant component in sedi-mentary systems,thus the research on their genesis and distribution is of great significance.In recent years,fine-grained sediment gravity-flows(FGSGF)have been recognized as an important transportation and depositional mechanism for accumulating thick successions of fine-grained sediments.Through a comprehensive review and synthesis of global research on FGSGF deposition,the characteristics,depositional mechanisms,and distribution patterns of fine-grained sediment gravity-flow deposits(FGSGFD)are discussed,and future research prospects are clarified.In addition to the traditionally recognized low-density turbidity current and muddy debris flow,wave-enhanced gravity flow,low-density muddy hyperpycnal flow,and hypopycnal plumes can all form widely distributed FGSGFD.At the same time,the evolution of FGSGF during transportation can result in transitional and hybrid gravity-flow deposits.The combination of multiple triggering mechanisms promotes the widespread develop-ment of FGSGFD,without temporal and spatial limitations.Different types and concentrations of clay minerals,organic matters,and organo-clay complexes are the keys to controlling the flow transformation of FGSGF from low-concentration turbidity currents to high-concentration muddy debris flows.Further study is needed on the interaction mechanism of FGSGF caused by different initiations,the evolution of FGSGF with the effect of organic-inorganic synergy,and the controlling factors of the distribution pat-terns of FGSGFD.The study of FGSGFD can shed some new light on the formation of widely developed thin-bedded siltstones within shales.At the same time,these insights may broaden the exploration scope of shale oil and gas,which have important geological significances for unconventional shale oil and gas.展开更多
Preservation of the crops depends on early and accurate detection of pests on crops as they cause several diseases decreasing crop production and quality. Several deep-learning techniques have been applied to overcome...Preservation of the crops depends on early and accurate detection of pests on crops as they cause several diseases decreasing crop production and quality. Several deep-learning techniques have been applied to overcome the issue of pest detection on crops. We have developed the YOLOCSP-PEST model for Pest localization and classification. With the Cross Stage Partial Network (CSPNET) backbone, the proposed model is a modified version of You Only Look Once Version 7 (YOLOv7) that is intended primarily for pest localization and classification. Our proposed model gives exceptionally good results under conditions that are very challenging for any other comparable models especially conditions where we have issues with the luminance and the orientation of the images. It helps farmers working out on their crops in distant areas to determine any infestation quickly and accurately on their crops which helps in the quality and quantity of the production yield. The model has been trained and tested on 2 datasets namely the IP102 data set and a local crop data set on both of which it has shown exceptional results. It gave us a mean average precision (mAP) of 88.40% along with a precision of 85.55% and a recall of 84.25% on the IP102 dataset meanwhile giving a mAP of 97.18% on the local data set along with a recall of 94.88% and a precision of 97.50%. These findings demonstrate that the proposed model is very effective in detecting real-life scenarios and can help in the production of crops improving the yield quality and quantity at the same time.展开更多
The cleanliness of seed cotton plays a critical role in the pre-treatment of cotton textiles,and the removal of impurity during the harvesting process directly determines the quality and market value of cotton textile...The cleanliness of seed cotton plays a critical role in the pre-treatment of cotton textiles,and the removal of impurity during the harvesting process directly determines the quality and market value of cotton textiles.By fusing band combination optimization with deep learning,this study aims to achieve more efficient and accurate detection of film impurities in seed cotton on the production line.By applying hyperspectral imaging and a one-dimensional deep learning algorithm,we detect and classify impurities in seed cotton after harvest.The main categories detected include pure cotton,conveyor belt,film covering seed cotton,and film adhered to the conveyor belt.The proposed method achieves an impurity detection rate of 99.698%.To further ensure the feasibility and practical application potential of this strategy,we compare our results against existing mainstream methods.In addition,the model shows excellent recognition performance on pseudo-color images of real samples.With a processing time of 11.764μs per pixel from experimental data,it shows a much improved speed requirement while maintaining the accuracy of real production lines.This strategy provides an accurate and efficient method for removing impurities during cotton processing.展开更多
Myocardial perfusion imaging(MPI),which uses single-photon emission computed tomography(SPECT),is a well-known estimating tool for medical diagnosis,employing the classification of images to show situations in coronar...Myocardial perfusion imaging(MPI),which uses single-photon emission computed tomography(SPECT),is a well-known estimating tool for medical diagnosis,employing the classification of images to show situations in coronary artery disease(CAD).The automatic classification of SPECT images for different techniques has achieved near-optimal accuracy when using convolutional neural networks(CNNs).This paper uses a SPECT classification framework with three steps:1)Image denoising,2)Attenuation correction,and 3)Image classification.Image denoising is done by a U-Net architecture that ensures effective image denoising.Attenuation correction is implemented by a convolution neural network model that can remove the attenuation that affects the feature extraction process of classification.Finally,a novel multi-scale diluted convolution(MSDC)network is proposed.It merges the features extracted in different scales and makes the model learn the features more efficiently.Three scales of filters with size 3×3 are used to extract features.All three steps are compared with state-of-the-art methods.The proposed denoising architecture ensures a high-quality image with the highest peak signal-to-noise ratio(PSNR)value of 39.7.The proposed classification method is compared with the five different CNN models,and the proposed method ensures better classification with an accuracy of 96%,precision of 87%,sensitivity of 87%,specificity of 89%,and F1-score of 87%.To demonstrate the importance of preprocessing,the classification model was analyzed without denoising and attenuation correction.展开更多
Joint Multimodal Aspect-based Sentiment Analysis(JMASA)is a significant task in the research of multimodal fine-grained sentiment analysis,which combines two subtasks:Multimodal Aspect Term Extraction(MATE)and Multimo...Joint Multimodal Aspect-based Sentiment Analysis(JMASA)is a significant task in the research of multimodal fine-grained sentiment analysis,which combines two subtasks:Multimodal Aspect Term Extraction(MATE)and Multimodal Aspect-oriented Sentiment Classification(MASC).Currently,most existing models for JMASA only perform text and image feature encoding from a basic level,but often neglect the in-depth analysis of unimodal intrinsic features,which may lead to the low accuracy of aspect term extraction and the poor ability of sentiment prediction due to the insufficient learning of intra-modal features.Given this problem,we propose a Text-Image Feature Fine-grained Learning(TIFFL)model for JMASA.First,we construct an enhanced adjacency matrix of word dependencies and adopt graph convolutional network to learn the syntactic structure features for text,which addresses the context interference problem of identifying different aspect terms.Then,the adjective-noun pairs extracted from image are introduced to enable the semantic representation of visual features more intuitive,which addresses the ambiguous semantic extraction problem during image feature learning.Thereby,the model performance of aspect term extraction and sentiment polarity prediction can be further optimized and enhanced.Experiments on two Twitter benchmark datasets demonstrate that TIFFL achieves competitive results for JMASA,MATE and MASC,thus validating the effectiveness of our proposed methods.展开更多
Diagnosing cardiac diseases relies heavily on electrocardiogram(ECG)analysis,but detecting myocardial infarction-related arrhythmias remains challenging due to irregular heartbeats and signal variations.Despite advanc...Diagnosing cardiac diseases relies heavily on electrocardiogram(ECG)analysis,but detecting myocardial infarction-related arrhythmias remains challenging due to irregular heartbeats and signal variations.Despite advancements in machine learning,achieving both high accuracy and low computational cost for arrhythmia classification remains a critical issue.Computer-aided diagnosis systems can play a key role in early detection,reducing mortality rates associated with cardiac disorders.This study proposes a fully automated approach for ECG arrhythmia classification using deep learning and machine learning techniques to improve diagnostic accuracy while minimizing processing time.The methodology consists of three stages:1)preprocessing,where ECG signals undergo noise reduction and feature extraction;2)feature Identification,where deep convolutional neural network(CNN)blocks,combined with data augmentation and transfer learning,extract key parameters;3)classification,where a hybrid CNN-SVM model is employed for arrhythmia recognition.CNN-extracted features were fed into a binary support vector machine(SVM)classifier,and model performance was assessed using five-fold cross-validation.Experimental findings demonstrated that the CNN2 model achieved 85.52%accuracy,while the hybrid CNN2-SVM approach significantly improved accuracy to 97.33%,outperforming conventional methods.This model enhances classification efficiency while reducing computational complexity.The proposed approach bridges the gap between accuracy and processing speed in ECG arrhythmia classification,offering a promising solution for real-time clinical applications.Its superior performance compared to nonlinear classifiers highlights its potential for improving automated cardiac diagnosis.展开更多
In the era of precision medicine,the classification of diabetes mellitus has evolved beyond the traditional categories.Various classification methods now account for a multitude of factors,including variations in spec...In the era of precision medicine,the classification of diabetes mellitus has evolved beyond the traditional categories.Various classification methods now account for a multitude of factors,including variations in specific genes,type ofβ-cell impairment,degree of insulin resistance,and clinical characteristics of metabolic profiles.Improved classification methods enable healthcare providers to formulate blood glucose management strategies more precisely.Applying these updated classification systems,will assist clinicians in further optimising treatment plans,including targeted drug therapies,personalized dietary advice,and specific exercise plans.Ultimately,this will facilitate stricter blood glucose control,minimize the risks of hypoglycaemia and hyperglycaemia,and reduce long-term complications associated with diabetes.展开更多
文摘The remote sensing ships’fine-grained classification technology makes it possible to identify certain ship types in remote sensing images,and it has broad application prospects in civil and military fields.However,the current model does not examine the properties of ship targets in remote sensing images with mixed multi-granularity features and a complicated backdrop.There is still an opportunity for future enhancement of the classification impact.To solve the challenges brought by the above characteristics,this paper proposes a Metaformer and Residual fusion network based on Visual Attention Network(VAN-MR)for fine-grained classification tasks.For the complex background of remote sensing images,the VAN-MR model adopts the parallel structure of large kernel attention and spatial attention to enhance the model’s feature extraction ability of interest targets and improve the classification performance of remote sensing ship targets.For the problem of multi-grained feature mixing in remote sensing images,the VAN-MR model uses a Metaformer structure and a parallel network of residual modules to extract ship features.The parallel network has different depths,considering both high-level and lowlevel semantic information.The model achieves better classification performance in remote sensing ship images with multi-granularity mixing.Finally,the model achieves 88.73%and 94.56%accuracy on the public fine-grained ship collection-23(FGSC-23)and FGSCR-42 datasets,respectively,while the parameter size is only 53.47 M,the floating point operations is 9.9 G.The experimental results show that the classification effect of VAN-MR is superior to that of traditional CNNs model and visual model with Transformer structure under the same parameter quantity.
基金supported by National Basic Research Program of China (973 Program) (No. 2015CB352502)National Nature Science Foundation of China (No. 61573026)Beijing Nature Science Foundation (No. L172037)
文摘Fine-grained image classification, which aims to distinguish images with subtle distinctions, is a challenging task for two main reasons: lack of sufficient training data for every class and difficulty in learning discriminative features for representation. In this paper, to address the two issues, we propose a two-phase framework for recognizing images from unseen fine-grained classes, i.e., zeroshot fine-grained classification. In the first feature learning phase, we finetune deep convolutional neural networks using hierarchical semantic structure among fine-grained classes to extract discriminative deep visual features. Meanwhile, a domain adaptation structure is induced into deep convolutional neural networks to avoid domain shift from training data to test data. In the second label inference phase, a semantic directed graph is constructed over attributes of fine-grained classes. Based on this graph, we develop a label propagation algorithm to infer the labels of images in the unseen classes. Experimental results on two benchmark datasets demonstrate that our model outperforms the state-of-the-art zero-shot learning models. In addition, the features obtained by our feature learning model also yield significant gains when they are used by other zero-shot learning models, which shows the flexility of our model in zero-shot finegrained classification.
文摘With the rapid development of the Internet of things and e-commerce, feature-based image retrieval and classification have become a serious challenge for shoppers searching websites for relevant product information. The last decade has witnessed great interest in research on content-based feature extraction techniques. Moreover, semantic attributes cannot fully express the rich image information. This paper designs and trains a deep convolutional neural network that the convolution kernel size and the order of network connection are based on the high efficiency of the filter capacity and coverage. To solve the problem of long training time and high resource share of deep convolutional neural network, this paper designed a shallow convolutional neural network to achieve the similar classification accuracy. The deep and shallow convolutional neural networks have data pre-processing, feature extraction and softmax classification. To evaluate the classification performance of the network, experiments were conducted using a public database Caltech256 and a homemade product image database containing 15 species of garment and 5 species of shoes on a total of 20,000 color images from shopping websites. Compared with the classification accuracy of combining content-based feature extraction techniques with traditional support vector machine techniques from 76.3% to 86.2%, the deep convolutional neural network obtains an impressive state-of-the-art classification accuracy of 92.1%, and the shallow convolutional neural network reached a classification accuracy of 90.6%. Moreover, the proposed convolutional neural networks can be integrated and implemented in other colour image database.
基金Supported by the National Natural Science Foundation of China(61601176)。
文摘In this paper,we propose hierarchical attention dual network(DNet)for fine-grained image classification.The DNet can randomly select pairs of inputs from the dataset and compare the differences between them through hierarchical attention feature learning,which are used simultaneously to remove noise and retain salient features.In the loss function,it considers the losses of difference in paired images according to the intra-variance and inter-variance.In addition,we also collect the disaster scene dataset from remote sensing images and apply the proposed method to disaster scene classification,which contains complex scenes and multiple types of disasters.Compared to other methods,experimental results show that the DNet with hierarchical attention is robust to different datasets and performs better.
基金supported by the Beijing Natural Science Foundation(No.5252014)the Open Fund of The Key Laboratory of Urban Ecological Environment Simulation and Protection,Ministry of Ecology and Environment of the People's Republic of China (No.UEESP-202502)the National Natural Science Foundation of China (No.62303063&32371874)。
文摘Bird monitoring and protection are essential for maintaining biodiversity,and fine-grained bird classification has become a key focus in this field.Audio-visual modalities provide critical cues for this task,but robust feature extraction and efficient fusion remain major challenges.We introduce a multi-stage fine-grained audiovisual fusion network(MSFG-AVFNet) for fine-grained bird species classification,which addresses these challenges through two key components:(1) the audiovisual feature extraction module,which adopts a multi-stage finetuning strategy to provide high-quality unimodal features,laying a solid foundation for modality fusion;(2) the audiovisual feature fusion module,which combines a max pooling aggregation strategy with a novel audiovisual loss function to achieve effective and robust feature fusion.Experiments were conducted on the self-built AVB81and the publicly available SSW60 datasets,which contain data from 81 and 60 bird species,respectively.Comprehensive experiments demonstrate that our approach achieves notable performance gains,outperforming existing state-of-the-art methods.These results highlight its effectiveness in leveraging audiovisual modalities for fine-grained bird classification and its potential to support ecological monitoring and biodiversity research.
基金supported by National Key Research and Development Program of China(2022YFB3104903)S&T Program of Hebei(No.SZX2020034).
文摘Intelligent vehicle applications provide convenience but raise privacy and security concerns.Misuse of sensitive data,including vehicle location,and facial recognition information,poses a threat to user privacy.Hence,traffic classification is vital for promptly overseeing and controlling applications with sensitive information.In this paper,we propose ETNet,a framework that combines multiple features and leverages self-attention mechanisms to learn deep relationships between packets.ET-Net employs a multisimilarity triplet network to extract features from raw bytes,and exploits self-attention to capture long-range dependencies within packets in a session and contextual information features.Additionally,we utilizing the loss function to more effectively integrate information acquired from both byte sequences and their corresponding lengths.Through simulated evaluations on datasets with similar attributes,ET-Net demonstrates the ability to finely distinguish between nine categories of applications,achieving superior results compared to existing methods.
基金the National Natural Science Foundation of China(Project Nos.61521002 and 61772298)a Research Grant of Beijing Higher Institution Engineering Research CenterTsinghua–Tencent Joint Laboratory for Internet Innovation Technology。
文摘In this paper, we introduce an image dataset for fine-grained classification of dog breeds: the Tsinghua Dogs Dataset. It is currently the largest dataset for fine-grained classification of dogs, including 130 dog breeds and 70,428 real-world images. It has only one dog in each image and provides annotated bounding boxes for the whole body and head. In comparison to previous similar datasets, it contains more breeds and more carefully chosen images for each breed. The diversity within each breed is greater,with between 200 and 7000+ images for each breed.Annotation of the whole body and head makes the dataset not only suitable for the improvement of finegrained image classification models based on overall features, but also for those locating local informative parts. We show that dataset provides a tough challenge by benchmarking several state-of-the-art deep neural models. The dataset is available for academic purposes at https://cg.cs.tsinghua.edu.cn/ThuDogs/.
基金This work was financially supported by the National Key Research and Development Project(Grant No.2020YFD1100601)。
文摘The value of grape cultivars varies.The use of a mixture of cultivars can negate the benefits of improved cultivars and hamper the protection of genetic resources and the identification of new hybrid cultivars.Classifying cultivars based on their leaves is therefore highly practical.Transplanted grape seedlings take years to bear fruit,but leaves mature in months.Foliar morphology differs among cultivars,so identifying cultivars based on leaves is feasible.Different cultivars,however,can be bred from the same parents,so the leaves of some cultivars can have similar morphologies.In this work,a pyramid residual convolution neural network was developed to classify images of eleven grape cultivars.The model extracts multi-scale feature maps of the leaf images through the convolution layer and enters them into three residual convolution neural networks.Features are fused by adding the value of the convolution kernel feature matrix to enhance the attention on the edge and center regions of the leaves and classify the images.The results indicated that the average accuracy of the model was 92.26%for the proposed leaf dataset.The proposed model is superior to previous models and provides a reliable method for the fine-grained classification and identification of plant cultivars.
基金supported by the National Natural Science Foundation of China(Nos.61373121 and 61328205)Program for Sichuan Provincial Science Fund for Distinguished Young Scholars(No.13QNJJ0149)+1 种基金the Fundamental Research Funds for the Central UniversitiesChina Scholarship Council(No.201507000032)
文摘The deep learning technology has shown impressive performance in various vision tasks such as image classification, object detection and semantic segmentation. In particular, recent advances of deep learning techniques bring encouraging performance to fine-grained image classification which aims to distinguish subordinate-level categories, such as bird species or dog breeds. This task is extremely challenging due to high intra-class and low inter-class variance. In this paper, we review four types of deep learning based fine-grained image classification approaches, including the general convolutional neural networks (CNNs), part detection based, ensemble of networks based and visual attention based fine-grained image classification approaches. Besides, the deep learning based semantic segmentation approaches are also covered in this paper. The region proposal based and fully convolutional networks based approaches for semantic segmentation are introduced respectively.
基金supported by the Certificate of China Postdoctoral Science Foundation (No. 2015M582165)the National Natural Science Foundation of China (Nos. 41602142, 41772090)the National Science and Technology Special (No. 2017ZX05009-002)
文摘Fine-grained sedimentary rocks are defined as rocks which mainly compose of fine grains(〈62.5 μm). The detailed studies on these rocks have revealed the need of a more unified, comprehensive and inclusive classification. The study focuses on fine-grained rocks has turned from the differences of inorganic mineral components to the significance of organic matter and microorganisms. The proposed classification is based on mineral composition, and it is noted that organic matters have been taken as a very important parameter in this classification scheme. Thus, four parameters, the TOC content, silica(quartz plus feldspars), clay minerals and carbonate minerals, are considered to divide the fine-grained sedimentary rocks into eight categories, and the further classification within every category is refined depending on subordinate mineral composition. The nomenclature consists of a root name preceded by a primary adjective. The root names reflect mineral constituent of the rock, including low organic(TOC〈2%), middle organic(2%4%) claystone, siliceous mudstone, limestone, and mixed mudstone. Primary adjectives convey structure and organic content information, including massive or limanited. The lithofacies are closely related to the reservoir storage space, porosity, permeability, hydrocarbon potential and shale oil/gas sweet spot, and are the key factor for the shale oil and gas exploration. The classification helps to systematically and practicably describe variability within fine-grained sedimentary rocks, what's more, it helps to guide the hydrocarbon exploration.
基金Supported by the National Natural Science Foundation of China (41872166)。
文摘Based on reviews and summaries of the naming schemes of fine-grained sedimentary rocks, and analysis of characteristics of fine-grained sedimentary rocks, the problems existing in the classification and naming of fine-grained sedimentary rocks are discussed. On this basis, following the principle of three-level nomenclature, a new scheme of rock classification and naming for fine-grained sedimentary rocks is determined from two perspectives: First, fine-grained sedimentary rocks are divided into 12 types in two major categories, mudstone and siltstone, according to particle size(sand, silt and mud). Second,fine-grained sedimentary rocks are divided into 18 types in four categories, carbonate rock, fine-grained felsic sedimentary rock,clay rock and mixed fine-grained sedimentary rock according to mineral composition(carbonate minerals, felsic detrital minerals and clay minerals as three end elements). Considering the importance of organic matter in unconventional oil and gas generation and evaluation, organic matter is taken as the fourth element in the scheme. Taking the organic matter contents of 0.5% and 2% as dividing points, fine grained sedimentary rocks are divided into three categories, organic-poor, organic-bearing,and organic-rich ones. The new scheme meets the requirement of unconventional oil and gas exploration and development today and solves the problem of conceptual confusion in fine-grained sedimentary rocks, providing a unified basic term system for the research of fine-grained sedimentology.
文摘Urban tree species provide various essential ecosystem services in cities,such as regulating urban temperatures,reducing noise,capturing carbon,and mitigating the urban heat island effect.The quality of these services is influenced by species diversity,tree health,and the distribution and the composition of trees.Traditionally,data on urban trees has been collected through field surveys and manual interpretation of remote sensing images.In this study,we evaluated the effectiveness of multispectral airborne laser scanning(ALS)data in classifying 24 common urban roadside tree species in Espoo,Finland.Tree crown structure information,intensity features,and spectral data were used for classification.Eight different machine learning algorithms were tested,with the extra trees(ET)algorithm performing the best,achieving an overall accuracy of 71.7%using multispectral LiDAR data.This result highlights that integrating structural and spectral information within a single framework can improve the classification accuracy.Future research will focus on identifying the most important features for species classification and developing algorithms with greater efficiency and accuracy.
文摘Purpose:Interdisciplinary research has become a critical approach to addressing complex societal,economic,technological,and environmental challenges,driving innovation and integrating scientific knowledge.While interdisciplinarity indicators are widely used to evaluate research performance,the impact of classification granularity on these assessments remains underexplored.Design/methodology/approach:This study investigates how different levels of classification granularity-macro,meso,and micro-affect the evaluation of interdisciplinarity in research institutes.Using a dataset of 262 institutes from four major German non-university organizations(FHG,HGF,MPG,WGL)from 2018 to 2022,we examine inconsistencies in interdisciplinarity across levels,analyze ranking changes,and explore the influence of institutional fields and research focus(applied vs.basic).Findings:Our findings reveal significant inconsistencies in interdisciplinarity across classification levels,with rankings varying substantially.Notably,the Fraunhofer Society(FHG),which performs well at the macro level,experiences significant ranking declines at meso and micro levels.Normalizing interdisciplinarity by research field confirmed that these declines persist.The research focus of institutes,whether applied,basic,or mixed,does not significantly explain the observed ranking dynamics.Research limitations:This study has only considered the publication-based dimension of institutional interdisciplinarity and has not explored other aspects.Practical implications:The findings provide insights for policymakers,research managers,and scholars to better interpret interdisciplinarity metrics and support interdisciplinary research effectively.Originality/value:This study underscores the critical role of classification granularity in interdisciplinarity assessment and emphasizes the need for standardized approaches to ensure robust and fair evaluations.
基金supported by National Natural Science Foundation of China(Grant Nos.42072126,42372139)the Natural Science Foundation of Sichuan Province(Grant Nos.2022NSFSC0990).
文摘Fine-grained sediments are widely distributed and constitute the most abundant component in sedi-mentary systems,thus the research on their genesis and distribution is of great significance.In recent years,fine-grained sediment gravity-flows(FGSGF)have been recognized as an important transportation and depositional mechanism for accumulating thick successions of fine-grained sediments.Through a comprehensive review and synthesis of global research on FGSGF deposition,the characteristics,depositional mechanisms,and distribution patterns of fine-grained sediment gravity-flow deposits(FGSGFD)are discussed,and future research prospects are clarified.In addition to the traditionally recognized low-density turbidity current and muddy debris flow,wave-enhanced gravity flow,low-density muddy hyperpycnal flow,and hypopycnal plumes can all form widely distributed FGSGFD.At the same time,the evolution of FGSGF during transportation can result in transitional and hybrid gravity-flow deposits.The combination of multiple triggering mechanisms promotes the widespread develop-ment of FGSGFD,without temporal and spatial limitations.Different types and concentrations of clay minerals,organic matters,and organo-clay complexes are the keys to controlling the flow transformation of FGSGF from low-concentration turbidity currents to high-concentration muddy debris flows.Further study is needed on the interaction mechanism of FGSGF caused by different initiations,the evolution of FGSGF with the effect of organic-inorganic synergy,and the controlling factors of the distribution pat-terns of FGSGFD.The study of FGSGFD can shed some new light on the formation of widely developed thin-bedded siltstones within shales.At the same time,these insights may broaden the exploration scope of shale oil and gas,which have important geological significances for unconventional shale oil and gas.
基金supported by King Saud University,Riyadh,Saudi Arabia,through the Researchers Supporting Project under Grant RSPD2025R697.
文摘Preservation of the crops depends on early and accurate detection of pests on crops as they cause several diseases decreasing crop production and quality. Several deep-learning techniques have been applied to overcome the issue of pest detection on crops. We have developed the YOLOCSP-PEST model for Pest localization and classification. With the Cross Stage Partial Network (CSPNET) backbone, the proposed model is a modified version of You Only Look Once Version 7 (YOLOv7) that is intended primarily for pest localization and classification. Our proposed model gives exceptionally good results under conditions that are very challenging for any other comparable models especially conditions where we have issues with the luminance and the orientation of the images. It helps farmers working out on their crops in distant areas to determine any infestation quickly and accurately on their crops which helps in the quality and quantity of the production yield. The model has been trained and tested on 2 datasets namely the IP102 data set and a local crop data set on both of which it has shown exceptional results. It gave us a mean average precision (mAP) of 88.40% along with a precision of 85.55% and a recall of 84.25% on the IP102 dataset meanwhile giving a mAP of 97.18% on the local data set along with a recall of 94.88% and a precision of 97.50%. These findings demonstrate that the proposed model is very effective in detecting real-life scenarios and can help in the production of crops improving the yield quality and quantity at the same time.
基金supported in part by the Six Talent Peaks Project in Jiangsu Province under Grant 013040315in part by the China Textile Industry Federation Science and Technology Guidance Project under Grant 2017107+1 种基金in part by the National Natural Science Foundation of China under Grant 31570714in part by the China Scholarship Council under Grant 202108320290。
文摘The cleanliness of seed cotton plays a critical role in the pre-treatment of cotton textiles,and the removal of impurity during the harvesting process directly determines the quality and market value of cotton textiles.By fusing band combination optimization with deep learning,this study aims to achieve more efficient and accurate detection of film impurities in seed cotton on the production line.By applying hyperspectral imaging and a one-dimensional deep learning algorithm,we detect and classify impurities in seed cotton after harvest.The main categories detected include pure cotton,conveyor belt,film covering seed cotton,and film adhered to the conveyor belt.The proposed method achieves an impurity detection rate of 99.698%.To further ensure the feasibility and practical application potential of this strategy,we compare our results against existing mainstream methods.In addition,the model shows excellent recognition performance on pseudo-color images of real samples.With a processing time of 11.764μs per pixel from experimental data,it shows a much improved speed requirement while maintaining the accuracy of real production lines.This strategy provides an accurate and efficient method for removing impurities during cotton processing.
基金the Research Grant of Kwangwoon University in 2024.
文摘Myocardial perfusion imaging(MPI),which uses single-photon emission computed tomography(SPECT),is a well-known estimating tool for medical diagnosis,employing the classification of images to show situations in coronary artery disease(CAD).The automatic classification of SPECT images for different techniques has achieved near-optimal accuracy when using convolutional neural networks(CNNs).This paper uses a SPECT classification framework with three steps:1)Image denoising,2)Attenuation correction,and 3)Image classification.Image denoising is done by a U-Net architecture that ensures effective image denoising.Attenuation correction is implemented by a convolution neural network model that can remove the attenuation that affects the feature extraction process of classification.Finally,a novel multi-scale diluted convolution(MSDC)network is proposed.It merges the features extracted in different scales and makes the model learn the features more efficiently.Three scales of filters with size 3×3 are used to extract features.All three steps are compared with state-of-the-art methods.The proposed denoising architecture ensures a high-quality image with the highest peak signal-to-noise ratio(PSNR)value of 39.7.The proposed classification method is compared with the five different CNN models,and the proposed method ensures better classification with an accuracy of 96%,precision of 87%,sensitivity of 87%,specificity of 89%,and F1-score of 87%.To demonstrate the importance of preprocessing,the classification model was analyzed without denoising and attenuation correction.
基金supported by the Science and Technology Project of Henan Province(No.222102210081).
文摘Joint Multimodal Aspect-based Sentiment Analysis(JMASA)is a significant task in the research of multimodal fine-grained sentiment analysis,which combines two subtasks:Multimodal Aspect Term Extraction(MATE)and Multimodal Aspect-oriented Sentiment Classification(MASC).Currently,most existing models for JMASA only perform text and image feature encoding from a basic level,but often neglect the in-depth analysis of unimodal intrinsic features,which may lead to the low accuracy of aspect term extraction and the poor ability of sentiment prediction due to the insufficient learning of intra-modal features.Given this problem,we propose a Text-Image Feature Fine-grained Learning(TIFFL)model for JMASA.First,we construct an enhanced adjacency matrix of word dependencies and adopt graph convolutional network to learn the syntactic structure features for text,which addresses the context interference problem of identifying different aspect terms.Then,the adjective-noun pairs extracted from image are introduced to enable the semantic representation of visual features more intuitive,which addresses the ambiguous semantic extraction problem during image feature learning.Thereby,the model performance of aspect term extraction and sentiment polarity prediction can be further optimized and enhanced.Experiments on two Twitter benchmark datasets demonstrate that TIFFL achieves competitive results for JMASA,MATE and MASC,thus validating the effectiveness of our proposed methods.
文摘Diagnosing cardiac diseases relies heavily on electrocardiogram(ECG)analysis,but detecting myocardial infarction-related arrhythmias remains challenging due to irregular heartbeats and signal variations.Despite advancements in machine learning,achieving both high accuracy and low computational cost for arrhythmia classification remains a critical issue.Computer-aided diagnosis systems can play a key role in early detection,reducing mortality rates associated with cardiac disorders.This study proposes a fully automated approach for ECG arrhythmia classification using deep learning and machine learning techniques to improve diagnostic accuracy while minimizing processing time.The methodology consists of three stages:1)preprocessing,where ECG signals undergo noise reduction and feature extraction;2)feature Identification,where deep convolutional neural network(CNN)blocks,combined with data augmentation and transfer learning,extract key parameters;3)classification,where a hybrid CNN-SVM model is employed for arrhythmia recognition.CNN-extracted features were fed into a binary support vector machine(SVM)classifier,and model performance was assessed using five-fold cross-validation.Experimental findings demonstrated that the CNN2 model achieved 85.52%accuracy,while the hybrid CNN2-SVM approach significantly improved accuracy to 97.33%,outperforming conventional methods.This model enhances classification efficiency while reducing computational complexity.The proposed approach bridges the gap between accuracy and processing speed in ECG arrhythmia classification,offering a promising solution for real-time clinical applications.Its superior performance compared to nonlinear classifiers highlights its potential for improving automated cardiac diagnosis.
文摘In the era of precision medicine,the classification of diabetes mellitus has evolved beyond the traditional categories.Various classification methods now account for a multitude of factors,including variations in specific genes,type ofβ-cell impairment,degree of insulin resistance,and clinical characteristics of metabolic profiles.Improved classification methods enable healthcare providers to formulate blood glucose management strategies more precisely.Applying these updated classification systems,will assist clinicians in further optimising treatment plans,including targeted drug therapies,personalized dietary advice,and specific exercise plans.Ultimately,this will facilitate stricter blood glucose control,minimize the risks of hypoglycaemia and hyperglycaemia,and reduce long-term complications associated with diabetes.