Machine learning(ML)is increasingly applied for medical image processing with appropriate learning paradigms.These applications include analyzing images of various organs,such as the brain,lung,eye,etc.,to identify sp...Machine learning(ML)is increasingly applied for medical image processing with appropriate learning paradigms.These applications include analyzing images of various organs,such as the brain,lung,eye,etc.,to identify specific flaws/diseases for diagnosis.The primary concern of ML applications is the precise selection of flexible image features for pattern detection and region classification.Most of the extracted image features are irrelevant and lead to an increase in computation time.Therefore,this article uses an analytical learning paradigm to design a Congruent Feature Selection Method to select the most relevant image features.This process trains the learning paradigm using similarity and correlation-based features over different textural intensities and pixel distributions.The similarity between the pixels over the various distribution patterns with high indexes is recommended for disease diagnosis.Later,the correlation based on intensity and distribution is analyzed to improve the feature selection congruency.Therefore,the more congruent pixels are sorted in the descending order of the selection,which identifies better regions than the distribution.Now,the learning paradigm is trained using intensity and region-based similarity to maximize the chances of selection.Therefore,the probability of feature selection,regardless of the textures and medical image patterns,is improved.This process enhances the performance of ML applications for different medical image processing.The proposed method improves the accuracy,precision,and training rate by 13.19%,10.69%,and 11.06%,respectively,compared to other models for the selected dataset.The mean error and selection time is also reduced by 12.56%and 13.56%,respectively,compared to the same models and dataset.展开更多
Accurate cloud classification plays a crucial role in aviation safety,climate monitoring,and localized weather forecasting.Current research has been focusing on machine learning techniques,particularly deep learning b...Accurate cloud classification plays a crucial role in aviation safety,climate monitoring,and localized weather forecasting.Current research has been focusing on machine learning techniques,particularly deep learning based model,for the types identification.However,traditional approaches such as convolutional neural networks(CNNs)encounter difficulties in capturing global contextual information.In addition,they are computationally expensive,which restricts their usability in resource-limited environments.To tackle these issues,we present the Cloud Vision Transformer(CloudViT),a lightweight model that integrates CNNs with Transformers.The integration enables an effective balance between local and global feature extraction.To be specific,CloudViT comprises two innovative modules:Feature Extraction(E_Module)and Downsampling(D_Module).These modules are able to significantly reduce the number of model parameters and computational complexity while maintaining translation invariance and enhancing contextual comprehension.Overall,the CloudViT includes 0.93×10^(6)parameters,which decreases more than ten times compared to the SOTA(State-of-the-Art)model CloudNet.Comprehensive evaluations conducted on the HBMCD and SWIMCAT datasets showcase the outstanding performance of CloudViT.It achieves classification accuracies of 98.45%and 100%,respectively.Moreover,the efficiency and scalability of CloudViT make it an ideal candidate for deployment inmobile cloud observation systems,enabling real-time cloud image classification.The proposed hybrid architecture of CloudViT offers a promising approach for advancing ground-based cloud image classification.It holds significant potential for both optimizing performance and facilitating practical deployment scenarios.展开更多
Optical and hybrid convolutional neural networks(CNNs)recently have become of increasing interest to achieve low-latency,low-power image classification,and computer-vision tasks.However,implementing optical nonlineari...Optical and hybrid convolutional neural networks(CNNs)recently have become of increasing interest to achieve low-latency,low-power image classification,and computer-vision tasks.However,implementing optical nonlinearity is challenging,and omitting the nonlinear layers in a standard CNN comes with a significant reduction in accuracy.We use knowledge distillation to compress modified AlexNet to a single linear convolutional layer and an electronic backend(two fully connected layers).We obtain comparable performance with a purely electronic CNN with five convolutional layers and three fully connected layers.We implement the convolution optically via engineering the point spread function of an inverse-designed meta-optic.Using this hybrid approach,we estimate a reduction in multiply-accumulate operations from 17M in a conventional electronic modified AlexNet to only 86 K in the hybrid compressed network enabled by the optical front end.This constitutes over 2 orders of magnitude of reduction in latency and power consumption.Furthermore,we experimentally demonstrate that the classification accuracy of the system exceeds 93%on the MNIST dataset of handwritten digits.展开更多
Multi-label image classification is a challenging task due to the diverse sizes and complex backgrounds of objects in images.Obtaining class-specific precise representations at different scales is a key aspect of feat...Multi-label image classification is a challenging task due to the diverse sizes and complex backgrounds of objects in images.Obtaining class-specific precise representations at different scales is a key aspect of feature representation.However,existing methods often rely on the single-scale deep feature,neglecting shallow and deeper layer features,which poses challenges when predicting objects of varying scales within the same image.Although some studies have explored multi-scale features,they rarely address the flow of information between scales or efficiently obtain class-specific precise representations for features at different scales.To address these issues,we propose a two-stage,three-branch Transformer-based framework.The first stage incorporates multi-scale image feature extraction and hierarchical scale attention.This design enables the model to consider objects at various scales while enhancing the flow of information across different feature scales,improving the model’s generalization to diverse object scales.The second stage includes a global feature enhancement module and a region selection module.The global feature enhancement module strengthens interconnections between different image regions,mitigating the issue of incomplete represen-tations,while the region selection module models the cross-modal relationships between image features and labels.Together,these components enable the efficient acquisition of class-specific precise feature representations.Extensive experiments on public datasets,including COCO2014,VOC2007,and VOC2012,demonstrate the effectiveness of our proposed method.Our approach achieves consistent performance gains of 0.3%,0.4%,and 0.2%over state-of-the-art methods on the three datasets,respectively.These results validate the reliability and superiority of our approach for multi-label image classification.展开更多
The multi-modal characteristics of mineral particles play a pivotal role in enhancing the classification accuracy,which is critical for obtaining a profound understanding of the Earth's composition and ensuring ef...The multi-modal characteristics of mineral particles play a pivotal role in enhancing the classification accuracy,which is critical for obtaining a profound understanding of the Earth's composition and ensuring effective exploitation utilization of its resources.However,the existing methods for classifying mineral particles do not fully utilize these multi-modal features,thereby limiting the classification accuracy.Furthermore,when conventional multi-modal image classification methods are applied to planepolarized and cross-polarized sequence images of mineral particles,they encounter issues such as information loss,misaligned features,and challenges in spatiotemporal feature extraction.To address these challenges,we propose a multi-modal mineral particle polarization image classification network(MMGC-Net)for precise mineral particle classification.Initially,MMGC-Net employs a two-dimensional(2D)backbone network with shared parameters to extract features from two types of polarized images to ensure feature alignment.Subsequently,a cross-polarized intra-modal feature fusion module is designed to refine the spatiotemporal features from the extracted features of the cross-polarized sequence images.Ultimately,the inter-modal feature fusion module integrates the two types of modal features to enhance the classification precision.Quantitative and qualitative experimental results indicate that when compared with the current state-of-the-art multi-modal image classification methods,MMGC-Net demonstrates marked superiority in terms of mineral particle multi-modal feature learning and four classification evaluation metrics.It also demonstrates better stability than the existing models.展开更多
We propose a hierarchical multi-scale attention mechanism-based model in response to the low accuracy and inefficient manual classification of existing oceanic biological image classification methods. Firstly, the hie...We propose a hierarchical multi-scale attention mechanism-based model in response to the low accuracy and inefficient manual classification of existing oceanic biological image classification methods. Firstly, the hierarchical efficient multi-scale attention(H-EMA) module is designed for lightweight feature extraction, achieving outstanding performance at a relatively low cost. Secondly, an improved EfficientNetV2 block is used to integrate information from different scales better and enhance inter-layer message passing. Furthermore, introducing the convolutional block attention module(CBAM) enhances the model's perception of critical features, optimizing its generalization ability. Lastly, Focal Loss is introduced to adjust the weights of complex samples to address the issue of imbalanced categories in the dataset, further improving the model's performance. The model achieved 96.11% accuracy on the intertidal marine organism dataset of Nanji Islands and 84.78% accuracy on the CIFAR-100 dataset, demonstrating its strong generalization ability to meet the demands of oceanic biological image classification.展开更多
The emergence of adversarial examples has revealed the inadequacies in the robustness of image classification models based on Convolutional Neural Networks (CNNs). Particularly in recent years, the discovery of natura...The emergence of adversarial examples has revealed the inadequacies in the robustness of image classification models based on Convolutional Neural Networks (CNNs). Particularly in recent years, the discovery of natural adversarial examples has posed significant challenges, as traditional defense methods against adversarial attacks have proven to be largely ineffective against these natural adversarial examples. This paper explores defenses against these natural adversarial examples from three perspectives: adversarial examples, model architecture, and dataset. First, it employs Class Activation Mapping (CAM) to visualize how models classify natural adversarial examples, identifying several typical attack patterns. Next, various common CNN models are analyzed to evaluate their susceptibility to these attacks, revealing that different architectures exhibit varying defensive capabilities. The study finds that as the depth of a network increases, its defenses against natural adversarial examples strengthen. Lastly, Finally, the impact of dataset class distribution on the defense capability of models is examined, focusing on two aspects: the number of classes in the training set and the number of predicted classes. This study investigates how these factors influence the model’s ability to defend against natural adversarial examples. Results indicate that reducing the number of training classes enhances the model’s defense against natural adversarial examples. Additionally, under a fixed number of training classes, some CNN models show an optimal range of predicted classes for achieving the best defense performance against these adversarial examples.展开更多
Real-world data always exhibit an imbalanced and long-tailed distribution,which leads to poor performance for neural network-based classification.Existing methods mainly tackle this problem by reweighting the loss fun...Real-world data always exhibit an imbalanced and long-tailed distribution,which leads to poor performance for neural network-based classification.Existing methods mainly tackle this problem by reweighting the loss function or rebalancing the classifier.However,one crucial aspect overlooked by previous research studies is the imbalanced feature space problem caused by the imbalanced angle distribution.In this paper,the authors shed light on the significance of the angle distribution in achieving a balanced feature space,which is essential for improving model performance under long-tailed distributions.Nevertheless,it is challenging to effectively balance both the classifier norms and angle distribution due to problems such as the low feature norm.To tackle these challenges,the authors first thoroughly analyse the classifier and feature space by decoupling the classification logits into three key components:classifier norm(i.e.the magnitude of the classifier vector),feature norm(i.e.the magnitude of the feature vector),and cosine similarity between the classifier vector and feature vector.In this way,the authors analyse the change of each component in the training process and reveal three critical problems that should be solved,that is,the imbalanced angle distribution,the lack of feature discrimination,and the low feature norm.Drawing from this analysis,the authors propose a novel loss function that incorporates hyperspherical uniformity,additive angular margin,and feature norm regularisation.Each component of the loss function addresses a specific problem and synergistically contributes to achieving a balanced classifier and feature space.The authors conduct extensive experiments on three popular benchmark datasets including CIFAR-10/100-LT,ImageNet-LT,and iNaturalist 2018.The experimental results demonstrate that the authors’loss function outperforms several previous state-of-the-art methods in addressing the challenges posed by imbalanced and longtailed datasets,that is,by improving upon the best-performing baselines on CIFAR-100-LT by 1.34,1.41,1.41 and 1.33,respectively.展开更多
In a context where urban satellite image processing technologies are undergoing rapid evolution,this article presents an innovative and rigorous approach to satellite image classification applied to urban planning.Thi...In a context where urban satellite image processing technologies are undergoing rapid evolution,this article presents an innovative and rigorous approach to satellite image classification applied to urban planning.This research proposes an integrated methodological framework,based on the principles of model-driven engineering(MDE),to transform a generic meta-model into a meta-model specifically dedicated to urban satellite image classification.We implemented this transformation using the Atlas Transformation Language(ATL),guaranteeing a smooth and consistent transition from platform-independent model(PIM)to platform-specific model(PSM),according to the principles of model-driven architecture(MDA).The application of this IDM methodology enables advanced structuring of satellite data for targeted urban planning analyses,making it possible to classify various urban zones such as built-up,cultivated,arid and water areas.The novelty of this approach lies in the automation and standardization of the classification process,which significantly reduces the need for manual intervention,and thus improves the reliability,reproducibility and efficiency of urban data analysis.By adopting this method,decision-makers and urban planners are provided with a powerful tool for systematically and consistently analyzing and interpreting satellite images,facilitating decision-making in critical areas such as urban space management,infrastructure planning and environmental preservation.展开更多
Medical image classification is crucial in disease diagnosis,treatment planning,and clinical decisionmaking.We introduced a novel medical image classification approach that integrates Bayesian Random Semantic Data Aug...Medical image classification is crucial in disease diagnosis,treatment planning,and clinical decisionmaking.We introduced a novel medical image classification approach that integrates Bayesian Random Semantic Data Augmentation(BSDA)with a Vision Mamba-based model for medical image classification(MedMamba),enhanced by residual connection blocks,we named the model BSDA-Mamba.BSDA augments medical image data semantically,enhancing the model’s generalization ability and classification performance.MedMamba,a deep learning-based state space model,excels in capturing long-range dependencies in medical images.By incorporating residual connections,BSDA-Mamba further improves feature extraction capabilities.Through comprehensive experiments on eight medical image datasets,we demonstrate that BSDA-Mamba outperforms existing models in accuracy,area under the curve,and F1-score.Our results highlight BSDA-Mamba’s potential as a reliable tool for medical image analysis,particularly in handling diverse imaging modalities from X-rays to MRI.The open-sourcing of our model’s code and datasets,will facilitate the reproduction and extension of our work.展开更多
Few‐shot image classification is the task of classifying novel classes using extremely limited labelled samples.To perform classification using the limited samples,one solution is to learn the feature alignment(FA)in...Few‐shot image classification is the task of classifying novel classes using extremely limited labelled samples.To perform classification using the limited samples,one solution is to learn the feature alignment(FA)information between the labelled and unlabelled sample features.Most FA methods use the feature mean as the class prototype and calculate the correlation between prototype and unlabelled features to learn an alignment strategy.However,mean prototypes tend to degenerate informative features because spatial features at the same position may not be equally important for the final classification,leading to inaccurate correlation calculations.Therefore,the authors propose an effective intraclass FA strategy that aggregates semantically similar spatial features from an adaptive reference prototype in low‐dimensional feature space to obtain an informative prototype feature map for precise correlation computation.Moreover,a dual correlation module to learn the hard and soft correlations was developed by the authors.This module combines the correlation information between the prototype and unlabelled features in both the original and learnable feature spaces,aiming to produce a comprehensive cross‐correlation between the prototypes and unlabelled features.Using both FA and cross‐attention modules,our model can maintain informative class features and capture important shared features for classification.Experimental results on three few‐shot classification benchmarks show that the proposed method outperformed related methods and resulted in a 3%performance boost in the 1‐shot setting by inserting the proposed module into the related methods.展开更多
Gliomas have the highest mortality rate of all brain tumors.Correctly classifying the glioma risk period can help doctors make reasonable treatment plans and improve patients’survival rates.This paper proposes a hier...Gliomas have the highest mortality rate of all brain tumors.Correctly classifying the glioma risk period can help doctors make reasonable treatment plans and improve patients’survival rates.This paper proposes a hierarchical multi-scale attention feature fusion medical image classification network(HMAC-Net),which effectively combines global features and local features.The network framework consists of three parallel layers:The global feature extraction layer,the local feature extraction layer,and the multi-scale feature fusion layer.A linear sparse attention mechanism is designed in the global feature extraction layer to reduce information redundancy.In the local feature extraction layer,a bilateral local attention mechanism is introduced to improve the extraction of relevant information between adjacent slices.In the multi-scale feature fusion layer,a channel fusion block combining convolutional attention mechanism and residual inverse multi-layer perceptron is proposed to prevent gradient disappearance and network degradation and improve feature representation capability.The double-branch iterative multi-scale classification block is used to improve the classification performance.On the brain glioma risk grading dataset,the results of the ablation experiment and comparison experiment show that the proposed HMAC-Net has the best performance in both qualitative analysis of heat maps and quantitative analysis of evaluation indicators.On the dataset of skin cancer classification,the generalization experiment results show that the proposed HMAC-Net has a good generalization effect.展开更多
Convolutional neural networks (CNNs) are widely used in image classification tasks, but their increasing model size and computation make them challenging to implement on embedded systems with constrained hardware reso...Convolutional neural networks (CNNs) are widely used in image classification tasks, but their increasing model size and computation make them challenging to implement on embedded systems with constrained hardware resources. To address this issue, the MobileNetV1 network was developed, which employs depthwise convolution to reduce network complexity. MobileNetV1 employs a stride of 2 in several convolutional layers to decrease the spatial resolution of feature maps, thereby lowering computational costs. However, this stride setting can lead to a loss of spatial information, particularly affecting the detection and representation of smaller objects or finer details in images. To maintain the trade-off between complexity and model performance, a lightweight convolutional neural network with hierarchical multi-scale feature fusion based on the MobileNetV1 network is proposed. The network consists of two main subnetworks. The first subnetwork uses a depthwise dilated separable convolution (DDSC) layer to learn imaging features with fewer parameters, which results in a lightweight and computationally inexpensive network. Furthermore, depthwise dilated convolution in DDSC layer effectively expands the field of view of filters, allowing them to incorporate a larger context. The second subnetwork is a hierarchical multi-scale feature fusion (HMFF) module that uses parallel multi-resolution branches architecture to process the input feature map in order to extract the multi-scale feature information of the input image. Experimental results on the CIFAR-10, Malaria, and KvasirV1 datasets demonstrate that the proposed method is efficient, reducing the network parameters and computational cost by 65.02% and 39.78%, respectively, while maintaining the network performance compared to the MobileNetV1 baseline.展开更多
Research has shown that chest radiography images of patients with different diseases, such as pneumonia, COVID-19, SARS, pneumothorax, etc., all exhibit some form of abnormality. Several deep learning techniques can b...Research has shown that chest radiography images of patients with different diseases, such as pneumonia, COVID-19, SARS, pneumothorax, etc., all exhibit some form of abnormality. Several deep learning techniques can be used to identify each of these anomalies in the chest x-ray images. Convolutional neural networks (CNNs) have shown great success in the fields of image recognition and image classification since there are numerous large-scale annotated image datasets available. The classification of medical images, particularly radiographic images, remains one of the biggest hurdles in medical diagnosis because of the restricted availability of annotated medical images. However, such difficulty can be solved by utilizing several deep learning strategies, including data augmentation and transfer learning. The aim was to build a model that would detect abnormalities in chest x-ray images with the highest probability. To do that, different models were built with different features. While making a CNN model, one of the main tasks is to tune the model by changing the hyperparameters and layers so that the model gives out good training and testing results. In our case, three different models were built, and finally, the last one gave out the best-predicted results. From that last model, we got 98% training accuracy, 84% validation, and 81% testing accuracy. The reason behind the final model giving out the best evaluation scores is that it was a well-fitted model. There was no overfitting or underfitting issues. Our aim with this project was to make a tool using the CNN model in R language, which will help detect abnormalities in radiography images. The tool will be able to detect diseases such as Pneumonia, Covid-19, Effusions, Infiltration, Pneumothorax, and others. Because of its high accuracy, this research chose to use supervised multi-class classification techniques as well as Convolutional Neural Networks (CNNs) to classify different chest x-ray images. CNNs are extremely efficient and successful at reducing the number of parameters while maintaining the quality of the primary model. CNNs are also trained to recognize the edges of various objects in any batch of images. CNNs automatically discover the relevant aspects in labeled data and learn the distinguishing features for each class by themselves.展开更多
The use of Explainable Artificial Intelligence(XAI)models becomes increasingly important for making decisions in smart healthcare environments.It is to make sure that decisions are based on trustworthy algorithms and ...The use of Explainable Artificial Intelligence(XAI)models becomes increasingly important for making decisions in smart healthcare environments.It is to make sure that decisions are based on trustworthy algorithms and that healthcare workers understand the decisions made by these algorithms.These models can potentially enhance interpretability and explainability in decision-making processes that rely on artificial intelligence.Nevertheless,the intricate nature of the healthcare field necessitates the utilization of sophisticated models to classify cancer images.This research presents an advanced investigation of XAI models to classify cancer images.It describes the different levels of explainability and interpretability associated with XAI models and the challenges faced in deploying them in healthcare applications.In addition,this study proposes a novel framework for cancer image classification that incorporates XAI models with deep learning and advanced medical imaging techniques.The proposed model integrates several techniques,including end-to-end explainable evaluation,rule-based explanation,and useradaptive explanation.The proposed XAI reaches 97.72%accuracy,90.72%precision,93.72%recall,96.72%F1-score,9.55%FDR,9.66%FOR,and 91.18%DOR.It will discuss the potential applications of the proposed XAI models in the smart healthcare environment.It will help ensure trust and accountability in AI-based decisions,which is essential for achieving a safe and reliable smart healthcare environment.展开更多
The research aims to improve the performance of image recognition methods based on a description in the form of a set of keypoint descriptors.The main focus is on increasing the speed of establishing the relevance of ...The research aims to improve the performance of image recognition methods based on a description in the form of a set of keypoint descriptors.The main focus is on increasing the speed of establishing the relevance of object and etalon descriptions while maintaining the required level of classification efficiency.The class to be recognized is represented by an infinite set of images obtained from the etalon by applying arbitrary geometric transformations.It is proposed to reduce the descriptions for the etalon database by selecting the most significant descriptor components according to the information content criterion.The informativeness of an etalon descriptor is estimated by the difference of the closest distances to its own and other descriptions.The developed method determines the relevance of the full description of the recognized object with the reduced description of the etalons.Several practical models of the classifier with different options for establishing the correspondence between object descriptors and etalons are considered.The results of the experimental modeling of the proposed methods for a database including images of museum jewelry are presented.The test sample is formed as a set of images from the etalon database and out of the database with the application of geometric transformations of scale and rotation in the field of view.The practical problems of determining the threshold for the number of votes,based on which a classification decision is made,have been researched.Modeling has revealed the practical possibility of tenfold reducing descriptions with full preservation of classification accuracy.Reducing the descriptions by twenty times in the experiment leads to slightly decreased accuracy.The speed of the analysis increases in proportion to the degree of reduction.The use of reduction by the informativeness criterion confirmed the possibility of obtaining the most significant subset of features for classification,which guarantees a decent level of accuracy.展开更多
Total shoulder arthroplasty is a standard restorative procedure practiced by orthopedists to diagnose shoulder arthritis in which a prosthesis replaces the whole joint or a part of the joint.It is often challenging fo...Total shoulder arthroplasty is a standard restorative procedure practiced by orthopedists to diagnose shoulder arthritis in which a prosthesis replaces the whole joint or a part of the joint.It is often challenging for doctors to identify the exact model and manufacturer of the prosthesis when it is unknown.This paper proposes a transfer learning-based class imbalance-aware prosthesis detection method to detect the implant’s manufacturer automatically from shoulder X-ray images.The framework of the method proposes a novel training approach and a new set of batch-normalization,dropout,and fully convolutional layers in the head network.It employs cyclical learning rates and weighting-based loss calculation mechanism.These modifications aid in faster convergence,avoid local-minima stagnation,and remove the training bias caused by imbalanced dataset.The proposed method is evaluated using seven well-known pre-trained models of VGGNet,ResNet,and DenseNet families.Experimentation is performed on a shoulder implant benchmark dataset consisting of 597 shoulder X-ray images.The proposed method improves the classification performance of all pre-trained models by 10–12%.The DenseNet-201-based variant has achieved the highest classification accuracy of 89.5%,which is 10%higher than existing methods.Further,to validate and generalize the proposed method,the existing baseline dataset is supplemented to six classes,including samples of two more implant manufacturers.Experimental results have shown average accuracy of 86.7%for the extended dataset and show the preeminence of the proposed method.展开更多
We redesign the parameterized quantum circuit in the quantum deep neural network, construct a three-layer structure as the hidden layer, and then use classical optimization algorithms to train the parameterized quantu...We redesign the parameterized quantum circuit in the quantum deep neural network, construct a three-layer structure as the hidden layer, and then use classical optimization algorithms to train the parameterized quantum circuit, thereby propose a novel hybrid quantum deep neural network(HQDNN) used for image classification. After bilinear interpolation reduces the original image to a suitable size, an improved novel enhanced quantum representation(INEQR) is used to encode it into quantum states as the input of the HQDNN. Multi-layer parameterized quantum circuits are used as the main structure to implement feature extraction and classification. The output results of parameterized quantum circuits are converted into classical data through quantum measurements and then optimized on a classical computer. To verify the performance of the HQDNN, we conduct binary classification and three classification experiments on the MNIST(Modified National Institute of Standards and Technology) data set. In the first binary classification, the accuracy of 0 and 4 exceeds98%. Then we compare the performance of three classification with other algorithms, the results on two datasets show that the classification accuracy is higher than that of quantum deep neural network and general quantum convolutional neural network.展开更多
In blood or bone marrow,leukemia is a form of cancer.A person with leukemia has an expansion of white blood cells(WBCs).It primarily affects children and rarely affects adults.Treatment depends on the type of leukemia...In blood or bone marrow,leukemia is a form of cancer.A person with leukemia has an expansion of white blood cells(WBCs).It primarily affects children and rarely affects adults.Treatment depends on the type of leukemia and the extent to which cancer has established throughout the body.Identifying leukemia in the initial stage is vital to providing timely patient care.Medical image-analysis-related approaches grant safer,quicker,and less costly solutions while ignoring the difficulties of these invasive processes.It can be simple to generalize Computer vision(CV)-based and image-processing techniques and eradicate human error.Many researchers have implemented computer-aided diagnosticmethods andmachine learning(ML)for laboratory image analysis,hopefully overcoming the limitations of late leukemia detection and determining its subgroups.This study establishes a Marine Predators Algorithm with Deep Learning Leukemia Cancer Classification(MPADL-LCC)algorithm onMedical Images.The projectedMPADL-LCC system uses a bilateral filtering(BF)technique to pre-process medical images.The MPADL-LCC system uses Faster SqueezeNet withMarine Predators Algorithm(MPA)as a hyperparameter optimizer for feature extraction.Lastly,the denoising autoencoder(DAE)methodology can be executed to accurately detect and classify leukemia cancer.The hyperparameter tuning process using MPA helps enhance leukemia cancer classification performance.Simulation results are compared with other recent approaches concerning various measurements and the MPADL-LCC algorithm exhibits the best results over other recent approaches.展开更多
The utilization of visual attention enhances the performance of image classification tasks.Previous attentionbased models have demonstrated notable performance,but many of these models exhibit reduced accuracy when co...The utilization of visual attention enhances the performance of image classification tasks.Previous attentionbased models have demonstrated notable performance,but many of these models exhibit reduced accuracy when confronted with inter-class and intra-class similarities and differences.Neural-Controlled Differential Equations(N-CDE’s)and Neural Ordinary Differential Equations(NODE’s)are extensively utilized within this context.NCDE’s possesses the capacity to effectively illustrate both inter-class and intra-class similarities and differences with enhanced clarity.To this end,an attentive neural network has been proposed to generate attention maps,which uses two different types of N-CDE’s,one for adopting hidden layers and the other to generate attention values.Two distinct attention techniques are implemented including time-wise attention,also referred to as bottom N-CDE’s;and element-wise attention,called topN-CDE’s.Additionally,a trainingmethodology is proposed to guarantee that the training problem is sufficiently presented.Two classification tasks including fine-grained visual classification andmulti-label classification,are utilized to evaluate the proposedmodel.The proposedmethodology is employed on five publicly available datasets,including CUB-200-2011,ImageNet-1K,PASCAL VOC 2007,PASCAL VOC 2012,and MS COCO.The obtained visualizations have demonstrated that N-CDE’s are better appropriate for attention-based activities in comparison to conventional NODE’s.展开更多
基金the Deanship of Scientifc Research at King Khalid University for funding this work through large group Research Project under grant number RGP2/421/45supported via funding from Prince Sattam bin Abdulaziz University project number(PSAU/2024/R/1446)+1 种基金supported by theResearchers Supporting Project Number(UM-DSR-IG-2023-07)Almaarefa University,Riyadh,Saudi Arabia.supported by the Basic Science Research Program through the National Research Foundation of Korea(NRF)funded by the Ministry of Education(No.2021R1F1A1055408).
文摘Machine learning(ML)is increasingly applied for medical image processing with appropriate learning paradigms.These applications include analyzing images of various organs,such as the brain,lung,eye,etc.,to identify specific flaws/diseases for diagnosis.The primary concern of ML applications is the precise selection of flexible image features for pattern detection and region classification.Most of the extracted image features are irrelevant and lead to an increase in computation time.Therefore,this article uses an analytical learning paradigm to design a Congruent Feature Selection Method to select the most relevant image features.This process trains the learning paradigm using similarity and correlation-based features over different textural intensities and pixel distributions.The similarity between the pixels over the various distribution patterns with high indexes is recommended for disease diagnosis.Later,the correlation based on intensity and distribution is analyzed to improve the feature selection congruency.Therefore,the more congruent pixels are sorted in the descending order of the selection,which identifies better regions than the distribution.Now,the learning paradigm is trained using intensity and region-based similarity to maximize the chances of selection.Therefore,the probability of feature selection,regardless of the textures and medical image patterns,is improved.This process enhances the performance of ML applications for different medical image processing.The proposed method improves the accuracy,precision,and training rate by 13.19%,10.69%,and 11.06%,respectively,compared to other models for the selected dataset.The mean error and selection time is also reduced by 12.56%and 13.56%,respectively,compared to the same models and dataset.
基金funded by Innovation and Development Special Project of China Meteorological Administration(CXFZ2022J038,CXFZ2024J035)Sichuan Science and Technology Program(No.2023YFQ0072)+1 种基金Key Laboratory of Smart Earth(No.KF2023YB03-07)Automatic Software Generation and Intelligent Service Key Laboratory of Sichuan Province(CUIT-SAG202210).
文摘Accurate cloud classification plays a crucial role in aviation safety,climate monitoring,and localized weather forecasting.Current research has been focusing on machine learning techniques,particularly deep learning based model,for the types identification.However,traditional approaches such as convolutional neural networks(CNNs)encounter difficulties in capturing global contextual information.In addition,they are computationally expensive,which restricts their usability in resource-limited environments.To tackle these issues,we present the Cloud Vision Transformer(CloudViT),a lightweight model that integrates CNNs with Transformers.The integration enables an effective balance between local and global feature extraction.To be specific,CloudViT comprises two innovative modules:Feature Extraction(E_Module)and Downsampling(D_Module).These modules are able to significantly reduce the number of model parameters and computational complexity while maintaining translation invariance and enhancing contextual comprehension.Overall,the CloudViT includes 0.93×10^(6)parameters,which decreases more than ten times compared to the SOTA(State-of-the-Art)model CloudNet.Comprehensive evaluations conducted on the HBMCD and SWIMCAT datasets showcase the outstanding performance of CloudViT.It achieves classification accuracies of 98.45%and 100%,respectively.Moreover,the efficiency and scalability of CloudViT make it an ideal candidate for deployment inmobile cloud observation systems,enabling real-time cloud image classification.The proposed hybrid architecture of CloudViT offers a promising approach for advancing ground-based cloud image classification.It holds significant potential for both optimizing performance and facilitating practical deployment scenarios.
基金supported by the National Science Foundation(Grant Nos.NSF-ECCS-2127235 and EFRI-BRAID-2223495)Part of this work was conducted at the Washington Nanofabrication Facility/Molecular Analysis Facility,a National Nanotechnology Coordinated Infrastructure(NNCI)site at the University of Washington with partial support from the National Science Foundation(Grant Nos.NNCI-1542101 and NNCI-2025489).
文摘Optical and hybrid convolutional neural networks(CNNs)recently have become of increasing interest to achieve low-latency,low-power image classification,and computer-vision tasks.However,implementing optical nonlinearity is challenging,and omitting the nonlinear layers in a standard CNN comes with a significant reduction in accuracy.We use knowledge distillation to compress modified AlexNet to a single linear convolutional layer and an electronic backend(two fully connected layers).We obtain comparable performance with a purely electronic CNN with five convolutional layers and three fully connected layers.We implement the convolution optically via engineering the point spread function of an inverse-designed meta-optic.Using this hybrid approach,we estimate a reduction in multiply-accumulate operations from 17M in a conventional electronic modified AlexNet to only 86 K in the hybrid compressed network enabled by the optical front end.This constitutes over 2 orders of magnitude of reduction in latency and power consumption.Furthermore,we experimentally demonstrate that the classification accuracy of the system exceeds 93%on the MNIST dataset of handwritten digits.
基金supported by the National Natural Science Foundation of China(62302167,62477013)Natural Science Foundation of Shanghai(No.24ZR1456100)+1 种基金Science and Technology Commission of Shanghai Municipality(No.24DZ2305900)the Shanghai Municipal Special Fund for Promoting High-Quality Development of Industries(2211106).
文摘Multi-label image classification is a challenging task due to the diverse sizes and complex backgrounds of objects in images.Obtaining class-specific precise representations at different scales is a key aspect of feature representation.However,existing methods often rely on the single-scale deep feature,neglecting shallow and deeper layer features,which poses challenges when predicting objects of varying scales within the same image.Although some studies have explored multi-scale features,they rarely address the flow of information between scales or efficiently obtain class-specific precise representations for features at different scales.To address these issues,we propose a two-stage,three-branch Transformer-based framework.The first stage incorporates multi-scale image feature extraction and hierarchical scale attention.This design enables the model to consider objects at various scales while enhancing the flow of information across different feature scales,improving the model’s generalization to diverse object scales.The second stage includes a global feature enhancement module and a region selection module.The global feature enhancement module strengthens interconnections between different image regions,mitigating the issue of incomplete represen-tations,while the region selection module models the cross-modal relationships between image features and labels.Together,these components enable the efficient acquisition of class-specific precise feature representations.Extensive experiments on public datasets,including COCO2014,VOC2007,and VOC2012,demonstrate the effectiveness of our proposed method.Our approach achieves consistent performance gains of 0.3%,0.4%,and 0.2%over state-of-the-art methods on the three datasets,respectively.These results validate the reliability and superiority of our approach for multi-label image classification.
基金supported by the National Natural Science Foundation of China(Grant Nos.62071315 and 62271336).
文摘The multi-modal characteristics of mineral particles play a pivotal role in enhancing the classification accuracy,which is critical for obtaining a profound understanding of the Earth's composition and ensuring effective exploitation utilization of its resources.However,the existing methods for classifying mineral particles do not fully utilize these multi-modal features,thereby limiting the classification accuracy.Furthermore,when conventional multi-modal image classification methods are applied to planepolarized and cross-polarized sequence images of mineral particles,they encounter issues such as information loss,misaligned features,and challenges in spatiotemporal feature extraction.To address these challenges,we propose a multi-modal mineral particle polarization image classification network(MMGC-Net)for precise mineral particle classification.Initially,MMGC-Net employs a two-dimensional(2D)backbone network with shared parameters to extract features from two types of polarized images to ensure feature alignment.Subsequently,a cross-polarized intra-modal feature fusion module is designed to refine the spatiotemporal features from the extracted features of the cross-polarized sequence images.Ultimately,the inter-modal feature fusion module integrates the two types of modal features to enhance the classification precision.Quantitative and qualitative experimental results indicate that when compared with the current state-of-the-art multi-modal image classification methods,MMGC-Net demonstrates marked superiority in terms of mineral particle multi-modal feature learning and four classification evaluation metrics.It also demonstrates better stability than the existing models.
基金supported by the National Natural Science Foundation of China (Nos.61806107 and 61702135)。
文摘We propose a hierarchical multi-scale attention mechanism-based model in response to the low accuracy and inefficient manual classification of existing oceanic biological image classification methods. Firstly, the hierarchical efficient multi-scale attention(H-EMA) module is designed for lightweight feature extraction, achieving outstanding performance at a relatively low cost. Secondly, an improved EfficientNetV2 block is used to integrate information from different scales better and enhance inter-layer message passing. Furthermore, introducing the convolutional block attention module(CBAM) enhances the model's perception of critical features, optimizing its generalization ability. Lastly, Focal Loss is introduced to adjust the weights of complex samples to address the issue of imbalanced categories in the dataset, further improving the model's performance. The model achieved 96.11% accuracy on the intertidal marine organism dataset of Nanji Islands and 84.78% accuracy on the CIFAR-100 dataset, demonstrating its strong generalization ability to meet the demands of oceanic biological image classification.
文摘The emergence of adversarial examples has revealed the inadequacies in the robustness of image classification models based on Convolutional Neural Networks (CNNs). Particularly in recent years, the discovery of natural adversarial examples has posed significant challenges, as traditional defense methods against adversarial attacks have proven to be largely ineffective against these natural adversarial examples. This paper explores defenses against these natural adversarial examples from three perspectives: adversarial examples, model architecture, and dataset. First, it employs Class Activation Mapping (CAM) to visualize how models classify natural adversarial examples, identifying several typical attack patterns. Next, various common CNN models are analyzed to evaluate their susceptibility to these attacks, revealing that different architectures exhibit varying defensive capabilities. The study finds that as the depth of a network increases, its defenses against natural adversarial examples strengthen. Lastly, Finally, the impact of dataset class distribution on the defense capability of models is examined, focusing on two aspects: the number of classes in the training set and the number of predicted classes. This study investigates how these factors influence the model’s ability to defend against natural adversarial examples. Results indicate that reducing the number of training classes enhances the model’s defense against natural adversarial examples. Additionally, under a fixed number of training classes, some CNN models show an optimal range of predicted classes for achieving the best defense performance against these adversarial examples.
基金National Key Research and Development Program of China,Grant/Award Numbers:2022YFB3103900,2023YFB3106504Major Key Project of PCL,Grant/Award Numbers:PCL2022A03,PCL2023A09+5 种基金Shenzhen Basic Research,Grant/Award Number:JCYJ20220531095214031Guangdong Provincial Key Laboratory of Novel Security Intelligence Technologies,Grant/Award Number:2022B1212010005Shenzhen International Science and Technology Cooperation Project,Grant/Award Number:GJHZ20220913143008015Natural Science Foundation of Guangdong Province,Grant/Award Number:2023A1515011959Shenzhen-Hong Kong Jointly Funded Project,Grant/Award Number:SGDX20230116091246007Shenzhen Science and Technology Program,Grant/Award Numbers:RCBS20221008093131089,ZDSYS20210623091809029。
文摘Real-world data always exhibit an imbalanced and long-tailed distribution,which leads to poor performance for neural network-based classification.Existing methods mainly tackle this problem by reweighting the loss function or rebalancing the classifier.However,one crucial aspect overlooked by previous research studies is the imbalanced feature space problem caused by the imbalanced angle distribution.In this paper,the authors shed light on the significance of the angle distribution in achieving a balanced feature space,which is essential for improving model performance under long-tailed distributions.Nevertheless,it is challenging to effectively balance both the classifier norms and angle distribution due to problems such as the low feature norm.To tackle these challenges,the authors first thoroughly analyse the classifier and feature space by decoupling the classification logits into three key components:classifier norm(i.e.the magnitude of the classifier vector),feature norm(i.e.the magnitude of the feature vector),and cosine similarity between the classifier vector and feature vector.In this way,the authors analyse the change of each component in the training process and reveal three critical problems that should be solved,that is,the imbalanced angle distribution,the lack of feature discrimination,and the low feature norm.Drawing from this analysis,the authors propose a novel loss function that incorporates hyperspherical uniformity,additive angular margin,and feature norm regularisation.Each component of the loss function addresses a specific problem and synergistically contributes to achieving a balanced classifier and feature space.The authors conduct extensive experiments on three popular benchmark datasets including CIFAR-10/100-LT,ImageNet-LT,and iNaturalist 2018.The experimental results demonstrate that the authors’loss function outperforms several previous state-of-the-art methods in addressing the challenges posed by imbalanced and longtailed datasets,that is,by improving upon the best-performing baselines on CIFAR-100-LT by 1.34,1.41,1.41 and 1.33,respectively.
文摘In a context where urban satellite image processing technologies are undergoing rapid evolution,this article presents an innovative and rigorous approach to satellite image classification applied to urban planning.This research proposes an integrated methodological framework,based on the principles of model-driven engineering(MDE),to transform a generic meta-model into a meta-model specifically dedicated to urban satellite image classification.We implemented this transformation using the Atlas Transformation Language(ATL),guaranteeing a smooth and consistent transition from platform-independent model(PIM)to platform-specific model(PSM),according to the principles of model-driven architecture(MDA).The application of this IDM methodology enables advanced structuring of satellite data for targeted urban planning analyses,making it possible to classify various urban zones such as built-up,cultivated,arid and water areas.The novelty of this approach lies in the automation and standardization of the classification process,which significantly reduces the need for manual intervention,and thus improves the reliability,reproducibility and efficiency of urban data analysis.By adopting this method,decision-makers and urban planners are provided with a powerful tool for systematically and consistently analyzing and interpreting satellite images,facilitating decision-making in critical areas such as urban space management,infrastructure planning and environmental preservation.
文摘Medical image classification is crucial in disease diagnosis,treatment planning,and clinical decisionmaking.We introduced a novel medical image classification approach that integrates Bayesian Random Semantic Data Augmentation(BSDA)with a Vision Mamba-based model for medical image classification(MedMamba),enhanced by residual connection blocks,we named the model BSDA-Mamba.BSDA augments medical image data semantically,enhancing the model’s generalization ability and classification performance.MedMamba,a deep learning-based state space model,excels in capturing long-range dependencies in medical images.By incorporating residual connections,BSDA-Mamba further improves feature extraction capabilities.Through comprehensive experiments on eight medical image datasets,we demonstrate that BSDA-Mamba outperforms existing models in accuracy,area under the curve,and F1-score.Our results highlight BSDA-Mamba’s potential as a reliable tool for medical image analysis,particularly in handling diverse imaging modalities from X-rays to MRI.The open-sourcing of our model’s code and datasets,will facilitate the reproduction and extension of our work.
基金Institute of Information&Communications Technology Planning&Evaluation,Grant/Award Number:2022-0-00074。
文摘Few‐shot image classification is the task of classifying novel classes using extremely limited labelled samples.To perform classification using the limited samples,one solution is to learn the feature alignment(FA)information between the labelled and unlabelled sample features.Most FA methods use the feature mean as the class prototype and calculate the correlation between prototype and unlabelled features to learn an alignment strategy.However,mean prototypes tend to degenerate informative features because spatial features at the same position may not be equally important for the final classification,leading to inaccurate correlation calculations.Therefore,the authors propose an effective intraclass FA strategy that aggregates semantically similar spatial features from an adaptive reference prototype in low‐dimensional feature space to obtain an informative prototype feature map for precise correlation computation.Moreover,a dual correlation module to learn the hard and soft correlations was developed by the authors.This module combines the correlation information between the prototype and unlabelled features in both the original and learnable feature spaces,aiming to produce a comprehensive cross‐correlation between the prototypes and unlabelled features.Using both FA and cross‐attention modules,our model can maintain informative class features and capture important shared features for classification.Experimental results on three few‐shot classification benchmarks show that the proposed method outperformed related methods and resulted in a 3%performance boost in the 1‐shot setting by inserting the proposed module into the related methods.
基金Major Program of National Natural Science Foundation of China(NSFC12292980,NSFC12292984)National Key R&D Program of China(2023YFA1009000,2023YFA1009004,2020YFA0712203,2020YFA0712201)+2 种基金Major Program of National Natural Science Foundation of China(NSFC12031016)Beijing Natural Science Foundation(BNSFZ210003)Department of Science,Technology and Information of the Ministry of Education(8091B042240).
文摘Gliomas have the highest mortality rate of all brain tumors.Correctly classifying the glioma risk period can help doctors make reasonable treatment plans and improve patients’survival rates.This paper proposes a hierarchical multi-scale attention feature fusion medical image classification network(HMAC-Net),which effectively combines global features and local features.The network framework consists of three parallel layers:The global feature extraction layer,the local feature extraction layer,and the multi-scale feature fusion layer.A linear sparse attention mechanism is designed in the global feature extraction layer to reduce information redundancy.In the local feature extraction layer,a bilateral local attention mechanism is introduced to improve the extraction of relevant information between adjacent slices.In the multi-scale feature fusion layer,a channel fusion block combining convolutional attention mechanism and residual inverse multi-layer perceptron is proposed to prevent gradient disappearance and network degradation and improve feature representation capability.The double-branch iterative multi-scale classification block is used to improve the classification performance.On the brain glioma risk grading dataset,the results of the ablation experiment and comparison experiment show that the proposed HMAC-Net has the best performance in both qualitative analysis of heat maps and quantitative analysis of evaluation indicators.On the dataset of skin cancer classification,the generalization experiment results show that the proposed HMAC-Net has a good generalization effect.
文摘Convolutional neural networks (CNNs) are widely used in image classification tasks, but their increasing model size and computation make them challenging to implement on embedded systems with constrained hardware resources. To address this issue, the MobileNetV1 network was developed, which employs depthwise convolution to reduce network complexity. MobileNetV1 employs a stride of 2 in several convolutional layers to decrease the spatial resolution of feature maps, thereby lowering computational costs. However, this stride setting can lead to a loss of spatial information, particularly affecting the detection and representation of smaller objects or finer details in images. To maintain the trade-off between complexity and model performance, a lightweight convolutional neural network with hierarchical multi-scale feature fusion based on the MobileNetV1 network is proposed. The network consists of two main subnetworks. The first subnetwork uses a depthwise dilated separable convolution (DDSC) layer to learn imaging features with fewer parameters, which results in a lightweight and computationally inexpensive network. Furthermore, depthwise dilated convolution in DDSC layer effectively expands the field of view of filters, allowing them to incorporate a larger context. The second subnetwork is a hierarchical multi-scale feature fusion (HMFF) module that uses parallel multi-resolution branches architecture to process the input feature map in order to extract the multi-scale feature information of the input image. Experimental results on the CIFAR-10, Malaria, and KvasirV1 datasets demonstrate that the proposed method is efficient, reducing the network parameters and computational cost by 65.02% and 39.78%, respectively, while maintaining the network performance compared to the MobileNetV1 baseline.
文摘Research has shown that chest radiography images of patients with different diseases, such as pneumonia, COVID-19, SARS, pneumothorax, etc., all exhibit some form of abnormality. Several deep learning techniques can be used to identify each of these anomalies in the chest x-ray images. Convolutional neural networks (CNNs) have shown great success in the fields of image recognition and image classification since there are numerous large-scale annotated image datasets available. The classification of medical images, particularly radiographic images, remains one of the biggest hurdles in medical diagnosis because of the restricted availability of annotated medical images. However, such difficulty can be solved by utilizing several deep learning strategies, including data augmentation and transfer learning. The aim was to build a model that would detect abnormalities in chest x-ray images with the highest probability. To do that, different models were built with different features. While making a CNN model, one of the main tasks is to tune the model by changing the hyperparameters and layers so that the model gives out good training and testing results. In our case, three different models were built, and finally, the last one gave out the best-predicted results. From that last model, we got 98% training accuracy, 84% validation, and 81% testing accuracy. The reason behind the final model giving out the best evaluation scores is that it was a well-fitted model. There was no overfitting or underfitting issues. Our aim with this project was to make a tool using the CNN model in R language, which will help detect abnormalities in radiography images. The tool will be able to detect diseases such as Pneumonia, Covid-19, Effusions, Infiltration, Pneumothorax, and others. Because of its high accuracy, this research chose to use supervised multi-class classification techniques as well as Convolutional Neural Networks (CNNs) to classify different chest x-ray images. CNNs are extremely efficient and successful at reducing the number of parameters while maintaining the quality of the primary model. CNNs are also trained to recognize the edges of various objects in any batch of images. CNNs automatically discover the relevant aspects in labeled data and learn the distinguishing features for each class by themselves.
基金supported by theCONAHCYT(Consejo Nacional deHumanidades,Ciencias y Tecnologias).
文摘The use of Explainable Artificial Intelligence(XAI)models becomes increasingly important for making decisions in smart healthcare environments.It is to make sure that decisions are based on trustworthy algorithms and that healthcare workers understand the decisions made by these algorithms.These models can potentially enhance interpretability and explainability in decision-making processes that rely on artificial intelligence.Nevertheless,the intricate nature of the healthcare field necessitates the utilization of sophisticated models to classify cancer images.This research presents an advanced investigation of XAI models to classify cancer images.It describes the different levels of explainability and interpretability associated with XAI models and the challenges faced in deploying them in healthcare applications.In addition,this study proposes a novel framework for cancer image classification that incorporates XAI models with deep learning and advanced medical imaging techniques.The proposed model integrates several techniques,including end-to-end explainable evaluation,rule-based explanation,and useradaptive explanation.The proposed XAI reaches 97.72%accuracy,90.72%precision,93.72%recall,96.72%F1-score,9.55%FDR,9.66%FOR,and 91.18%DOR.It will discuss the potential applications of the proposed XAI models in the smart healthcare environment.It will help ensure trust and accountability in AI-based decisions,which is essential for achieving a safe and reliable smart healthcare environment.
基金This research was funded by Prince Sattam bin Abdulaziz University(Project Number PSAU/2023/01/25387).
文摘The research aims to improve the performance of image recognition methods based on a description in the form of a set of keypoint descriptors.The main focus is on increasing the speed of establishing the relevance of object and etalon descriptions while maintaining the required level of classification efficiency.The class to be recognized is represented by an infinite set of images obtained from the etalon by applying arbitrary geometric transformations.It is proposed to reduce the descriptions for the etalon database by selecting the most significant descriptor components according to the information content criterion.The informativeness of an etalon descriptor is estimated by the difference of the closest distances to its own and other descriptions.The developed method determines the relevance of the full description of the recognized object with the reduced description of the etalons.Several practical models of the classifier with different options for establishing the correspondence between object descriptors and etalons are considered.The results of the experimental modeling of the proposed methods for a database including images of museum jewelry are presented.The test sample is formed as a set of images from the etalon database and out of the database with the application of geometric transformations of scale and rotation in the field of view.The practical problems of determining the threshold for the number of votes,based on which a classification decision is made,have been researched.Modeling has revealed the practical possibility of tenfold reducing descriptions with full preservation of classification accuracy.Reducing the descriptions by twenty times in the experiment leads to slightly decreased accuracy.The speed of the analysis increases in proportion to the degree of reduction.The use of reduction by the informativeness criterion confirmed the possibility of obtaining the most significant subset of features for classification,which guarantees a decent level of accuracy.
文摘Total shoulder arthroplasty is a standard restorative procedure practiced by orthopedists to diagnose shoulder arthritis in which a prosthesis replaces the whole joint or a part of the joint.It is often challenging for doctors to identify the exact model and manufacturer of the prosthesis when it is unknown.This paper proposes a transfer learning-based class imbalance-aware prosthesis detection method to detect the implant’s manufacturer automatically from shoulder X-ray images.The framework of the method proposes a novel training approach and a new set of batch-normalization,dropout,and fully convolutional layers in the head network.It employs cyclical learning rates and weighting-based loss calculation mechanism.These modifications aid in faster convergence,avoid local-minima stagnation,and remove the training bias caused by imbalanced dataset.The proposed method is evaluated using seven well-known pre-trained models of VGGNet,ResNet,and DenseNet families.Experimentation is performed on a shoulder implant benchmark dataset consisting of 597 shoulder X-ray images.The proposed method improves the classification performance of all pre-trained models by 10–12%.The DenseNet-201-based variant has achieved the highest classification accuracy of 89.5%,which is 10%higher than existing methods.Further,to validate and generalize the proposed method,the existing baseline dataset is supplemented to six classes,including samples of two more implant manufacturers.Experimental results have shown average accuracy of 86.7%for the extended dataset and show the preeminence of the proposed method.
基金Project supported by the Natural Science Foundation of Shandong Province,China (Grant No. ZR2021MF049)the Joint Fund of Natural Science Foundation of Shandong Province (Grant Nos. ZR2022LLZ012 and ZR2021LLZ001)。
文摘We redesign the parameterized quantum circuit in the quantum deep neural network, construct a three-layer structure as the hidden layer, and then use classical optimization algorithms to train the parameterized quantum circuit, thereby propose a novel hybrid quantum deep neural network(HQDNN) used for image classification. After bilinear interpolation reduces the original image to a suitable size, an improved novel enhanced quantum representation(INEQR) is used to encode it into quantum states as the input of the HQDNN. Multi-layer parameterized quantum circuits are used as the main structure to implement feature extraction and classification. The output results of parameterized quantum circuits are converted into classical data through quantum measurements and then optimized on a classical computer. To verify the performance of the HQDNN, we conduct binary classification and three classification experiments on the MNIST(Modified National Institute of Standards and Technology) data set. In the first binary classification, the accuracy of 0 and 4 exceeds98%. Then we compare the performance of three classification with other algorithms, the results on two datasets show that the classification accuracy is higher than that of quantum deep neural network and general quantum convolutional neural network.
基金funded by Researchers Supporting Program at King Saud University,(RSPD2024R809).
文摘In blood or bone marrow,leukemia is a form of cancer.A person with leukemia has an expansion of white blood cells(WBCs).It primarily affects children and rarely affects adults.Treatment depends on the type of leukemia and the extent to which cancer has established throughout the body.Identifying leukemia in the initial stage is vital to providing timely patient care.Medical image-analysis-related approaches grant safer,quicker,and less costly solutions while ignoring the difficulties of these invasive processes.It can be simple to generalize Computer vision(CV)-based and image-processing techniques and eradicate human error.Many researchers have implemented computer-aided diagnosticmethods andmachine learning(ML)for laboratory image analysis,hopefully overcoming the limitations of late leukemia detection and determining its subgroups.This study establishes a Marine Predators Algorithm with Deep Learning Leukemia Cancer Classification(MPADL-LCC)algorithm onMedical Images.The projectedMPADL-LCC system uses a bilateral filtering(BF)technique to pre-process medical images.The MPADL-LCC system uses Faster SqueezeNet withMarine Predators Algorithm(MPA)as a hyperparameter optimizer for feature extraction.Lastly,the denoising autoencoder(DAE)methodology can be executed to accurately detect and classify leukemia cancer.The hyperparameter tuning process using MPA helps enhance leukemia cancer classification performance.Simulation results are compared with other recent approaches concerning various measurements and the MPADL-LCC algorithm exhibits the best results over other recent approaches.
基金Institutional Fund Projects under Grant No.(IFPIP:638-830-1443).
文摘The utilization of visual attention enhances the performance of image classification tasks.Previous attentionbased models have demonstrated notable performance,but many of these models exhibit reduced accuracy when confronted with inter-class and intra-class similarities and differences.Neural-Controlled Differential Equations(N-CDE’s)and Neural Ordinary Differential Equations(NODE’s)are extensively utilized within this context.NCDE’s possesses the capacity to effectively illustrate both inter-class and intra-class similarities and differences with enhanced clarity.To this end,an attentive neural network has been proposed to generate attention maps,which uses two different types of N-CDE’s,one for adopting hidden layers and the other to generate attention values.Two distinct attention techniques are implemented including time-wise attention,also referred to as bottom N-CDE’s;and element-wise attention,called topN-CDE’s.Additionally,a trainingmethodology is proposed to guarantee that the training problem is sufficiently presented.Two classification tasks including fine-grained visual classification andmulti-label classification,are utilized to evaluate the proposedmodel.The proposedmethodology is employed on five publicly available datasets,including CUB-200-2011,ImageNet-1K,PASCAL VOC 2007,PASCAL VOC 2012,and MS COCO.The obtained visualizations have demonstrated that N-CDE’s are better appropriate for attention-based activities in comparison to conventional NODE’s.