Image-based similar trademark retrieval is a time-consuming and labor-intensive task in the trademark examination process.This paper aims to support trademark examiners by training Deep Convolutional Neural Network(DC...Image-based similar trademark retrieval is a time-consuming and labor-intensive task in the trademark examination process.This paper aims to support trademark examiners by training Deep Convolutional Neural Network(DCNN)models for effective Trademark Image Retrieval(TIR).To achieve this goal,we first develop a novel labeling method that automatically generates hundreds of thousands of labeled similar and dissimilar trademark image pairs using accompanying data fields such as citation lists,Vienna classification(VC)codes,and trademark ownership information.This approach eliminates the need for manual labeling and provides a large-scale dataset suitable for training deep learning models.We then train DCNN models based on Siamese and Triplet architectures,evaluating various feature extractors to determine the most effective configuration.Furthermore,we present an Adapted Contrastive Loss Function(ACLF)for the trademark retrieval task,specifically engineered to mitigate the influence of noisy labels found in automatically created datasets.Experimental results indicate that our proposed model(Efficient-Net_v21_Siamese)performs best at both True Negative Rate(TNR)threshold levels,TNR 0.9 and TNR 0.95,with==respective True Positive Rates(TPRs)of 77.7%and 70.8%and accuracies of 83.9%and 80.4%.Additionally,when testing on the public trademark dataset METU_v2,our model achieves a normalized average rank(NAR)of 0.0169,outperforming the current state-of-the-art(SOTA)model.Based on these findings,we estimate that considering only approximately 10%of the returned trademarks would be sufficient,significantly reducing the review time.Therefore,the paper highlights the potential of utilizing national trademark data to enhance the accuracy and efficiency of trademark retrieval systems,ultimately supporting trademark examiners in their evaluation tasks.展开更多
Content-Based Image Retrieval(CBIR)and image mining are becoming more important study fields in computer vision due to their wide range of applications in healthcare,security,and various domains.The image retrieval sy...Content-Based Image Retrieval(CBIR)and image mining are becoming more important study fields in computer vision due to their wide range of applications in healthcare,security,and various domains.The image retrieval system mainly relies on the efficiency and accuracy of the classification models.This research addresses the challenge of enhancing the image retrieval system by developing a novel approach,EfficientNet-Convolutional Neural Network(EffNet-CNN).The key objective of this research is to evaluate the proposed EffNet-CNN model’s performance in image classification,image mining,and CBIR.The novelty of the proposed EffNet-CNN model includes the integration of different techniques and modifications.The model includes the Mahalanobis distance metric for feature matching,which enhances the similarity measurements.The model extends EfficientNet architecture by incorporating additional convolutional layers,batch normalization,dropout,and pooling layers for improved hierarchical feature extraction.A systematic hyperparameter optimization using SGD,performance evaluation with three datasets,and data normalization for improving feature representations.The EffNet-CNN is assessed utilizing precision,accuracy,F-measure,and recall metrics across MS-COCO,CIFAR-10 and 100 datasets.The model achieved accuracy values ranging from 90.60%to 95.90%for the MS-COCO dataset,96.8%to 98.3%for the CIFAR-10 dataset and 92.9%to 98.6%for the CIFAR-100 dataset.A validation of the EffNet-CNN model’s results with other models reveals the proposed model’s superior performance.The results highlight the potential of the EffNet-CNN model proposed for image classification and its usefulness in image mining and CBIR.展开更多
Medical institutions frequently utilize cloud servers for storing digital medical imaging data, aiming to lower both storage expenses and computational expenses. Nevertheless, the reliability of cloud servers as third...Medical institutions frequently utilize cloud servers for storing digital medical imaging data, aiming to lower both storage expenses and computational expenses. Nevertheless, the reliability of cloud servers as third-party providers is not always guaranteed. To safeguard against the exposure and misuse of personal privacy information, and achieve secure and efficient retrieval, a secure medical image retrieval based on a multi-attention mechanism and triplet deep hashing is proposed in this paper (abbreviated as MATDH). Specifically, this method first utilizes the contrast-limited adaptive histogram equalization method applicable to color images to enhance chest X-ray images. Next, a designed multi-attention mechanism focuses on important local features during the feature extraction stage. Moreover, a triplet loss function is utilized to learn discriminative hash codes to construct a compact and efficient triplet deep hashing. Finally, upsampling is used to restore the original resolution of the images during retrieval, thereby enabling more accurate matching. To ensure the security of medical image data, a lightweight image encryption method based on frequency domain encryption is designed to encrypt the chest X-ray images. The findings of the experiment indicate that, in comparison to various advanced image retrieval techniques, the suggested approach improves the precision of feature extraction and retrieval using the COVIDx dataset. Additionally, it offers enhanced protection for the confidentiality of medical images stored in cloud settings and demonstrates strong practicality.展开更多
This paper introduces a novel method for medical image retrieval and classification by integrating a multi-scale encoding mechanism with Vision Transformer(ViT)architectures and a dynamic multi-loss function.The multi...This paper introduces a novel method for medical image retrieval and classification by integrating a multi-scale encoding mechanism with Vision Transformer(ViT)architectures and a dynamic multi-loss function.The multi-scale encoding significantly enhances the model’s ability to capture both fine-grained and global features,while the dynamic loss function adapts during training to optimize classification accuracy and retrieval performance.Our approach was evaluated on the ISIC-2018 and ChestX-ray14 datasets,yielding notable improvements.Specifically,on the ISIC-2018 dataset,our method achieves an F1-Score improvement of+4.84% compared to the standard ViT,with a precision increase of+5.46% for melanoma(MEL).On the ChestX-ray14 dataset,the method delivers an F1-Score improvement of 5.3%over the conventional ViT,with precision gains of+5.0% for pneumonia(PNEU)and+5.4%for fibrosis(FIB).Experimental results demonstrate that our approach outperforms traditional CNN-based models and existing ViT variants,particularly in retrieving relevant medical cases and enhancing diagnostic accuracy.These findings highlight the potential of the proposedmethod for large-scalemedical image analysis,offering improved tools for clinical decision-making through superior classification and case comparison.展开更多
Crime scene investigation(CSI)image is key evidence carrier during criminal investiga-tion,in which CSI image retrieval can assist the public police to obtain criminal clues.Moreover,with the rapid development of deep...Crime scene investigation(CSI)image is key evidence carrier during criminal investiga-tion,in which CSI image retrieval can assist the public police to obtain criminal clues.Moreover,with the rapid development of deep learning,data-driven paradigm has become the mainstreammethod of CSI image feature extraction and representation,and in this process,datasets provideeffective support for CSI retrieval performance.However,there is a lack of systematic research onCSI image retrieval methods and datasets.Therefore,we present an overview of the existing worksabout one-class and multi-class CSI image retrieval based on deep learning.According to theresearch,based on their technical functionalities and implementation methods,CSI image retrievalis roughly classified into five categories:feature representation,metric learning,generative adversar-ial networks,autoencoder networks and attention networks.Furthermore,We analyzed the remain-ing challenges and discussed future work directions in this field.展开更多
Background A medical content-based image retrieval(CBIR)system is designed to retrieve images from large imaging repositories that are visually similar to a user′s query image.CBIR is widely used in evidence-based di...Background A medical content-based image retrieval(CBIR)system is designed to retrieve images from large imaging repositories that are visually similar to a user′s query image.CBIR is widely used in evidence-based diagnosis,teaching,and research.Although the retrieval accuracy has largely improved,there has been limited development toward visualizing important image features that indicate the similarity of retrieved images.Despite the prevalence of 3D volumetric data in medical imaging such as computed tomography(CT),current CBIR systems still rely on 2D cross-sectional views for the visualization of retrieved images.Such 2D visualization requires users to browse through the image stacks to confirm the similarity of the retrieved images and often involves mental reconstruction of 3D information,including the size,shape,and spatial relations of multiple structures.This process is time-consuming and reliant on users'experience.Methods In this study,we proposed an importance-aware 3D volume visualization method.The rendering parameters were automatically optimized to maximize the visibility of important structures that were detected and prioritized in the retrieval process.We then integrated the proposed visualization into a CBIR system,thereby complementing the 2D cross-sectional views for relevance feedback and further analyses.Results Our preliminary results demonstrate that 3D visualization can provide additional information using multimodal positron emission tomography and computed tomography(PETCT)images of a non-small cell lung cancer dataset.展开更多
This paper presents an approach to improve medical image retrieval, particularly for brain tumors, by addressing the gap between low-level visual and high-level perceived contents in MRI, X-ray, and CT scans. Traditio...This paper presents an approach to improve medical image retrieval, particularly for brain tumors, by addressing the gap between low-level visual and high-level perceived contents in MRI, X-ray, and CT scans. Traditional methods based on color, shape, or texture are less effective. The proposed solution uses machine learning to handle high-dimensional image features, reducing computational complexity and mitigating issues caused by artifacts or noise. It employs a genetic algorithm for feature reduction and a hybrid residual UNet(HResUNet) model for Region-of-Interest(ROI) segmentation and classification, with enhanced image preprocessing. The study examines various loss functions, finding that a hybrid loss function yields superior results, and the GA-HResUNet model outperforms the HResUNet. Comparative analysis with state-of-the-art models shows a 4% improvement in retrieval accuracy.展开更多
The task of indoor visual localization, utilizing camera visual information for user pose calculation, was a core component of Augmented Reality (AR) and Simultaneous Localization and Mapping (SLAM). Existing indoor l...The task of indoor visual localization, utilizing camera visual information for user pose calculation, was a core component of Augmented Reality (AR) and Simultaneous Localization and Mapping (SLAM). Existing indoor localization technologies generally used scene-specific 3D representations or were trained on specific datasets, making it challenging to balance accuracy and cost when applied to new scenes. Addressing this issue, this paper proposed a universal indoor visual localization method based on efficient image retrieval. Initially, a Multi-Layer Perceptron (MLP) was employed to aggregate features from intermediate layers of a convolutional neural network, obtaining a global representation of the image. This approach ensured accurate and rapid retrieval of reference images. Subsequently, a new mechanism using Random Sample Consensus (RANSAC) was designed to resolve relative pose ambiguity caused by the essential matrix decomposition based on the five-point method. Finally, the absolute pose of the queried user image was computed, thereby achieving indoor user pose estimation. The proposed indoor localization method was characterized by its simplicity, flexibility, and excellent cross-scene generalization. Experimental results demonstrated a positioning error of 0.09 m and 2.14° on the 7Scenes dataset, and 0.15 m and 6.37° on the 12Scenes dataset. These results convincingly illustrated the outstanding performance of the proposed indoor localization method.展开更多
In order to narrow the semantic gap existing in content-based image retrieval (CBIR),a novel retrieval technology called auto-extended multi query examples (AMQE) is proposed.It expands the single one query image ...In order to narrow the semantic gap existing in content-based image retrieval (CBIR),a novel retrieval technology called auto-extended multi query examples (AMQE) is proposed.It expands the single one query image used in traditional image retrieval into multi query examples so as to include more image features related with semantics.Retrieving images for each of the multi query examples and integrating the retrieval results,more relevant images can be obtained.The property of the recall-precision curve of a general retrieval algorithm and the K-means clustering method are used to realize the expansion according to the distance of image features of the initially retrieved images.The experimental results demonstrate that the AMQE technology can greatly improve the recall and precision of the original algorithms.展开更多
This paper describes a new method for active learning in content-based image retrieval. The proposed method firstly uses support vector machine (SVM) classifiers to learn an initial query concept. Then the proposed ac...This paper describes a new method for active learning in content-based image retrieval. The proposed method firstly uses support vector machine (SVM) classifiers to learn an initial query concept. Then the proposed active learning scheme employs similarity measure to check the current version space and selects images with maximum expected information gain to solicit user's label. Finally, the learned query is refined based on the user's further feedback. With the combination of SVM classifier and similarity measure, the proposed method can alleviate model bias existing in each of them. Our experiments on several query concepts show that the proposed method can learn the user's query concept quickly and effectively only with several iterations.展开更多
Pill image recognition is an important field in computer vision.It has become a vital technology in healthcare and pharmaceuticals due to the necessity for precise medication identification to prevent errors and ensur...Pill image recognition is an important field in computer vision.It has become a vital technology in healthcare and pharmaceuticals due to the necessity for precise medication identification to prevent errors and ensure patient safety.This survey examines the current state of pill image recognition,focusing on advancements,methodologies,and the challenges that remain unresolved.It provides a comprehensive overview of traditional image processing-based,machine learning-based,deep learning-based,and hybrid-based methods,and aims to explore the ongoing difficulties in the field.We summarize and classify the methods used in each article,compare the strengths and weaknesses of traditional image processing-based,machine learning-based,deep learning-based,and hybrid-based methods,and review benchmark datasets for pill image recognition.Additionally,we compare the performance of proposed methods on popular benchmark datasets.This survey applies recent advancements,such as Transformer models and cutting-edge technologies like Augmented Reality(AR),to discuss potential research directions and conclude the review.By offering a holistic perspective,this paper aims to serve as a valuable resource for researchers and practitioners striving to advance the field of pill image recognition.展开更多
This paper presents an efficient image feature representation method, namely angle structure descriptor(ASD), which is built based on the angle structures of images. According to the diversity in directions, angle str...This paper presents an efficient image feature representation method, namely angle structure descriptor(ASD), which is built based on the angle structures of images. According to the diversity in directions, angle structures are defined in local blocks. Combining color information in HSV color space, we use angle structures to detect images. The internal correlations between neighboring pixels in angle structures are explored to form a feature vector. With angle structures as bridges, ASD extracts image features by integrating multiple information as a whole, such as color, texture, shape and spatial layout information. In addition, the proposed algorithm is efficient for image retrieval without any clustering implementation or model training. Experimental results demonstrate that ASD outperforms the other related algorithms.展开更多
Flower image retrieval is a very important step for computer-aided plant species recognition. In this paper, we propose an efficient segmentation method based on color clustering and domain knowledge to extract flower...Flower image retrieval is a very important step for computer-aided plant species recognition. In this paper, we propose an efficient segmentation method based on color clustering and domain knowledge to extract flower regions from flower images. For flower retrieval, we use the color histogram of a flower region to characterize the color features of flower and two shape-based features sets, Centroid-Contour Distance (CCD) and Angle Code Histogram (ACH), to characterize the shape features of a flower contour. Experimental results showed that our flower region extraction method based on color clustering and domain knowledge can produce accurate flower regions. Flower retrieval results on a database of 885 flower images collected from 14 plant species showed that our Region-of-Interest (ROI) based retrieval approach using both color and shape features can perform better than a method based on the global color histogram proposed by Swain and Ballard (1991) and a method based on domain knowledge-driven segmentation and color names proposed by Das et al.(1999).展开更多
Deep convolutional neural networks(DCNNs)are widely used in content-based image retrieval(CBIR)because of the advantages in image feature extraction.However,the training of deep neural networks requires a large number...Deep convolutional neural networks(DCNNs)are widely used in content-based image retrieval(CBIR)because of the advantages in image feature extraction.However,the training of deep neural networks requires a large number of labeled data,which limits the application.Self-supervised learning is a more general approach in unlabeled scenarios.A method of fine-tuning feature extraction networks based on masked learning is proposed.Masked autoencoders(MAE)are used in the fine-tune vision transformer(ViT)model.In addition,the scheme of extracting image descriptors is discussed.The encoder of the MAE uses the ViT to extract global features and performs self-supervised fine-tuning by reconstructing masked area pixels.The method works well on category-level image retrieval datasets with marked improvements in instance-level datasets.For the instance-level datasets Oxford5k and Paris6k,the retrieval accuracy of the base model is improved by 7%and 17%compared to that of the original model,respectively.展开更多
The implementation of content-based image retrieval(CBIR)mainly depends on two key technologies:image feature extraction and image feature matching.In this paper,we extract the color features based on Global Color His...The implementation of content-based image retrieval(CBIR)mainly depends on two key technologies:image feature extraction and image feature matching.In this paper,we extract the color features based on Global Color Histogram(GCH)and texture features based on Gray Level Co-occurrence Matrix(GLCM).In order to obtain the effective and representative features of the image,we adopt the fuzzy mathematical algorithm in the process of color feature extraction and texture feature extraction respectively.And we combine the fuzzy color feature vector with the fuzzy texture feature vector to form the comprehensive fuzzy feature vector of the image according to a certain way.Image feature matching mainly depends on the similarity between two image feature vectors.In this paper,we propose a novel similarity measure method based on k-Nearest Neighbors(kNN)and fuzzy mathematical algorithm(SBkNNF).Finding out the k nearest neighborhood images of the query image from the image data set according to an appropriate similarity measure method.Using the k similarity values between the query image and its k neighborhood images to constitute the new k-dimensional fuzzy feature vector corresponding to the query image.And using the k similarity values between the retrieved image and the k neighborhood images of the query image to constitute the new k-dimensional fuzzy feature vector corresponding to the retrieved image.Calculating the similarity between the two kdimensional fuzzy feature vector according to a certain fuzzy similarity algorithm to measure the similarity between the query image and the retrieved image.Extensive experiments are carried out on three data sets:WANG data set,Corel-5k data set and Corel-10k data set.The experimental results show that the outperforming retrieval performance of our proposed CBIR system with the other CBIR systems.展开更多
With the massive growth of images data and the rise of cloud computing that can provide cheap storage space and convenient access,more and more users store data in cloud server.However,how to quickly query the expecte...With the massive growth of images data and the rise of cloud computing that can provide cheap storage space and convenient access,more and more users store data in cloud server.However,how to quickly query the expected data with privacy-preserving is still a challenging in the encryption image data retrieval.Towards this goal,this paper proposes a ciphertext image retrieval method based on SimHash in cloud computing.Firstly,we extract local feature of images,and then cluster the features by K-means.Based on it,the visual word codebook is introduced to represent feature information of images,which hashes the codebook to the corresponding fingerprint.Finally,the image feature vector is generated by SimHash searchable encryption feature algorithm for similarity retrieval.Extensive experiments on two public datasets validate the effectiveness of our method.Besides,the proposed method outperforms one popular searchable encryption,and the results are competitive to the state-of-the-art.展开更多
<div style="text-align:justify;"> Digital image collection as rapidly increased along with the development of computer network. Image retrieval system was developed purposely to provide an efficient to...<div style="text-align:justify;"> Digital image collection as rapidly increased along with the development of computer network. Image retrieval system was developed purposely to provide an efficient tool for a set of images from a collection of images in the database that matches the user’s requirements in similarity evaluations such as image content similarity, edge, and color similarity. Retrieving images based on the content which is color, texture, and shape is called content based image retrieval (CBIR). The content is actually the feature of an image and these features are extracted and used as the basis for a similarity check between images. The algorithms used to calculate the similarity between extracted features. There are two kinds of content based image retrieval which are general image retrieval and application specific image retrieval. For the general image retrieval, the goal of the query is to obtain images with the same object as the query. Such CBIR imitates web search engines for images rather than for text. For application specific, the purpose tries to match a query image to a collection of images of a specific type such as fingerprints image and x-ray. In this paper, the general architecture, various functional components, and techniques of CBIR system are discussed. CBIR techniques discussed in this paper are categorized as CBIR using color, CBIR using texture, and CBIR using shape features. This paper also describe about the comparison study about color features, texture features, shape features, and combined features (hybrid techniques) in terms of several parameters. The parameters are precision, recall and response time. </div>展开更多
A novel image retrieval approach based on color features and anisotropic directional information is proposed for content based image retrieval systems (CBIR). The color feature is described by the color histogram ...A novel image retrieval approach based on color features and anisotropic directional information is proposed for content based image retrieval systems (CBIR). The color feature is described by the color histogram (CH), which is translation and rotation invariant. However, the CH does not contain spatial information which is very important for the image retrieval. To overcome this shortcoming, the subband energy of the lifting directionlet transform (L-DT) is proposed to describe the directional information, in which L-DT is characterized by multi-direction and anisotropic basis functions compared with the wavelet transform. A global similarity measure is designed to implement the fusion of both color feature and anisotropic directionality for the retrieval process. The retrieval experiments using a set of COREL images demonstrate that the higher query precision and better visual effect can be achieved.展开更多
In content-based image retrieval(CBIR),primitive image signatures are critical because they represent the visual characteristics.Image signatures,which are algorithmically descriptive and accurately recognized visual ...In content-based image retrieval(CBIR),primitive image signatures are critical because they represent the visual characteristics.Image signatures,which are algorithmically descriptive and accurately recognized visual components,are used to appropriately index and retrieve comparable results.To differentiate an image in the category of qualifying contender,feature vectors must have image information's like colour,objects,shape,spatial viewpoints.Previous methods such as sketch-based image retrieval by salient contour(SBIR)and greedy learning of deep Boltzmann machine(GDBM)used spatial information to distinguish between image categories.This requires interest points and also feature analysis emerged image detection problems.Thus,a proposed model to overcome this issue and predict the repeating pattern as well as series of pixels that conclude similarity has been necessary.In this study,a technique called CBIR-similarity measure via artificial neural network interpolation(CBIR-SMANN)has been presented.By collecting datasets,the images are resized then subject to Gaussian filtering in the pre-processing stage,then by permitting them to the Hessian detector,the interesting points are gathered.Based on Skewness,mean,kurtosis and standard deviation features were extracted then given to ANN for interpolation.Interpolated results are stored in a database for retrieval.In the testing stage,the query image was inputted that is subjected to pre-processing,and feature extraction was then fed to the similarity measurement function.Thus,ANN helps to get similar images from the database.CBIR-SMANN have been implemented in the python tool and then evaluated for its performance.Results show that CBIR-SMANN exhibited a high recall value of 78%with a minimum retrieval time of 980 ms.This showed the supremacy of the proposed model was comparatively greater than the previous ones.展开更多
AIM:To present a content-based image retrieval(CBIR) system that supports the classification of breast tissue density and can be used in the processing chain to adapt parameters for lesion segmentation and classificat...AIM:To present a content-based image retrieval(CBIR) system that supports the classification of breast tissue density and can be used in the processing chain to adapt parameters for lesion segmentation and classification.METHODS:Breast density is characterized by image texture using singular value decomposition(SVD) and histograms.Pattern similarity is computed by a support vector machine(SVM) to separate the four BI-RADS tissue categories.The crucial number of remaining singular values is varied(SVD),and linear,radial,and polynomial kernels are investigated(SVM).The system is supported by a large reference database for training and evaluation.Experiments are based on 5-fold cross validation.RESULTS:Adopted from DDSM,MIAS,LLNL,and RWTH datasets,the reference database is composed of over 10000 various mammograms with unified and reliable ground truth.An average precision of 82.14% is obtained using 25 singular values(SVD),polynomial kernel and the one-against-one(SVM).CONCLUSION:Breast density characterization using SVD allied with SVM for image retrieval enable the development of a CBIR system that can effectively aid radiologists in their diagnosis.展开更多
基金funded by the Institute of InformationTechnology,VietnamAcademy of Science and Technology(project number CSCL02.02/22-23)“Research and Development of Methods for Searching Similar Trademark Images Using Machine Learning to Support Trademark Examination in Vietnam”.
文摘Image-based similar trademark retrieval is a time-consuming and labor-intensive task in the trademark examination process.This paper aims to support trademark examiners by training Deep Convolutional Neural Network(DCNN)models for effective Trademark Image Retrieval(TIR).To achieve this goal,we first develop a novel labeling method that automatically generates hundreds of thousands of labeled similar and dissimilar trademark image pairs using accompanying data fields such as citation lists,Vienna classification(VC)codes,and trademark ownership information.This approach eliminates the need for manual labeling and provides a large-scale dataset suitable for training deep learning models.We then train DCNN models based on Siamese and Triplet architectures,evaluating various feature extractors to determine the most effective configuration.Furthermore,we present an Adapted Contrastive Loss Function(ACLF)for the trademark retrieval task,specifically engineered to mitigate the influence of noisy labels found in automatically created datasets.Experimental results indicate that our proposed model(Efficient-Net_v21_Siamese)performs best at both True Negative Rate(TNR)threshold levels,TNR 0.9 and TNR 0.95,with==respective True Positive Rates(TPRs)of 77.7%and 70.8%and accuracies of 83.9%and 80.4%.Additionally,when testing on the public trademark dataset METU_v2,our model achieves a normalized average rank(NAR)of 0.0169,outperforming the current state-of-the-art(SOTA)model.Based on these findings,we estimate that considering only approximately 10%of the returned trademarks would be sufficient,significantly reducing the review time.Therefore,the paper highlights the potential of utilizing national trademark data to enhance the accuracy and efficiency of trademark retrieval systems,ultimately supporting trademark examiners in their evaluation tasks.
基金The authors extend their appreciation to the Deanship of Research and Graduate Studies at King Khalid University,Kingdom of Saudi Arabia,for funding this work through the Small Research Group Project under Grant Number RGP.1/316/45.
文摘Content-Based Image Retrieval(CBIR)and image mining are becoming more important study fields in computer vision due to their wide range of applications in healthcare,security,and various domains.The image retrieval system mainly relies on the efficiency and accuracy of the classification models.This research addresses the challenge of enhancing the image retrieval system by developing a novel approach,EfficientNet-Convolutional Neural Network(EffNet-CNN).The key objective of this research is to evaluate the proposed EffNet-CNN model’s performance in image classification,image mining,and CBIR.The novelty of the proposed EffNet-CNN model includes the integration of different techniques and modifications.The model includes the Mahalanobis distance metric for feature matching,which enhances the similarity measurements.The model extends EfficientNet architecture by incorporating additional convolutional layers,batch normalization,dropout,and pooling layers for improved hierarchical feature extraction.A systematic hyperparameter optimization using SGD,performance evaluation with three datasets,and data normalization for improving feature representations.The EffNet-CNN is assessed utilizing precision,accuracy,F-measure,and recall metrics across MS-COCO,CIFAR-10 and 100 datasets.The model achieved accuracy values ranging from 90.60%to 95.90%for the MS-COCO dataset,96.8%to 98.3%for the CIFAR-10 dataset and 92.9%to 98.6%for the CIFAR-100 dataset.A validation of the EffNet-CNN model’s results with other models reveals the proposed model’s superior performance.The results highlight the potential of the EffNet-CNN model proposed for image classification and its usefulness in image mining and CBIR.
基金supported by the NationalNatural Science Foundation of China(No.61862041).
文摘Medical institutions frequently utilize cloud servers for storing digital medical imaging data, aiming to lower both storage expenses and computational expenses. Nevertheless, the reliability of cloud servers as third-party providers is not always guaranteed. To safeguard against the exposure and misuse of personal privacy information, and achieve secure and efficient retrieval, a secure medical image retrieval based on a multi-attention mechanism and triplet deep hashing is proposed in this paper (abbreviated as MATDH). Specifically, this method first utilizes the contrast-limited adaptive histogram equalization method applicable to color images to enhance chest X-ray images. Next, a designed multi-attention mechanism focuses on important local features during the feature extraction stage. Moreover, a triplet loss function is utilized to learn discriminative hash codes to construct a compact and efficient triplet deep hashing. Finally, upsampling is used to restore the original resolution of the images during retrieval, thereby enabling more accurate matching. To ensure the security of medical image data, a lightweight image encryption method based on frequency domain encryption is designed to encrypt the chest X-ray images. The findings of the experiment indicate that, in comparison to various advanced image retrieval techniques, the suggested approach improves the precision of feature extraction and retrieval using the COVIDx dataset. Additionally, it offers enhanced protection for the confidentiality of medical images stored in cloud settings and demonstrates strong practicality.
基金funded by the Deanship of Research and Graduate Studies at King Khalid University through small group research under grant number RGP1/278/45.
文摘This paper introduces a novel method for medical image retrieval and classification by integrating a multi-scale encoding mechanism with Vision Transformer(ViT)architectures and a dynamic multi-loss function.The multi-scale encoding significantly enhances the model’s ability to capture both fine-grained and global features,while the dynamic loss function adapts during training to optimize classification accuracy and retrieval performance.Our approach was evaluated on the ISIC-2018 and ChestX-ray14 datasets,yielding notable improvements.Specifically,on the ISIC-2018 dataset,our method achieves an F1-Score improvement of+4.84% compared to the standard ViT,with a precision increase of+5.46% for melanoma(MEL).On the ChestX-ray14 dataset,the method delivers an F1-Score improvement of 5.3%over the conventional ViT,with precision gains of+5.0% for pneumonia(PNEU)and+5.4%for fibrosis(FIB).Experimental results demonstrate that our approach outperforms traditional CNN-based models and existing ViT variants,particularly in retrieving relevant medical cases and enhancing diagnostic accuracy.These findings highlight the potential of the proposedmethod for large-scalemedical image analysis,offering improved tools for clinical decision-making through superior classification and case comparison.
文摘Crime scene investigation(CSI)image is key evidence carrier during criminal investiga-tion,in which CSI image retrieval can assist the public police to obtain criminal clues.Moreover,with the rapid development of deep learning,data-driven paradigm has become the mainstreammethod of CSI image feature extraction and representation,and in this process,datasets provideeffective support for CSI retrieval performance.However,there is a lack of systematic research onCSI image retrieval methods and datasets.Therefore,we present an overview of the existing worksabout one-class and multi-class CSI image retrieval based on deep learning.According to theresearch,based on their technical functionalities and implementation methods,CSI image retrievalis roughly classified into five categories:feature representation,metric learning,generative adversar-ial networks,autoencoder networks and attention networks.Furthermore,We analyzed the remain-ing challenges and discussed future work directions in this field.
文摘Background A medical content-based image retrieval(CBIR)system is designed to retrieve images from large imaging repositories that are visually similar to a user′s query image.CBIR is widely used in evidence-based diagnosis,teaching,and research.Although the retrieval accuracy has largely improved,there has been limited development toward visualizing important image features that indicate the similarity of retrieved images.Despite the prevalence of 3D volumetric data in medical imaging such as computed tomography(CT),current CBIR systems still rely on 2D cross-sectional views for the visualization of retrieved images.Such 2D visualization requires users to browse through the image stacks to confirm the similarity of the retrieved images and often involves mental reconstruction of 3D information,including the size,shape,and spatial relations of multiple structures.This process is time-consuming and reliant on users'experience.Methods In this study,we proposed an importance-aware 3D volume visualization method.The rendering parameters were automatically optimized to maximize the visibility of important structures that were detected and prioritized in the retrieval process.We then integrated the proposed visualization into a CBIR system,thereby complementing the 2D cross-sectional views for relevance feedback and further analyses.Results Our preliminary results demonstrate that 3D visualization can provide additional information using multimodal positron emission tomography and computed tomography(PETCT)images of a non-small cell lung cancer dataset.
文摘This paper presents an approach to improve medical image retrieval, particularly for brain tumors, by addressing the gap between low-level visual and high-level perceived contents in MRI, X-ray, and CT scans. Traditional methods based on color, shape, or texture are less effective. The proposed solution uses machine learning to handle high-dimensional image features, reducing computational complexity and mitigating issues caused by artifacts or noise. It employs a genetic algorithm for feature reduction and a hybrid residual UNet(HResUNet) model for Region-of-Interest(ROI) segmentation and classification, with enhanced image preprocessing. The study examines various loss functions, finding that a hybrid loss function yields superior results, and the GA-HResUNet model outperforms the HResUNet. Comparative analysis with state-of-the-art models shows a 4% improvement in retrieval accuracy.
文摘The task of indoor visual localization, utilizing camera visual information for user pose calculation, was a core component of Augmented Reality (AR) and Simultaneous Localization and Mapping (SLAM). Existing indoor localization technologies generally used scene-specific 3D representations or were trained on specific datasets, making it challenging to balance accuracy and cost when applied to new scenes. Addressing this issue, this paper proposed a universal indoor visual localization method based on efficient image retrieval. Initially, a Multi-Layer Perceptron (MLP) was employed to aggregate features from intermediate layers of a convolutional neural network, obtaining a global representation of the image. This approach ensured accurate and rapid retrieval of reference images. Subsequently, a new mechanism using Random Sample Consensus (RANSAC) was designed to resolve relative pose ambiguity caused by the essential matrix decomposition based on the five-point method. Finally, the absolute pose of the queried user image was computed, thereby achieving indoor user pose estimation. The proposed indoor localization method was characterized by its simplicity, flexibility, and excellent cross-scene generalization. Experimental results demonstrated a positioning error of 0.09 m and 2.14° on the 7Scenes dataset, and 0.15 m and 6.37° on the 12Scenes dataset. These results convincingly illustrated the outstanding performance of the proposed indoor localization method.
基金The National High Technology Research and Develop-ment Program of China (863 Program) (No.2002AA413420).
文摘In order to narrow the semantic gap existing in content-based image retrieval (CBIR),a novel retrieval technology called auto-extended multi query examples (AMQE) is proposed.It expands the single one query image used in traditional image retrieval into multi query examples so as to include more image features related with semantics.Retrieving images for each of the multi query examples and integrating the retrieval results,more relevant images can be obtained.The property of the recall-precision curve of a general retrieval algorithm and the K-means clustering method are used to realize the expansion according to the distance of image features of the initially retrieved images.The experimental results demonstrate that the AMQE technology can greatly improve the recall and precision of the original algorithms.
文摘This paper describes a new method for active learning in content-based image retrieval. The proposed method firstly uses support vector machine (SVM) classifiers to learn an initial query concept. Then the proposed active learning scheme employs similarity measure to check the current version space and selects images with maximum expected information gain to solicit user's label. Finally, the learned query is refined based on the user's further feedback. With the combination of SVM classifier and similarity measure, the proposed method can alleviate model bias existing in each of them. Our experiments on several query concepts show that the proposed method can learn the user's query concept quickly and effectively only with several iterations.
文摘Pill image recognition is an important field in computer vision.It has become a vital technology in healthcare and pharmaceuticals due to the necessity for precise medication identification to prevent errors and ensure patient safety.This survey examines the current state of pill image recognition,focusing on advancements,methodologies,and the challenges that remain unresolved.It provides a comprehensive overview of traditional image processing-based,machine learning-based,deep learning-based,and hybrid-based methods,and aims to explore the ongoing difficulties in the field.We summarize and classify the methods used in each article,compare the strengths and weaknesses of traditional image processing-based,machine learning-based,deep learning-based,and hybrid-based methods,and review benchmark datasets for pill image recognition.Additionally,we compare the performance of proposed methods on popular benchmark datasets.This survey applies recent advancements,such as Transformer models and cutting-edge technologies like Augmented Reality(AR),to discuss potential research directions and conclude the review.By offering a holistic perspective,this paper aims to serve as a valuable resource for researchers and practitioners striving to advance the field of pill image recognition.
基金supported by the National Natural Science Foundation of China (No.61170145, 61373081, 61402268, 61401260, 61572298)the Technology and Development Project of Shandong (No.2013GGX10125)+1 种基金the Natural Science Foundation of Shandong China (No.BS2014DX006, ZR2014FM012)the Taishan Scholar Project of Shandong, China
文摘This paper presents an efficient image feature representation method, namely angle structure descriptor(ASD), which is built based on the angle structures of images. According to the diversity in directions, angle structures are defined in local blocks. Combining color information in HSV color space, we use angle structures to detect images. The internal correlations between neighboring pixels in angle structures are explored to form a feature vector. With angle structures as bridges, ASD extracts image features by integrating multiple information as a whole, such as color, texture, shape and spatial layout information. In addition, the proposed algorithm is efficient for image retrieval without any clustering implementation or model training. Experimental results demonstrate that ASD outperforms the other related algorithms.
基金Project (Nos. 60302012 60202002) supported by the NationaNatural Science Foundation of China and the Research GrantCouncil of the Hong Kong Special Administrative Region (NoPolyU 5119.01E) China
文摘Flower image retrieval is a very important step for computer-aided plant species recognition. In this paper, we propose an efficient segmentation method based on color clustering and domain knowledge to extract flower regions from flower images. For flower retrieval, we use the color histogram of a flower region to characterize the color features of flower and two shape-based features sets, Centroid-Contour Distance (CCD) and Angle Code Histogram (ACH), to characterize the shape features of a flower contour. Experimental results showed that our flower region extraction method based on color clustering and domain knowledge can produce accurate flower regions. Flower retrieval results on a database of 885 flower images collected from 14 plant species showed that our Region-of-Interest (ROI) based retrieval approach using both color and shape features can perform better than a method based on the global color histogram proposed by Swain and Ballard (1991) and a method based on domain knowledge-driven segmentation and color names proposed by Das et al.(1999).
基金the Project of Introducing Urgently Needed Talents in Key Supporting Regions of Shandong Province,China(No.SDJQP20221805)。
文摘Deep convolutional neural networks(DCNNs)are widely used in content-based image retrieval(CBIR)because of the advantages in image feature extraction.However,the training of deep neural networks requires a large number of labeled data,which limits the application.Self-supervised learning is a more general approach in unlabeled scenarios.A method of fine-tuning feature extraction networks based on masked learning is proposed.Masked autoencoders(MAE)are used in the fine-tune vision transformer(ViT)model.In addition,the scheme of extracting image descriptors is discussed.The encoder of the MAE uses the ViT to extract global features and performs self-supervised fine-tuning by reconstructing masked area pixels.The method works well on category-level image retrieval datasets with marked improvements in instance-level datasets.For the instance-level datasets Oxford5k and Paris6k,the retrieval accuracy of the base model is improved by 7%and 17%compared to that of the original model,respectively.
基金This research was supported by the National Natural Science Foundation of China(Grant Number:61702310)the National Natural Science Foundation of China(Grant Number:61401260).
文摘The implementation of content-based image retrieval(CBIR)mainly depends on two key technologies:image feature extraction and image feature matching.In this paper,we extract the color features based on Global Color Histogram(GCH)and texture features based on Gray Level Co-occurrence Matrix(GLCM).In order to obtain the effective and representative features of the image,we adopt the fuzzy mathematical algorithm in the process of color feature extraction and texture feature extraction respectively.And we combine the fuzzy color feature vector with the fuzzy texture feature vector to form the comprehensive fuzzy feature vector of the image according to a certain way.Image feature matching mainly depends on the similarity between two image feature vectors.In this paper,we propose a novel similarity measure method based on k-Nearest Neighbors(kNN)and fuzzy mathematical algorithm(SBkNNF).Finding out the k nearest neighborhood images of the query image from the image data set according to an appropriate similarity measure method.Using the k similarity values between the query image and its k neighborhood images to constitute the new k-dimensional fuzzy feature vector corresponding to the query image.And using the k similarity values between the retrieved image and the k neighborhood images of the query image to constitute the new k-dimensional fuzzy feature vector corresponding to the retrieved image.Calculating the similarity between the two kdimensional fuzzy feature vector according to a certain fuzzy similarity algorithm to measure the similarity between the query image and the retrieved image.Extensive experiments are carried out on three data sets:WANG data set,Corel-5k data set and Corel-10k data set.The experimental results show that the outperforming retrieval performance of our proposed CBIR system with the other CBIR systems.
基金This work is supported by the National Natural Science Foundation of China(No.61772561)the Key Research&Development Plan of Hunan Province(No.2018NK2012)+2 种基金the Science Research Projects of Hunan Provincial Education Department(Nos.18A174,18C0262)the Science&Technology Innovation Platform and Talent Plan of Hunan Province(2017TP1022)this work is implemented at the 2011 Collaborative Innovation Center for Development and Utilization of Finance and Economics Big Data Property,Universities of Hunan Province,Open project(No.20181901CRP04).
文摘With the massive growth of images data and the rise of cloud computing that can provide cheap storage space and convenient access,more and more users store data in cloud server.However,how to quickly query the expected data with privacy-preserving is still a challenging in the encryption image data retrieval.Towards this goal,this paper proposes a ciphertext image retrieval method based on SimHash in cloud computing.Firstly,we extract local feature of images,and then cluster the features by K-means.Based on it,the visual word codebook is introduced to represent feature information of images,which hashes the codebook to the corresponding fingerprint.Finally,the image feature vector is generated by SimHash searchable encryption feature algorithm for similarity retrieval.Extensive experiments on two public datasets validate the effectiveness of our method.Besides,the proposed method outperforms one popular searchable encryption,and the results are competitive to the state-of-the-art.
文摘<div style="text-align:justify;"> Digital image collection as rapidly increased along with the development of computer network. Image retrieval system was developed purposely to provide an efficient tool for a set of images from a collection of images in the database that matches the user’s requirements in similarity evaluations such as image content similarity, edge, and color similarity. Retrieving images based on the content which is color, texture, and shape is called content based image retrieval (CBIR). The content is actually the feature of an image and these features are extracted and used as the basis for a similarity check between images. The algorithms used to calculate the similarity between extracted features. There are two kinds of content based image retrieval which are general image retrieval and application specific image retrieval. For the general image retrieval, the goal of the query is to obtain images with the same object as the query. Such CBIR imitates web search engines for images rather than for text. For application specific, the purpose tries to match a query image to a collection of images of a specific type such as fingerprints image and x-ray. In this paper, the general architecture, various functional components, and techniques of CBIR system are discussed. CBIR techniques discussed in this paper are categorized as CBIR using color, CBIR using texture, and CBIR using shape features. This paper also describe about the comparison study about color features, texture features, shape features, and combined features (hybrid techniques) in terms of several parameters. The parameters are precision, recall and response time. </div>
基金supported by the National High Technology Research and Development Program of China (863 Program) (2007AA12Z1362007AA12Z223)+2 种基金the National Basic Research Program of China (973Program) (2006CB705707)the National Natural Science Foundation of China (60672126, 60607010)the Program for Cheung Kong Scholars and Innovative Research Team in University (IRT0645)
文摘A novel image retrieval approach based on color features and anisotropic directional information is proposed for content based image retrieval systems (CBIR). The color feature is described by the color histogram (CH), which is translation and rotation invariant. However, the CH does not contain spatial information which is very important for the image retrieval. To overcome this shortcoming, the subband energy of the lifting directionlet transform (L-DT) is proposed to describe the directional information, in which L-DT is characterized by multi-direction and anisotropic basis functions compared with the wavelet transform. A global similarity measure is designed to implement the fusion of both color feature and anisotropic directionality for the retrieval process. The retrieval experiments using a set of COREL images demonstrate that the higher query precision and better visual effect can be achieved.
文摘In content-based image retrieval(CBIR),primitive image signatures are critical because they represent the visual characteristics.Image signatures,which are algorithmically descriptive and accurately recognized visual components,are used to appropriately index and retrieve comparable results.To differentiate an image in the category of qualifying contender,feature vectors must have image information's like colour,objects,shape,spatial viewpoints.Previous methods such as sketch-based image retrieval by salient contour(SBIR)and greedy learning of deep Boltzmann machine(GDBM)used spatial information to distinguish between image categories.This requires interest points and also feature analysis emerged image detection problems.Thus,a proposed model to overcome this issue and predict the repeating pattern as well as series of pixels that conclude similarity has been necessary.In this study,a technique called CBIR-similarity measure via artificial neural network interpolation(CBIR-SMANN)has been presented.By collecting datasets,the images are resized then subject to Gaussian filtering in the pre-processing stage,then by permitting them to the Hessian detector,the interesting points are gathered.Based on Skewness,mean,kurtosis and standard deviation features were extracted then given to ANN for interpolation.Interpolated results are stored in a database for retrieval.In the testing stage,the query image was inputted that is subjected to pre-processing,and feature extraction was then fed to the similarity measurement function.Thus,ANN helps to get similar images from the database.CBIR-SMANN have been implemented in the python tool and then evaluated for its performance.Results show that CBIR-SMANN exhibited a high recall value of 78%with a minimum retrieval time of 980 ms.This showed the supremacy of the proposed model was comparatively greater than the previous ones.
基金Supported by CNPq-Brazil,Grants 306193/2007-8,471518/ 2007-7,307373/2006-1 and 484893/2007-6,by FAPEMIG,Grant PPM 347/08,and by CAPESThe IRMA project is funded by the German Research Foundation(DFG),Le 1108/4 and Le 1108/9
文摘AIM:To present a content-based image retrieval(CBIR) system that supports the classification of breast tissue density and can be used in the processing chain to adapt parameters for lesion segmentation and classification.METHODS:Breast density is characterized by image texture using singular value decomposition(SVD) and histograms.Pattern similarity is computed by a support vector machine(SVM) to separate the four BI-RADS tissue categories.The crucial number of remaining singular values is varied(SVD),and linear,radial,and polynomial kernels are investigated(SVM).The system is supported by a large reference database for training and evaluation.Experiments are based on 5-fold cross validation.RESULTS:Adopted from DDSM,MIAS,LLNL,and RWTH datasets,the reference database is composed of over 10000 various mammograms with unified and reliable ground truth.An average precision of 82.14% is obtained using 25 singular values(SVD),polynomial kernel and the one-against-one(SVM).CONCLUSION:Breast density characterization using SVD allied with SVM for image retrieval enable the development of a CBIR system that can effectively aid radiologists in their diagnosis.