Crime scene investigation(CSI)image is key evidence carrier during criminal investiga-tion,in which CSI image retrieval can assist the public police to obtain criminal clues.Moreover,with the rapid development of deep...Crime scene investigation(CSI)image is key evidence carrier during criminal investiga-tion,in which CSI image retrieval can assist the public police to obtain criminal clues.Moreover,with the rapid development of deep learning,data-driven paradigm has become the mainstreammethod of CSI image feature extraction and representation,and in this process,datasets provideeffective support for CSI retrieval performance.However,there is a lack of systematic research onCSI image retrieval methods and datasets.Therefore,we present an overview of the existing worksabout one-class and multi-class CSI image retrieval based on deep learning.According to theresearch,based on their technical functionalities and implementation methods,CSI image retrievalis roughly classified into five categories:feature representation,metric learning,generative adversar-ial networks,autoencoder networks and attention networks.Furthermore,We analyzed the remain-ing challenges and discussed future work directions in this field.展开更多
Purpose: To detect small diagnostic signals such as lung nodules in chest radiographs, radiologists magnify a region-of-interest using linear interpolation methods. However, such methods tend to generate over-smoothed...Purpose: To detect small diagnostic signals such as lung nodules in chest radiographs, radiologists magnify a region-of-interest using linear interpolation methods. However, such methods tend to generate over-smoothed images with artifacts that can make interpretation difficult. The purpose of this study was to investigate the effectiveness of super-resolution methods for improving the image quality of magnified chest radiographs. Materials and Methods: A total of 247 chest X-rays were sampled from the JSRT database, then divided into 93 training cases with non-nodules and 154 test cases with lung nodules. We first trained two types of super-resolution methods, sparse-coding super-resolution (ScSR) and super-resolution convolutional neural network (SRCNN). With the trained super-resolution methods, the high-resolution image was then reconstructed using the super-resolution methods from a low-resolution image that was down-sampled from the original test image. We compared the image quality of the super-resolution methods and the linear interpolations (nearest neighbor and bilinear interpolations). For quantitative evaluation, we measured two image quality metrics: peak signal-to-noise ratio (PSNR) and structural similarity (SSIM). For comparative evaluation of the super-resolution methods, we measured the computation time per image. Results: The PSNRs and SSIMs for the ScSR and the SRCNN schemes were significantly higher than those of the linear interpolation methods (p p p Conclusion: Super-resolution methods provide significantly better image quality than linear interpolation methods for magnified chest radiograph images. Of the two tested schemes, the SRCNN scheme processed the images fastest;thus, SRCNN could be clinically superior for processing radiographs in terms of both image quality and processing speed.展开更多
The Advanced Geosynchronous Radiation Imager(AGRI)is a mission-critical instrument for the Fengyun series of satellites.AGRI acquires full-disk images every 15 min and views East Asia every 5 min through 14 spectral b...The Advanced Geosynchronous Radiation Imager(AGRI)is a mission-critical instrument for the Fengyun series of satellites.AGRI acquires full-disk images every 15 min and views East Asia every 5 min through 14 spectral bands,enabling the detection of highly variable aerosol optical depth(AOD).Quantitative retrieval of AOD has hitherto been challenging,especially over land.In this study,an AOD retrieval algorithm is proposed that combines deep learning and transfer learning.The algorithm uses core concepts from both the Dark Target(DT)and Deep Blue(DB)algorithms to select features for the machinelearning(ML)algorithm,allowing for AOD retrieval at 550 nm over both dark and bright surfaces.The algorithm consists of two steps:①A baseline deep neural network(DNN)with skip connections is developed using 10 min Advanced Himawari Imager(AHI)AODs as the target variable,and②sunphotometer AODs from 89 ground-based stations are used to fine-tune the DNN parameters.Out-of-station validation shows that the retrieved AOD attains high accuracy,characterized by a coefficient of determination(R2)of 0.70,a mean bias error(MBE)of 0.03,and a percentage of data within the expected error(EE)of 70.7%.A sensitivity study reveals that the top-of-atmosphere reflectance at 650 and 470 nm,as well as the surface reflectance at 650 nm,are the two largest sources of uncertainty impacting the retrieval.In a case study of monitoring an extreme aerosol event,the AGRI AOD is found to be able to capture the detailed temporal evolution of the event.This work demonstrates the superiority of the transfer-learning technique in satellite AOD retrievals and the applicability of the retrieved AGRI AOD in monitoring extreme pollution events.展开更多
Fine-grained image classification is a challenging research topic because of the high degree of similarity among categories and the high degree of dissimilarity for a specific category caused by different poses and scal...Fine-grained image classification is a challenging research topic because of the high degree of similarity among categories and the high degree of dissimilarity for a specific category caused by different poses and scales.A cul-tural heritage image is one of thefine-grained images because each image has the same similarity in most cases.Using the classification technique,distinguishing cultural heritage architecture may be difficult.This study proposes a cultural heri-tage content retrieval method using adaptive deep learning forfine-grained image retrieval.The key contribution of this research was the creation of a retrieval mod-el that could handle incremental streams of new categories while maintaining its past performance in old categories and not losing the old categorization of a cul-tural heritage image.The goal of the proposed method is to perform a retrieval task for classes.Incremental learning for new classes was conducted to reduce the re-training process.In this step,the original class is not necessary for re-train-ing which we call an adaptive deep learning technique.Cultural heritage in the case of Thai archaeological site architecture was retrieved through machine learn-ing and image processing.We analyze the experimental results of incremental learning forfine-grained images with images of Thai archaeological site architec-ture from world heritage provinces in Thailand,which have a similar architecture.Using afine-grained image retrieval technique for this group of cultural heritage images in a database can solve the problem of a high degree of similarity among categories and a high degree of dissimilarity for a specific category.The proposed method for retrieving the correct image from a database can deliver an average accuracy of 85 percent.Adaptive deep learning forfine-grained image retrieval was used to retrieve cultural heritage content,and it outperformed state-of-the-art methods infine-grained image retrieval.展开更多
Recent days,Image retrieval has become a tedious process as the image database has grown very larger.The introduction of Machine Learning(ML)and Deep Learning(DL)made this process more comfortable.In these,the pair-wi...Recent days,Image retrieval has become a tedious process as the image database has grown very larger.The introduction of Machine Learning(ML)and Deep Learning(DL)made this process more comfortable.In these,the pair-wise label similarity is used tofind the matching images from the database.But this method lacks of limited propose code and weak execution of misclassified images.In order to get-rid of the above problem,a novel triplet based label that incorporates context-spatial similarity measure is proposed.A Point Attention Based Triplet Network(PABTN)is introduced to study propose code that gives maximum discriminative ability.To improve the performance of ranking,a corre-lating resolutions for the classification,triplet labels based onfindings,a spatial-attention mechanism and Region Of Interest(ROI)and small trial information loss containing a new triplet cross-entropy loss are used.From the experimental results,it is shown that the proposed technique exhibits better results in terms of mean Reciprocal Rank(mRR)and mean Average Precision(mAP)in the CIFAR-10 and NUS-WIPE datasets.展开更多
The use of massive image databases has increased drastically over the few years due to evolution of multimedia technology.Image retrieval has become one of the vital tools in image processing applications.Content-Base...The use of massive image databases has increased drastically over the few years due to evolution of multimedia technology.Image retrieval has become one of the vital tools in image processing applications.Content-Based Image Retrieval(CBIR)has been widely used in varied applications.But,the results produced by the usage of a single image feature are not satisfactory.So,multiple image features are used very often for attaining better results.But,fast and effective searching for relevant images from a database becomes a challenging task.In the previous existing system,the CBIR has used the combined feature extraction technique using color auto-correlogram,Rotation-Invariant Uniform Local Binary Patterns(RULBP)and local energy.However,the existing system does not provide significant results in terms of recall and precision.Also,the computational complexity is higher for the existing CBIR systems.In order to handle the above mentioned issues,the Gray Level Co-occurrence Matrix(GLCM)with Deep Learning based Enhanced Convolution Neural Network(DLECNN)is proposed in this work.The proposed system framework includes noise reduction using histogram equalization,feature extraction using GLCM,similarity matching computation using Hierarchal and Fuzzy c-Means(HFCM)algorithm and the image retrieval using DLECNN algorithm.The histogram equalization has been used for computing the image enhancement.This enhanced image has a uniform histogram.Then,the GLCM method has been used to extract the features such as shape,texture,colour,annotations and keywords.The HFCM similarity measure is used for computing the query image vector's similarity index with every database images.For enhancing the performance of this image retrieval approach,the DLECNN algorithm is proposed to retrieve more accurate features of the image.The proposed GLCM+DLECNN algorithm provides better results associated with high accuracy,precision,recall,f-measure and lesser complexity.From the experimental results,it is clearly observed that the proposed system provides efficient image retrieval for the given query image.展开更多
Classifying the visual features in images to retrieve a specific image is a significant problem within the computer vision field especially when dealing with historical faded colored images.Thus,there were lots of eff...Classifying the visual features in images to retrieve a specific image is a significant problem within the computer vision field especially when dealing with historical faded colored images.Thus,there were lots of efforts trying to automate the classification operation and retrieve similar images accurately.To reach this goal,we developed a VGG19 deep convolutional neural network to extract the visual features from the images automatically.Then,the distances among the extracted features vectors are measured and a similarity score is generated using a Siamese deep neural network.The Siamese model built and trained at first from scratch but,it didn’t generated high evaluation metrices.Thus,we re-built it from VGG19 pre-trained deep learning model to generate higher evaluation metrices.Afterward,three different distance metrics combined with the Sigmoid activation function are experimented looking for the most accurate method formeasuring the similarities among the retrieved images.Reaching that the highest evaluation parameters generated using the Cosine distance metric.Moreover,the Graphics Processing Unit(GPU)utilized to run the code instead of running it on the Central Processing Unit(CPU).This step optimized the execution further since it expedited both the training and the retrieval time efficiently.After extensive experimentation,we reached satisfactory solution recording 0.98 and 0.99 F-score for the classification and for the retrieval,respectively.展开更多
Apricot has a long history of cultivation and has many varieties and types. The traditional variety identification methods are timeconsuming and labor-consuming, posing grand challenges to apricot resource management....Apricot has a long history of cultivation and has many varieties and types. The traditional variety identification methods are timeconsuming and labor-consuming, posing grand challenges to apricot resource management. Tool development in this regard will help researchers quickly identify variety information. This study photographed apricot fruits outdoors and indoors and constructed a dataset that can precisely classify the fruits using a U-net model (F-score:99%), which helps to obtain the fruit's size, shape, and color features. Meanwhile, a variety search engine was constructed, which can search and identify variety from the database according to the above features. Besides, a mobile and web application (ApricotView) was developed, and the construction mode can be also applied to other varieties of fruit trees.Additionally, we have collected four difficult-to-identify seed datasets and used the VGG16 model for training, with an accuracy of 97%, which provided an important basis for ApricotView. To address the difficulties in data collection bottlenecking apricot phenomics research, we developed the first apricot database platform of its kind (ApricotDIAP, http://apricotdiap.com/) to accumulate, manage, and publicize scientific data of apricot.展开更多
The recent developments in Multimedia Internet of Things(MIoT)devices,empowered with Natural Language Processing(NLP)model,seem to be a promising future of smart devices.It plays an important role in industrial models...The recent developments in Multimedia Internet of Things(MIoT)devices,empowered with Natural Language Processing(NLP)model,seem to be a promising future of smart devices.It plays an important role in industrial models such as speech understanding,emotion detection,home automation,and so on.If an image needs to be captioned,then the objects in that image,its actions and connections,and any silent feature that remains under-projected or missing from the images should be identified.The aim of the image captioning process is to generate a caption for image.In next step,the image should be provided with one of the most significant and detailed descriptions that is syntactically as well as semantically correct.In this scenario,computer vision model is used to identify the objects and NLP approaches are followed to describe the image.The current study develops aNatural Language Processing with Optimal Deep Learning Enabled Intelligent Image Captioning System(NLPODL-IICS).The aim of the presented NLPODL-IICS model is to produce a proper description for input image.To attain this,the proposed NLPODL-IICS follows two stages such as encoding and decoding processes.Initially,at the encoding side,the proposed NLPODL-IICS model makes use of Hunger Games Search(HGS)with Neural Search Architecture Network(NASNet)model.This model represents the input data appropriately by inserting it into a predefined length vector.Besides,during decoding phase,Chimp Optimization Algorithm(COA)with deeper Long Short Term Memory(LSTM)approach is followed to concatenate the description sentences 4436 CMC,2023,vol.74,no.2 produced by the method.The application of HGS and COA algorithms helps in accomplishing proper parameter tuning for NASNet and LSTM models respectively.The proposed NLPODL-IICS model was experimentally validated with the help of two benchmark datasets.Awidespread comparative analysis confirmed the superior performance of NLPODL-IICS model over other models.展开更多
Given one specific image,it would be quite significant if humanity could simply retrieve all those pictures that fall into a similar category of images.However,traditional methods are inclined to achieve high-quality ...Given one specific image,it would be quite significant if humanity could simply retrieve all those pictures that fall into a similar category of images.However,traditional methods are inclined to achieve high-quality retrieval by utilizing adequate learning instances,ignoring the extraction of the image’s essential information which leads to difficulty in the retrieval of similar category images just using one reference image.Aiming to solve this problem above,we proposed in this paper one refined sparse representation based similar category image retrieval model.On the one hand,saliency detection and multi-level decomposition could contribute to taking salient and spatial information into consideration more fully in the future.On the other hand,the cross mutual sparse coding model aims to extract the image’s essential feature to the maximumextent possible.At last,we set up a database concluding a large number of multi-source images.Adequate groups of comparative experiments show that our method could contribute to retrieving similar category images effectively.Moreover,adequate groups of ablation experiments show that nearly all procedures play their roles,respectively.展开更多
The exponential increase in data over the past fewyears,particularly in images,has led to more complex content since visual representation became the new norm.E-commerce and similar platforms maintain large image cata...The exponential increase in data over the past fewyears,particularly in images,has led to more complex content since visual representation became the new norm.E-commerce and similar platforms maintain large image catalogues of their products.In image databases,searching and retrieving similar images is still a challenge,even though several image retrieval techniques have been proposed over the decade.Most of these techniques work well when querying general image databases.However,they often fail in domain-specific image databases,especially for datasets with low intraclass variance.This paper proposes a domain-specific image similarity search engine based on a fused deep learning network.The network is comprised of an improved object localization module,a classification module to narrow down search options and finally a feature extraction and similarity calculation module.The network features both an offline stage for indexing the dataset and an online stage for querying.The dataset used to evaluate the performance of the proposed network is a custom domain-specific dataset related to cosmetics packaging gathered from various online platforms.The proposed method addresses the intraclass variance problem with more precise object localization and the introduction of top result reranking based on object contours.Finally,quantitative and qualitative experiment results are presented,showing improved image similarity search performance.展开更多
3D medical image reconstruction has significantly enhanced diagnostic accuracy,yet the reliance on densely sampled projection data remains a major limitation in clinical practice.Sparse-angle X-ray imaging,though safe...3D medical image reconstruction has significantly enhanced diagnostic accuracy,yet the reliance on densely sampled projection data remains a major limitation in clinical practice.Sparse-angle X-ray imaging,though safer and faster,poses challenges for accurate volumetric reconstruction due to limited spatial information.This study proposes a 3D reconstruction neural network based on adaptive weight fusion(AdapFusionNet)to achieve high-quality 3D medical image reconstruction from sparse-angle X-ray images.To address the issue of spatial inconsistency in multi-angle image reconstruction,an innovative adaptive fusion module was designed to score initial reconstruction results during the inference stage and perform weighted fusion,thereby improving the final reconstruction quality.The reconstruction network is built on an autoencoder(AE)framework and uses orthogonal-angle X-ray images(frontal and lateral projections)as inputs.The encoder extracts 2D features,which the decoder maps into 3D space.This study utilizes a lung CT dataset to obtain complete three-dimensional volumetric data,from which digitally reconstructed radiographs(DRR)are generated at various angles to simulate X-ray images.Since real-world clinical X-ray images rarely come with perfectly corresponding 3D“ground truth,”using CT scans as the three-dimensional reference effectively supports the training and evaluation of deep networks for sparse-angle X-ray 3D reconstruction.Experiments conducted on the LIDC-IDRI dataset with simulated X-ray images(DRR images)as training data demonstrate the superior performance of AdapFusionNet compared to other fusion methods.Quantitative results show that AdapFusionNet achieves SSIM,PSNR,and MAE values of 0.332,13.404,and 0.163,respectively,outperforming other methods(SingleViewNet:0.289,12.363,0.182;AvgFusionNet:0.306,13.384,0.159).Qualitative analysis further confirms that AdapFusionNet significantly enhances the reconstruction of lung and chest contours while effectively reducing noise during the reconstruction process.The findings demonstrate that AdapFusionNet offers significant advantages in 3D reconstruction of sparse-angle X-ray images.展开更多
(Aim)COVID-19 is an ongoing infectious disease.It has caused more than 107.45 m confirmed cases and 2.35 m deaths till 11/Feb/2021.Traditional computer vision methods have achieved promising results on the automatic s...(Aim)COVID-19 is an ongoing infectious disease.It has caused more than 107.45 m confirmed cases and 2.35 m deaths till 11/Feb/2021.Traditional computer vision methods have achieved promising results on the automatic smart diagnosis.(Method)This study aims to propose a novel deep learning method that can obtain better performance.We use the pseudo-Zernike moment(PZM),derived from Zernike moment,as the extracted features.Two settings are introducing:(i)image plane over unit circle;and(ii)image plane inside the unit circle.Afterward,we use a deep-stacked sparse autoencoder(DSSAE)as the classifier.Besides,multiple-way data augmentation is chosen to overcome overfitting.The multiple-way data augmentation is based on Gaussian noise,salt-and-pepper noise,speckle noise,horizontal and vertical shear,rotation,Gamma correction,random translation and scaling.(Results)10 runs of 10-fold cross validation shows that our PZM-DSSAE method achieves a sensitivity of 92.06%±1.54%,a specificity of 92.56%±1.06%,a precision of 92.53%±1.03%,and an accuracy of 92.31%±1.08%.Its F1 score,MCC,and FMI arrive at 92.29%±1.10%,84.64%±2.15%,and 92.29%±1.10%,respectively.The AUC of our model is 0.9576.(Conclusion)We demonstrate“image plane over unit circle”can get better results than“image plane inside a unit circle.”Besides,this proposed PZM-DSSAE model is better than eight state-of-the-art approaches.展开更多
Traditional hyperspectral imaging(HI)systems are constrained by a limited depth of field(DoF),necessitating refocusing for any out-of-focus objects.This requirement not only slows down the imaging speed but also compl...Traditional hyperspectral imaging(HI)systems are constrained by a limited depth of field(DoF),necessitating refocusing for any out-of-focus objects.This requirement not only slows down the imaging speed but also complicates the system architecture.It is challenging to trade off among speed,resolution,and DoF within an ultrasimple system.While some studies have reported advancements in extending DoF,the improvements remain insufficient.To address this challenge,we propose a novel,to our knowledge,differentiable framework that integrates an extended DoF(E-DoF)wave propagation model and an achromatic hyperspectral reconstructor powered by deep learning.Through rigorous experimental validation,we have demonstrated that the compact HI system is capable of snapshot capturing of high-fidelity images with an exceptional DoF reaching approximately 5 m,marking a significant improvement of over three orders of magnitude.Additionally,the system achieves over 90%spectral accuracy without aberration,nearly doubling the accuracy performance of existing methods.An asymmetric freeform surface design is introduced for diffractive optical elements,enabling dual functionality with design freedom and E-DoF.The sparse prior conditions for spatial texture and spectral features of hyperspectral cubic data are integrated into the reconstruction network,effectively mitigating texture blurring and chromatic aberration.It foresees that the optimal strategy for achromatic E-DoF can be adopted into other optical systems such as polarization imaging and depth measurement.展开更多
In this paper, a discriminative structured dictionary learning algorithm is presented. To enhance the dictionary's discriminative power, the reconstruction error, classification error and inhomogeneous representat...In this paper, a discriminative structured dictionary learning algorithm is presented. To enhance the dictionary's discriminative power, the reconstruction error, classification error and inhomogeneous representation error are integrated into the objective function. The proposed approach learns a single structured dictionary and a linear classifier jointly. The learned dictionary encourages the samples from the same class to have similar sparse codes, and the samples from different classes to have dissimilar sparse codes. The solution to the objective function is achieved by employing a feature-sign search algorithm and Lagrange dual method. Experimental results on three public databases demonstrate that the proposed approach outperforms several recently proposed dictionary learning techniques for classification.展开更多
In this paper, a two-level Bregman method is presented with graph regularized sparse coding for highly undersampled magnetic resonance image reconstruction. The graph regularized sparse coding is incorporated with the...In this paper, a two-level Bregman method is presented with graph regularized sparse coding for highly undersampled magnetic resonance image reconstruction. The graph regularized sparse coding is incorporated with the two-level Bregman iterative procedure which enforces the sampled data constraints in the outer level and updates dictionary and sparse representation in the inner level. Graph regularized sparse coding and simple dictionary updating applied in the inner minimization make the proposed algorithm converge with a relatively small number of iterations. Experimental results demonstrate that the proposed algorithm can consistently reconstruct both simulated MR images and real MR data efficiently, and outperforms the current state-of-the-art approaches in terms of visual comparisons and quantitative measures.展开更多
文摘Crime scene investigation(CSI)image is key evidence carrier during criminal investiga-tion,in which CSI image retrieval can assist the public police to obtain criminal clues.Moreover,with the rapid development of deep learning,data-driven paradigm has become the mainstreammethod of CSI image feature extraction and representation,and in this process,datasets provideeffective support for CSI retrieval performance.However,there is a lack of systematic research onCSI image retrieval methods and datasets.Therefore,we present an overview of the existing worksabout one-class and multi-class CSI image retrieval based on deep learning.According to theresearch,based on their technical functionalities and implementation methods,CSI image retrievalis roughly classified into five categories:feature representation,metric learning,generative adversar-ial networks,autoencoder networks and attention networks.Furthermore,We analyzed the remain-ing challenges and discussed future work directions in this field.
文摘Purpose: To detect small diagnostic signals such as lung nodules in chest radiographs, radiologists magnify a region-of-interest using linear interpolation methods. However, such methods tend to generate over-smoothed images with artifacts that can make interpretation difficult. The purpose of this study was to investigate the effectiveness of super-resolution methods for improving the image quality of magnified chest radiographs. Materials and Methods: A total of 247 chest X-rays were sampled from the JSRT database, then divided into 93 training cases with non-nodules and 154 test cases with lung nodules. We first trained two types of super-resolution methods, sparse-coding super-resolution (ScSR) and super-resolution convolutional neural network (SRCNN). With the trained super-resolution methods, the high-resolution image was then reconstructed using the super-resolution methods from a low-resolution image that was down-sampled from the original test image. We compared the image quality of the super-resolution methods and the linear interpolations (nearest neighbor and bilinear interpolations). For quantitative evaluation, we measured two image quality metrics: peak signal-to-noise ratio (PSNR) and structural similarity (SSIM). For comparative evaluation of the super-resolution methods, we measured the computation time per image. Results: The PSNRs and SSIMs for the ScSR and the SRCNN schemes were significantly higher than those of the linear interpolation methods (p p p Conclusion: Super-resolution methods provide significantly better image quality than linear interpolation methods for magnified chest radiograph images. Of the two tested schemes, the SRCNN scheme processed the images fastest;thus, SRCNN could be clinically superior for processing radiographs in terms of both image quality and processing speed.
基金supported by the National Natural Science of Foundation of China(41825011,42030608,42105128,and 42075079)the Opening Foundation of Key Laboratory of Atmospheric Sounding,the CMA and the CMA Research Center on Meteorological Observation Engineering Technology(U2021Z03).
文摘The Advanced Geosynchronous Radiation Imager(AGRI)is a mission-critical instrument for the Fengyun series of satellites.AGRI acquires full-disk images every 15 min and views East Asia every 5 min through 14 spectral bands,enabling the detection of highly variable aerosol optical depth(AOD).Quantitative retrieval of AOD has hitherto been challenging,especially over land.In this study,an AOD retrieval algorithm is proposed that combines deep learning and transfer learning.The algorithm uses core concepts from both the Dark Target(DT)and Deep Blue(DB)algorithms to select features for the machinelearning(ML)algorithm,allowing for AOD retrieval at 550 nm over both dark and bright surfaces.The algorithm consists of two steps:①A baseline deep neural network(DNN)with skip connections is developed using 10 min Advanced Himawari Imager(AHI)AODs as the target variable,and②sunphotometer AODs from 89 ground-based stations are used to fine-tune the DNN parameters.Out-of-station validation shows that the retrieved AOD attains high accuracy,characterized by a coefficient of determination(R2)of 0.70,a mean bias error(MBE)of 0.03,and a percentage of data within the expected error(EE)of 70.7%.A sensitivity study reveals that the top-of-atmosphere reflectance at 650 and 470 nm,as well as the surface reflectance at 650 nm,are the two largest sources of uncertainty impacting the retrieval.In a case study of monitoring an extreme aerosol event,the AGRI AOD is found to be able to capture the detailed temporal evolution of the event.This work demonstrates the superiority of the transfer-learning technique in satellite AOD retrievals and the applicability of the retrieved AGRI AOD in monitoring extreme pollution events.
基金This research was funded by King Mongkut’s University of Technology North Bangkok(Contract no.KMUTNB-62-KNOW-026).
文摘Fine-grained image classification is a challenging research topic because of the high degree of similarity among categories and the high degree of dissimilarity for a specific category caused by different poses and scales.A cul-tural heritage image is one of thefine-grained images because each image has the same similarity in most cases.Using the classification technique,distinguishing cultural heritage architecture may be difficult.This study proposes a cultural heri-tage content retrieval method using adaptive deep learning forfine-grained image retrieval.The key contribution of this research was the creation of a retrieval mod-el that could handle incremental streams of new categories while maintaining its past performance in old categories and not losing the old categorization of a cul-tural heritage image.The goal of the proposed method is to perform a retrieval task for classes.Incremental learning for new classes was conducted to reduce the re-training process.In this step,the original class is not necessary for re-train-ing which we call an adaptive deep learning technique.Cultural heritage in the case of Thai archaeological site architecture was retrieved through machine learn-ing and image processing.We analyze the experimental results of incremental learning forfine-grained images with images of Thai archaeological site architec-ture from world heritage provinces in Thailand,which have a similar architecture.Using afine-grained image retrieval technique for this group of cultural heritage images in a database can solve the problem of a high degree of similarity among categories and a high degree of dissimilarity for a specific category.The proposed method for retrieving the correct image from a database can deliver an average accuracy of 85 percent.Adaptive deep learning forfine-grained image retrieval was used to retrieve cultural heritage content,and it outperformed state-of-the-art methods infine-grained image retrieval.
文摘Recent days,Image retrieval has become a tedious process as the image database has grown very larger.The introduction of Machine Learning(ML)and Deep Learning(DL)made this process more comfortable.In these,the pair-wise label similarity is used tofind the matching images from the database.But this method lacks of limited propose code and weak execution of misclassified images.In order to get-rid of the above problem,a novel triplet based label that incorporates context-spatial similarity measure is proposed.A Point Attention Based Triplet Network(PABTN)is introduced to study propose code that gives maximum discriminative ability.To improve the performance of ranking,a corre-lating resolutions for the classification,triplet labels based onfindings,a spatial-attention mechanism and Region Of Interest(ROI)and small trial information loss containing a new triplet cross-entropy loss are used.From the experimental results,it is shown that the proposed technique exhibits better results in terms of mean Reciprocal Rank(mRR)and mean Average Precision(mAP)in the CIFAR-10 and NUS-WIPE datasets.
文摘The use of massive image databases has increased drastically over the few years due to evolution of multimedia technology.Image retrieval has become one of the vital tools in image processing applications.Content-Based Image Retrieval(CBIR)has been widely used in varied applications.But,the results produced by the usage of a single image feature are not satisfactory.So,multiple image features are used very often for attaining better results.But,fast and effective searching for relevant images from a database becomes a challenging task.In the previous existing system,the CBIR has used the combined feature extraction technique using color auto-correlogram,Rotation-Invariant Uniform Local Binary Patterns(RULBP)and local energy.However,the existing system does not provide significant results in terms of recall and precision.Also,the computational complexity is higher for the existing CBIR systems.In order to handle the above mentioned issues,the Gray Level Co-occurrence Matrix(GLCM)with Deep Learning based Enhanced Convolution Neural Network(DLECNN)is proposed in this work.The proposed system framework includes noise reduction using histogram equalization,feature extraction using GLCM,similarity matching computation using Hierarchal and Fuzzy c-Means(HFCM)algorithm and the image retrieval using DLECNN algorithm.The histogram equalization has been used for computing the image enhancement.This enhanced image has a uniform histogram.Then,the GLCM method has been used to extract the features such as shape,texture,colour,annotations and keywords.The HFCM similarity measure is used for computing the query image vector's similarity index with every database images.For enhancing the performance of this image retrieval approach,the DLECNN algorithm is proposed to retrieve more accurate features of the image.The proposed GLCM+DLECNN algorithm provides better results associated with high accuracy,precision,recall,f-measure and lesser complexity.From the experimental results,it is clearly observed that the proposed system provides efficient image retrieval for the given query image.
基金The authors would like to thank the Deanship of Scientific Research at Umm Al-Qura University for supporting this work by Grant Code:(22UQU4400271DSR01).
文摘Classifying the visual features in images to retrieve a specific image is a significant problem within the computer vision field especially when dealing with historical faded colored images.Thus,there were lots of efforts trying to automate the classification operation and retrieve similar images accurately.To reach this goal,we developed a VGG19 deep convolutional neural network to extract the visual features from the images automatically.Then,the distances among the extracted features vectors are measured and a similarity score is generated using a Siamese deep neural network.The Siamese model built and trained at first from scratch but,it didn’t generated high evaluation metrices.Thus,we re-built it from VGG19 pre-trained deep learning model to generate higher evaluation metrices.Afterward,three different distance metrics combined with the Sigmoid activation function are experimented looking for the most accurate method formeasuring the similarities among the retrieved images.Reaching that the highest evaluation parameters generated using the Cosine distance metric.Moreover,the Graphics Processing Unit(GPU)utilized to run the code instead of running it on the Central Processing Unit(CPU).This step optimized the execution further since it expedited both the training and the retrieval time efficiently.After extensive experimentation,we reached satisfactory solution recording 0.98 and 0.99 F-score for the classification and for the retrieval,respectively.
基金supported by the Fundamental Research Funds for the Central Non-profit Research Institution of the Chinese Academy of Forestry (Grant No.CAFYBB2020ZY003)the Key S&T Project of Inner Mongolia (Grant No.2021ZD0041-001-002)the Central Public-interest Scientific Institution Basal Research Fund (Grant No.11024316000202300001)。
文摘Apricot has a long history of cultivation and has many varieties and types. The traditional variety identification methods are timeconsuming and labor-consuming, posing grand challenges to apricot resource management. Tool development in this regard will help researchers quickly identify variety information. This study photographed apricot fruits outdoors and indoors and constructed a dataset that can precisely classify the fruits using a U-net model (F-score:99%), which helps to obtain the fruit's size, shape, and color features. Meanwhile, a variety search engine was constructed, which can search and identify variety from the database according to the above features. Besides, a mobile and web application (ApricotView) was developed, and the construction mode can be also applied to other varieties of fruit trees.Additionally, we have collected four difficult-to-identify seed datasets and used the VGG16 model for training, with an accuracy of 97%, which provided an important basis for ApricotView. To address the difficulties in data collection bottlenecking apricot phenomics research, we developed the first apricot database platform of its kind (ApricotDIAP, http://apricotdiap.com/) to accumulate, manage, and publicize scientific data of apricot.
基金Princess Nourah bint Abdulrahman University Researchers Supporting Project number(PNURSP2022R161)PrincessNourah bint Abdulrahman University,Riyadh,Saudi Arabia.The authors would like to thank the|Deanship of Scientific Research at Umm Al-Qura University|for supporting this work by Grant Code:(22UQU4310373DSR33).
文摘The recent developments in Multimedia Internet of Things(MIoT)devices,empowered with Natural Language Processing(NLP)model,seem to be a promising future of smart devices.It plays an important role in industrial models such as speech understanding,emotion detection,home automation,and so on.If an image needs to be captioned,then the objects in that image,its actions and connections,and any silent feature that remains under-projected or missing from the images should be identified.The aim of the image captioning process is to generate a caption for image.In next step,the image should be provided with one of the most significant and detailed descriptions that is syntactically as well as semantically correct.In this scenario,computer vision model is used to identify the objects and NLP approaches are followed to describe the image.The current study develops aNatural Language Processing with Optimal Deep Learning Enabled Intelligent Image Captioning System(NLPODL-IICS).The aim of the presented NLPODL-IICS model is to produce a proper description for input image.To attain this,the proposed NLPODL-IICS follows two stages such as encoding and decoding processes.Initially,at the encoding side,the proposed NLPODL-IICS model makes use of Hunger Games Search(HGS)with Neural Search Architecture Network(NASNet)model.This model represents the input data appropriately by inserting it into a predefined length vector.Besides,during decoding phase,Chimp Optimization Algorithm(COA)with deeper Long Short Term Memory(LSTM)approach is followed to concatenate the description sentences 4436 CMC,2023,vol.74,no.2 produced by the method.The application of HGS and COA algorithms helps in accomplishing proper parameter tuning for NASNet and LSTM models respectively.The proposed NLPODL-IICS model was experimentally validated with the help of two benchmark datasets.Awidespread comparative analysis confirmed the superior performance of NLPODL-IICS model over other models.
基金sponsored by the National Natural Science Foundation of China(Grants:62002200,61772319)Shandong Natural Science Foundation of China(Grant:ZR2020QF012).
文摘Given one specific image,it would be quite significant if humanity could simply retrieve all those pictures that fall into a similar category of images.However,traditional methods are inclined to achieve high-quality retrieval by utilizing adequate learning instances,ignoring the extraction of the image’s essential information which leads to difficulty in the retrieval of similar category images just using one reference image.Aiming to solve this problem above,we proposed in this paper one refined sparse representation based similar category image retrieval model.On the one hand,saliency detection and multi-level decomposition could contribute to taking salient and spatial information into consideration more fully in the future.On the other hand,the cross mutual sparse coding model aims to extract the image’s essential feature to the maximumextent possible.At last,we set up a database concluding a large number of multi-source images.Adequate groups of comparative experiments show that our method could contribute to retrieving similar category images effectively.Moreover,adequate groups of ablation experiments show that nearly all procedures play their roles,respectively.
文摘The exponential increase in data over the past fewyears,particularly in images,has led to more complex content since visual representation became the new norm.E-commerce and similar platforms maintain large image catalogues of their products.In image databases,searching and retrieving similar images is still a challenge,even though several image retrieval techniques have been proposed over the decade.Most of these techniques work well when querying general image databases.However,they often fail in domain-specific image databases,especially for datasets with low intraclass variance.This paper proposes a domain-specific image similarity search engine based on a fused deep learning network.The network is comprised of an improved object localization module,a classification module to narrow down search options and finally a feature extraction and similarity calculation module.The network features both an offline stage for indexing the dataset and an online stage for querying.The dataset used to evaluate the performance of the proposed network is a custom domain-specific dataset related to cosmetics packaging gathered from various online platforms.The proposed method addresses the intraclass variance problem with more precise object localization and the introduction of top result reranking based on object contours.Finally,quantitative and qualitative experiment results are presented,showing improved image similarity search performance.
基金Supported by Sichuan Science and Technology Program(2023YFSY0026,2023YFH0004).
文摘3D medical image reconstruction has significantly enhanced diagnostic accuracy,yet the reliance on densely sampled projection data remains a major limitation in clinical practice.Sparse-angle X-ray imaging,though safer and faster,poses challenges for accurate volumetric reconstruction due to limited spatial information.This study proposes a 3D reconstruction neural network based on adaptive weight fusion(AdapFusionNet)to achieve high-quality 3D medical image reconstruction from sparse-angle X-ray images.To address the issue of spatial inconsistency in multi-angle image reconstruction,an innovative adaptive fusion module was designed to score initial reconstruction results during the inference stage and perform weighted fusion,thereby improving the final reconstruction quality.The reconstruction network is built on an autoencoder(AE)framework and uses orthogonal-angle X-ray images(frontal and lateral projections)as inputs.The encoder extracts 2D features,which the decoder maps into 3D space.This study utilizes a lung CT dataset to obtain complete three-dimensional volumetric data,from which digitally reconstructed radiographs(DRR)are generated at various angles to simulate X-ray images.Since real-world clinical X-ray images rarely come with perfectly corresponding 3D“ground truth,”using CT scans as the three-dimensional reference effectively supports the training and evaluation of deep networks for sparse-angle X-ray 3D reconstruction.Experiments conducted on the LIDC-IDRI dataset with simulated X-ray images(DRR images)as training data demonstrate the superior performance of AdapFusionNet compared to other fusion methods.Quantitative results show that AdapFusionNet achieves SSIM,PSNR,and MAE values of 0.332,13.404,and 0.163,respectively,outperforming other methods(SingleViewNet:0.289,12.363,0.182;AvgFusionNet:0.306,13.384,0.159).Qualitative analysis further confirms that AdapFusionNet significantly enhances the reconstruction of lung and chest contours while effectively reducing noise during the reconstruction process.The findings demonstrate that AdapFusionNet offers significant advantages in 3D reconstruction of sparse-angle X-ray images.
基金This study was supported by Royal Society International Exchanges Cost Share Award,UK(RP202G0230)Medical Research Council Confidence in Concept Award,UK(MC_PC_17171)+1 种基金Hope Foundation for Cancer Research,UK(RM60G0680)Global Challenges Research Fund(GCRF),UK(P202PF11)。
文摘(Aim)COVID-19 is an ongoing infectious disease.It has caused more than 107.45 m confirmed cases and 2.35 m deaths till 11/Feb/2021.Traditional computer vision methods have achieved promising results on the automatic smart diagnosis.(Method)This study aims to propose a novel deep learning method that can obtain better performance.We use the pseudo-Zernike moment(PZM),derived from Zernike moment,as the extracted features.Two settings are introducing:(i)image plane over unit circle;and(ii)image plane inside the unit circle.Afterward,we use a deep-stacked sparse autoencoder(DSSAE)as the classifier.Besides,multiple-way data augmentation is chosen to overcome overfitting.The multiple-way data augmentation is based on Gaussian noise,salt-and-pepper noise,speckle noise,horizontal and vertical shear,rotation,Gamma correction,random translation and scaling.(Results)10 runs of 10-fold cross validation shows that our PZM-DSSAE method achieves a sensitivity of 92.06%±1.54%,a specificity of 92.56%±1.06%,a precision of 92.53%±1.03%,and an accuracy of 92.31%±1.08%.Its F1 score,MCC,and FMI arrive at 92.29%±1.10%,84.64%±2.15%,and 92.29%±1.10%,respectively.The AUC of our model is 0.9576.(Conclusion)We demonstrate“image plane over unit circle”can get better results than“image plane inside a unit circle.”Besides,this proposed PZM-DSSAE model is better than eight state-of-the-art approaches.
基金Youth Innovation Promotion Association of the Chinese Academy of Sciences(2022246)Shanghai Sailing Program(22YF1454800,20YF145480)。
文摘Traditional hyperspectral imaging(HI)systems are constrained by a limited depth of field(DoF),necessitating refocusing for any out-of-focus objects.This requirement not only slows down the imaging speed but also complicates the system architecture.It is challenging to trade off among speed,resolution,and DoF within an ultrasimple system.While some studies have reported advancements in extending DoF,the improvements remain insufficient.To address this challenge,we propose a novel,to our knowledge,differentiable framework that integrates an extended DoF(E-DoF)wave propagation model and an achromatic hyperspectral reconstructor powered by deep learning.Through rigorous experimental validation,we have demonstrated that the compact HI system is capable of snapshot capturing of high-fidelity images with an exceptional DoF reaching approximately 5 m,marking a significant improvement of over three orders of magnitude.Additionally,the system achieves over 90%spectral accuracy without aberration,nearly doubling the accuracy performance of existing methods.An asymmetric freeform surface design is introduced for diffractive optical elements,enabling dual functionality with design freedom and E-DoF.The sparse prior conditions for spatial texture and spectral features of hyperspectral cubic data are integrated into the reconstruction network,effectively mitigating texture blurring and chromatic aberration.It foresees that the optimal strategy for achromatic E-DoF can be adopted into other optical systems such as polarization imaging and depth measurement.
基金Manuscript received February 13, 2016 accepted December 7, 2016. This work was supported by the National Natural Science Foundation of China (61362001, 61661031), Jiangxi Province Innovation Projects for Postgraduate Funds (YC2016-S006), the International Postdoctoral Exchange Fellowship Program, and Jiangxi Advanced Project for Post-Doctoral Research Fund (2014KY02).
基金Supported by the National Natural Science Foundation of China(No.61379014)
文摘In this paper, a discriminative structured dictionary learning algorithm is presented. To enhance the dictionary's discriminative power, the reconstruction error, classification error and inhomogeneous representation error are integrated into the objective function. The proposed approach learns a single structured dictionary and a linear classifier jointly. The learned dictionary encourages the samples from the same class to have similar sparse codes, and the samples from different classes to have dissimilar sparse codes. The solution to the objective function is achieved by employing a feature-sign search algorithm and Lagrange dual method. Experimental results on three public databases demonstrate that the proposed approach outperforms several recently proposed dictionary learning techniques for classification.
基金Supported by the National Natural Science Foundation of China(No.61261010No.61362001+7 种基金No.61365013No.61262084No.51165033)Technology Foundation of Department of Education in Jiangxi Province(GJJ13061GJJ14196)Young Scientists Training Plan of Jiangxi Province(No.20133ACB21007No.20142BCB23001)National Post-Doctoral Research Fund(No.2014M551867)and Jiangxi Advanced Project for Post-Doctoral Research Fund(No.2014KY02)
文摘In this paper, a two-level Bregman method is presented with graph regularized sparse coding for highly undersampled magnetic resonance image reconstruction. The graph regularized sparse coding is incorporated with the two-level Bregman iterative procedure which enforces the sampled data constraints in the outer level and updates dictionary and sparse representation in the inner level. Graph regularized sparse coding and simple dictionary updating applied in the inner minimization make the proposed algorithm converge with a relatively small number of iterations. Experimental results demonstrate that the proposed algorithm can consistently reconstruct both simulated MR images and real MR data efficiently, and outperforms the current state-of-the-art approaches in terms of visual comparisons and quantitative measures.