期刊文献+
共找到694篇文章
< 1 2 35 >
每页显示 20 50 100
Active learning based on maximizing information gain for content-based image retrieval
1
作者 徐杰 施鹏飞 《Journal of Southeast University(English Edition)》 EI CAS 2004年第4期431-435,共5页
This paper describes a new method for active learning in content-based image retrieval. The proposed method firstly uses support vector machine (SVM) classifiers to learn an initial query concept. Then the proposed ac... This paper describes a new method for active learning in content-based image retrieval. The proposed method firstly uses support vector machine (SVM) classifiers to learn an initial query concept. Then the proposed active learning scheme employs similarity measure to check the current version space and selects images with maximum expected information gain to solicit user's label. Finally, the learned query is refined based on the user's further feedback. With the combination of SVM classifier and similarity measure, the proposed method can alleviate model bias existing in each of them. Our experiments on several query concepts show that the proposed method can learn the user's query concept quickly and effectively only with several iterations. 展开更多
关键词 active learning content-based image retrieval relevance feedback support vector machines similarity measure
在线阅读 下载PDF
Feature Extraction of Kernel Regress Reconstruction for Fault Diagnosis Based on Self-organizing Manifold Learning 被引量:3
2
作者 CHEN Xiaoguang LIANG Lin +1 位作者 XU Guanghua LIU Dan 《Chinese Journal of Mechanical Engineering》 SCIE EI CAS CSCD 2013年第5期1041-1049,共9页
The feature space extracted from vibration signals with various faults is often nonlinear and of high dimension.Currently,nonlinear dimensionality reduction methods are available for extracting low-dimensional embeddi... The feature space extracted from vibration signals with various faults is often nonlinear and of high dimension.Currently,nonlinear dimensionality reduction methods are available for extracting low-dimensional embeddings,such as manifold learning.However,these methods are all based on manual intervention,which have some shortages in stability,and suppressing the disturbance noise.To extract features automatically,a manifold learning method with self-organization mapping is introduced for the first time.Under the non-uniform sample distribution reconstructed by the phase space,the expectation maximization(EM) iteration algorithm is used to divide the local neighborhoods adaptively without manual intervention.After that,the local tangent space alignment(LTSA) algorithm is adopted to compress the high-dimensional phase space into a more truthful low-dimensional representation.Finally,the signal is reconstructed by the kernel regression.Several typical states include the Lorenz system,engine fault with piston pin defect,and bearing fault with outer-race defect are analyzed.Compared with the LTSA and continuous wavelet transform,the results show that the background noise can be fully restrained and the entire periodic repetition of impact components is well separated and identified.A new way to automatically and precisely extract the impulsive components from mechanical signals is proposed. 展开更多
关键词 feature extraction manifold learning self-organize mapping kernel regression local tangent space alignment
在线阅读 下载PDF
Toward Fine-grained Image Retrieval with Adaptive Deep Learning for Cultural Heritage Image 被引量:2
3
作者 Sathit Prasomphan 《Computer Systems Science & Engineering》 SCIE EI 2023年第2期1295-1307,共13页
Fine-grained image classification is a challenging research topic because of the high degree of similarity among categories and the high degree of dissimilarity for a specific category caused by different poses and scal... Fine-grained image classification is a challenging research topic because of the high degree of similarity among categories and the high degree of dissimilarity for a specific category caused by different poses and scales.A cul-tural heritage image is one of thefine-grained images because each image has the same similarity in most cases.Using the classification technique,distinguishing cultural heritage architecture may be difficult.This study proposes a cultural heri-tage content retrieval method using adaptive deep learning forfine-grained image retrieval.The key contribution of this research was the creation of a retrieval mod-el that could handle incremental streams of new categories while maintaining its past performance in old categories and not losing the old categorization of a cul-tural heritage image.The goal of the proposed method is to perform a retrieval task for classes.Incremental learning for new classes was conducted to reduce the re-training process.In this step,the original class is not necessary for re-train-ing which we call an adaptive deep learning technique.Cultural heritage in the case of Thai archaeological site architecture was retrieved through machine learn-ing and image processing.We analyze the experimental results of incremental learning forfine-grained images with images of Thai archaeological site architec-ture from world heritage provinces in Thailand,which have a similar architecture.Using afine-grained image retrieval technique for this group of cultural heritage images in a database can solve the problem of a high degree of similarity among categories and a high degree of dissimilarity for a specific category.The proposed method for retrieving the correct image from a database can deliver an average accuracy of 85 percent.Adaptive deep learning forfine-grained image retrieval was used to retrieve cultural heritage content,and it outperformed state-of-the-art methods infine-grained image retrieval. 展开更多
关键词 Fine-grained image adaptive deep learning cultural heritage image retrieval
在线阅读 下载PDF
Learning Noise-Assisted Robust Image Features for Fine-Grained Image Retrieval 被引量:1
4
作者 Vidit Kumar Hemant Petwal +1 位作者 Ajay Krishan Gairola Pareshwar Prasad Barmola 《Computer Systems Science & Engineering》 SCIE EI 2023年第9期2711-2724,共14页
Fine-grained image search is one of the most challenging tasks in computer vision that aims to retrieve similar images at the fine-grained level for a given query image.The key objective is to learn discriminative fin... Fine-grained image search is one of the most challenging tasks in computer vision that aims to retrieve similar images at the fine-grained level for a given query image.The key objective is to learn discriminative fine-grained features by training deep models such that similar images are clustered,and dissimilar images are separated in the low embedding space.Previous works primarily focused on defining local structure loss functions like triplet loss,pairwise loss,etc.However,training via these approaches takes a long training time,and they have poor accuracy.Additionally,representations learned through it tend to tighten up in the embedded space and lose generalizability to unseen classes.This paper proposes a noise-assisted representation learning method for fine-grained image retrieval to mitigate these issues.In the proposed work,class manifold learning is performed in which positive pairs are created with noise insertion operation instead of tightening class clusters.And other instances are treated as negatives within the same cluster.Then a loss function is defined to penalize when the distance between instances of the same class becomes too small relative to the noise pair in that class in embedded space.The proposed approach is validated on CARS-196 and CUB-200 datasets and achieved better retrieval results(85.38%recall@1 for CARS-196%and 70.13%recall@1 for CUB-200)compared to other existing methods. 展开更多
关键词 Convolutional network zero-shot learning fine-grained image retrieval image representation image retrieval intra-class diversity feature learning
在线阅读 下载PDF
Triplet Label Based Image Retrieval Using Deep Learning in Large Database 被引量:1
5
作者 K.Nithya V.Rajamani 《Computer Systems Science & Engineering》 SCIE EI 2023年第3期2655-2666,共12页
Recent days,Image retrieval has become a tedious process as the image database has grown very larger.The introduction of Machine Learning(ML)and Deep Learning(DL)made this process more comfortable.In these,the pair-wi... Recent days,Image retrieval has become a tedious process as the image database has grown very larger.The introduction of Machine Learning(ML)and Deep Learning(DL)made this process more comfortable.In these,the pair-wise label similarity is used tofind the matching images from the database.But this method lacks of limited propose code and weak execution of misclassified images.In order to get-rid of the above problem,a novel triplet based label that incorporates context-spatial similarity measure is proposed.A Point Attention Based Triplet Network(PABTN)is introduced to study propose code that gives maximum discriminative ability.To improve the performance of ranking,a corre-lating resolutions for the classification,triplet labels based onfindings,a spatial-attention mechanism and Region Of Interest(ROI)and small trial information loss containing a new triplet cross-entropy loss are used.From the experimental results,it is shown that the proposed technique exhibits better results in terms of mean Reciprocal Rank(mRR)and mean Average Precision(mAP)in the CIFAR-10 and NUS-WIPE datasets. 展开更多
关键词 image retrieval deep learning point attention based triplet network correlating resolutions classification region of interest
在线阅读 下载PDF
An Efficient Deep Learning-based Content-based Image Retrieval Framework 被引量:1
6
作者 M.Sivakumar N.M.Saravana Kumar N.Karthikeyan 《Computer Systems Science & Engineering》 SCIE EI 2022年第11期683-700,共18页
The use of massive image databases has increased drastically over the few years due to evolution of multimedia technology.Image retrieval has become one of the vital tools in image processing applications.Content-Base... The use of massive image databases has increased drastically over the few years due to evolution of multimedia technology.Image retrieval has become one of the vital tools in image processing applications.Content-Based Image Retrieval(CBIR)has been widely used in varied applications.But,the results produced by the usage of a single image feature are not satisfactory.So,multiple image features are used very often for attaining better results.But,fast and effective searching for relevant images from a database becomes a challenging task.In the previous existing system,the CBIR has used the combined feature extraction technique using color auto-correlogram,Rotation-Invariant Uniform Local Binary Patterns(RULBP)and local energy.However,the existing system does not provide significant results in terms of recall and precision.Also,the computational complexity is higher for the existing CBIR systems.In order to handle the above mentioned issues,the Gray Level Co-occurrence Matrix(GLCM)with Deep Learning based Enhanced Convolution Neural Network(DLECNN)is proposed in this work.The proposed system framework includes noise reduction using histogram equalization,feature extraction using GLCM,similarity matching computation using Hierarchal and Fuzzy c-Means(HFCM)algorithm and the image retrieval using DLECNN algorithm.The histogram equalization has been used for computing the image enhancement.This enhanced image has a uniform histogram.Then,the GLCM method has been used to extract the features such as shape,texture,colour,annotations and keywords.The HFCM similarity measure is used for computing the query image vector's similarity index with every database images.For enhancing the performance of this image retrieval approach,the DLECNN algorithm is proposed to retrieve more accurate features of the image.The proposed GLCM+DLECNN algorithm provides better results associated with high accuracy,precision,recall,f-measure and lesser complexity.From the experimental results,it is clearly observed that the proposed system provides efficient image retrieval for the given query image. 展开更多
关键词 Content based image retrieval(CBIR) improved gray level cooccurrence matrix(GLCM) hierarchal and fuzzy C-means(HFCM)algorithm deep learning based enhanced convolution neural network(DLECNN)
在线阅读 下载PDF
Alternating minimization for data-driven computational elasticity from experimental data: kernel method for learning constitutive manifold
7
作者 Yoshihiro Kanno 《Theoretical & Applied Mechanics Letters》 CSCD 2021年第5期260-265,共6页
Data-driven computing in elasticity attempts to directly use experimental data on material,without constructing an empirical model of the constitutive relation,to predict an equilibrium state of a structure subjected ... Data-driven computing in elasticity attempts to directly use experimental data on material,without constructing an empirical model of the constitutive relation,to predict an equilibrium state of a structure subjected to a specified external load.Provided that a data set comprising stress-strain pairs of material is available,a data-driven method using the kernel method and the regularized least-squares was developed to extract a manifold on which the points in the data set approximately lie(Kanno 2021,Jpn.J.Ind.Appl.Math.).From the perspective of physical experiments,stress field cannot be directly measured,while displacement and force fields are measurable.In this study,we extend the previous kernel method to the situation that pairs of displacement and force,instead of pairs of stress and strain,are available as an input data set.A new regularized least-squares problem is formulated in this problem setting,and an alternating minimization algorithm is proposed to solve the problem. 展开更多
关键词 Alternating minimization Regularized least-squares kernel method manifold learning Data-driven computing
在线阅读 下载PDF
Historical Arabic Images Classification and Retrieval Using Siamese Deep Learning Model
8
作者 Manal M.Khayyat Lamiaa A.Elrefaei Mashael M.Khayyat 《Computers, Materials & Continua》 SCIE EI 2022年第7期2109-2125,共17页
Classifying the visual features in images to retrieve a specific image is a significant problem within the computer vision field especially when dealing with historical faded colored images.Thus,there were lots of eff... Classifying the visual features in images to retrieve a specific image is a significant problem within the computer vision field especially when dealing with historical faded colored images.Thus,there were lots of efforts trying to automate the classification operation and retrieve similar images accurately.To reach this goal,we developed a VGG19 deep convolutional neural network to extract the visual features from the images automatically.Then,the distances among the extracted features vectors are measured and a similarity score is generated using a Siamese deep neural network.The Siamese model built and trained at first from scratch but,it didn’t generated high evaluation metrices.Thus,we re-built it from VGG19 pre-trained deep learning model to generate higher evaluation metrices.Afterward,three different distance metrics combined with the Sigmoid activation function are experimented looking for the most accurate method formeasuring the similarities among the retrieved images.Reaching that the highest evaluation parameters generated using the Cosine distance metric.Moreover,the Graphics Processing Unit(GPU)utilized to run the code instead of running it on the Central Processing Unit(CPU).This step optimized the execution further since it expedited both the training and the retrieval time efficiently.After extensive experimentation,we reached satisfactory solution recording 0.98 and 0.99 F-score for the classification and for the retrieval,respectively. 展开更多
关键词 Visual features vectors deep learning models distance methods similar image retrieval
在线阅读 下载PDF
Semi-Supervised Dimensionality Reduction of Hyperspectral Image Based on Sparse Multi-Manifold Learning
9
作者 Hong Huang Fulin Luo +1 位作者 Zezhong Ma Hailiang Feng 《Journal of Computer and Communications》 2015年第11期33-39,共7页
In this paper, we proposed a new semi-supervised multi-manifold learning method, called semi- supervised sparse multi-manifold embedding (S3MME), for dimensionality reduction of hyperspectral image data. S3MME exploit... In this paper, we proposed a new semi-supervised multi-manifold learning method, called semi- supervised sparse multi-manifold embedding (S3MME), for dimensionality reduction of hyperspectral image data. S3MME exploits both the labeled and unlabeled data to adaptively find neighbors of each sample from the same manifold by using an optimization program based on sparse representation, and naturally gives relative importance to the labeled ones through a graph-based methodology. Then it tries to extract discriminative features on each manifold such that the data points in the same manifold become closer. The effectiveness of the proposed multi-manifold learning algorithm is demonstrated and compared through experiments on a real hyperspectral images. 展开更多
关键词 HYPERSPECTRAL image Classification Dimensionality Reduction Multiple manifoldS Structure SPARSE REPRESENTATION SEMI-SUPERVISED learning
在线阅读 下载PDF
A Survey of Crime Scene Investigation Image Retrieval Using Deep Learning
10
作者 Ying Liu Aodong Zhou +1 位作者 Jize Xue Zhijie Xu 《Journal of Beijing Institute of Technology》 EI CAS 2024年第4期271-286,共16页
Crime scene investigation(CSI)image is key evidence carrier during criminal investiga-tion,in which CSI image retrieval can assist the public police to obtain criminal clues.Moreover,with the rapid development of deep... Crime scene investigation(CSI)image is key evidence carrier during criminal investiga-tion,in which CSI image retrieval can assist the public police to obtain criminal clues.Moreover,with the rapid development of deep learning,data-driven paradigm has become the mainstreammethod of CSI image feature extraction and representation,and in this process,datasets provideeffective support for CSI retrieval performance.However,there is a lack of systematic research onCSI image retrieval methods and datasets.Therefore,we present an overview of the existing worksabout one-class and multi-class CSI image retrieval based on deep learning.According to theresearch,based on their technical functionalities and implementation methods,CSI image retrievalis roughly classified into five categories:feature representation,metric learning,generative adversar-ial networks,autoencoder networks and attention networks.Furthermore,We analyzed the remain-ing challenges and discussed future work directions in this field. 展开更多
关键词 crime scene investigation(CSI)image image retrieval deep learning
在线阅读 下载PDF
High Precision Self-learning Hashing for Image Retrieval
11
作者 Jia-run Fu Ling-yu Yan +3 位作者 Lu Yuan Yan Zhou Hong-xin Zhang Chun-zhi Wang 《国际计算机前沿大会会议论文集》 2018年第1期57-57,共1页
在线阅读 下载PDF
Multi-perception large kernel convnet for efficient image super-resolution
12
作者 MIAO Xuan LI Zheng XU Wen-Zheng 《四川大学学报(自然科学版)》 北大核心 2025年第1期67-78,共12页
Significant advancements have been achieved in the field of Single Image Super-Resolution(SISR)through the utilization of Convolutional Neural Networks(CNNs)to attain state-of-the-art performance.Recent efforts have e... Significant advancements have been achieved in the field of Single Image Super-Resolution(SISR)through the utilization of Convolutional Neural Networks(CNNs)to attain state-of-the-art performance.Recent efforts have explored the incorporation of Transformers to augment network performance in SISR.However,the high computational cost of Transformers makes them less suitable for deployment on lightweight devices.Moreover,the majority of enhancements for CNNs rely predominantly on small spatial convolutions,thereby neglecting the potential advantages of large kernel convolution.In this paper,the authors propose a Multi-Perception Large Kernel convNet(MPLKN)which delves into the exploration of large kernel convolution.Specifically,the authors have architected a Multi-Perception Large Kernel(MPLK)module aimed at extracting multi-scale features and employ a stepwise feature fusion strategy to seamlessly integrate these features.In addition,to enhance the network's capacity for nonlinear spatial information processing,the authors have designed a Spatial-Channel Gated Feed-forward Network(SCGFN)that is capable of adapting to feature interactions across both spatial and channel dimensions.Experimental results demonstrate that MPLKN outperforms other lightweight image super-resolution models while maintaining a minimal number of parameters and FLOPs. 展开更多
关键词 Single image Super-Resolution Lightweight model Deep learning Large kernel
在线阅读 下载PDF
Semi-supervised learning based probabilistic latent semantic analysis for automatic image annotation 被引量:1
13
作者 Tian Dongping 《High Technology Letters》 EI CAS 2017年第4期367-374,共8页
In recent years,multimedia annotation problem has been attracting significant research attention in multimedia and computer vision areas,especially for automatic image annotation,whose purpose is to provide an efficie... In recent years,multimedia annotation problem has been attracting significant research attention in multimedia and computer vision areas,especially for automatic image annotation,whose purpose is to provide an efficient and effective searching environment for users to query their images more easily. In this paper,a semi-supervised learning based probabilistic latent semantic analysis( PLSA) model for automatic image annotation is presenred. Since it's often hard to obtain or create labeled images in large quantities while unlabeled ones are easier to collect,a transductive support vector machine( TSVM) is exploited to enhance the quality of the training image data. Then,different image features with different magnitudes will result in different performance for automatic image annotation. To this end,a Gaussian normalization method is utilized to normalize different features extracted from effective image regions segmented by the normalized cuts algorithm so as to reserve the intrinsic content of images as complete as possible. Finally,a PLSA model with asymmetric modalities is constructed based on the expectation maximization( EM) algorithm to predict a candidate set of annotations with confidence scores. Extensive experiments on the general-purpose Corel5k dataset demonstrate that the proposed model can significantly improve performance of traditional PLSA for the task of automatic image annotation. 展开更多
关键词 automatic image annotation semi-supervised learning probabilistic latent semantic analysis(PLSA) transductive support vector machine(TSVM) image segmentation image retrieval
在线阅读 下载PDF
Natural Language Processing with Optimal Deep Learning-Enabled Intelligent Image Captioning System 被引量:1
14
作者 Radwa Marzouk Eatedal Alabdulkreem +5 位作者 Mohamed KNour Mesfer Al Duhayyim Mahmoud Othman Abu Sarwar Zamani Ishfaq Yaseen Abdelwahed Motwakel 《Computers, Materials & Continua》 SCIE EI 2023年第2期4435-4451,共17页
The recent developments in Multimedia Internet of Things(MIoT)devices,empowered with Natural Language Processing(NLP)model,seem to be a promising future of smart devices.It plays an important role in industrial models... The recent developments in Multimedia Internet of Things(MIoT)devices,empowered with Natural Language Processing(NLP)model,seem to be a promising future of smart devices.It plays an important role in industrial models such as speech understanding,emotion detection,home automation,and so on.If an image needs to be captioned,then the objects in that image,its actions and connections,and any silent feature that remains under-projected or missing from the images should be identified.The aim of the image captioning process is to generate a caption for image.In next step,the image should be provided with one of the most significant and detailed descriptions that is syntactically as well as semantically correct.In this scenario,computer vision model is used to identify the objects and NLP approaches are followed to describe the image.The current study develops aNatural Language Processing with Optimal Deep Learning Enabled Intelligent Image Captioning System(NLPODL-IICS).The aim of the presented NLPODL-IICS model is to produce a proper description for input image.To attain this,the proposed NLPODL-IICS follows two stages such as encoding and decoding processes.Initially,at the encoding side,the proposed NLPODL-IICS model makes use of Hunger Games Search(HGS)with Neural Search Architecture Network(NASNet)model.This model represents the input data appropriately by inserting it into a predefined length vector.Besides,during decoding phase,Chimp Optimization Algorithm(COA)with deeper Long Short Term Memory(LSTM)approach is followed to concatenate the description sentences 4436 CMC,2023,vol.74,no.2 produced by the method.The application of HGS and COA algorithms helps in accomplishing proper parameter tuning for NASNet and LSTM models respectively.The proposed NLPODL-IICS model was experimentally validated with the help of two benchmark datasets.Awidespread comparative analysis confirmed the superior performance of NLPODL-IICS model over other models. 展开更多
关键词 Natural language processing information retrieval image captioning deep learning metaheuristics
在线阅读 下载PDF
Online Learning a Binary Classifier for Improving Google Image Search Results 被引量:1
15
作者 WAN Yu-Chai LIU Xia-Bi HAN Fei-Fei TONG Kun-Qi LIU Yu 《自动化学报》 EI CSCD 北大核心 2014年第8期1699-1708,共10页
关键词 搜索结果 在线学习 二元分类 贝叶斯分类器 算法框架 训练数据 图片 支持向量机
在线阅读 下载PDF
Efficient Method for Trademark Image Retrieval: Leveraging Siamese and Triplet Networks with Examination-Informed Loss Adjustment
16
作者 Thanh Bui-Minh Nguyen Long Giang Luan Thanh Le 《Computers, Materials & Continua》 2025年第7期1203-1226,共24页
Image-based similar trademark retrieval is a time-consuming and labor-intensive task in the trademark examination process.This paper aims to support trademark examiners by training Deep Convolutional Neural Network(DC... Image-based similar trademark retrieval is a time-consuming and labor-intensive task in the trademark examination process.This paper aims to support trademark examiners by training Deep Convolutional Neural Network(DCNN)models for effective Trademark Image Retrieval(TIR).To achieve this goal,we first develop a novel labeling method that automatically generates hundreds of thousands of labeled similar and dissimilar trademark image pairs using accompanying data fields such as citation lists,Vienna classification(VC)codes,and trademark ownership information.This approach eliminates the need for manual labeling and provides a large-scale dataset suitable for training deep learning models.We then train DCNN models based on Siamese and Triplet architectures,evaluating various feature extractors to determine the most effective configuration.Furthermore,we present an Adapted Contrastive Loss Function(ACLF)for the trademark retrieval task,specifically engineered to mitigate the influence of noisy labels found in automatically created datasets.Experimental results indicate that our proposed model(Efficient-Net_v21_Siamese)performs best at both True Negative Rate(TNR)threshold levels,TNR 0.9 and TNR 0.95,with==respective True Positive Rates(TPRs)of 77.7%and 70.8%and accuracies of 83.9%and 80.4%.Additionally,when testing on the public trademark dataset METU_v2,our model achieves a normalized average rank(NAR)of 0.0169,outperforming the current state-of-the-art(SOTA)model.Based on these findings,we estimate that considering only approximately 10%of the returned trademarks would be sufficient,significantly reducing the review time.Therefore,the paper highlights the potential of utilizing national trademark data to enhance the accuracy and efficiency of trademark retrieval systems,ultimately supporting trademark examiners in their evaluation tasks. 展开更多
关键词 TRADEMARK image retrieval similar search similar retrieval content-based image retrieval similar ranking contrastive learning Siamese TRIPLET citation list
在线阅读 下载PDF
Approximate-Guided Representation Learning in Vision Transformer
17
作者 Kaili Wang Xinwei Sun +2 位作者 Huijie He Fenhua Bai Tao Shen 《CAAI Transactions on Intelligence Technology》 2025年第5期1459-1477,共19页
In recent years,the transformer model has demonstrated excellent performance in computer vision(CV)applications.The key lies in its guided representation attention mechanism,which uses dot-product to depict complex fe... In recent years,the transformer model has demonstrated excellent performance in computer vision(CV)applications.The key lies in its guided representation attention mechanism,which uses dot-product to depict complex feature relationships,and comprehensively understands the context semantics to obtain feature weights.Then feature enhancement is implemented by guiding the target matrix through feature weights.However,the uncertainty and inconsistency of features are widespread that prone to confusion in the description of relationships within dot-product attention mechanisms.To solve this problem,this paper proposed a novel approximate-guided representation learning methodology for vision transformer.The kernelised matroids fuzzy rough set is defined,wherein the closed sets inside kernelised fuzzy information granules of matroids structures can constitute the subspace of lower approximation in rough sets.Thus,the kernel relation is employed to characterise image feature granules that will be reconstructed according to the independent set in matroids theory.Then,according to the characteristics of the closed set within matroids,the feature attention weight is formed by using the lower approximation to realise the approximate guidance of features.The approximate-guided representation mechanism can be flexibly deployed as a plug-and-play component in a wide range of CV tasks.Extensive empirical results demonstrate that the proposed method outperforms the majority of advanced prevalent models,especially in terms of robustness. 展开更多
关键词 computer vision deep learning image representation kernel methods rough sets
在线阅读 下载PDF
UniTrans:Unified Parameter-Efficient Transfer Learning and Multimodal Alignment for Large Multimodal Foundation Model
18
作者 Jiakang Sun Ke Chen +3 位作者 Xinyang He Xu Liu Ke Li Cheng Peng 《Computers, Materials & Continua》 2025年第4期219-238,共20页
With the advancements in parameter-efficient transfer learning techniques,it has become feasible to leverage large pre-trained language models for downstream tasks under low-cost and low-resource conditions.However,ap... With the advancements in parameter-efficient transfer learning techniques,it has become feasible to leverage large pre-trained language models for downstream tasks under low-cost and low-resource conditions.However,applying this technique to multimodal knowledge transfer introduces a significant challenge:ensuring alignment across modalities while minimizing the number of additional parameters required for downstream task adaptation.This paper introduces UniTrans,a framework aimed at facilitating efficient knowledge transfer across multiple modalities.UniTrans leverages Vector-based Cross-modal Random Matrix Adaptation to enable fine-tuning with minimal parameter overhead.To further enhance modality alignment,we introduce two key components:the Multimodal Consistency Alignment Module and the Query-Augmentation Side Network,specifically optimized for scenarios with extremely limited trainable parameters.Extensive evaluations on various cross-modal downstream tasks demonstrate that our approach surpasses state-of-the-art methods while using just 5%of their trainable parameters.Additionally,it achieves superior performance compared to fully fine-tuned models on certain benchmarks. 展开更多
关键词 Parameter-efficient transfer learning multimodal alignment image captioning image-text retrieval visual question answering
在线阅读 下载PDF
A Deep-Learning and Transfer-Learning Hybrid Aerosol Retrieval Algorithm for FY4-AGRI:Development and Verification over Asia
19
作者 Disong Fu Hongrong Shi +9 位作者 Christian AGueymard Dazhi Yang Yu Zheng Huizheng Che Xuehua Fan Xinlei Han Lin Gao Jianchun Bian Minzheng Duan Xiangao Xia 《Engineering》 SCIE EI CAS CSCD 2024年第7期164-174,共11页
The Advanced Geosynchronous Radiation Imager(AGRI)is a mission-critical instrument for the Fengyun series of satellites.AGRI acquires full-disk images every 15 min and views East Asia every 5 min through 14 spectral b... The Advanced Geosynchronous Radiation Imager(AGRI)is a mission-critical instrument for the Fengyun series of satellites.AGRI acquires full-disk images every 15 min and views East Asia every 5 min through 14 spectral bands,enabling the detection of highly variable aerosol optical depth(AOD).Quantitative retrieval of AOD has hitherto been challenging,especially over land.In this study,an AOD retrieval algorithm is proposed that combines deep learning and transfer learning.The algorithm uses core concepts from both the Dark Target(DT)and Deep Blue(DB)algorithms to select features for the machinelearning(ML)algorithm,allowing for AOD retrieval at 550 nm over both dark and bright surfaces.The algorithm consists of two steps:①A baseline deep neural network(DNN)with skip connections is developed using 10 min Advanced Himawari Imager(AHI)AODs as the target variable,and②sunphotometer AODs from 89 ground-based stations are used to fine-tune the DNN parameters.Out-of-station validation shows that the retrieved AOD attains high accuracy,characterized by a coefficient of determination(R2)of 0.70,a mean bias error(MBE)of 0.03,and a percentage of data within the expected error(EE)of 70.7%.A sensitivity study reveals that the top-of-atmosphere reflectance at 650 and 470 nm,as well as the surface reflectance at 650 nm,are the two largest sources of uncertainty impacting the retrieval.In a case study of monitoring an extreme aerosol event,the AGRI AOD is found to be able to capture the detailed temporal evolution of the event.This work demonstrates the superiority of the transfer-learning technique in satellite AOD retrievals and the applicability of the retrieved AGRI AOD in monitoring extreme pollution events. 展开更多
关键词 Aerosol optical depth retrieval algorithm Deep learning Transfer learning Advanced Geosynchronous Radiation imageR
在线阅读 下载PDF
Multimodal Fused Deep Learning Networks for Domain Specific Image Similarity Search
20
作者 Umer Waqas Jesse Wiebe Visser +1 位作者 Hana Choe Donghun Lee 《Computers, Materials & Continua》 SCIE EI 2023年第4期243-258,共16页
The exponential increase in data over the past fewyears,particularly in images,has led to more complex content since visual representation became the new norm.E-commerce and similar platforms maintain large image cata... The exponential increase in data over the past fewyears,particularly in images,has led to more complex content since visual representation became the new norm.E-commerce and similar platforms maintain large image catalogues of their products.In image databases,searching and retrieving similar images is still a challenge,even though several image retrieval techniques have been proposed over the decade.Most of these techniques work well when querying general image databases.However,they often fail in domain-specific image databases,especially for datasets with low intraclass variance.This paper proposes a domain-specific image similarity search engine based on a fused deep learning network.The network is comprised of an improved object localization module,a classification module to narrow down search options and finally a feature extraction and similarity calculation module.The network features both an offline stage for indexing the dataset and an online stage for querying.The dataset used to evaluate the performance of the proposed network is a custom domain-specific dataset related to cosmetics packaging gathered from various online platforms.The proposed method addresses the intraclass variance problem with more precise object localization and the introduction of top result reranking based on object contours.Finally,quantitative and qualitative experiment results are presented,showing improved image similarity search performance. 展开更多
关键词 image search classification image retrieval deep learning
在线阅读 下载PDF
上一页 1 2 35 下一页 到第
使用帮助 返回顶部