期刊文献+
共找到297篇文章
< 1 2 15 >
每页显示 20 50 100
Tomato Growth Height Prediction Method by Phenotypic Feature Extraction Using Multi-modal Data
1
作者 GONG Yu WANG Ling +3 位作者 ZHAO Rongqiang YOU Haibo ZHOU Mo LIU Jie 《智慧农业(中英文)》 2025年第1期97-110,共14页
[Objective]Accurate prediction of tomato growth height is crucial for optimizing production environments in smart farming.However,current prediction methods predominantly rely on empirical,mechanistic,or learning-base... [Objective]Accurate prediction of tomato growth height is crucial for optimizing production environments in smart farming.However,current prediction methods predominantly rely on empirical,mechanistic,or learning-based models that utilize either images data or environmental data.These methods fail to fully leverage multi-modal data to capture the diverse aspects of plant growth comprehensively.[Methods]To address this limitation,a two-stage phenotypic feature extraction(PFE)model based on deep learning algorithm of recurrent neural network(RNN)and long short-term memory(LSTM)was developed.The model integrated environment and plant information to provide a holistic understanding of the growth process,emploied phenotypic and temporal feature extractors to comprehensively capture both types of features,enabled a deeper understanding of the interaction between tomato plants and their environment,ultimately leading to highly accurate predictions of growth height.[Results and Discussions]The experimental results showed the model's ef‐fectiveness:When predicting the next two days based on the past five days,the PFE-based RNN and LSTM models achieved mean absolute percentage error(MAPE)of 0.81%and 0.40%,respectively,which were significantly lower than the 8.00%MAPE of the large language model(LLM)and 6.72%MAPE of the Transformer-based model.In longer-term predictions,the 10-day prediction for 4 days ahead and the 30-day prediction for 12 days ahead,the PFE-RNN model continued to outperform the other two baseline models,with MAPE of 2.66%and 14.05%,respectively.[Conclusions]The proposed method,which leverages phenotypic-temporal collaboration,shows great potential for intelligent,data-driven management of tomato cultivation,making it a promising approach for enhancing the efficiency and precision of smart tomato planting management. 展开更多
关键词 tomato growth prediction deep learning phenotypic feature extraction multi-modal data recurrent neural net‐work long short-term memory large language model
在线阅读 下载PDF
Multi-modal face parts fusion based on Gabor feature for face recognition 被引量:1
2
作者 相燕 《High Technology Letters》 EI CAS 2009年第1期70-74,共5页
A novel face recognition method, which is a fusion of muhi-modal face parts based on Gabor feature (MMP-GF), is proposed in this paper. Firstly, the bare face image detached from the normalized image was convolved w... A novel face recognition method, which is a fusion of muhi-modal face parts based on Gabor feature (MMP-GF), is proposed in this paper. Firstly, the bare face image detached from the normalized image was convolved with a family of Gabor kernels, and then according to the face structure and the key-points locations, the calculated Gabor images were divided into five parts: Gabor face, Gabor eyebrow, Gabor eye, Gabor nose and Gabor mouth. After that multi-modal Gabor features were spatially partitioned into non-overlapping regions and the averages of regions were concatenated to be a low dimension feature vector, whose dimension was further reduced by principal component analysis (PCA). In the decision level fusion, match results respectively calculated based on the five parts were combined according to linear discriminant analysis (LDA) and a normalized matching algorithm was used to improve the performance. Experiments on FERET database show that the proposed MMP-GF method achieves good robustness to the expression and age variations. 展开更多
关键词 Gabor filter multi-modal Gabor features principal component analysis (PCA) linear discriminant analysis (IDA) normalized matching algorithm
在线阅读 下载PDF
Unsupervised multi-modal image translation based on the squeeze-and-excitation mechanism and feature attention module 被引量:1
3
作者 胡振涛 HU Chonghao +1 位作者 YANG Haoran SHUAI Weiwei 《High Technology Letters》 EI CAS 2024年第1期23-30,共8页
The unsupervised multi-modal image translation is an emerging domain of computer vision whose goal is to transform an image from the source domain into many diverse styles in the target domain.However,the multi-genera... The unsupervised multi-modal image translation is an emerging domain of computer vision whose goal is to transform an image from the source domain into many diverse styles in the target domain.However,the multi-generator mechanism is employed among the advanced approaches available to model different domain mappings,which results in inefficient training of neural networks and pattern collapse,leading to inefficient generation of image diversity.To address this issue,this paper introduces a multi-modal unsupervised image translation framework that uses a generator to perform multi-modal image translation.Specifically,firstly,the domain code is introduced in this paper to explicitly control the different generation tasks.Secondly,this paper brings in the squeeze-and-excitation(SE)mechanism and feature attention(FA)module.Finally,the model integrates multiple optimization objectives to ensure efficient multi-modal translation.This paper performs qualitative and quantitative experiments on multiple non-paired benchmark image translation datasets while demonstrating the benefits of the proposed method over existing technologies.Overall,experimental results have shown that the proposed method is versatile and scalable. 展开更多
关键词 multi-modal image translation generative adversarial network(GAN) squeezeand-excitation(SE)mechanism feature attention(FA)module
在线阅读 下载PDF
Robust Symmetry Prediction with Multi-Modal Feature Fusion for Partial Shapes
4
作者 Junhua Xi Kouquan Zheng +3 位作者 Yifan Zhong Longjiang Li Zhiping Cai Jinjing Chen 《Intelligent Automation & Soft Computing》 SCIE 2023年第3期3099-3111,共13页
In geometry processing,symmetry research benefits from global geo-metric features of complete shapes,but the shape of an object captured in real-world applications is often incomplete due to the limited sensor resoluti... In geometry processing,symmetry research benefits from global geo-metric features of complete shapes,but the shape of an object captured in real-world applications is often incomplete due to the limited sensor resolution,single viewpoint,and occlusion.Different from the existing works predicting symmetry from the complete shape,we propose a learning approach for symmetry predic-tion based on a single RGB-D image.Instead of directly predicting the symmetry from incomplete shapes,our method consists of two modules,i.e.,the multi-mod-al feature fusion module and the detection-by-reconstruction module.Firstly,we build a channel-transformer network(CTN)to extract cross-fusion features from the RGB-D as the multi-modal feature fusion module,which helps us aggregate features from the color and the depth separately.Then,our self-reconstruction net-work based on a 3D variational auto-encoder(3D-VAE)takes the global geo-metric features as input,followed by a prediction symmetry network to detect the symmetry.Our experiments are conducted on three public datasets:ShapeNet,YCB,and ScanNet,we demonstrate that our method can produce reliable and accurate results. 展开更多
关键词 Symmetry prediction multi-modal feature fusion partial shapes
在线阅读 下载PDF
Adaptive multi-modal feature fusion for far and hard object detection
5
作者 LI Yang GE Hongwei 《Journal of Measurement Science and Instrumentation》 CAS CSCD 2021年第2期232-241,共10页
In order to solve difficult detection of far and hard objects due to the sparseness and insufficient semantic information of LiDAR point cloud,a 3D object detection network with multi-modal data adaptive fusion is pro... In order to solve difficult detection of far and hard objects due to the sparseness and insufficient semantic information of LiDAR point cloud,a 3D object detection network with multi-modal data adaptive fusion is proposed,which makes use of multi-neighborhood information of voxel and image information.Firstly,design an improved ResNet that maintains the structure information of far and hard objects in low-resolution feature maps,which is more suitable for detection task.Meanwhile,semantema of each image feature map is enhanced by semantic information from all subsequent feature maps.Secondly,extract multi-neighborhood context information with different receptive field sizes to make up for the defect of sparseness of point cloud which improves the ability of voxel features to represent the spatial structure and semantic information of objects.Finally,propose a multi-modal feature adaptive fusion strategy which uses learnable weights to express the contribution of different modal features to the detection task,and voxel attention further enhances the fused feature expression of effective target objects.The experimental results on the KITTI benchmark show that this method outperforms VoxelNet with remarkable margins,i.e.increasing the AP by 8.78%and 5.49%on medium and hard difficulty levels.Meanwhile,our method achieves greater detection performance compared with many mainstream multi-modal methods,i.e.outperforming the AP by 1%compared with that of MVX-Net on medium and hard difficulty levels. 展开更多
关键词 3D object detection adaptive fusion multi-modal data fusion attention mechanism multi-neighborhood features
在线阅读 下载PDF
MMGC-Net: Deep neural network for classification of mineral grains using multi-modal polarization images 被引量:1
6
作者 Jun Shu Xiaohai He +3 位作者 Qizhi Teng Pengcheng Yan Haibo He Honggang Chen 《Journal of Rock Mechanics and Geotechnical Engineering》 2025年第6期3894-3909,共16页
The multi-modal characteristics of mineral particles play a pivotal role in enhancing the classification accuracy,which is critical for obtaining a profound understanding of the Earth's composition and ensuring ef... The multi-modal characteristics of mineral particles play a pivotal role in enhancing the classification accuracy,which is critical for obtaining a profound understanding of the Earth's composition and ensuring effective exploitation utilization of its resources.However,the existing methods for classifying mineral particles do not fully utilize these multi-modal features,thereby limiting the classification accuracy.Furthermore,when conventional multi-modal image classification methods are applied to planepolarized and cross-polarized sequence images of mineral particles,they encounter issues such as information loss,misaligned features,and challenges in spatiotemporal feature extraction.To address these challenges,we propose a multi-modal mineral particle polarization image classification network(MMGC-Net)for precise mineral particle classification.Initially,MMGC-Net employs a two-dimensional(2D)backbone network with shared parameters to extract features from two types of polarized images to ensure feature alignment.Subsequently,a cross-polarized intra-modal feature fusion module is designed to refine the spatiotemporal features from the extracted features of the cross-polarized sequence images.Ultimately,the inter-modal feature fusion module integrates the two types of modal features to enhance the classification precision.Quantitative and qualitative experimental results indicate that when compared with the current state-of-the-art multi-modal image classification methods,MMGC-Net demonstrates marked superiority in terms of mineral particle multi-modal feature learning and four classification evaluation metrics.It also demonstrates better stability than the existing models. 展开更多
关键词 Mineral particles multi-modal image classification Shared parameters feature fusion Spatiotemporal feature
暂未订购
Joint Feature Encoding and Task Alignment Mechanism for Emotion-Cause Pair Extraction
7
作者 Shi Li Didi Sun 《Computers, Materials & Continua》 SCIE EI 2025年第1期1069-1086,共18页
With the rapid expansion of social media,analyzing emotions and their causes in texts has gained significant importance.Emotion-cause pair extraction enables the identification of causal relationships between emotions... With the rapid expansion of social media,analyzing emotions and their causes in texts has gained significant importance.Emotion-cause pair extraction enables the identification of causal relationships between emotions and their triggers within a text,facilitating a deeper understanding of expressed sentiments and their underlying reasons.This comprehension is crucial for making informed strategic decisions in various business and societal contexts.However,recent research approaches employing multi-task learning frameworks for modeling often face challenges such as the inability to simultaneouslymodel extracted features and their interactions,or inconsistencies in label prediction between emotion-cause pair extraction and independent assistant tasks like emotion and cause extraction.To address these issues,this study proposes an emotion-cause pair extraction methodology that incorporates joint feature encoding and task alignment mechanisms.The model consists of two primary components:First,joint feature encoding simultaneously generates features for emotion-cause pairs and clauses,enhancing feature interactions between emotion clauses,cause clauses,and emotion-cause pairs.Second,the task alignment technique is applied to reduce the labeling distance between emotion-cause pair extraction and the two assistant tasks,capturing deep semantic information interactions among tasks.The proposed method is evaluated on a Chinese benchmark corpus using 10-fold cross-validation,assessing key performance metrics such as precision,recall,and F1 score.Experimental results demonstrate that the model achieves an F1 score of 76.05%,surpassing the state-of-the-art by 1.03%.The proposed model exhibits significant improvements in emotion-cause pair extraction(ECPE)and cause extraction(CE)compared to existing methods,validating its effectiveness.This research introduces a novel approach based on joint feature encoding and task alignment mechanisms,contributing to advancements in emotion-cause pair extraction.However,the study’s limitation lies in the data sources,potentially restricting the generalizability of the findings. 展开更多
关键词 Emotion-cause pair extraction interactive information enhancement joint feature encoding label consistency task alignment mechanisms
在线阅读 下载PDF
Multi-Modal Pre-Synergistic Fusion Entity Alignment Based on Mutual Information Strategy Optimization
8
作者 Huayu Li Xinxin Chen +3 位作者 Lizhuang Tan Konstantin I.Kostromitin Athanasios V.Vasilakos Peiying Zhang 《Computers, Materials & Continua》 2025年第11期4133-4153,共21页
To address the challenge of missing modal information in entity alignment and to mitigate information loss or bias arising frommodal heterogeneity during fusion,while also capturing shared information acrossmodalities... To address the challenge of missing modal information in entity alignment and to mitigate information loss or bias arising frommodal heterogeneity during fusion,while also capturing shared information acrossmodalities,this paper proposes a Multi-modal Pre-synergistic Entity Alignmentmodel based on Cross-modalMutual Information Strategy Optimization(MPSEA).The model first employs independent encoders to process multi-modal features,including text,images,and numerical values.Next,a multi-modal pre-synergistic fusion mechanism integrates graph structural and visual modal features into the textual modality as preparatory information.This pre-fusion strategy enables unified perception of heterogeneous modalities at the model’s initial stage,reducing discrepancies during the fusion process.Finally,using cross-modal deep perception reinforcement learning,the model achieves adaptive multilevel feature fusion between modalities,supporting learningmore effective alignment strategies.Extensive experiments on multiple public datasets show that the MPSEA method achieves gains of up to 7% in Hits@1 and 8.2% in MRR on the FBDB15K dataset,and up to 9.1% in Hits@1 and 7.7% in MRR on the FBYG15K dataset,compared to existing state-of-the-art methods.These results confirm the effectiveness of the proposed model. 展开更多
关键词 Knowledge graph multi-modal entity alignment feature fusion pre-synergistic fusion
在线阅读 下载PDF
Tri-M2MT:Multi-modalities based effective acute bilirubin encephalopathy diagnosis through multi-transformer using neonatal Magnetic Resonance Imaging
9
作者 Kumar Perumal Rakesh Kumar Mahendran +1 位作者 Arfat Ahmad Khan Seifedine Kadry 《CAAI Transactions on Intelligence Technology》 2025年第2期434-449,共16页
Acute Bilirubin Encephalopathy(ABE)is a significant threat to neonates and it leads to disability and high mortality rates.Detecting and treating ABE promptly is important to prevent further complications and long-ter... Acute Bilirubin Encephalopathy(ABE)is a significant threat to neonates and it leads to disability and high mortality rates.Detecting and treating ABE promptly is important to prevent further complications and long-term issues.Recent studies have explored ABE diagnosis.However,they often face limitations in classification due to reliance on a single modality of Magnetic Resonance Imaging(MRI).To tackle this problem,the authors propose a Tri-M2MT model for precise ABE detection by using tri-modality MRI scans.The scans include T1-weighted imaging(T1WI),T2-weighted imaging(T2WI),and apparent diffusion coefficient maps to get indepth information.Initially,the tri-modality MRI scans are collected and preprocessesed by using an Advanced Gaussian Filter for noise reduction and Z-score normalisation for data standardisation.An Advanced Capsule Network was utilised to extract relevant features by using Snake Optimization Algorithm to select optimal features based on feature correlation with the aim of minimising complexity and enhancing detection accuracy.Furthermore,a multi-transformer approach was used for feature fusion and identify feature correlations effectively.Finally,accurate ABE diagnosis is achieved through the utilisation of a SoftMax layer.The performance of the proposed Tri-M2MT model is evaluated across various metrics,including accuracy,specificity,sensitivity,F1-score,and ROC curve analysis,and the proposed methodology provides better performance compared to existing methodologies. 展开更多
关键词 Acute Bilirubin Encephalopathy(ABE)Diagnosis feature extraction MRI multi-modalITY multi-transformer NEONATAL
在线阅读 下载PDF
Advanced Feature Selection Techniques in Medical Imaging--A Systematic Literature Review
10
作者 Sunawar Khan Tehseen Mazhar +5 位作者 Naila Sammar Naz Fahed Ahmed Tariq Shahzad Atif Ali Muhammad Adnan Khan Habib Hamam 《Computers, Materials & Continua》 2025年第11期2347-2401,共55页
Feature selection(FS)plays a crucial role in medical imaging by reducing dimensionality,improving computational efficiency,and enhancing diagnostic accuracy.Traditional FS techniques,including filter,wrapper,and embed... Feature selection(FS)plays a crucial role in medical imaging by reducing dimensionality,improving computational efficiency,and enhancing diagnostic accuracy.Traditional FS techniques,including filter,wrapper,and embedded methods,have been widely used but often struggle with high-dimensional and heterogeneous medical imaging data.Deep learning-based FS methods,particularly Convolutional Neural Networks(CNNs)and autoencoders,have demonstrated superior performance but lack interpretability.Hybrid approaches that combine classical and deep learning techniques have emerged as a promising solution,offering improved accuracy and explainability.Furthermore,integratingmulti-modal imaging data(e.g.,MagneticResonance Imaging(MRI),ComputedTomography(CT),Positron Emission Tomography(PET),and Ultrasound(US))poses additional challenges in FS,necessitating advanced feature fusion strategies.Multi-modal feature fusion combines information fromdifferent imagingmodalities to improve diagnostic accuracy.Recently,quantum computing has gained attention as a revolutionary approach for FS,providing the potential to handle high-dimensional medical data more efficiently.This systematic literature review comprehensively examines classical,Deep Learning(DL),hybrid,and quantum-based FS techniques inmedical imaging.Key outcomes include a structured taxonomy of FS methods,a critical evaluation of their performance across modalities,and identification of core challenges such as computational burden,interpretability,and ethical considerations.Future research directions—such as explainable AI(XAI),federated learning,and quantum-enhanced FS—are also emphasized to bridge the current gaps.This review provides actionable insights for developing scalable,interpretable,and clinically applicable FS methods in the evolving landscape of medical imaging. 展开更多
关键词 feature selection medical imaging deep learning hybrid approaches multi-modal imaging quantum computing explainable AI computational efficiency dimensionality reduction
在线阅读 下载PDF
The Role and features of Student Peer Feedback in Interpreting Training——Preliminary Findings of a Survey of both trainers and students
11
作者 万宏瑜 《海外英语》 2016年第5期234-238,240,共6页
The project delves into the preliminary findings of a survey of both trainers and students on the practice of using student peer feedback in interpreting practice.It first explains the theoretical foundation which jus... The project delves into the preliminary findings of a survey of both trainers and students on the practice of using student peer feedback in interpreting practice.It first explains the theoretical foundation which justifies the use of peer feedback in interpreting practice,the research methodology and data collection.Then it brings forth specific findings concerning the implementation of peer feedback in the interpreting class followed by discussions of the role and features of student peer feedback as a means to help students ready for the booth.Analysis of the results shows that peer feedback in interpreting practice keeps students on-task,attentive and help them spot their own problems.Trainers and students themselves point to similar features of student peer feedback as focusing on comprehension of the original,word choice and numbers.The preliminary findings of the survey demonstrate the roles and features of student peer feedback in interpreting practice and point to the possible way of enhancing student’s learning curve through more effective peer feedback. 展开更多
关键词 INTERPRETING training PEER feedback ROLE & features TRAINERS & STUDENTS consistent
在线阅读 下载PDF
Multi-modality hierarchical fusion network for lumbar spine segmentation with magnetic resonance images 被引量:1
12
作者 Han Yan Guangtao Zhang +1 位作者 Wei Cui Zhuliang Yu 《Control Theory and Technology》 EI CSCD 2024年第4期612-622,共11页
For the analysis of spinal and disc diseases,automated tissue segmentation of the lumbar spine is vital.Due to the continuous and concentrated location of the target,the abundance of edge features,and individual diffe... For the analysis of spinal and disc diseases,automated tissue segmentation of the lumbar spine is vital.Due to the continuous and concentrated location of the target,the abundance of edge features,and individual differences,conventional automatic segmentation methods perform poorly.Since the success of deep learning in the segmentation of medical images has been shown in the past few years,it has been applied to this task in a number of ways.The multi-scale and multi-modal features of lumbar tissues,however,are rarely explored by methodologies of deep learning.Because of the inadequacies in medical images availability,it is crucial to effectively fuse various modes of data collection for model training to alleviate the problem of insufficient samples.In this paper,we propose a novel multi-modality hierarchical fusion network(MHFN)for improving lumbar spine segmentation by learning robust feature representations from multi-modality magnetic resonance images.An adaptive group fusion module(AGFM)is introduced in this paper to fuse features from various modes to extract cross-modality features that could be valuable.Furthermore,to combine features from low to high levels of cross-modality,we design a hierarchical fusion structure based on AGFM.Compared to the other feature fusion methods,AGFM is more effective based on experimental results on multi-modality MR images of the lumbar spine.To further enhance segmentation accuracy,we compare our network with baseline fusion structures.Compared to the baseline fusion structures(input-level:76.27%,layer-level:78.10%,decision-level:79.14%),our network was able to segment fractured vertebrae more accurately(85.05%). 展开更多
关键词 Lumbar spine segmentation Deep learning multi-modality fusion feature fusion
原文传递
An Effective Image Retrieval Mechanism Using Family-based Spatial Consistency Filtration with Object Region 被引量:1
13
作者 Jing Sun Ying-Jie Xing 《International Journal of Automation and computing》 EI 2010年第1期23-30,共8页
How to construct an appropriate spatial consistent measurement is the key to improving image retrieval performance.To address this problem,this paper introduces a novel image retrieval mechanism based on the family fi... How to construct an appropriate spatial consistent measurement is the key to improving image retrieval performance.To address this problem,this paper introduces a novel image retrieval mechanism based on the family filtration in object region.First,we supply an object region by selecting a rectangle in a query image such that system returns a ranked list of images that contain the same object,retrieved from the corpus based on 100 images,as a result of the first rank.To further improve retrieval performance,we add an efficient spatial consistency stage,which is named family-based spatial consistency filtration,to re-rank the results returned by the first rank.We elaborate the performance of the retrieval system by some experiments on the dataset selected from the key frames of"TREC Video Retrieval Evaluation 2005(TRECVID2005)".The results of experiments show that the retrieval mechanism proposed by us has vast major effect on the retrieval quality.The paper also verifies the stability of the retrieval mechanism by increasing the number of images from 100 to 2000 and realizes generalized retrieval with the object outside the dataset. 展开更多
关键词 Content-based image retrieval object region family-based spatial consistency filtration local affine invariant feature spatial relationship.
在线阅读 下载PDF
Improving VQA via Dual-Level Feature Embedding Network 被引量:1
14
作者 Yaru Song Huahu Xu Dikai Fang 《Intelligent Automation & Soft Computing》 2024年第3期397-416,共20页
Visual Question Answering(VQA)has sparked widespread interest as a crucial task in integrating vision and language.VQA primarily uses attention mechanisms to effectively answer questions to associate relevant visual r... Visual Question Answering(VQA)has sparked widespread interest as a crucial task in integrating vision and language.VQA primarily uses attention mechanisms to effectively answer questions to associate relevant visual regions with input questions.The detection-based features extracted by the object detection network aim to acquire the visual attention distribution on a predetermined detection frame and provide object-level insights to answer questions about foreground objects more effectively.However,it cannot answer the question about the background forms without detection boxes due to the lack of fine-grained details,which is the advantage of grid-based features.In this paper,we propose a Dual-Level Feature Embedding(DLFE)network,which effectively integrates grid-based and detection-based image features in a unified architecture to realize the complementary advantages of both features.Specifically,in DLFE,In DLFE,firstly,a novel Dual-Level Self-Attention(DLSA)modular is proposed to mine the intrinsic properties of the two features,where Positional Relation Attention(PRA)is designed to model the position information.Then,we propose a Feature Fusion Attention(FFA)to address the semantic noise caused by the fusion of two features and construct an alignment graph to enhance and align the grid and detection features.Finally,we use co-attention to learn the interactive features of the image and question and answer questions more accurately.Our method has significantly improved compared to the baseline,increasing accuracy from 66.01%to 70.63%on the test-std dataset of VQA 1.0 and from 66.24%to 70.91%for the test-std dataset of VQA 2.0. 展开更多
关键词 Visual question answering multi-modal feature processing attention mechanisms cross-model fusion
在线阅读 下载PDF
Feature Selection for SVM Classifiers Based on Discretization
15
作者 李烨 蔡云泽 许晓鸣 《Journal of Shanghai Jiaotong university(Science)》 EI 2005年第3期268-273,共6页
The rough sets and Boolean reasoning based discretization approach (RSBRA) is no t suitable for feature selection for machine learning algorithms such as neural network or SVM because the information loss due to discr... The rough sets and Boolean reasoning based discretization approach (RSBRA) is no t suitable for feature selection for machine learning algorithms such as neural network or SVM because the information loss due to discretization is large. A mo dified RSBRA for feature selection was proposed and evaluated with SVM classifie rs. In the presented algorithm, the level of consistency, coined from the rough sets theory, is introduced to substitute the stop criterion of circulation of th e RSBRA, which maintains the fidelity of the training set after discretization. The experimental results show the modified algorithm has better predictive accur acy and less training time than the original RSBRA. 展开更多
关键词 feature selection t discretization rough sets SVM classification level of consistency
在线阅读 下载PDF
Attentive Neighborhood Feature Augmentation for Semi-supervised Learning
16
作者 Qi Liu Jing Li +1 位作者 Xianmin Wang Wenpeng Zhao 《Intelligent Automation & Soft Computing》 SCIE 2023年第8期1753-1771,共19页
Recent state-of-the-art semi-supervised learning(SSL)methods usually use data augmentations as core components.Such methods,however,are limited to simple transformations such as the augmentations under the instance’s... Recent state-of-the-art semi-supervised learning(SSL)methods usually use data augmentations as core components.Such methods,however,are limited to simple transformations such as the augmentations under the instance’s naive representations or the augmentations under the instance’s semantic representations.To tackle this problem,we offer a unique insight into data augmentations and propose a novel data-augmentation-based semi-supervised learning method,called Attentive Neighborhood Feature Aug-mentation(ANFA).The motivation of our method lies in the observation that the relationship between the given feature and its neighborhood may contribute to constructing more reliable transformations for the data,and further facilitating the classifier to distinguish the ambiguous features from the low-dense regions.Specially,we first project the labeled and unlabeled data points into an embedding space and then construct a neighbor graph that serves as a similarity measure based on the similar representations in the embedding space.Then,we employ an attention mechanism to transform the target features into augmented ones based on the neighbor graph.Finally,we formulate a novel semi-supervised loss by encouraging the predictions of the interpolations of augmented features to be consistent with the corresponding interpolations of the predictions of the target features.We carried out exper-iments on SVHN and CIFAR-10 benchmark datasets and the experimental results demonstrate that our method outperforms the state-of-the-art methods when the number of labeled examples is limited. 展开更多
关键词 Semi-supervised learning attention mechanism feature augmentation consistency regularization
在线阅读 下载PDF
Motion estimation based feature selection for visual SLAM
17
作者 孟旭炯 Jiang Rongxin Zhou Fan Chen Yaowu 《High Technology Letters》 EI CAS 2011年第4期433-438,共6页
Feature selection is always an important issue in the visual SLAM (simultaneous location and mapping) literature. Considering that the location estimation can be improved by tracking features with larger value of vi... Feature selection is always an important issue in the visual SLAM (simultaneous location and mapping) literature. Considering that the location estimation can be improved by tracking features with larger value of visible time, a new feature selection method based on motion estimation is proposed. First, a k-step iteration algorithm is presented for visible time estimation using an affme motion model; then a delayed feature detection method is introduced for efficiently detecting features with the maximum visible time. As a means of validation for the proposed method, both simulation and real data experiments are carded out. Results show that the proposed method can improve both the estimation performance and the computational performance compared with the existing random feature selection method. 展开更多
关键词 visual SLAM feature selection motion estimation computational efficiency consistency extended Kalman filter (EKF)
在线阅读 下载PDF
Speed-up Multi-modal Near Duplicate Image Detection
18
作者 Chunlei Yang Jinye Peng Jianping Fan 《Open Journal of Applied Sciences》 2013年第1期16-21,共6页
Near-duplicate image detection is a necessary operation to refine image search results for efficient user exploration. The existences of large amounts of near duplicates require fast and accurate automatic near-duplic... Near-duplicate image detection is a necessary operation to refine image search results for efficient user exploration. The existences of large amounts of near duplicates require fast and accurate automatic near-duplicate detection methods. We have designed a coarse-to-fine near duplicate detection framework to speed-up the process and a multi-modal integra-tion scheme for accurate detection. The duplicate pairs are detected with both global feature (partition based color his-togram) and local feature (CPAM and SIFT Bag-of-Word model). The experiment results on large scale data set proved the effectiveness of the proposed design. 展开更多
关键词 Near-Duplicate Detection Coarse-To-Fine Framework multi-modal feature Integration
在线阅读 下载PDF
基于关系一致性的多分支对比学习算法
19
作者 冯慧敏 吕巧莉 陈俊芬 《河北大学学报(自然科学版)》 北大核心 2026年第1期104-112,共9页
传统对比学习算法进行实例判别时容易引入虚假负样本,导致模型收敛于次优解,影响下游任务性能.为此,提出一种基于关系一致性的多分支对比学习算法.该算法在分支网络中挖掘近邻集,提供语义一致的正样本,避免产生假的负样本.结合数据增强... 传统对比学习算法进行实例判别时容易引入虚假负样本,导致模型收敛于次优解,影响下游任务性能.为此,提出一种基于关系一致性的多分支对比学习算法.该算法在分支网络中挖掘近邻集,提供语义一致的正样本,避免产生假的负样本.结合数据增强的多分支网络,最小化KL散度拉近语义一致性的正样本推开负样本,提升网络的特征表达能力.不同分支的温度控制输出分布的平滑性,保证特征表示的真实可靠性.最后在5个数据集上测试所提算法,并与其他先进方法进行对比,均获得令人满意的结果. 展开更多
关键词 对比学习 关系一致性 特征表示 假负样本对 数据增强
在线阅读 下载PDF
面向原煤分选场景的多模态融合异物开集检测方法
20
作者 曹现刚 刘航 +2 位作者 刘家辉 吴旭东 王鹏 《煤炭科学技术》 北大核心 2026年第1期464-474,共11页
原煤分选过程首先需要对大块矸石、铁丝、编织袋等异物进行识别与拣选,以避免对后续工艺环节造成影响或引发安全事故。目前煤炭异物目标检测算法主要是面向已知对象的检测算法,对未知目标,尤其是各类锚杆、新式支护材料等具有复杂外观... 原煤分选过程首先需要对大块矸石、铁丝、编织袋等异物进行识别与拣选,以避免对后续工艺环节造成影响或引发安全事故。目前煤炭异物目标检测算法主要是面向已知对象的检测算法,对未知目标,尤其是各类锚杆、新式支护材料等具有复杂外观与语义不确定目标的检测能力不足,亟须研究能够同时具备已知与未知异物检测能力的目标检测模型。提出了一种基于多模态融合的煤炭异物开集检测方法。首先,基于DINO网络,设计了文本与图像的双模态特征信息提取架构,以获取更具类别判别性的文本与视觉特征,引入路径聚合特征金字塔网络,采用多层特征抽取策略,将深层语义特征与浅层空间细节有效结合,强化对小尺度煤炭异物的感知能力,提升检测精度;其次,构建了基于自注意力机制与交叉注意力机制的多模态特征融合模块,实现文本与视觉特征的深度交互与高效融合,并引入基于语言引导的查询选择机制,使任意类别文本描述与视觉查询建立对应关系,从而提升特征语义一致性与跨类别泛化能力;最后,设计了一种基于视觉-文本多模态解码模块,在每层查询更新阶段插入文本引导机制,使可学习查询在与图像特征交互前对齐语言特征,有效提升多模态特征对齐的准确性与鲁棒性。基于自建煤炭异物数据集构建多类别组合的开放动态环境,并系统开展了试验,结果表明本文方法在已知类别检测不同开放度任务中mAP@0.5精度均优于其他对比方法,在未知类别检测不同开放度任务中,未知类召回率分别达到41.24%、52.26%、57.13%,验证了零样本条件下的有效性。本文方法具备针对未知类别煤炭异物的检测能力,为煤炭异物的开集检测提供了有效的技术支撑。 展开更多
关键词 煤炭异物 多模态融合 开集检测 特征金字塔 特征语义一致性
在线阅读 下载PDF
上一页 1 2 15 下一页 到第
使用帮助 返回顶部