期刊文献+
共找到152篇文章
< 1 2 8 >
每页显示 20 50 100
Long Short-Term Memory Recurrent Neural Network-Based Acoustic Model Using Connectionist Temporal Classification on a Large-Scale Training Corpus 被引量:9
1
作者 Donghyun Lee Minkyu Lim +4 位作者 Hosung Park Yoseb Kang Jeong-Sik Park Gil-Jin Jang Ji-Hwan Kim 《China Communications》 SCIE CSCD 2017年第9期23-31,共9页
A Long Short-Term Memory(LSTM) Recurrent Neural Network(RNN) has driven tremendous improvements on an acoustic model based on Gaussian Mixture Model(GMM). However, these models based on a hybrid method require a force... A Long Short-Term Memory(LSTM) Recurrent Neural Network(RNN) has driven tremendous improvements on an acoustic model based on Gaussian Mixture Model(GMM). However, these models based on a hybrid method require a forced aligned Hidden Markov Model(HMM) state sequence obtained from the GMM-based acoustic model. Therefore, it requires a long computation time for training both the GMM-based acoustic model and a deep learning-based acoustic model. In order to solve this problem, an acoustic model using CTC algorithm is proposed. CTC algorithm does not require the GMM-based acoustic model because it does not use the forced aligned HMM state sequence. However, previous works on a LSTM RNN-based acoustic model using CTC used a small-scale training corpus. In this paper, the LSTM RNN-based acoustic model using CTC is trained on a large-scale training corpus and its performance is evaluated. The implemented acoustic model has a performance of 6.18% and 15.01% in terms of Word Error Rate(WER) for clean speech and noisy speech, respectively. This is similar to a performance of the acoustic model based on the hybrid method. 展开更多
关键词 acoustic model connectionisttemporal classification LARGE-SCALE trainingcorpus LonG SHORT-TERM memory recurrentneural network
在线阅读 下载PDF
Computer vision-based limestone rock-type classification using probabilistic neural network 被引量:20
2
作者 Ashok Kumar Patel Snehamoy Chatterjee 《Geoscience Frontiers》 SCIE CAS CSCD 2016年第1期53-60,共8页
Proper quality planning of limestone raw materials is an essential job of maintaining desired feed in cement plant. Rock-type identification is an integrated part of quality planning for limestone mine. In this paper,... Proper quality planning of limestone raw materials is an essential job of maintaining desired feed in cement plant. Rock-type identification is an integrated part of quality planning for limestone mine. In this paper, a computer vision-based rock-type classification algorithm is proposed for fast and reliable identification without human intervention. A laboratory scale vision-based model was developed using probabilistic neural network(PNN) where color histogram features are used as input. The color image histogram-based features that include weighted mean, skewness and kurtosis features are extracted for all three color space red, green, and blue. A total nine features are used as input for the PNN classification model. The smoothing parameter for PNN model is selected judicially to develop an optimal or close to the optimum classification model. The developed PPN is validated using the test data set and results reveal that the proposed vision-based model can perform satisfactorily for classifying limestone rocktypes. Overall the error of mis-classification is below 6%. When compared with other three classification algorithms, it is observed that the proposed method performs substantially better than all three classification algorithms. 展开更多
关键词 Supervised classification Probabilistic neural network Histogram based features Smoothing parameter LIMESTonE
在线阅读 下载PDF
Clustering-based temporal deep neural network denoising method for event-based sensors
3
作者 LI Jianing XU Jiangtao GAO Jiandong 《Optoelectronics Letters》 2025年第7期441-448,共8页
To enhance the denoising performance of event-based sensors,we introduce a clustering-based temporal deep neural network denoising method(CBTDNN).Firstly,to cluster the sensor output data and obtain the respective clu... To enhance the denoising performance of event-based sensors,we introduce a clustering-based temporal deep neural network denoising method(CBTDNN).Firstly,to cluster the sensor output data and obtain the respective cluster centers,a combination of density-based spatial clustering of applications with noise(DBSCAN)and Kmeans++is utilized.Subsequently,long short-term memory(LSTM)is employed to fit and yield optimized cluster centers with temporal information.Lastly,based on the new cluster centers and denoising ratio,a radius threshold is set,and noise points beyond this threshold are removed.The comprehensive denoising metrics F1_score of CBTDNN have achieved 0.8931,0.7735,and 0.9215 on the traffic sequences dataset,pedestrian detection dataset,and turntable dataset,respectively.And these metrics demonstrate improvements of 49.90%,33.07%,19.31%,and 22.97%compared to four contrastive algorithms,namely nearest neighbor(NNb),nearest neighbor with polarity(NNp),Autoencoder,and multilayer perceptron denoising filter(MLPF).These results demonstrate that the proposed method enhances the denoising performance of event-based sensors. 展开更多
关键词 cluster centers denoising kmeans cluster centersa temporal deep neural network CLUSTERING event based sensors dbscan
原文传递
HLR-Net:A Hybrid Lip-Reading Model Based on Deep Convolutional Neural Networks 被引量:2
4
作者 Amany M.Sarhan Nada M.Elshennawy Dina M.Ibrahim 《Computers, Materials & Continua》 SCIE EI 2021年第8期1531-1549,共19页
Lip reading is typically regarded as visually interpreting the speaker’s lip movements during the speaking.This is a task of decoding the text from the speaker’s mouth movement.This paper proposes a lip-reading mode... Lip reading is typically regarded as visually interpreting the speaker’s lip movements during the speaking.This is a task of decoding the text from the speaker’s mouth movement.This paper proposes a lip-reading model that helps deaf people and persons with hearing problems to understand a speaker by capturing a video of the speaker and inputting it into the proposed model to obtain the corresponding subtitles.Using deep learning technologies makes it easier for users to extract a large number of different features,which can then be converted to probabilities of letters to obtain accurate results.Recently proposed methods for lip reading are based on sequence-to-sequence architectures that are designed for natural machine translation and audio speech recognition.However,in this paper,a deep convolutional neural network model called the hybrid lip-reading(HLR-Net)model is developed for lip reading from a video.The proposed model includes three stages,namely,preprocessing,encoder,and decoder stages,which produce the output subtitle.The inception,gradient,and bidirectional GRU layers are used to build the encoder,and the attention,fully-connected,activation function layers are used to build the decoder,which performs the connectionist temporal classification(CTC).In comparison with the three recent models,namely,the LipNet model,the lip-reading model with cascaded attention(LCANet),and attention-CTC(A-ACA)model,on the GRID corpus dataset,the proposed HLR-Net model can achieve significant improvements,achieving the CER of 4.9%,WER of 9.7%,and Bleu score of 92%in the case of unseen speakers,and the CER of 1.4%,WER of 3.3%,and Bleu score of 99%in the case of overlapped speakers. 展开更多
关键词 LIP-READING visual speech recognition deep neural network connectionist temporal classification
在线阅读 下载PDF
Integrating deep learning and logging data analytics for lithofacies classification and 3D modeling of tight sandstone reservoirs 被引量:4
5
作者 Jing-Jing Liu Jian-Chao Liu 《Geoscience Frontiers》 SCIE CAS CSCD 2022年第1期350-363,共14页
The lithofacies classification is essential for oil and gas reservoir exploration and development.The traditional method of lithofacies classification is based on"core calibration logging"and the experience ... The lithofacies classification is essential for oil and gas reservoir exploration and development.The traditional method of lithofacies classification is based on"core calibration logging"and the experience of geologists.This approach has strong subjectivity,low efficiency,and high uncertainty.This uncertainty may be one of the key factors affecting the results of 3 D modeling of tight sandstone reservoirs.In recent years,deep learning,which is a cutting-edge artificial intelligence technology,has attracted attention from various fields.However,the study of deep-learning techniques in the field of lithofacies classification has not been sufficient.Therefore,this paper proposes a novel hybrid deep-learning model based on the efficient data feature-extraction ability of convolutional neural networks(CNN)and the excellent ability to describe time-dependent features of long short-term memory networks(LSTM)to conduct lithological facies-classification experiments.The results of a series of experiments show that the hybrid CNN-LSTM model had an average accuracy of 87.3%and the best classification effect compared to the CNN,LSTM or the three commonly used machine learning models(Support vector machine,random forest,and gradient boosting decision tree).In addition,the borderline synthetic minority oversampling technique(BSMOTE)is introduced to address the class-imbalance issue of raw data.The results show that processed data balance can significantly improve the accuracy of lithofacies classification.Beside that,based on the fine lithofacies constraints,the sequential indicator simulation method is used to establish a three-dimensional lithofacies model,which completes the fine description of the spatial distribution of tight sandstone reservoirs in the study area.According to this comprehensive analysis,the proposed CNN-LSTM model,which eliminates class imbalance,can be effectively applied to lithofacies classification,and is expected to improve the reality of the geological model for the tight sandstone reservoirs. 展开更多
关键词 Deep learning Convolutional neural networks LSTM Lithological-facies classification 3D modeling class imbalance
在线阅读 下载PDF
基于图卷积网络和CTC/Attention的连续手语识别 被引量:1
6
作者 边辉 孟畅乾 +2 位作者 李子涵 陈子豪 谢雪雷 《计算机科学》 北大核心 2025年第S1期550-558,共9页
手语是听力障碍患者之间一种重要的交流方式。通过手语识别,可以让患者与正常人进行无障碍的交流。随着深度学习技术的发展,各种手语识别技术也随之发展,但现有的手语识别技术往往无法完成连续识别手语的任务,因此文中提出了一种基于图... 手语是听力障碍患者之间一种重要的交流方式。通过手语识别,可以让患者与正常人进行无障碍的交流。随着深度学习技术的发展,各种手语识别技术也随之发展,但现有的手语识别技术往往无法完成连续识别手语的任务,因此文中提出了一种基于图卷积网络(Graph Convolution Network,GCN)和神经网络的时序类分类(Connectionist Temporal Classification/Attention,CTC/Attention)的连续手语识别方法,分别从空间维度与时间维度提取特征,并将空间注意力机制融入其中,以赋予骨骼点权重,突出有效的空间特征,实现手语的连续识别。该方法可实现连续手语语句翻译的序列对齐和上下文语义建模。首先基于MediaPipe框架采集手语动作骨骼点数据,并基于此搭建中文手语骨骼关键点坐标的数据集,根据骨骼关键点坐标,设计了基于时空图神经网络(Spatio-Temporal Graph Convolutional Networks,ST-GCN)的动态手语词识别方法,然后提出基于GCN和CTC/Attention的编解码器网络,用于实现连续手语语句识别的方法。在数据集有限的情况下,在自建的骨骼点数据集SSLD上对所提出的方法进行评估,实验结果表明,平均连续手语识别字准确率达到94.41%,证明所提模型具有良好的手语识别能力。 展开更多
关键词 连续手语识别 图卷积网络 基于神经网络的时序类分类 MediaPipe框架 骨骼关键点 基于时空图神经网络
在线阅读 下载PDF
Fault Detection,Classification,and Location Based on Empirical Wavelet Transform-Teager Energy Operator and ANN for Hybrid Transmission Lines in VSC-HVDC Systems
7
作者 Jalal Sahebkar Farkhani ÖzgürÇelik +2 位作者 Kaiqi Ma Claus Leth Bak Zhe Chen 《Journal of Modern Power Systems and Clean Energy》 2025年第3期840-851,共12页
Traditional protection methods are not suitable for hybrid(cable and overhead)transmission lines in voltage source converter based high-voltage direct current(VSC-HVDC)systems.Accordingly,this paper presents the robus... Traditional protection methods are not suitable for hybrid(cable and overhead)transmission lines in voltage source converter based high-voltage direct current(VSC-HVDC)systems.Accordingly,this paper presents the robust fault detection,classification,and location based on the empirical wavelet transform-Teager energy operator(EWT-TEO)and artificial neural network(ANN)for hybrid transmission lines in VSC-HVDC systems.The operational scheme of the proposed protection method consists of two loops①an EWT-TEO based feature extraction loop,②and an ANN-based fault detection,classification,and location loop.Under the proposed protection method,the voltage and current signals are decomposed into several sub-passbands with low and high frequencies using the empirical wavelet transform(EWT)method.The energy content extracted by the EWT is fed into the ANN for fault detection,classification,and location.Various fault cases,including the high-impedance fault(HIF)as well as noises,are performed to train the ANN with two hidden layers.The test system and signal decomposition are conducted by PSCAD/EMTDC and MATLAB,respectively.The performance of the proposed protection method is compared with that of the traditional non-pilot traveling wave(TW)based protection method.The results confirm the high accuracy of the proposed protection method for hybrid transmission lines in VSC-HVDC systems,where a mean percentage error of approximately 0.1%is achieved. 展开更多
关键词 Voltage source converter based high-voltage direct current(VSC-HVDC) protection fault detection fault classification fault location empirical wavelet transform(EWT) artificial neural network(ANN) hybrid transmission line
原文传递
Deep learning and machine learning neural network approaches for multi class leather texture defect classification and segmentation 被引量:1
8
作者 Praveen Kumar Moganam Denis Ashok Sathia Seelan 《Journal of Leather Science and Engineering》 2022年第1期90-110,共21页
Modern leather industries are focused on producing high quality leather products for sustaining the market com-petitiveness. However, various leather defects are introduced during various stages of manufacturing proce... Modern leather industries are focused on producing high quality leather products for sustaining the market com-petitiveness. However, various leather defects are introduced during various stages of manufacturing process such as material handling, tanning and dyeing. Manual inspection of leather surfaces is subjective and inconsistent in nature;hence machine vision systems have been widely adopted for the automated inspection of leather defects. It is neces-sary develop suitable image processing algorithms for localize leather defects such as folding marks, growth marks, grain off, loose grain, and pinhole due to the ambiguous texture pattern and tiny nature in the localized regions of the leather. This paper presents deep learning neural network-based approach for automatic localization and classifica-tion of leather defects using a machine vision system. In this work, popular convolutional neural networks are trained using leather images of different leather defects and a class activation mapping technique is followed to locate the region of interest for the class of leather defect. Convolution neural networks such as Google net, Squeeze-net, RestNet are found to provide better accuracy of classification as compared with the state-of-the-art neural network architectures and the results are presented. 展开更多
关键词 Convolution neural networks Machine learning classifier Leather defects Multi class classification class activation map SEGMENTATIon
原文传递
基于Inception结构的手写汉字档案文本识别方法 被引量:2
9
作者 刘明忠 贾永红 《武汉大学学报(信息科学版)》 EI CAS CSCD 北大核心 2022年第4期632-638,共7页
针对手写汉字文本识别准确率不高的问题,提出了一种结合卷积神经网络和循环神经网络进行手写汉字文本识别的端到端方法。首先,通过Inception模块构建的卷积神经网络提取文本图像的基本特征;然后,使用循环神经网络对提取的特征进行预测,... 针对手写汉字文本识别准确率不高的问题,提出了一种结合卷积神经网络和循环神经网络进行手写汉字文本识别的端到端方法。首先,通过Inception模块构建的卷积神经网络提取文本图像的基本特征;然后,使用循环神经网络对提取的特征进行预测,并输出一个关于汉字字符集的概率分布;最后,采用连接主义序列分类算法计算识别结果并构建损失函数。利用所提方法在手写汉字文本数据集上进行实验,结果表明,Inception模块和数据增强方法可以有效提升算法的性能,并取得了71.2%的识别准确率和0.060的文本编辑距离,较现有方法性能有所提升,证明了所提方法的有效性。 展开更多
关键词 手写汉字文本识别 Inception结构 卷积神经网络 循环神经网络 连接主义序列分类 手写汉字文本数据集
原文传递
Recent Progresses in Deep Learning Based Acoustic Models 被引量:11
10
作者 Dong Yu Jinyu Li 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2017年第3期396-409,共14页
In this paper,we summarize recent progresses made in deep learning based acoustic models and the motivation and insights behind the surveyed techniques.We first discuss models such as recurrent neural networks(RNNs) a... In this paper,we summarize recent progresses made in deep learning based acoustic models and the motivation and insights behind the surveyed techniques.We first discuss models such as recurrent neural networks(RNNs) and convolutional neural networks(CNNs) that can effectively exploit variablelength contextual information,and their various combination with other models.We then describe models that are optimized end-to-end and emphasize on feature representations learned jointly with the rest of the system,the connectionist temporal classification(CTC) criterion,and the attention-based sequenceto-sequence translation model.We further illustrate robustness issues in speech recognition systems,and discuss acoustic model adaptation,speech enhancement and separation,and robust training strategies.We also cover modeling techniques that lead to more efficient decoding and discuss possible future directions in acoustic model research. 展开更多
关键词 Attention model convolutional neural network(CNN) connectionist temporal classification(CTC) deep learning(DL) long short-term memory(LSTM) permutation invariant training speech adaptation speech processing speech recognition speech separation
在线阅读 下载PDF
基于Temporal rule的忆阻神经网络电路
11
作者 黄成龙 郝栋栋 方粮 《计算机工程与科学》 CSCD 北大核心 2019年第3期409-416,共8页
忆阻器是一种动态特性的电阻,其阻值可以根据外场的变化而变化,并且在外场撤掉后能够保持原来的阻值,具有类似于生物神经突触连接强度的特性,可以用来存储突触权值。在此基础上,为了实现基于Temporal rule对IRIS数据集识别学习的功能,... 忆阻器是一种动态特性的电阻,其阻值可以根据外场的变化而变化,并且在外场撤掉后能够保持原来的阻值,具有类似于生物神经突触连接强度的特性,可以用来存储突触权值。在此基础上,为了实现基于Temporal rule对IRIS数据集识别学习的功能,建立了以桥式忆阻器为突触的神经网络SPICE仿真电路。采用单个脉冲的编码方式,脉冲的时刻代表着数据信息,该神经网络电路由48个脉冲输入端口、144个突触、3个输出端口组成。基于Temporal rule学习规则对突触的权值修改,通过仿真该神经网络电路对IRIS数据集的分类正确率最高能达到93.33%,表明了此神经系统结构设计在类脑脉冲神经网络中的可用性。 展开更多
关键词 忆阻器 temporal RULE 神经网络电路 桥式忆阻器突触
在线阅读 下载PDF
Surrounding rock classification from onsite images with deep transfer learning based on EfficientNet
12
作者 Xiaoying ZHUANG Wenjie FAN +2 位作者 Hongwei GUO Xuefeng CHEN Qimin WANG 《Frontiers of Structural and Civil Engineering》 SCIE EI CSCD 2024年第9期1311-1320,共10页
This paper proposes an accurate,efficient and explainable method for the classification of the surrounding rock based on a convolutional neural network(CNN).The state-of-the-art robust CNN model(EfficientNet)is applie... This paper proposes an accurate,efficient and explainable method for the classification of the surrounding rock based on a convolutional neural network(CNN).The state-of-the-art robust CNN model(EfficientNet)is applied to tunnel wall image recognition.Gaussian filtering,data augmentation and other data pre-processing techniques are used to improve the data quality and quantity.Combined with transfer learning,the generality,accuracy and efficiency of the deep learning(DL)model are further improved,and finally we achieve 89.96%accuracy.Compared with other state-of-the-art CNN architectures,such as ResNet and Inception-ResNet-V2(IRV2),the presented deep transfer learning model is more stable,accurate and efficient.To reveal the rock classification mechanism of the proposed model,Gradient-weight Class Activation Map(Grad-CAM)visualizations are integrated into the model to enable its explainability and accountability.The developed deep transfer learning model has been applied to support the tunneling of the Xingyi City Bypass in the high mountain area of Guizhou,China,with great results. 展开更多
关键词 surrounding rock classification convolutional neural network EfficientNet Gradient-weight class Activation Map
原文传递
An Efficient Hybrid Model for Arabic Text Recognition
13
作者 Hicham Lamtougui Hicham El Moubtahij +1 位作者 Hassan Fouadi Khalid Satori 《Computers, Materials & Continua》 SCIE EI 2023年第2期2871-2888,共18页
In recent years,Deep Learning models have become indispensable in several fields such as computer vision,automatic object recognition,and automatic natural language processing.The implementation of a robust and effici... In recent years,Deep Learning models have become indispensable in several fields such as computer vision,automatic object recognition,and automatic natural language processing.The implementation of a robust and efficient handwritten text recognition system remains a challenge for the research community in this field,especially for the Arabic language,which,compared to other languages,has a dearth of published works.In this work,we presented an efficient and new system for offline Arabic handwritten text recognition.Our new approach is based on the combination of a Convolutional Neural Network(CNN)and a Bidirectional Long-Term Memory(BLSTM)followed by a Connectionist Temporal Classification layer(CTC).Moreover,during the training phase of the model,we introduce an algorithm of data augmentation to increase the quality of data.Our proposed approach can recognize Arabic handwritten texts without the need to segment the characters,thus overcoming several problems related to this point.To train and test(evaluate)our approach,we used two Arabic handwritten text recognition databases,which are IFN/ENIT and KHATT.The Experimental results show that our new approach,compared to other methods in the literature,gives better results. 展开更多
关键词 Deep learning arabic handwritten text recognition convolutional neural network(CNN) bidirectional long-term memory(BLSTM) connectionist temporal classification(CTC)
在线阅读 下载PDF
An Efficient Cyber Security and Intrusion Detection System Using CRSR with PXORP-ECC and LTH-CNN
14
作者 Nouf Saeed Alotaibi 《Computers, Materials & Continua》 SCIE EI 2023年第8期2061-2078,共18页
Intrusion Detection System(IDS)is a network security mechanism that analyses all users’and applications’traffic and detectsmalicious activities in real-time.The existing IDSmethods suffer fromlower accuracy and lack... Intrusion Detection System(IDS)is a network security mechanism that analyses all users’and applications’traffic and detectsmalicious activities in real-time.The existing IDSmethods suffer fromlower accuracy and lack the required level of security to prevent sophisticated attacks.This problem can result in the system being vulnerable to attacks,which can lead to the loss of sensitive data and potential system failure.Therefore,this paper proposes an Intrusion Detection System using Logistic Tanh-based Convolutional Neural Network Classification(LTH-CNN).Here,the Correlation Coefficient based Mayfly Optimization(CC-MA)algorithm is used to extract the input characteristics for the IDS from the input data.Then,the optimized features are utilized by the LTH-CNN,which returns the attacked and non-attacked data.After that,the attacked data is stored in the log file and non-attacked data is mapped to the cyber security and data security phases.To prevent the system from cyber-attack,the Source and Destination IP address is converted into a complex binary format named 1’s Complement Reverse Shift Right(CRSR),where,in the data security phase the sensed data is converted into an encrypted format using Senders Public key Exclusive OR Receivers Public Key-Elliptic Curve Cryptography(PXORP-ECC)Algorithm to improve the data security.TheNetwork Security Laboratory-Knowledge Discovery inDatabases(NSLKDD)dataset and real-time sensor are used to train and evaluate the proposed LTH-CNN.The suggested model is evaluated based on accuracy,sensitivity,and specificity,which outperformed the existing IDS methods,according to the results of the experiments. 展开更多
关键词 Intrusion detection system logistic tanh-based convolutional neural network classification(LTH-CNN) correlation coefficient based mayfly optimization(CC-MA) cyber security
在线阅读 下载PDF
基于少类增强和远距离连通的不平衡节点分类 被引量:1
15
作者 韩忠明 张舒群 +1 位作者 刘燕 杨伟杰 《计算机应用研究》 北大核心 2025年第9期2683-2689,共7页
图数据在现实应用中普遍存在类不平衡分布问题,现有的生成式方法通过提出对应的生成策略来合成少类节点以增强原始类不平衡图。但这些方法主要关注数量补偿,在根据少类数量对其进行补偿时,某些节点可能会显著降低其他类的性能。为此,从... 图数据在现实应用中普遍存在类不平衡分布问题,现有的生成式方法通过提出对应的生成策略来合成少类节点以增强原始类不平衡图。但这些方法主要关注数量补偿,在根据少类数量对其进行补偿时,某些节点可能会显著降低其他类的性能。为此,从数量和拓扑两个角度来考虑少类生成方法以应对图上的不平衡问题,提出了基于少类增强和远距离连通的不平衡节点分类方法。在生成新少类节点平衡训练数据时,通过基于节点重要性的邻居采样方式来查找远距离潜在同类节点,减轻节点邻域高异类和自类标记节点连接弱带来的拓扑不平衡问题,合理增强不平衡图。在三个基准数据集上的实验结果表明,所提方法在不平衡节点分类任务中,其准确率、平衡准确率和F 1值指标均优于基线方法,并通过消融实验和应用实例分析等验证了所提方法的有效性及实用性。 展开更多
关键词 图神经网络 节点分类 少类增强 拓扑不平衡 数量不平衡
在线阅读 下载PDF
融合多层图与分类信息的双意图会话推荐 被引量:2
16
作者 刘超 王中迪 +1 位作者 余岩化 朱军 《计算机应用研究》 北大核心 2025年第4期1058-1064,共7页
针对现有会话推荐系统存在的会话间信息挖掘不够充分、会话间聚合信息冗余和辅助信息未与会话特征相结合的问题,提出融合多层图与分类信息的双意图会话推荐模型(SRIMC)。首先,根据会话序列,构建局部会话图、会话关系图和全局项目图,通... 针对现有会话推荐系统存在的会话间信息挖掘不够充分、会话间聚合信息冗余和辅助信息未与会话特征相结合的问题,提出融合多层图与分类信息的双意图会话推荐模型(SRIMC)。首先,根据会话序列,构建局部会话图、会话关系图和全局项目图,通过图神经网络(GNN)学习得到局部会话特征、会话关系特征和全局项目会话特征,并将上述特征结合获得α意图;其次,基于替换先验分布为β分布的贝叶斯分布整合分类信息与会话长度信息,获得β意图;最后,将α和β意图融合进行预测。在五个公开数据集上的实验结果表明,SRIMC的P@20提升了1.23%~51.78%,MRR@20提升了2.87%~80.87%,证明了模型利用多层会话信息与分类信息捕获用户意图的有效性。 展开更多
关键词 会话推荐 多层信息 图神经网络 分类信息 双意图
在线阅读 下载PDF
多层混合注意力机制的类激活图可解释性方法
17
作者 张剑 张一然 王梓聪 《中国图象图形学报》 北大核心 2025年第7期2468-2483,共16页
目的深度卷积神经网络在视觉任务中的广泛应用,使得其作为黑盒模型的复杂性和不透明性引发了对决策机制的关注。类激活图已被证明能有效提升图像分类的可解释性从而提高决策机制的理解程度,但现有方法在高亮目标区域时,常存在边界模糊... 目的深度卷积神经网络在视觉任务中的广泛应用,使得其作为黑盒模型的复杂性和不透明性引发了对决策机制的关注。类激活图已被证明能有效提升图像分类的可解释性从而提高决策机制的理解程度,但现有方法在高亮目标区域时,常存在边界模糊、范围过大和细粒度不足的问题。为此,提出一种多层混合注意力机制的类激活图方法(spatial attention-based multi-layer fusion for high-quality class activation maps,SAMLCAM),以优化这些局限性。方法在以往的类激活图方法忽略了空间位置信息只关注通道级权重,降低了目标物体的定位性能,SAMLCAM方法提出一种结合了通道注意力机制和空间注意力机制的混合注意力机制,实现增强目标物体定位减少无效位置信息的效果。在得到有效物体定位结果后,根据神经网络多层卷积层的特点,改进多层特征图融合的方式提出多层加权融合机制,改善类激活图的边界效果范围过大和细粒度不足的问题,从而增强类激活图的视觉解释性。结果引用广泛用于计算机视觉模型的基准测试ILSVRC 2012(ImageNet Large-scale Visual Recognition Challenge 2012)数据集和MS COCO2017(Microsoft common objects in context 2017)数据集,对提出方法在多种待解释卷积网络模型下进行评估,包括消融实验、定性评估和定量评估。消融实验证明了各模块的有效性;定性评估对其可解释性效果进行视觉直观展示,验证效果的提升;定量评估中数据表明,SAMLCAM在Loc1和Loc5指标性能相较于最低数据均有大于8%的提升,在能量定位决策指标相较于最低数据均有大于7%的提升。由于改进方法减少了目标样本区域的上下文背景区域,使其对结果置信度存在负影响,但在可信度指标中,与其他方法比较仍可以保持比较小的差距并维持较高性能。结论本文方法在多种卷积神经网络架构上均展现出优异的解释性能,通过扩大目标样本区域的响应覆盖度并有效抑制背景或无关区域的响应,提升了可解释性结果的精确性与可靠性。 展开更多
关键词 类激活图(CAM) 可解释性 注意力机制 图像分类 卷积神经网络(CNN)
原文传递
基于迁移学习和改进EfficientNet-B0的脑肿瘤分类算法
18
作者 王勇 杨义龙 +2 位作者 范晓晖 周雷 孔祥勇 《电子科技》 2025年第4期46-51,共6页
针对现有脑肿瘤分类模型和方法复杂度高以及识别率低等问题,文中提出一种基于改进EfficientNet-B0的模型用于3种脑肿瘤分类。在数据预处理阶段,使用ROI(Region of Interest)特征裁剪出脑肿瘤图像的关键特征区域,并按肿瘤类型扩增数据集... 针对现有脑肿瘤分类模型和方法复杂度高以及识别率低等问题,文中提出一种基于改进EfficientNet-B0的模型用于3种脑肿瘤分类。在数据预处理阶段,使用ROI(Region of Interest)特征裁剪出脑肿瘤图像的关键特征区域,并按肿瘤类型扩增数据集。根据卷积网络设计思想重新设计了EfficientNet中的MBConv(Mobile Inverted Bottleneck Convolution)模块,在首步卷积后引入卷积注意力CBAM(Convolutional Block Attention Module)。为了更完整地进行迁移学习,在不修改原始输出结构的基础上外接3个神经元用于脑肿瘤的三分类。改进网络模型具有更低的复杂度,可更好地适应肿瘤病灶的识别。文中利用迁移学习方法在公开数据集figshare-Brain Tumor Dataset上进行微调。实验结果表明,改进模型在该公共数据集上分类准确率为99.67%,相较于原始EfficientNet-B0网络提升了约3.1百分点。 展开更多
关键词 脑肿瘤分类 深度学习 卷积神经网络 阈值化处理 类平衡 EfficientNet ECA注意力机制 CBAM注意力机制
在线阅读 下载PDF
多模态信息融合的地基云图分类
19
作者 项洪印 孔文迪 《信息与电脑》 2025年第13期40-42,共3页
地基云图分类在气象观测与分析中占据核心地位,对于研究人员开展天气预报的准确性预测、航空安全的保障来说都非常关键。为此,文章论述了神经网络结构设计的原理,然后从多个方面论述了特征信息融合过程借助于多模态信息融合的神经网络模... 地基云图分类在气象观测与分析中占据核心地位,对于研究人员开展天气预报的准确性预测、航空安全的保障来说都非常关键。为此,文章论述了神经网络结构设计的原理,然后从多个方面论述了特征信息融合过程借助于多模态信息融合的神经网络模型,实现了多模态信息融合的地基云图分类与整合,实现了分类准确率的全面提升。 展开更多
关键词 多模态信息融合 地基云图分类 神经网络 特征信息融合
在线阅读 下载PDF
基于结构感知图神经网络的多类别漏洞检测 被引量:1
20
作者 曹思聪 孙小兵 +6 位作者 薄莉莉 吴潇雪 李斌 陈厅 罗夏朴 张涛 刘维 《软件学报》 北大核心 2025年第11期5045-5061,共17页
软件漏洞威胁着现实世界系统的安全.近年来,基于学习的漏洞检测方法(尤其是基于深度学习的方法)由于其从大量漏洞样本中挖掘隐式漏洞特征的显著优势,得到了广泛的研究.然而,由于不同类型漏洞之间的特征差异和数据分布不平衡问题,现有基... 软件漏洞威胁着现实世界系统的安全.近年来,基于学习的漏洞检测方法(尤其是基于深度学习的方法)由于其从大量漏洞样本中挖掘隐式漏洞特征的显著优势,得到了广泛的研究.然而,由于不同类型漏洞之间的特征差异和数据分布不平衡问题,现有基于深度学习的漏洞检测方法难以准确识别具体的漏洞类型.因此,提出一种基于深度学习的多类型漏洞检测方法MulVD.MulVD构建了一种新型的结构感知图神经网络(SA-GNN),它可以自适应地为不同类型的漏洞提取局部典型的漏洞模式,并在不引入噪声的情况下重新平衡数据分布.检验所提方法在二分类和多分类漏洞检测任务中的有效性.实验结果表明,MulVD显著提高了现有基于深度学习的漏洞检测技术的性能. 展开更多
关键词 漏洞检测 注意力机制 图神经网络 多类别分类
在线阅读 下载PDF
上一页 1 2 8 下一页 到第
使用帮助 返回顶部