The rapid expansion of online content and big data has precipitated an urgent need for efficient summarization techniques to swiftly comprehend vast textual documents without compromising their original integrity.Curr...The rapid expansion of online content and big data has precipitated an urgent need for efficient summarization techniques to swiftly comprehend vast textual documents without compromising their original integrity.Current approaches in Extractive Text Summarization(ETS)leverage the modeling of inter-sentence relationships,a task of paramount importance in producing coherent summaries.This study introduces an innovative model that integrates Graph Attention Networks(GATs)with Transformer-based Bidirectional Encoder Representa-tions from Transformers(BERT)and Latent Dirichlet Allocation(LDA),further enhanced by Term Frequency-Inverse Document Frequency(TF-IDF)values,to improve sentence selection by capturing comprehensive topical information.Our approach constructs a graph with nodes representing sentences,words,and topics,thereby elevating the interconnectivity and enabling a more refined understanding of text structures.This model is stretched to Multi-Document Summarization(MDS)from Single-Document Summarization,offering significant improvements over existing models such as THGS-GMM and Topic-GraphSum,as demonstrated by empirical evaluations on benchmark news datasets like Cable News Network(CNN)/Daily Mail(DM)and Multi-News.The results consistently demonstrate superior performance,showcasing the model’s robustness in handling complex summarization tasks across single and multi-document contexts.This research not only advances the integration of BERT and LDA within a GATs but also emphasizes our model’s capacity to effectively manage global information and adapt to diverse summarization challenges.展开更多
Drug-target interaction(DTI)is a widely explored topic in the field of bioinformatics and plays a pivotal role in drug discovery.However,the traditional bio-experimental process of drug-target interaction identificati...Drug-target interaction(DTI)is a widely explored topic in the field of bioinformatics and plays a pivotal role in drug discovery.However,the traditional bio-experimental process of drug-target interaction identification requires a large investment of time and labor.To address this challenge,graph neural network(GNN)approaches in deep learning are becoming a prominent trend in the field of DTI research,which is characterized by multimodal processing of data,feature learning and interpretability in DTI.Nevertheless,some methods are still limited by homogeneous graphs and single features.To address the problems,we mechanistically analyze graph convolutional neural networks(GCNs)and graph attentional neural networks(GATs)to propose a new model for the prediction of drug-target interactions using graph neural networks named BiTGNN[Bidirectional Transformer(Bi-Transformer)-graph neural network].The method first establishes drug-target pairs through the pseudo-position specificity scoring matrix(PsePSSM)and drug fingerprint data,and constructs a heterogeneous network by utilizing the relationship between the drug and the target.Then,the computational extraction of drug and target attributes is performed using GCNs and GATs for the purpose of model information flow extension and graph information enhancement.We collect interaction data using the proposed Bi-Transformer architecture,in which we design a bidirectional cross-attention mechanism for calculating the effects of drugtarget interactions for realistic biological interaction simulations.Finally,a feed-forward neural network is used to obtain the feature matrices of the drug and the target,and DTI prediction is performed by fusing the two feature matrices.The Enzyme,Ion Channel(IC),G Protein-coupled Receptor(GPCR)and Nuclear Receptor(NR)datasets are used in the experiments,and compared with several existing mainstream models,our model outperforms in Area Under the ROC Curve(AUC),Specificity,Accuracy and the metric Area Under the Precision-Recall Curve(AUPR).展开更多
在一些修船企业建立的修船结算系统和电子价格库中,人工匹配结算编码步骤易出错且耗时长,直接影响结算效率。为解决该问题,提出一种基于多特征融合的修船结算编码智能匹配复合模型。采用来自变换器的双向编码器表示(Bidirectional Encod...在一些修船企业建立的修船结算系统和电子价格库中,人工匹配结算编码步骤易出错且耗时长,直接影响结算效率。为解决该问题,提出一种基于多特征融合的修船结算编码智能匹配复合模型。采用来自变换器的双向编码器表示(Bidirectional Encoder Representations from Transformers,BERT)模型将工程内容文本表示为词向量,采用卷积神经网络(Convolutional Neural Network,CNN)模型提取文本的局部特征,采用双向长短期记忆网络结合注意力机制(Bidirectional Long Short-Term Memory with Attention Mechanism,BiLSTM-Attention)模型提取上下文特征,得到对应的结算编码。试验结果表明,所提出的复合模型在整体准确率方面实现显著提升,充分证明该复合模型在处理复杂文本分类任务中的优势。展开更多
Background:The medical records of traditional Chinese medicine(TCM)contain numerous synonymous terms with different descriptions,which is not conducive to computer-aided data mining of TCM.However,there is a lack of m...Background:The medical records of traditional Chinese medicine(TCM)contain numerous synonymous terms with different descriptions,which is not conducive to computer-aided data mining of TCM.However,there is a lack of models available to normalize synonymous TCM terms.Therefore,construction of a synonymous term conversion(STC)model for normalizing synonymous TCM terms is necessary.Methods:Based on the neural networks of bidirectional encoder representations from transformers(BERT),four types of TCM STC models were designed:Models based on BERT and text classification,text sequence generation,named entity recognition,and text matching.The superior STC model was selected on the basis of its performance in converting synonymous terms.Moreover,three misjudgment inspection methods for the conversion results of the STC model based on inconsistency were proposed to find incorrect term conversion:Neuron random deactivation,output comparison of multiple isomorphic models,and output comparison of multiple heterogeneous models(OCMH).Results:The classification-based STC model outperformed the other STC task models.It achieved F1 scores of 0.91,0.91,and 0.83 for performing symptoms,patterns,and treatments STC tasks,respectively.The OCMH method showed the best performance in misjudgment inspection,with wrong detection rates of 0.80,0.84,and 0.90 in the term conversion results for symptoms,patterns,and treatments,respectively.Conclusion:The TCM STC model based on classification achieved superior performance in converting synonymous terms for symptoms,patterns,and treatments.The misjudgment inspection method based on OCMH showed superior performance in identifying incorrect outputs.展开更多
为了解决飞机目标机动数据集缺失的问题,文章利用运动学建模生成了丰富的轨迹数据集,为网络训练提供了必要的数据支持。针对现阶段轨迹预测运动学模型建立困难及时序预测方法难以提取时空特征的问题,提出了一种结合Transformer编码器和...为了解决飞机目标机动数据集缺失的问题,文章利用运动学建模生成了丰富的轨迹数据集,为网络训练提供了必要的数据支持。针对现阶段轨迹预测运动学模型建立困难及时序预测方法难以提取时空特征的问题,提出了一种结合Transformer编码器和长短期记忆网络(Long Short Term Memory,LSTM)的飞机目标轨迹预测方法,即Transformer-Encoder-LSTM模型。新模型可同时提供LSTM和Transformer编码器模块的补充历史信息和基于注意力的信息表示,提高了模型能力。通过与一些经典神经网络模型进行对比分析,发现在数据集上,新方法的平均位移误差减小到0.22,显著优于CNN-LSTMAttention模型的0.35。相比其他网络,该算法能够提取复杂轨迹中的隐藏特征,在面对飞机连续转弯、大机动转弯的复杂轨迹时,能够保证模型的鲁棒性,提升了对于复杂轨迹预测的准确性。展开更多
文摘The rapid expansion of online content and big data has precipitated an urgent need for efficient summarization techniques to swiftly comprehend vast textual documents without compromising their original integrity.Current approaches in Extractive Text Summarization(ETS)leverage the modeling of inter-sentence relationships,a task of paramount importance in producing coherent summaries.This study introduces an innovative model that integrates Graph Attention Networks(GATs)with Transformer-based Bidirectional Encoder Representa-tions from Transformers(BERT)and Latent Dirichlet Allocation(LDA),further enhanced by Term Frequency-Inverse Document Frequency(TF-IDF)values,to improve sentence selection by capturing comprehensive topical information.Our approach constructs a graph with nodes representing sentences,words,and topics,thereby elevating the interconnectivity and enabling a more refined understanding of text structures.This model is stretched to Multi-Document Summarization(MDS)from Single-Document Summarization,offering significant improvements over existing models such as THGS-GMM and Topic-GraphSum,as demonstrated by empirical evaluations on benchmark news datasets like Cable News Network(CNN)/Daily Mail(DM)and Multi-News.The results consistently demonstrate superior performance,showcasing the model’s robustness in handling complex summarization tasks across single and multi-document contexts.This research not only advances the integration of BERT and LDA within a GATs but also emphasizes our model’s capacity to effectively manage global information and adapt to diverse summarization challenges.
基金supported by the National Key R&D Program of China under the Project No.2021YFB2802300National Natural Science Foundation of China under the Grant Nos.12271362 and 12061059.
文摘Drug-target interaction(DTI)is a widely explored topic in the field of bioinformatics and plays a pivotal role in drug discovery.However,the traditional bio-experimental process of drug-target interaction identification requires a large investment of time and labor.To address this challenge,graph neural network(GNN)approaches in deep learning are becoming a prominent trend in the field of DTI research,which is characterized by multimodal processing of data,feature learning and interpretability in DTI.Nevertheless,some methods are still limited by homogeneous graphs and single features.To address the problems,we mechanistically analyze graph convolutional neural networks(GCNs)and graph attentional neural networks(GATs)to propose a new model for the prediction of drug-target interactions using graph neural networks named BiTGNN[Bidirectional Transformer(Bi-Transformer)-graph neural network].The method first establishes drug-target pairs through the pseudo-position specificity scoring matrix(PsePSSM)and drug fingerprint data,and constructs a heterogeneous network by utilizing the relationship between the drug and the target.Then,the computational extraction of drug and target attributes is performed using GCNs and GATs for the purpose of model information flow extension and graph information enhancement.We collect interaction data using the proposed Bi-Transformer architecture,in which we design a bidirectional cross-attention mechanism for calculating the effects of drugtarget interactions for realistic biological interaction simulations.Finally,a feed-forward neural network is used to obtain the feature matrices of the drug and the target,and DTI prediction is performed by fusing the two feature matrices.The Enzyme,Ion Channel(IC),G Protein-coupled Receptor(GPCR)and Nuclear Receptor(NR)datasets are used in the experiments,and compared with several existing mainstream models,our model outperforms in Area Under the ROC Curve(AUC),Specificity,Accuracy and the metric Area Under the Precision-Recall Curve(AUPR).
文摘在一些修船企业建立的修船结算系统和电子价格库中,人工匹配结算编码步骤易出错且耗时长,直接影响结算效率。为解决该问题,提出一种基于多特征融合的修船结算编码智能匹配复合模型。采用来自变换器的双向编码器表示(Bidirectional Encoder Representations from Transformers,BERT)模型将工程内容文本表示为词向量,采用卷积神经网络(Convolutional Neural Network,CNN)模型提取文本的局部特征,采用双向长短期记忆网络结合注意力机制(Bidirectional Long Short-Term Memory with Attention Mechanism,BiLSTM-Attention)模型提取上下文特征,得到对应的结算编码。试验结果表明,所提出的复合模型在整体准确率方面实现显著提升,充分证明该复合模型在处理复杂文本分类任务中的优势。
基金The National Key R&D Program of China supported this study(2017YFC1700303).
文摘Background:The medical records of traditional Chinese medicine(TCM)contain numerous synonymous terms with different descriptions,which is not conducive to computer-aided data mining of TCM.However,there is a lack of models available to normalize synonymous TCM terms.Therefore,construction of a synonymous term conversion(STC)model for normalizing synonymous TCM terms is necessary.Methods:Based on the neural networks of bidirectional encoder representations from transformers(BERT),four types of TCM STC models were designed:Models based on BERT and text classification,text sequence generation,named entity recognition,and text matching.The superior STC model was selected on the basis of its performance in converting synonymous terms.Moreover,three misjudgment inspection methods for the conversion results of the STC model based on inconsistency were proposed to find incorrect term conversion:Neuron random deactivation,output comparison of multiple isomorphic models,and output comparison of multiple heterogeneous models(OCMH).Results:The classification-based STC model outperformed the other STC task models.It achieved F1 scores of 0.91,0.91,and 0.83 for performing symptoms,patterns,and treatments STC tasks,respectively.The OCMH method showed the best performance in misjudgment inspection,with wrong detection rates of 0.80,0.84,and 0.90 in the term conversion results for symptoms,patterns,and treatments,respectively.Conclusion:The TCM STC model based on classification achieved superior performance in converting synonymous terms for symptoms,patterns,and treatments.The misjudgment inspection method based on OCMH showed superior performance in identifying incorrect outputs.
文摘为了解决飞机目标机动数据集缺失的问题,文章利用运动学建模生成了丰富的轨迹数据集,为网络训练提供了必要的数据支持。针对现阶段轨迹预测运动学模型建立困难及时序预测方法难以提取时空特征的问题,提出了一种结合Transformer编码器和长短期记忆网络(Long Short Term Memory,LSTM)的飞机目标轨迹预测方法,即Transformer-Encoder-LSTM模型。新模型可同时提供LSTM和Transformer编码器模块的补充历史信息和基于注意力的信息表示,提高了模型能力。通过与一些经典神经网络模型进行对比分析,发现在数据集上,新方法的平均位移误差减小到0.22,显著优于CNN-LSTMAttention模型的0.35。相比其他网络,该算法能够提取复杂轨迹中的隐藏特征,在面对飞机连续转弯、大机动转弯的复杂轨迹时,能够保证模型的鲁棒性,提升了对于复杂轨迹预测的准确性。