期刊文献+
共找到10,467篇文章
< 1 2 250 >
每页显示 20 50 100
Adapter Based on Pre-Trained Language Models for Classification of Medical Text
1
作者 Quan Li 《Journal of Electronic Research and Application》 2024年第3期129-134,共6页
We present an approach to classify medical text at a sentence level automatically.Given the inherent complexity of medical text classification,we employ adapters based on pre-trained language models to extract informa... We present an approach to classify medical text at a sentence level automatically.Given the inherent complexity of medical text classification,we employ adapters based on pre-trained language models to extract information from medical text,facilitating more accurate classification while minimizing the number of trainable parameters.Extensive experiments conducted on various datasets demonstrate the effectiveness of our approach. 展开更多
关键词 Classification of medical text ADAPTER pre-trained language model
在线阅读 下载PDF
Enhancing Relational Triple Extraction in Specific Domains:Semantic Enhancement and Synergy of Large Language Models and Small Pre-Trained Language Models 被引量:1
2
作者 Jiakai Li Jianpeng Hu Geng Zhang 《Computers, Materials & Continua》 SCIE EI 2024年第5期2481-2503,共23页
In the process of constructing domain-specific knowledge graphs,the task of relational triple extraction plays a critical role in transforming unstructured text into structured information.Existing relational triple e... In the process of constructing domain-specific knowledge graphs,the task of relational triple extraction plays a critical role in transforming unstructured text into structured information.Existing relational triple extraction models facemultiple challenges when processing domain-specific data,including insufficient utilization of semantic interaction information between entities and relations,difficulties in handling challenging samples,and the scarcity of domain-specific datasets.To address these issues,our study introduces three innovative components:Relation semantic enhancement,data augmentation,and a voting strategy,all designed to significantly improve the model’s performance in tackling domain-specific relational triple extraction tasks.We first propose an innovative attention interaction module.This method significantly enhances the semantic interaction capabilities between entities and relations by integrating semantic information fromrelation labels.Second,we propose a voting strategy that effectively combines the strengths of large languagemodels(LLMs)and fine-tuned small pre-trained language models(SLMs)to reevaluate challenging samples,thereby improving the model’s adaptability in specific domains.Additionally,we explore the use of LLMs for data augmentation,aiming to generate domain-specific datasets to alleviate the scarcity of domain data.Experiments conducted on three domain-specific datasets demonstrate that our model outperforms existing comparative models in several aspects,with F1 scores exceeding the State of the Art models by 2%,1.6%,and 0.6%,respectively,validating the effectiveness and generalizability of our approach. 展开更多
关键词 Relational triple extraction semantic interaction large language models data augmentation specific domains
在线阅读 下载PDF
Classification of Conversational Sentences Using an Ensemble Pre-Trained Language Model with the Fine-Tuned Parameter
3
作者 R.Sujatha K.Nimala 《Computers, Materials & Continua》 SCIE EI 2024年第2期1669-1686,共18页
Sentence classification is the process of categorizing a sentence based on the context of the sentence.Sentence categorization requires more semantic highlights than other tasks,such as dependence parsing,which requir... Sentence classification is the process of categorizing a sentence based on the context of the sentence.Sentence categorization requires more semantic highlights than other tasks,such as dependence parsing,which requires more syntactic elements.Most existing strategies focus on the general semantics of a conversation without involving the context of the sentence,recognizing the progress and comparing impacts.An ensemble pre-trained language model was taken up here to classify the conversation sentences from the conversation corpus.The conversational sentences are classified into four categories:information,question,directive,and commission.These classification label sequences are for analyzing the conversation progress and predicting the pecking order of the conversation.Ensemble of Bidirectional Encoder for Representation of Transformer(BERT),Robustly Optimized BERT pretraining Approach(RoBERTa),Generative Pre-Trained Transformer(GPT),DistilBERT and Generalized Autoregressive Pretraining for Language Understanding(XLNet)models are trained on conversation corpus with hyperparameters.Hyperparameter tuning approach is carried out for better performance on sentence classification.This Ensemble of Pre-trained Language Models with a Hyperparameter Tuning(EPLM-HT)system is trained on an annotated conversation dataset.The proposed approach outperformed compared to the base BERT,GPT,DistilBERT and XLNet transformer models.The proposed ensemble model with the fine-tuned parameters achieved an F1_score of 0.88. 展开更多
关键词 Bidirectional encoder for representation of transformer conversation ensemble model fine-tuning generalized autoregressive pretraining for language understanding generative pre-trained transformer hyperparameter tuning natural language processing robustly optimized BERT pretraining approach sentence classification transformer models
在线阅读 下载PDF
GPT2-ICC:A data-driven approach for accurate ion channel identification using pre-trained large language models 被引量:1
4
作者 Zihan Zhou Yang Yu +9 位作者 Chengji Yang Leyan Cao Shaoying Zhang Junnan Li Yingnan Zhang Huayun Han Guoliang Shi Qiansen Zhang Juwen Shen Huaiyu Yang 《Journal of Pharmaceutical Analysis》 2025年第8期1800-1809,共10页
Current experimental and computational methods have limitations in accurately and efficiently classifying ion channels within vast protein spaces.Here we have developed a deep learning algorithm,GPT2 Ion Channel Class... Current experimental and computational methods have limitations in accurately and efficiently classifying ion channels within vast protein spaces.Here we have developed a deep learning algorithm,GPT2 Ion Channel Classifier(GPT2-ICC),which effectively distinguishing ion channels from a test set containing approximately 239 times more non-ion-channel proteins.GPT2-ICC integrates representation learning with a large language model(LLM)-based classifier,enabling highly accurate identification of potential ion channels.Several potential ion channels were predicated from the unannotated human proteome,further demonstrating GPT2-ICC’s generalization ability.This study marks a significant advancement in artificial-intelligence-driven ion channel research,highlighting the adaptability and effectiveness of combining representation learning with LLMs to address the challenges of imbalanced protein sequence data.Moreover,it provides a valuable computational tool for uncovering previously uncharacterized ion channels. 展开更多
关键词 Ion channel Artificial intelligence Representation learning GPT2 Protein language model
在线阅读 下载PDF
Multilingual Text Summarization in Healthcare Using Pre-Trained Transformer-Based Language Models
5
作者 Josua Käser Thomas Nagy +1 位作者 Patrick Stirnemann Thomas Hanne 《Computers, Materials & Continua》 2025年第4期201-217,共17页
We analyze the suitability of existing pre-trained transformer-based language models(PLMs)for abstractive text summarization on German technical healthcare texts.The study focuses on the multilingual capabilities of t... We analyze the suitability of existing pre-trained transformer-based language models(PLMs)for abstractive text summarization on German technical healthcare texts.The study focuses on the multilingual capabilities of these models and their ability to perform the task of abstractive text summarization in the healthcare field.The research hypothesis was that large language models could perform high-quality abstractive text summarization on German technical healthcare texts,even if the model is not specifically trained in that language.Through experiments,the research questions explore the performance of transformer language models in dealing with complex syntax constructs,the difference in performance between models trained in English and German,and the impact of translating the source text to English before conducting the summarization.We conducted an evaluation of four PLMs(GPT-3,a translation-based approach also utilizing GPT-3,a German language Model,and a domain-specific bio-medical model approach).The evaluation considered the informativeness using 3 types of metrics based on Recall-Oriented Understudy for Gisting Evaluation(ROUGE)and the quality of results which is manually evaluated considering 5 aspects.The results show that text summarization models could be used in the German healthcare domain and that domain-independent language models achieved the best results.The study proves that text summarization models can simplify the search for pre-existing German knowledge in various domains. 展开更多
关键词 Text summarization pre-trained transformer-based language models large language models technical healthcare texts natural language processing
在线阅读 下载PDF
Agri-Eval:Multi-level Large Language Model Valuation Benchmark for Agriculture
6
作者 WANG Yaojun GE Mingliang +2 位作者 XU Guowei ZHANG Qiyu BIE Yuhui 《农业机械学报》 北大核心 2026年第1期290-299,共10页
Model evaluation using benchmark datasets is an important method to measure the capability of large language models(LLMs)in specific domains,and it is mainly used to assess the knowledge and reasoning abilities of LLM... Model evaluation using benchmark datasets is an important method to measure the capability of large language models(LLMs)in specific domains,and it is mainly used to assess the knowledge and reasoning abilities of LLMs.Therefore,in order to better assess the capability of LLMs in the agricultural domain,Agri-Eval was proposed as a benchmark for assessing the knowledge and reasoning ability of LLMs in agriculture.The assessment dataset used in Agri-Eval covered seven major disciplines in the agricultural domain:crop science,horticulture,plant protection,animal husbandry,forest science,aquaculture science,and grass science,and contained a total of 2283 questions.Among domestic general-purpose LLMs,DeepSeek R1 performed best with an accuracy rate of 75.49%.In the realm of international general-purpose LLMs,Gemini 2.0 pro exp 0205 standed out as the top performer,achieving an accuracy rate of 74.28%.As an LLMs in agriculture vertical,Shennong V2.0 outperformed all the LLMs in China,and the answer accuracy rate of agricultural knowledge exceeded that of all the existing general-purpose LLMs.The launch of Agri-Eval helped the LLM developers to comprehensively evaluate the model's capability in the field of agriculture through a variety of tasks and tests to promote the development of the LLMs in the field of agriculture. 展开更多
关键词 large language models assessment systems agricultural knowledge agricultural datasets
在线阅读 下载PDF
CIT-Rec:Enhancing Sequential Recommendation System with Large Language Models
7
作者 Ziyu Li Zhen Chen +2 位作者 Xuejing Fu Tong Mo Weiping Li 《Computers, Materials & Continua》 2026年第3期2328-2343,共16页
Recommendation systems are key to boosting user engagement,satisfaction,and retention,particularly on media platforms where personalized content is vital.Sequential recommendation systems learn from user-item interact... Recommendation systems are key to boosting user engagement,satisfaction,and retention,particularly on media platforms where personalized content is vital.Sequential recommendation systems learn from user-item interactions to predict future items of interest.However,many current methods rely on unique user and item IDs,limiting their ability to represent users and items effectively,especially in zero-shot learning scenarios where training data is scarce.With the rapid development of Large Language Models(LLMs),researchers are exploring their potential to enhance recommendation systems.However,there is a semantic gap between the linguistic semantics of LLMs and the collaborative semantics of recommendation systems,where items are typically indexed by IDs.Moreover,most research focuses on item representations,neglecting personalized user modeling.To address these issues,we propose a sequential recommendation framework using LLMs,called CIT-Rec,a model that integrates Collaborative semantics for user representation and Image and Text information for item representation to enhance Recommendations.Specifically,by aligning intuitive image information with text containing semantic features,we can more accurately represent items,improving item representation quality.We focus not only on item representations but also on user representations.To more precisely capture users’personalized preferences,we use traditional sequential recommendation models to train on users’historical interaction data,effectively capturing behavioral patterns.Finally,by combining LLMs and traditional sequential recommendation models,we allow the LLM to understand linguistic semantics while capturing collaborative semantics.Extensive evaluations on real-world datasets show that our model outperforms baseline methods,effectively combining user interaction history with item visual and textual modalities to provide personalized recommendations. 展开更多
关键词 Large language models vision language models sequential recommendation instruction tuning
在线阅读 下载PDF
When Large Language Models and Machine Learning Meet Multi-Criteria Decision Making: Fully Integrated Approach for Social Media Moderation
8
作者 Noreen Fuentes Janeth Ugang +4 位作者 Narcisan Galamiton Suzette Bacus Samantha Shane Evangelista Fatima Maturan Lanndon Ocampo 《Computers, Materials & Continua》 2026年第1期2137-2162,共26页
This study demonstrates a novel integration of large language models,machine learning,and multicriteria decision-making to investigate self-moderation in small online communities,a topic under-explored compared to use... This study demonstrates a novel integration of large language models,machine learning,and multicriteria decision-making to investigate self-moderation in small online communities,a topic under-explored compared to user behavior and platform-driven moderation on social media.The proposed methodological framework(1)utilizes large language models for social media post analysis and categorization,(2)employs k-means clustering for content characterization,and(3)incorporates the TODIM(Tomada de Decisão Interativa Multicritério)method to determine moderation strategies based on expert judgments.In general,the fully integrated framework leverages the strengths of these intelligent systems in a more systematic evaluation of large-scale decision problems.When applied in social media moderation,this approach promotes nuanced and context-sensitive self-moderation by taking into account factors such as cultural background and geographic location.The application of this framework is demonstrated within Facebook groups.Eight distinct content clusters encompassing safety,harassment,diversity,and misinformation are identified.Analysis revealed a preference for content removal across all clusters,suggesting a cautious approach towards potentially harmful content.However,the framework also highlights the use of other moderation actions,like account suspension,depending on the content category.These findings contribute to the growing body of research on self-moderation and offer valuable insights for creating safer and more inclusive online spaces within smaller communities. 展开更多
关键词 Self-moderation user-generated content k-means clustering TODIM large language models
在线阅读 下载PDF
Task-Structured Curriculum Learning for Multi-Task Distillation:Enhancing Step-by-Step Knowledge Transfer in Language Models
9
作者 Ahmet Ezgi Aytug Onan 《Computers, Materials & Continua》 2026年第3期1647-1673,共27页
Knowledge distillation has become a standard technique for compressing large language models into efficient student models,but existing methods often struggle to balance prediction accuracy with explanation quality.Re... Knowledge distillation has become a standard technique for compressing large language models into efficient student models,but existing methods often struggle to balance prediction accuracy with explanation quality.Recent approaches such as Distilling Step-by-Step(DSbS)introduce explanation supervision,yet they apply it in a uniform manner that may not fully exploit the different learning dynamics of prediction and explanation.In this work,we propose a task-structured curriculum learning(TSCL)framework that structures training into three sequential phases:(i)prediction-only,to establish stable feature representations;(ii)joint prediction-explanation,to align task outputs with rationale generation;and(iii)explanation-only,to refine the quality of rationales.This design provides a simple but effective modification to DSbS,requiring no architectural changes and adding negligible training cost.We justify the phase scheduling with ablation studies and convergence analysis,showing that an initial prediction-heavy stage followed by a balanced joint phase improves both stability and explanation alignment.Extensive experiments on five datasets(e-SNLI,ANLI,CommonsenseQA,SVAMP,and MedNLI)demonstrate that TSCL consistently outperforms strong baselines,achieving gains of+1.7-2.6 points in accuracy and 0.8-1.2 in ROUGE-L,corresponding to relative error reductions of up to 21%.Beyond lexical metrics,human evaluation and ERASERstyle faithfulness diagnostics confirm that TSCL produces more faithful and informative explanations.Comparative training curves further reveal faster convergence and lower variance across seeds.Efficiency analysis shows less than 3%overhead in wall-clock training time and no additional inference cost,making the approach practical for realworld deployment.This study demonstrates that a simple task-structured curriculum can significantly improve the effectiveness of knowledge distillation.By separating and sequencing objectives,TSCL achieves a better balance between accuracy,stability,and explanation quality.The framework generalizes across domains,including medical NLI,and offers a principled recipe for future applications in multimodal reasoning and reinforcement learning. 展开更多
关键词 Knowledge distillation curriculum learning language models multi-task learning step-by-step learning
在线阅读 下载PDF
Detection of Maliciously Disseminated Hate Speech in Spanish Using Fine-Tuning and In-Context Learning Techniques with Large Language Models
10
作者 Tomás Bernal-Beltrán RonghaoPan +3 位作者 JoséAntonio García-Díaz María del Pilar Salas-Zárate Mario Andrés Paredes-Valverde Rafael Valencia-García 《Computers, Materials & Continua》 2026年第4期353-390,共38页
The malicious dissemination of hate speech via compromised accounts,automated bot networks and malware-driven social media campaigns has become a growing cybersecurity concern.Automatically detecting such content in S... The malicious dissemination of hate speech via compromised accounts,automated bot networks and malware-driven social media campaigns has become a growing cybersecurity concern.Automatically detecting such content in Spanish is challenging due to linguistic complexity and the scarcity of annotated resources.In this paper,we compare two predominant AI-based approaches for the forensic detection of malicious hate speech:(1)finetuning encoder-only models that have been trained in Spanish and(2)In-Context Learning techniques(Zero-and Few-Shot Learning)with large-scale language models.Our approach goes beyond binary classification,proposing a comprehensive,multidimensional evaluation that labels each text by:(1)type of speech,(2)recipient,(3)level of intensity(ordinal)and(4)targeted group(multi-label).Performance is evaluated using an annotated Spanish corpus,standard metrics such as precision,recall and F1-score and stability-oriented metrics to evaluate the stability of the transition from zero-shot to few-shot prompting(Zero-to-Few Shot Retention and Zero-to-Few Shot Gain)are applied.The results indicate that fine-tuned encoder-only models(notably MarIA and BETO variants)consistently deliver the strongest and most reliable performance:in our experiments their macro F1-scores lie roughly in the range of approximately 46%–66%depending on the task.Zero-shot approaches are much less stable and typically yield substantially lower performance(observed F1-scores range approximately 0%–39%),often producing invalid outputs in practice.Few-shot prompting(e.g.,Qwen 38B,Mistral 7B)generally improves stability and recall relative to pure zero-shot,bringing F1-scores into a moderate range of approximately 20%–51%but still falling short of fully fine-tuned models.These findings highlight the importance of supervised adaptation and discuss the potential of both paradigms as components in AI-powered cybersecurity and malware forensics systems designed to identify and mitigate coordinated online hate campaigns. 展开更多
关键词 Hate speech detection malicious communication campaigns AI-driven cybersecurity social media analytics large language models prompt-tuning fine-tuning in-context learning natural language processing
在线阅读 下载PDF
Command-agent:Reconstructing warfare simulation and command decision-making using large language models
11
作者 Mengwei Zhang Minchi Kuang +3 位作者 Heng Shi Jihong Zhu Jingyu Zhu Xiao Jiang 《Defence Technology(防务技术)》 2026年第2期294-313,共20页
War rehearsals have become increasingly important in national security due to the growing complexity of international affairs.However,traditional rehearsal methods,such as military chess simulations,are inefficient an... War rehearsals have become increasingly important in national security due to the growing complexity of international affairs.However,traditional rehearsal methods,such as military chess simulations,are inefficient and inflexible,with particularly pronounced limitations in command and decision-making.The overwhelming volume of information and high decision complexity hinder the realization of autonomous and agile command and control.To address this challenge,an intelligent warfare simulation framework named Command-Agent is proposed,which deeply integrates large language models(LLMs)with digital twin battlefields.By constructing a highly realistic battlefield environment through real-time simulation and multi-source data fusion,the natural language interaction capabilities of LLMs are leveraged to lower the command threshold and to enable autonomous command through the Observe-Orient-Decide-Act(OODA)feedback loop.Within the Command-Agent framework,a multimodel collaborative architecture is further adopted to decouple the decision-generation and command-execution functions of LLMs.By combining specialized models such as Deep Seek-R1 and MCTool,the limitations of single-model capabilities are overcome.MCTool is a lightweight execution model fine-tuned for military Function Calling tasks.The framework also introduces a Vector Knowledge Base to mitigate hallucinations commonly exhibited by LLMs.Experimental results demonstrate that Command-Agent not only enables natural language-driven simulation and control but also deeply understands commander intent.Leveraging the multi-model collaborative architecture,during red-blue UAV confrontations involving 2 to 8 UAVs,the integrated score is improved by an average of 41.8%compared to the single-agent system(MCTool),accompanied by a 161.8%optimization in the battle loss ratio.Furthermore,when compared with multi-agent systems lacking the knowledge base,the inclusion of the Vector Knowledge Base further improves overall performance by 16.8%.In comparison with the general model(Qwen2.5-7B),the fine-tuned MCTool leads by 5%in execution efficiency.Therefore,the proposed Command-Agent introduces a novel perspective to the military command system and offers a feasible solution for intelligent battlefield decision-making. 展开更多
关键词 Digital twin battlefield Large language models Multi-agent system Military command
在线阅读 下载PDF
Prompt Injection Attacks on Large Language Models:A Survey of Attack Methods,Root Causes,and Defense Strategies
12
作者 Tongcheng Geng Zhiyuan Xu +1 位作者 Yubin Qu W.Eric Wong 《Computers, Materials & Continua》 2026年第4期134-185,共52页
Large language models(LLMs)have revolutionized AI applications across diverse domains.However,their widespread deployment has introduced critical security vulnerabilities,particularly prompt injection attacks that man... Large language models(LLMs)have revolutionized AI applications across diverse domains.However,their widespread deployment has introduced critical security vulnerabilities,particularly prompt injection attacks that manipulate model behavior through malicious instructions.Following Kitchenham’s guidelines,this systematic review synthesizes 128 peer-reviewed studies from 2022 to 2025 to provide a unified understanding of this rapidly evolving threat landscape.Our findings reveal a swift progression from simple direct injections to sophisticated multimodal attacks,achieving over 90%success rates against unprotected systems.In response,defense mechanisms show varying effectiveness:input preprocessing achieves 60%–80%detection rates and advanced architectural defenses demonstrate up to 95%protection against known patterns,though significant gaps persist against novel attack vectors.We identified 37 distinct defense approaches across three categories,but standardized evaluation frameworks remain limited.Our analysis attributes these vulnerabilities to fundamental LLM architectural limitations,such as the inability to distinguish instructions from data and attention mechanism vulnerabilities.This highlights critical research directions such as formal verification methods,standardized evaluation protocols,and architectural innovations for inherently secure LLM designs. 展开更多
关键词 Prompt injection attacks large language models defense mechanisms security evaluation
在线阅读 下载PDF
OPOR-Bench:Evaluating Large Language Models on Online Public Opinion Report Generation
13
作者 Jinzheng Yu Yang Xu +4 位作者 Haozhen Li Junqi Li Ligu Zhu Hao Shen Lei Shi 《Computers, Materials & Continua》 2026年第4期1403-1427,共25页
Online Public Opinion Reports consolidate news and social media for timely crisis management by governments and enterprises.While large language models(LLMs)enable automated report generation,this specific domain lack... Online Public Opinion Reports consolidate news and social media for timely crisis management by governments and enterprises.While large language models(LLMs)enable automated report generation,this specific domain lacks formal task definitions and corresponding benchmarks.To bridge this gap,we define the Automated Online Public Opinion Report Generation(OPOR-Gen)task and construct OPOR-Bench,an event-centric dataset with 463 crisis events across 108 countries(comprising 8.8 K news articles and 185 K tweets).To evaluate report quality,we propose OPOR-Eval,a novel agent-based framework that simulates human expert evaluation.Validation experiments show OPOR-Eval achieves a high Spearman’s correlation(ρ=0.70)with human judgments,though challenges in temporal reasoning persist.This work establishes an initial foundation for advancing automated public opinion reporting research. 展开更多
关键词 Online public opinion reports crisis management large language models agent-based evaluation
在线阅读 下载PDF
LLMKB:Large Language Models with Knowledge Base Augmentation for Conversational Recommendation
14
作者 FANG Xiu QIU Sijia +1 位作者 SUN Guohao LU Jinhu 《Journal of Donghua University(English Edition)》 2026年第1期91-103,共13页
Conversational recommender systems(CRSs)focus on refining preferences and providing personalized recommendations through natural language interactions and dialogue history.Large language models(LLMs)have shown outstan... Conversational recommender systems(CRSs)focus on refining preferences and providing personalized recommendations through natural language interactions and dialogue history.Large language models(LLMs)have shown outstanding performance across various domains,thereby prompting researchers to investigate their applicability in recommendation systems.However,due to the lack of task-specific knowledge and an inefficient feature extraction process,LLMs still have suboptimal performance in recommendation tasks.Therefore,external knowledge sources,such as knowledge graphs(KGs)and knowledge bases(KBs),are often introduced to address the issue of data sparsity.Compared to KGs,KBs possess higher retrieval efficiency,making them more suitable for scenarios where LLMs serve as recommenders.To this end,we introduce a novel framework integrating LLMs with KBs for enhanced retrieval generation,namely LLMKB.LLMKB initially leverages structured knowledge to create mapping dictionaries,extracting entity-relation information from heterogeneous knowledge to construct KBs.Then,LLMKB achieves the embedding calibration between user information representations and documents in KBs through retrieval model fine-tuning.Finally,LLMKB employs retrievalaugmented generation to produce recommendations based on fused text inputs,followed by post-processing.Experiment results on two public CRS datasets demonstrate the effectiveness of our framework.Our code is publicly available at the link:https://anonymous.4open.science/r/LLMKB-6FD0. 展开更多
关键词 recommender system large language model(LLM) knowledge base(KB)
在线阅读 下载PDF
Vision Enhanced Generative Pre-trained Language Model for Multimodal Sentence Summarization 被引量:2
15
作者 Liqiang Jing Yiren Li +3 位作者 Junhao Xu Yongcan Yu Pei Shen Xuemeng Song 《Machine Intelligence Research》 EI CSCD 2023年第2期289-298,共10页
Multimodal sentence summarization(MMSS)is a new yet challenging task that aims to generate a concise summary of a long sentence and its corresponding image.Although existing methods have gained promising success in MM... Multimodal sentence summarization(MMSS)is a new yet challenging task that aims to generate a concise summary of a long sentence and its corresponding image.Although existing methods have gained promising success in MMSS,they overlook the powerful generation ability of generative pre-trained language models(GPLMs),which have shown to be effective in many text generation tasks.To fill this research gap,we propose to using GPLMs to promote the performance of MMSS.Notably,adopting GPLMs to solve MMSS inevitably faces two challenges:1)What fusion strategy should we use to inject visual information into GPLMs properly?2)How to keep the GPLM′s generation ability intact to the utmost extent when the visual feature is injected into the GPLM.To address these two challenges,we propose a vision enhanced generative pre-trained language model for MMSS,dubbed as Vision-GPLM.In Vision-GPLM,we obtain features of visual and textual modalities with two separate encoders and utilize a text decoder to produce a summary.In particular,we utilize multi-head attention to fuse the features extracted from visual and textual modalities to inject the visual feature into the GPLM.Meanwhile,we train Vision-GPLM in two stages:the vision-oriented pre-training stage and fine-tuning stage.In the vision-oriented pre-training stage,we particularly train the visual encoder by the masked language model task while the other components are frozen,aiming to obtain homogeneous representations of text and image.In the fine-tuning stage,we train all the components of Vision-GPLM by the MMSS task.Extensive experiments on a public MMSS dataset verify the superiority of our model over existing baselines. 展开更多
关键词 Multimodal sentence summarization(MMSS) generative pre-trained language model(GPLM) natural language generation deep learning artificial intelligence
原文传递
Unsupervised statistical text simplification using pre-trained language modeling for initialization 被引量:1
16
作者 Jipeng QIANG Feng ZHANG +3 位作者 Yun LI Yunhao YUAN Yi ZHU Xindong WU 《Frontiers of Computer Science》 SCIE EI CSCD 2023年第1期81-90,共10页
Unsupervised text simplification has attracted much attention due to the scarcity of high-quality parallel text simplification corpora. Recent an unsupervised statistical text simplification based on phrase-based mach... Unsupervised text simplification has attracted much attention due to the scarcity of high-quality parallel text simplification corpora. Recent an unsupervised statistical text simplification based on phrase-based machine translation system (UnsupPBMT) achieved good performance, which initializes the phrase tables using the similar words obtained by word embedding modeling. Since word embedding modeling only considers the relevance between words, the phrase table in UnsupPBMT contains a lot of dissimilar words. In this paper, we propose an unsupervised statistical text simplification using pre-trained language modeling BERT for initialization. Specifically, we use BERT as a general linguistic knowledge base for predicting similar words. Experimental results show that our method outperforms the state-of-the-art unsupervised text simplification methods on three benchmarks, even outperforms some supervised baselines. 展开更多
关键词 text simplification pre-trained language modeling BERT word embeddings
原文传递
Satellite and instrument entity recognition using a pre-trained language model with distant supervision 被引量:1
17
作者 Ming Lin Meng Jin +1 位作者 Yufu Liu Yuqi Bai 《International Journal of Digital Earth》 SCIE EI 2022年第1期1290-1304,共15页
Earth observations,especially satellite data,have produced a wealth of methods and results in meeting global challenges,often presented in unstructured texts such as papers or reports.Accurate extraction of satellite ... Earth observations,especially satellite data,have produced a wealth of methods and results in meeting global challenges,often presented in unstructured texts such as papers or reports.Accurate extraction of satellite and instrument entities from these unstructured texts can help to link and reuse Earth observation resources.The direct use of an existing dictionary to extract satellite and instrument entities suffers from the problem of poor matching,which leads to low recall.In this study,we present a named entity recognition model to automatically extract satellite and instrument entities from unstructured texts.Due to the lack of manually labeled data,we apply distant supervision to automatically generate labeled training data.Accordingly,we fine-tune the pre-trained language model with early stopping and a weighted cross-entropy loss function.We propose the dictionary-based self-training method to correct the incomplete annotations caused by the distant supervision method.Experiments demonstrate that our method achieves significant improvements in both precision and recall compared to dictionary matching or standard adaptation of pre-trained language models. 展开更多
关键词 Earth observation named entity recognition pre-trained language model distant supervision dictionary-based self-training
原文传递
Improving Extraction of Chinese Open Relations Using Pre-trained Language Model and Knowledge Enhancement 被引量:1
18
作者 Chaojie Wen Xudong Jia Tao Chen 《Data Intelligence》 EI 2023年第4期962-989,共28页
Open Relation Extraction(ORE)is a task of extracting semantic relations from a text document.Current ORE systems have significantly improved their efficiency in obtaining Chinese relations,when compared with conventio... Open Relation Extraction(ORE)is a task of extracting semantic relations from a text document.Current ORE systems have significantly improved their efficiency in obtaining Chinese relations,when compared with conventional systems which heavily depend on feature engineering or syntactic parsing.However,the ORE systems do not use robust neural networks such as pre-trained language models to take advantage of large-scale unstructured data effectively.In respons to this issue,a new system entitled Chinese Open Relation Extraction with Knowledge Enhancement(CORE-KE)is presented in this paper.The CORE-KE system employs a pre-trained language model(with the support of a Bidirectional Long Short-Term Memory(BiLSTM)layer and a Masked Conditional Random Field(Masked CRF)layer)on unstructured data in order to improve Chinese open relation extraction.Entity descriptions in Wikidata and additional knowledge(in terms of triple facts)extracted from Chinese ORE datasets are used to fine-tune the pre-trained language model.In addition,syntactic features are further adopted in the training stage of the CORE-KE system for knowledge enhancement.Experimental results of the CORE-KE system on two large-scale datasets of open Chinese entities and relations demonstrate that the CORE-KE system is superior to other ORE systems.The F1-scores of the CORE-KE system on the two datasets have given a relative improvement of 20.1%and 1.3%,when compared with benchmark ORE systems,respectively.The source code is available at https:/github.COm/cjwen15/CORE-KE. 展开更多
关键词 Chinese open relation extraction pre-trained language model Knowledge enhancement
原文传递
Medical Named Entity Recognition from Un-labelled Medical Records based on Pre-trained Language Models and Domain Dictionary 被引量:1
19
作者 Chaojie Wen Tao Chen +1 位作者 Xudong Jia Jiang Zhu 《Data Intelligence》 2021年第3期402-417,共16页
Medical named entity recognition(NER)is an area in which medical named entities are recognized from medical texts,such as diseases,drugs,surgery reports,anatomical parts,and examination documents.Conventional medical ... Medical named entity recognition(NER)is an area in which medical named entities are recognized from medical texts,such as diseases,drugs,surgery reports,anatomical parts,and examination documents.Conventional medical NER methods do not make full use of un-labelled medical texts embedded in medical documents.To address this issue,we proposed a medical NER approach based on pre-trained language models and a domain dictionary.First,we constructed a medical entity dictionary by extracting medical entities from labelled medical texts and collecting medical entities from other resources,such as the YiduN4 K data set.Second,we employed this dictionary to train domain-specific pre-trained language models using un-labelled medical texts.Third,we employed a pseudo labelling mechanism in un-labelled medical texts to automatically annotate texts and create pseudo labels.Fourth,the BiLSTM-CRF sequence tagging model was used to fine-tune the pre-trained language models.Our experiments on the un-labelled medical texts,which were extracted from Chinese electronic medical records,show that the proposed NER approach enables the strict and relaxed F1 scores to be 88.7%and 95.3%,respectively. 展开更多
关键词 Medical named entity recognition pre-trained language model Domain dictionary Pseudo labelling Un-labelled medical data
原文传递
Leveraging Vision-Language Pre-Trained Model and Contrastive Learning for Enhanced Multimodal Sentiment Analysis
20
作者 Jieyu An Wan Mohd Nazmee Wan Zainon Binfen Ding 《Intelligent Automation & Soft Computing》 SCIE 2023年第8期1673-1689,共17页
Multimodal sentiment analysis is an essential area of research in artificial intelligence that combines multiple modes,such as text and image,to accurately assess sentiment.However,conventional approaches that rely on... Multimodal sentiment analysis is an essential area of research in artificial intelligence that combines multiple modes,such as text and image,to accurately assess sentiment.However,conventional approaches that rely on unimodal pre-trained models for feature extraction from each modality often overlook the intrinsic connections of semantic information between modalities.This limitation is attributed to their training on unimodal data,and necessitates the use of complex fusion mechanisms for sentiment analysis.In this study,we present a novel approach that combines a vision-language pre-trained model with a proposed multimodal contrastive learning method.Our approach harnesses the power of transfer learning by utilizing a vision-language pre-trained model to extract both visual and textual representations in a unified framework.We employ a Transformer architecture to integrate these representations,thereby enabling the capture of rich semantic infor-mation in image-text pairs.To further enhance the representation learning of these pairs,we introduce our proposed multimodal contrastive learning method,which leads to improved performance in sentiment analysis tasks.Our approach is evaluated through extensive experiments on two publicly accessible datasets,where we demonstrate its effectiveness.We achieve a significant improvement in sentiment analysis accuracy,indicating the supe-riority of our approach over existing techniques.These results highlight the potential of multimodal sentiment analysis and underscore the importance of considering the intrinsic semantic connections between modalities for accurate sentiment assessment. 展开更多
关键词 Multimodal sentiment analysis vision–language pre-trained model contrastive learning sentiment classification
在线阅读 下载PDF
上一页 1 2 250 下一页 到第
使用帮助 返回顶部