期刊文献+
共找到128篇文章
< 1 2 7 >
每页显示 20 50 100
Multilingual Text Summarization in Healthcare Using Pre-Trained Transformer-Based Language Models
1
作者 Josua Käser Thomas Nagy +1 位作者 Patrick Stirnemann Thomas Hanne 《Computers, Materials & Continua》 2025年第4期201-217,共17页
We analyze the suitability of existing pre-trained transformer-based language models(PLMs)for abstractive text summarization on German technical healthcare texts.The study focuses on the multilingual capabilities of t... We analyze the suitability of existing pre-trained transformer-based language models(PLMs)for abstractive text summarization on German technical healthcare texts.The study focuses on the multilingual capabilities of these models and their ability to perform the task of abstractive text summarization in the healthcare field.The research hypothesis was that large language models could perform high-quality abstractive text summarization on German technical healthcare texts,even if the model is not specifically trained in that language.Through experiments,the research questions explore the performance of transformer language models in dealing with complex syntax constructs,the difference in performance between models trained in English and German,and the impact of translating the source text to English before conducting the summarization.We conducted an evaluation of four PLMs(GPT-3,a translation-based approach also utilizing GPT-3,a German language Model,and a domain-specific bio-medical model approach).The evaluation considered the informativeness using 3 types of metrics based on Recall-Oriented Understudy for Gisting Evaluation(ROUGE)and the quality of results which is manually evaluated considering 5 aspects.The results show that text summarization models could be used in the German healthcare domain and that domain-independent language models achieved the best results.The study proves that text summarization models can simplify the search for pre-existing German knowledge in various domains. 展开更多
关键词 Text summarization pre-trained transformer-based language models large language models technical healthcare texts natural language processing
在线阅读 下载PDF
KitWaSor:Pioneering pre-trained model for kitchen waste sorting with an innovative million-level benchmark dataset
2
作者 Leyuan Fang Shuaiyu Ding +3 位作者 Hao Feng Junwu Yu Lin Tang Pedram Ghamisi 《CAAI Transactions on Intelligence Technology》 2025年第1期94-114,共21页
Intelligent sorting is an important prerequisite for the full quantitative consumption and harmless disposal of kitchen waste.The existing object detection method based on an ImageNet pre-trained model is an effective... Intelligent sorting is an important prerequisite for the full quantitative consumption and harmless disposal of kitchen waste.The existing object detection method based on an ImageNet pre-trained model is an effective way of sorting.Owing to significant domain gaps between natural images and kitchen waste images,it is difficult to reflect the characteristics of diverse scales and dense distribution in kitchen waste based on an ImageNet pre-trained model,leading to poor generalisation.In this article,the authors propose the first pre-trained model for kitchen waste sorting called KitWaSor,which combines both contrastive learning(CL)and masked image modelling(MIM)through self-supervised learning(SSL).First,to address the issue of diverse scales,the authors propose a mixed masking strategy by introducing an incomplete masking branch based on the original random masking branch.It prevents the complete loss of small-scale objects while avoiding excessive leakage of large-scale object pixels.Second,to address the issue of dense distribution,the authors introduce semantic consistency constraints on the basis of the mixed masking strategy.That is,object semantic reasoning is performed through semantic consistency constraints to compensate for the lack of contextual information.To train KitWaSor,the authors construct the first million-level kitchen waste dataset across seasonal and regional distributions,named KWD-Million.Extensive experiments show that KitWaSor achieves state-of-the-art(SOTA)performance on the two most relevant downstream tasks for kitchen waste sorting(i.e.image classification and object detection),demonstrating the effectiveness of the proposed KitWaSor. 展开更多
关键词 contrastive learning kitchen waste masked image modeling pre-trained model self-supervised learning
在线阅读 下载PDF
DPCIPI: A pre-trained deep learning model for predicting cross-immunity between drifted strains of Influenza A/H3N2
3
作者 Yiming Du Zhuotian Li +8 位作者 Qian He Thomas Wetere Tulu Kei Hang Katie Chan Lin Wang Sen Pei Zhanwei Du Zhen Wang Xiao-Ke Xu Xiao Fan Liu 《Journal of Automation and Intelligence》 2025年第2期115-124,共10页
Predicting cross-immunity between viral strains is vital for public health surveillance and vaccine development.Traditional neural network methods,such as BiLSTM,could be ineffective due to the lack of lab data for mo... Predicting cross-immunity between viral strains is vital for public health surveillance and vaccine development.Traditional neural network methods,such as BiLSTM,could be ineffective due to the lack of lab data for model training and the overshadowing of crucial features within sequence concatenation.The current work proposes a less data-consuming model incorporating a pre-trained gene sequence model and a mutual information inference operator.Our methodology utilizes gene alignment and deduplication algorithms to preprocess gene sequences,enhancing the model’s capacity to discern and focus on distinctions among input gene pairs.The model,i.e.,DNA Pretrained Cross-Immunity Protection Inference model(DPCIPI),outperforms state-of-theart(SOTA)models in predicting hemagglutination inhibition titer from influenza viral gene sequences only.Improvement in binary cross-immunity prediction is 1.58%in F1,2.34%in precision,1.57%in recall,and 1.57%in Accuracy.For multilevel cross-immunity improvements,the improvement is 2.12%in F1,3.50%in precision,2.19%in recall,and 2.19%in Accuracy.Our study showcases the potential of pre-trained gene models to improve predictions of antigenic variation and cross-immunity.With expanding gene data and advancements in pre-trained models,this approach promises significant impacts on vaccine development and public health. 展开更多
关键词 Cross-immunity prediction pre-trained model Deep learning Influenza strains Hemagglutination inhibition
在线阅读 下载PDF
A Classification–Detection Approach of COVID-19 Based on Chest X-ray and CT by Using Keras Pre-Trained Deep Learning Models 被引量:10
4
作者 Xing Deng Haijian Shao +2 位作者 Liang Shi Xia Wang Tongling Xie 《Computer Modeling in Engineering & Sciences》 SCIE EI 2020年第11期579-596,共18页
The Coronavirus Disease 2019(COVID-19)is wreaking havoc around the world,bring out that the enormous pressure on national health and medical staff systems.One of the most effective and critical steps in the fight agai... The Coronavirus Disease 2019(COVID-19)is wreaking havoc around the world,bring out that the enormous pressure on national health and medical staff systems.One of the most effective and critical steps in the fight against COVID-19,is to examine the patient’s lungs based on the Chest X-ray and CT generated by radiation imaging.In this paper,five keras-related deep learning models:ResNet50,InceptionResNetV2,Xception,transfer learning and pre-trained VGGNet16 is applied to formulate an classification-detection approaches of COVID-19.Two benchmark methods SVM(Support Vector Machine),CNN(Conventional Neural Networks)are provided to compare with the classification-detection approaches based on the performance indicators,i.e.,precision,recall,F1 scores,confusion matrix,classification accuracy and three types of AUC(Area Under Curve).The highest classification accuracy derived by classification-detection based on 5857 Chest X-rays and 767 Chest CTs are respectively 84%and 75%,which shows that the keras-related deep learning approaches facilitate accurate and effective COVID-19-assisted detection. 展开更多
关键词 COVID-19 detection deep learning transfer learning pre-trained models
在线阅读 下载PDF
Construction and application of knowledge graph for grid dispatch fault handling based on pre-trained model 被引量:1
5
作者 Zhixiang Ji Xiaohui Wang +1 位作者 Jie Zhang Di Wu 《Global Energy Interconnection》 EI CSCD 2023年第4期493-504,共12页
With the construction of new power systems,the power grid has become extremely large,with an increasing proportion of new energy and AC/DC hybrid connections.The dynamic characteristics and fault patterns of the power... With the construction of new power systems,the power grid has become extremely large,with an increasing proportion of new energy and AC/DC hybrid connections.The dynamic characteristics and fault patterns of the power grid are complex;additionally,power grid control is difficult,operation risks are high,and the task of fault handling is arduous.Traditional power-grid fault handling relies primarily on human experience.The difference in and lack of knowledge reserve of control personnel restrict the accuracy and timeliness of fault handling.Therefore,this mode of operation is no longer suitable for the requirements of new systems.Based on the multi-source heterogeneous data of power grid dispatch,this paper proposes a joint entity–relationship extraction method for power-grid dispatch fault processing based on a pre-trained model,constructs a knowledge graph of power-grid dispatch fault processing and designs,and develops a fault-processing auxiliary decision-making system based on the knowledge graph.It was applied to study a provincial dispatch control center,and it effectively improved the accident processing ability and intelligent level of accident management and control of the power grid. 展开更多
关键词 Power-grid dispatch fault handling Knowledge graph pre-trained model Auxiliary decision-making
在线阅读 下载PDF
Investigation of Automatic Speech Recognition Systems via the Multilingual Deep Neural Network Modeling Methods for a Very Low-Resource Language, Chaha 被引量:1
6
作者 Tessfu Geteye Fantaye Junqing Yu Tulu Tilahun Hailu 《Journal of Signal and Information Processing》 2020年第1期1-21,共21页
Automatic speech recognition (ASR) is vital for very low-resource languages for mitigating the extinction trouble. Chaha is one of the low-resource languages, which suffers from the problem of resource insufficiency a... Automatic speech recognition (ASR) is vital for very low-resource languages for mitigating the extinction trouble. Chaha is one of the low-resource languages, which suffers from the problem of resource insufficiency and some of its phonological, morphological, and orthographic features challenge the development and initiatives in the area of ASR. By considering these challenges, this study is the first endeavor, which analyzed the characteristics of the language, prepared speech corpus, and developed different ASR systems. A small 3-hour read speech corpus was prepared and transcribed. Different basic and rounded phone unit-based speech recognizers were explored using multilingual deep neural network (DNN) modeling methods. The experimental results demonstrated that all the basic phone and rounded phone unit-based multilingual models outperformed the corresponding unilingual models with the relative performance improvements of 5.47% to 19.87% and 5.74% to 16.77%, respectively. The rounded phone unit-based multilingual models outperformed the equivalent basic phone unit-based models with relative performance improvements of 0.95% to 4.98%. Overall, we discovered that multilingual DNN modeling methods are profoundly effective to develop Chaha speech recognizers. Both the basic and rounded phone acoustic units are convenient to build Chaha ASR system. However, the rounded phone unit-based models are superior in performance and faster in recognition speed over the corresponding basic phone unit-based models. Hence, the rounded phone units are the most suitable acoustic units to develop Chaha ASR systems. 展开更多
关键词 Automatic SPEECH Recognition multilingual DNN modeling Methods Basic PHONE ACOUSTIC UNITS Rounded PHONE ACOUSTIC UNITS Chaha
在线阅读 下载PDF
Leveraging Vision-Language Pre-Trained Model and Contrastive Learning for Enhanced Multimodal Sentiment Analysis
7
作者 Jieyu An Wan Mohd Nazmee Wan Zainon Binfen Ding 《Intelligent Automation & Soft Computing》 SCIE 2023年第8期1673-1689,共17页
Multimodal sentiment analysis is an essential area of research in artificial intelligence that combines multiple modes,such as text and image,to accurately assess sentiment.However,conventional approaches that rely on... Multimodal sentiment analysis is an essential area of research in artificial intelligence that combines multiple modes,such as text and image,to accurately assess sentiment.However,conventional approaches that rely on unimodal pre-trained models for feature extraction from each modality often overlook the intrinsic connections of semantic information between modalities.This limitation is attributed to their training on unimodal data,and necessitates the use of complex fusion mechanisms for sentiment analysis.In this study,we present a novel approach that combines a vision-language pre-trained model with a proposed multimodal contrastive learning method.Our approach harnesses the power of transfer learning by utilizing a vision-language pre-trained model to extract both visual and textual representations in a unified framework.We employ a Transformer architecture to integrate these representations,thereby enabling the capture of rich semantic infor-mation in image-text pairs.To further enhance the representation learning of these pairs,we introduce our proposed multimodal contrastive learning method,which leads to improved performance in sentiment analysis tasks.Our approach is evaluated through extensive experiments on two publicly accessible datasets,where we demonstrate its effectiveness.We achieve a significant improvement in sentiment analysis accuracy,indicating the supe-riority of our approach over existing techniques.These results highlight the potential of multimodal sentiment analysis and underscore the importance of considering the intrinsic semantic connections between modalities for accurate sentiment assessment. 展开更多
关键词 Multimodal sentiment analysis vision–language pre-trained model contrastive learning sentiment classification
在线阅读 下载PDF
Classification of Conversational Sentences Using an Ensemble Pre-Trained Language Model with the Fine-Tuned Parameter
8
作者 R.Sujatha K.Nimala 《Computers, Materials & Continua》 SCIE EI 2024年第2期1669-1686,共18页
Sentence classification is the process of categorizing a sentence based on the context of the sentence.Sentence categorization requires more semantic highlights than other tasks,such as dependence parsing,which requir... Sentence classification is the process of categorizing a sentence based on the context of the sentence.Sentence categorization requires more semantic highlights than other tasks,such as dependence parsing,which requires more syntactic elements.Most existing strategies focus on the general semantics of a conversation without involving the context of the sentence,recognizing the progress and comparing impacts.An ensemble pre-trained language model was taken up here to classify the conversation sentences from the conversation corpus.The conversational sentences are classified into four categories:information,question,directive,and commission.These classification label sequences are for analyzing the conversation progress and predicting the pecking order of the conversation.Ensemble of Bidirectional Encoder for Representation of Transformer(BERT),Robustly Optimized BERT pretraining Approach(RoBERTa),Generative Pre-Trained Transformer(GPT),DistilBERT and Generalized Autoregressive Pretraining for Language Understanding(XLNet)models are trained on conversation corpus with hyperparameters.Hyperparameter tuning approach is carried out for better performance on sentence classification.This Ensemble of Pre-trained Language Models with a Hyperparameter Tuning(EPLM-HT)system is trained on an annotated conversation dataset.The proposed approach outperformed compared to the base BERT,GPT,DistilBERT and XLNet transformer models.The proposed ensemble model with the fine-tuned parameters achieved an F1_score of 0.88. 展开更多
关键词 Bidirectional encoder for representation of transformer conversation ensemble model fine-tuning generalized autoregressive pretraining for language understanding generative pre-trained transformer hyperparameter tuning natural language processing robustly optimized BERT pretraining approach sentence classification transformer models
在线阅读 下载PDF
Adapter Based on Pre-Trained Language Models for Classification of Medical Text
9
作者 Quan Li 《Journal of Electronic Research and Application》 2024年第3期129-134,共6页
We present an approach to classify medical text at a sentence level automatically.Given the inherent complexity of medical text classification,we employ adapters based on pre-trained language models to extract informa... We present an approach to classify medical text at a sentence level automatically.Given the inherent complexity of medical text classification,we employ adapters based on pre-trained language models to extract information from medical text,facilitating more accurate classification while minimizing the number of trainable parameters.Extensive experiments conducted on various datasets demonstrate the effectiveness of our approach. 展开更多
关键词 Classification of medical text ADAPTER pre-trained language model
在线阅读 下载PDF
Fine-Tuning Pre-Trained CodeBERT for Code Search in Smart Contract 被引量:1
10
作者 JIN Huan LI Qinying 《Wuhan University Journal of Natural Sciences》 CAS CSCD 2023年第3期237-245,共9页
Smart contracts,which automatically execute on decentralized platforms like Ethereum,require high security and low gas consumption.As a result,developers have a strong demand for semantic code search tools that utiliz... Smart contracts,which automatically execute on decentralized platforms like Ethereum,require high security and low gas consumption.As a result,developers have a strong demand for semantic code search tools that utilize natural language queries to efficiently search for existing code snippets.However,existing code search models face a semantic gap between code and queries,which requires a large amount of training data.In this paper,we propose a fine-tuning approach to bridge the semantic gap in code search and improve the search accuracy.We collect 80723 different pairs of<comment,code snippet>from Etherscan.io and use these pairs to fine-tune,validate,and test the pre-trained CodeBERT model.Using the fine-tuned model,we develop a code search engine specifically for smart contracts.We evaluate the Recall@k and Mean Reciprocal Rank(MRR)of the fine-tuned CodeBERT model using different proportions of the finetuned data.It is encouraging that even a small amount of fine-tuned data can produce satisfactory results.In addition,we perform a comparative analysis between the fine-tuned CodeBERT model and the two state-of-the-art models.The experimental results show that the finetuned CodeBERT model has superior performance in terms of Recall@k and MRR.These findings highlight the effectiveness of our finetuning approach and its potential to significantly improve the code search accuracy. 展开更多
关键词 code search smart contract pre-trained code models program analysis machine learning
原文传递
GeoNER:Geological Named Entity Recognition with Enriched Domain Pre-Training Model and Adversarial Training
11
作者 MA Kai HU Xinxin +4 位作者 TIAN Miao TAN Yongjian ZHENG Shuai TAO Liufeng QIU Qinjun 《Acta Geologica Sinica(English Edition)》 SCIE CAS CSCD 2024年第5期1404-1417,共14页
As important geological data,a geological report contains rich expert and geological knowledge,but the challenge facing current research into geological knowledge extraction and mining is how to render accurate unders... As important geological data,a geological report contains rich expert and geological knowledge,but the challenge facing current research into geological knowledge extraction and mining is how to render accurate understanding of geological reports guided by domain knowledge.While generic named entity recognition models/tools can be utilized for the processing of geoscience reports/documents,their effectiveness is hampered by a dearth of domain-specific knowledge,which in turn leads to a pronounced decline in recognition accuracy.This study summarizes six types of typical geological entities,with reference to the ontological system of geological domains and builds a high quality corpus for the task of geological named entity recognition(GNER).In addition,Geo Wo BERT-adv BGP(Geological Word-base BERTadversarial training Bi-directional Long Short-Term Memory Global Pointer)is proposed to address the issues of ambiguity,diversity and nested entities for the geological entities.The model first uses the fine-tuned word granularitybased pre-training model Geo Wo BERT(Geological Word-base BERT)and combines the text features that are extracted using the Bi LSTM(Bi-directional Long Short-Term Memory),followed by an adversarial training algorithm to improve the robustness of the model and enhance its resistance to interference,the decoding finally being performed using a global association pointer algorithm.The experimental results show that the proposed model for the constructed dataset achieves high performance and is capable of mining the rich geological information. 展开更多
关键词 geological named entity recognition geological report adversarial training confrontation training global pointer pre-training model
在线阅读 下载PDF
ThinGPT:describing sedimentary rock thin section images with a multimodal large language model
12
作者 Xin Luo Jian-Meng Sun +4 位作者 Peng Chi Ran Zhang Rui-Kang Cui Xing-Hua Ci Wei Liu 《Petroleum Science》 2025年第12期5020-5033,共14页
Rock thin section description is an essential method for examining lithology,structure,diagenesis,and sedimentary environment,playing a pivotal role in fields such as geology,geophysics,and petroleum exploration.To ov... Rock thin section description is an essential method for examining lithology,structure,diagenesis,and sedimentary environment,playing a pivotal role in fields such as geology,geophysics,and petroleum exploration.To overcome the challenges of subjectivity,low efficiency,and high expertise requirements in describing rock thin sections,we design a multimodal mapping network,ThinGPT,which aligns the feature spaces of the contrastive language-image pre-training(CLIP)and Generative Pre-trained(GPT-2)through network training.Given the high frequency of keywords and the structured sentence patterns in thin-section descriptions,we introduce a tokenization method tailored for rock thin sections.This approach enhances GPT-2's ability to effectively encode text and produce text feature vectors.We conducted comparative experiments using ThinGPT and other models on common sedimentary rocks.The results demonstrate that ThinGPT exhibits excellent potential in generating thin-section feature descriptions of rocks.Based on the geological expert evaluation criteria proposed in this study,ThinGPT achieved a score of 1.62 on the test set.For model complexity,ThinGPT avoids heavy initial training of large language models(LLMs).This training strategy makes the model lighter and improves the efficiency of rock thin section descriptions.As an innovative application of a LLMs within a lightweight architecture for rock thin section description,ThinGPT has significant implications for intelligent geology,geophysics,and petroleum exploration. 展开更多
关键词 Rock thin section description Large language model Contrastive language-image pre-training Generative pre-trained
原文传递
Artificial intelligence large model for logging curve reconstruction
13
作者 CHEN Zhangxing ZHANG Yongan +5 位作者 LI Jian HUI Gang SUN Youzhuang LI Yizheng CHEN Yuntian ZHANG Dongxiao 《Petroleum Exploration and Development》 2025年第3期842-854,共13页
To improve the accuracy and generalization of well logging curve reconstruction,this paper proposes an artificial intelligence large language model“Gaia”and conducts model evaluation experiments.By fine-tuning the p... To improve the accuracy and generalization of well logging curve reconstruction,this paper proposes an artificial intelligence large language model“Gaia”and conducts model evaluation experiments.By fine-tuning the pre-trained large language model,the Gaia significantly improved its ability in extracting sequential patterns and spatial features from well-log curves.Leveraging the adapter method for fine-tuning,this model required training only about 1/70 of its original parameters,greatly improving training efficiency.Comparative experiments,ablation experiments,and generalization experiments were designed and conducted using well-log data from 250 wells.In the comparative experiment,the Gaia model was benchmarked against cutting-edge small deep learning models and conventional large language models,demonstrating that the Gaia model reduced the mean absolute error(MAE)by at least 20%.In the ablation experiments,the synergistic effect of the Gaia model's multiple components was validated,with its MAE being at least 30%lower than that of single-component models.In the generalization experiments,the superior performance of the Gaia model in blind-well predictions was further confirmed.Compared to traditional models,the Gaia model is significantly superior in accuracy and generalization for logging curve reconstruction,fully showcasing the potential of large language models in the field of well-logging.This provides a new approach for future intelligent logging data processing. 展开更多
关键词 logging curve reconstruction large language model ADAPTER pre-trained model fine-tuning method
在线阅读 下载PDF
Geometry-based BERT:An experimentally validated deep learning model for molecular property prediction in drug discovery
14
作者 Xiang Zhang Chenliang Qian +5 位作者 Bochao Yang Hongwei Jin Song Wu Jie Xia Fan Yang Liangren Zhang 《Journal of Pharmaceutical Analysis》 2025年第12期2960-2974,共15页
Various deep learning based methods have significantlyimpacted the realm of drug discovery.The development of deep learning methods for identifying novel structural types of active compounds has become an urgent chall... Various deep learning based methods have significantlyimpacted the realm of drug discovery.The development of deep learning methods for identifying novel structural types of active compounds has become an urgent challenge.In this paper,we introduce a self-supervised representation learning framework,i.e.,Geometry-based Bidirectional Encoder Representations from Transformers(GEO-BERT).GEO-BERT considers the information of atoms and chemical bonds in chemical structures as the input,and integrates the positional information of the three-dimensional conformation of the molecule for training.Specifically,GEO-BERT enhances its ability to characterize molecular structures by introducing three different positional relationships:atom-atom,bond-bond,and atom-bond.By benchmarking study,GEO-BERT has demonstrated optimal performance on multiple benchmarks.We also performed prospective study to validate the GEO-BERT model,with screening for DYRK1A inhibitors as a case.Two potent and novel DYRK1A inhibitors(IC_(50):<1μM)were ultimately discovered.Taken together,we have developed an open-source GEO-BERT model for molecular property prediction(https://github.com/drug-designer/GEO-BERT)and proved its practical utility in early-stage drug discovery. 展开更多
关键词 Drug discovery Chemical pre-trained model Self-supervised learning BERT DYRK1A inhibitor
在线阅读 下载PDF
Multi-Head Encoder Shared Model Integrating Intent and Emotion for Dialogue Summarization
15
作者 Xinlai Xing Junliang Chen +2 位作者 Xiaochuan Zhang Shuran Zhou Runqing Zhang 《Computers, Materials & Continua》 2025年第2期2275-2292,共18页
In task-oriented dialogue systems, intent, emotion, and actions are crucial elements of user activity. Analyzing the relationships among these elements to control and manage task-oriented dialogue systems is a challen... In task-oriented dialogue systems, intent, emotion, and actions are crucial elements of user activity. Analyzing the relationships among these elements to control and manage task-oriented dialogue systems is a challenging task. However, previous work has primarily focused on the independent recognition of user intent and emotion, making it difficult to simultaneously track both aspects in the dialogue tracking module and to effectively utilize user emotions in subsequent dialogue strategies. We propose a Multi-Head Encoder Shared Model (MESM) that dynamically integrates features from emotion and intent encoders through a feature fusioner. Addressing the scarcity of datasets containing both emotion and intent labels, we designed a multi-dataset learning approach enabling the model to generate dialogue summaries encompassing both user intent and emotion. Experiments conducted on the MultiWoZ and MELD datasets demonstrate that our model effectively captures user intent and emotion, achieving extremely competitive results in dialogue state tracking tasks. 展开更多
关键词 Dialogue summaries dialogue state tracking emotion recognition task-oriented dialogue system pre-trained language model
在线阅读 下载PDF
融合领域词汇扩充的低资源法律文书命名实体识别
16
作者 帕尔哈提·吐拉江 孙媛媛 +2 位作者 蔡艾宸 王艳华 林鸿飞 《计算机工程与应用》 北大核心 2026年第1期264-273,共10页
针对司法领域低资源语言法律文书的命名实体识别面临的标注数据匮乏和领域适应性差两大问题,提出了一种融合领域词汇扩充的改进策略。以维吾尔语为例,通过动态扩充特定领域的词汇,将这些词汇融入预训练模型的词汇表中,并在人工标注的维... 针对司法领域低资源语言法律文书的命名实体识别面临的标注数据匮乏和领域适应性差两大问题,提出了一种融合领域词汇扩充的改进策略。以维吾尔语为例,通过动态扩充特定领域的词汇,将这些词汇融入预训练模型的词汇表中,并在人工标注的维吾尔语法律文书数据集UgLaw-NERD上对模型进行微调,增强模型在司法领域命名实体识别任务中的性能。研究以多个多语言预训练模型为基线进行对比评估,结果显示,融合领域词汇扩充策略使得F1分数相比未扩充词汇的基线模型提高了7.39个百分点,精确率和召回率也显著提升。此外,实验还分析了词汇表大小对模型性能的影响,随着特定领域词汇的逐步增加,精确率、召回率和F1分数整体呈上升趋势;在词汇扩充较大时,模型性能仍能保持稳定提升,但精确率与召回率的增幅相对前期有所减缓。研究结果表明,融合领域词汇扩充策略有效提高了预训练模型对维吾尔语法律文书中命名实体的识别能力,为低资源语言法律文本的处理提供了一种可行的解决方案。 展开更多
关键词 多语言预训练模型 司法领域 领域词汇扩充 低资源语言 维吾尔语 命名实体识别
在线阅读 下载PDF
Research on the Classification of Digital Cultural Texts Based on ASSC-TextRCNN Algorithm
17
作者 Zixuan Guo Houbin Wang +1 位作者 Sameer Kumar Yuanfang Chen 《Computers, Materials & Continua》 2026年第3期2119-2145,共27页
With the rapid development of digital culture,a large number of cultural texts are presented in the form of digital and network.These texts have significant characteristics such as sparsity,real-time and non-standard ... With the rapid development of digital culture,a large number of cultural texts are presented in the form of digital and network.These texts have significant characteristics such as sparsity,real-time and non-standard expression,which bring serious challenges to traditional classification methods.In order to cope with the above problems,this paper proposes a new ASSC(ALBERT,SVD,Self-Attention and Cross-Entropy)-TextRCNN digital cultural text classification model.Based on the framework of TextRCNN,the Albert pre-training language model is introduced to improve the depth and accuracy of semantic embedding.Combined with the dual attention mechanism,the model’s ability to capture and model potential key information in short texts is strengthened.The Singular Value Decomposition(SVD)was used to replace the traditional Max pooling operation,which effectively reduced the feature loss rate and retained more key semantic information.The cross-entropy loss function was used to optimize the prediction results,making the model more robust in class distribution learning.The experimental results indicate that,in the digital cultural text classification task,as compared to the baseline model,the proposed ASSC-TextRCNN method achieves an 11.85%relative improvement in accuracy and an 11.97%relative increase in the F1 score.Meanwhile,the relative error rate decreases by 53.18%.This achievement not only validates the effectiveness and advanced nature of the proposed approach but also offers a novel technical route and methodological underpinnings for the intelligent analysis and dissemination of digital cultural texts.It holds great significance for promoting the in-depth exploration and value realization of digital culture. 展开更多
关键词 Text classification natural language processing TextRCNN model albert pre-training singular value decomposition cross-entropy loss function
在线阅读 下载PDF
A survey on multilingual large language models:corpora,alignment,and bias 被引量:1
18
作者 Yuemei XU Ling HU +4 位作者 Jiayi ZHAO Zihan QIU Kexin XU Yuqi YE Hanwen GU 《Frontiers of Computer Science》 2025年第11期1-25,共25页
Based on the foundation of Large Language Models(LLMs),Multilingual LLMs(MLLMs)have been developed to address the challenges faced in multilingual natural language processing,hoping to achieve knowledge transfer from ... Based on the foundation of Large Language Models(LLMs),Multilingual LLMs(MLLMs)have been developed to address the challenges faced in multilingual natural language processing,hoping to achieve knowledge transfer from high-resource languages to low-resource languages.However,significant limitations and challenges still exist,such as language imbalance,multilingual alignment,and inherent bias.In this paper,we aim to provide a comprehensive analysis of MLLMs,delving deeply into discussions surrounding these critical issues.First of all,we start by presenting an overview of MLLMs,covering their evolutions,key techniques,and multilingual capacities.Secondly,we explore the multilingual training corpora of MLLMs and the multilingual datasets oriented for downstream tasks that are crucial to enhance the cross-lingual capability of MLLMs.Thirdly,we survey the state-of-the-art studies of multilingual representations and investigate whether the current MLLMs can learn a universal language representation.Fourthly,we discuss bias on MLLMs,including its categories,evaluation metrics,and debiasing techniques.Finally,we discuss existing challenges and point out promising research directions of MLLMs. 展开更多
关键词 multilingual large language model CORPORA ALIGNMENT BIAS SURVEY
原文传递
统一架构的多语种标点预测
19
作者 吴海波 李紫京 陈宋 《网络新媒体技术》 2026年第1期33-39,65,共8页
本文针对传统单语种标点预测方案训练成本高、跨语种迁移困难等问题,提出一种基于RoBERTa的统一多语种标点预测框架。该框架构建中、日、韩3语种混合语料库,采用统一的3种标点标签(COMMA、PERIOD、NONE)进行标注,实现单一模型对多语种... 本文针对传统单语种标点预测方案训练成本高、跨语种迁移困难等问题,提出一种基于RoBERTa的统一多语种标点预测框架。该框架构建中、日、韩3语种混合语料库,采用统一的3种标点标签(COMMA、PERIOD、NONE)进行标注,实现单一模型对多语种标点的同步端到端预测。实验结果表明,该模型与单语种基线相比,标点预测F1平均值差距仅为1.7%,各语种性能下降均未超过2%,验证了多语种统一建模在标点恢复任务中的有效性与可行性。 展开更多
关键词 多语种文本处理 标点符号预测 RoBERTa模型 预训练微调 混合语料库 统一标注体系 跨语种迁移 语义编码
在线阅读 下载PDF
A Survey of Multilingual Neural Machine Translation Based on Sparse Models
20
作者 Shaolin Zhu Dong Jian Deyi Xiong 《Tsinghua Science and Technology》 2025年第6期2399-2418,共20页
Recent research has shown a burgeoning interest in exploring sparse models for massively Multilingual Neural Machine Translation(MNMT).In this paper,we present a comprehensive survey of this emerging topic.Massively M... Recent research has shown a burgeoning interest in exploring sparse models for massively Multilingual Neural Machine Translation(MNMT).In this paper,we present a comprehensive survey of this emerging topic.Massively MNMT,when based on sparse models,offers significant improvements in parameter efficiency and reduces interference compared to its dense model counterparts.Various methods have been proposed to leverage sparse models for enhancing translation quality.However,the lack of a thorough survey has hindered the identification and further investigation of the most promising approaches.To address this gap,we provide an exhaustive examination of the current research landscape in massively MNMT,with a special emphasis on sparse models.Initially,we categorize the various sparse model-based approaches into distinct classifications.We then delve into each category in detail,elucidating their fundamental modeling principles,core issues,and the challenges they face.Wherever possible,we conduct comparative analyses to assess the strengths and weaknesses of different methodologies.Moreover,we explore potential future research avenues for MNMT based on sparse models.This survey serves as a valuable resource for both newcomers and established experts in the field of MNMT,particularly those interested in sparse model applications. 展开更多
关键词 neural machine translation sparse models multilingual dense model
原文传递
上一页 1 2 7 下一页 到第
使用帮助 返回顶部