期刊文献+
共找到106篇文章
< 1 2 6 >
每页显示 20 50 100
KitWaSor:Pioneering pre-trained model for kitchen waste sorting with an innovative million-level benchmark dataset
1
作者 Leyuan Fang Shuaiyu Ding +3 位作者 Hao Feng Junwu Yu Lin Tang Pedram Ghamisi 《CAAI Transactions on Intelligence Technology》 2025年第1期94-114,共21页
Intelligent sorting is an important prerequisite for the full quantitative consumption and harmless disposal of kitchen waste.The existing object detection method based on an ImageNet pre-trained model is an effective... Intelligent sorting is an important prerequisite for the full quantitative consumption and harmless disposal of kitchen waste.The existing object detection method based on an ImageNet pre-trained model is an effective way of sorting.Owing to significant domain gaps between natural images and kitchen waste images,it is difficult to reflect the characteristics of diverse scales and dense distribution in kitchen waste based on an ImageNet pre-trained model,leading to poor generalisation.In this article,the authors propose the first pre-trained model for kitchen waste sorting called KitWaSor,which combines both contrastive learning(CL)and masked image modelling(MIM)through self-supervised learning(SSL).First,to address the issue of diverse scales,the authors propose a mixed masking strategy by introducing an incomplete masking branch based on the original random masking branch.It prevents the complete loss of small-scale objects while avoiding excessive leakage of large-scale object pixels.Second,to address the issue of dense distribution,the authors introduce semantic consistency constraints on the basis of the mixed masking strategy.That is,object semantic reasoning is performed through semantic consistency constraints to compensate for the lack of contextual information.To train KitWaSor,the authors construct the first million-level kitchen waste dataset across seasonal and regional distributions,named KWD-Million.Extensive experiments show that KitWaSor achieves state-of-the-art(SOTA)performance on the two most relevant downstream tasks for kitchen waste sorting(i.e.image classification and object detection),demonstrating the effectiveness of the proposed KitWaSor. 展开更多
关键词 contrastive learning kitchen waste masked image modeling pre-trained model self-supervised learning
在线阅读 下载PDF
DPCIPI: A pre-trained deep learning model for predicting cross-immunity between drifted strains of Influenza A/H3N2
2
作者 Yiming Du Zhuotian Li +8 位作者 Qian He Thomas Wetere Tulu Kei Hang Katie Chan Lin Wang Sen Pei Zhanwei Du Zhen Wang Xiao-Ke Xu Xiao Fan Liu 《Journal of Automation and Intelligence》 2025年第2期115-124,共10页
Predicting cross-immunity between viral strains is vital for public health surveillance and vaccine development.Traditional neural network methods,such as BiLSTM,could be ineffective due to the lack of lab data for mo... Predicting cross-immunity between viral strains is vital for public health surveillance and vaccine development.Traditional neural network methods,such as BiLSTM,could be ineffective due to the lack of lab data for model training and the overshadowing of crucial features within sequence concatenation.The current work proposes a less data-consuming model incorporating a pre-trained gene sequence model and a mutual information inference operator.Our methodology utilizes gene alignment and deduplication algorithms to preprocess gene sequences,enhancing the model’s capacity to discern and focus on distinctions among input gene pairs.The model,i.e.,DNA Pretrained Cross-Immunity Protection Inference model(DPCIPI),outperforms state-of-theart(SOTA)models in predicting hemagglutination inhibition titer from influenza viral gene sequences only.Improvement in binary cross-immunity prediction is 1.58%in F1,2.34%in precision,1.57%in recall,and 1.57%in Accuracy.For multilevel cross-immunity improvements,the improvement is 2.12%in F1,3.50%in precision,2.19%in recall,and 2.19%in Accuracy.Our study showcases the potential of pre-trained gene models to improve predictions of antigenic variation and cross-immunity.With expanding gene data and advancements in pre-trained models,this approach promises significant impacts on vaccine development and public health. 展开更多
关键词 Cross-immunity prediction pre-trained model Deep learning Influenza strains Hemagglutination inhibition
在线阅读 下载PDF
Multilingual Text Summarization in Healthcare Using Pre-Trained Transformer-Based Language Models
3
作者 Josua Käser Thomas Nagy +1 位作者 Patrick Stirnemann Thomas Hanne 《Computers, Materials & Continua》 2025年第4期201-217,共17页
We analyze the suitability of existing pre-trained transformer-based language models(PLMs)for abstractive text summarization on German technical healthcare texts.The study focuses on the multilingual capabilities of t... We analyze the suitability of existing pre-trained transformer-based language models(PLMs)for abstractive text summarization on German technical healthcare texts.The study focuses on the multilingual capabilities of these models and their ability to perform the task of abstractive text summarization in the healthcare field.The research hypothesis was that large language models could perform high-quality abstractive text summarization on German technical healthcare texts,even if the model is not specifically trained in that language.Through experiments,the research questions explore the performance of transformer language models in dealing with complex syntax constructs,the difference in performance between models trained in English and German,and the impact of translating the source text to English before conducting the summarization.We conducted an evaluation of four PLMs(GPT-3,a translation-based approach also utilizing GPT-3,a German language Model,and a domain-specific bio-medical model approach).The evaluation considered the informativeness using 3 types of metrics based on Recall-Oriented Understudy for Gisting Evaluation(ROUGE)and the quality of results which is manually evaluated considering 5 aspects.The results show that text summarization models could be used in the German healthcare domain and that domain-independent language models achieved the best results.The study proves that text summarization models can simplify the search for pre-existing German knowledge in various domains. 展开更多
关键词 Text summarization pre-trained transformer-based language models large language models technical healthcare texts natural language processing
在线阅读 下载PDF
Big Texture Dataset Synthesized Based on Gradient and Convolution Kernels Using Pre-Trained Deep Neural Networks
4
作者 Farhan A.Alenizi Faten Khalid Karim +1 位作者 Alaa R.Al-Shamasneh Mohammad Hossein Shakoor 《Computer Modeling in Engineering & Sciences》 2025年第8期1793-1829,共37页
Deep neural networks provide accurate results for most applications.However,they need a big dataset to train properly.Providing a big dataset is a significant challenge in most applications.Image augmentation refers t... Deep neural networks provide accurate results for most applications.However,they need a big dataset to train properly.Providing a big dataset is a significant challenge in most applications.Image augmentation refers to techniques that increase the amount of image data.Common operations for image augmentation include changes in illumination,rotation,contrast,size,viewing angle,and others.Recently,Generative Adversarial Networks(GANs)have been employed for image generation.However,like image augmentation methods,GAN approaches can only generate images that are similar to the original images.Therefore,they also cannot generate new classes of data.Texture images presentmore challenges than general images,and generating textures is more complex than creating other types of images.This study proposes a gradient-based deep neural network method that generates a new class of texture.It is possible to rapidly generate new classes of textures using different kernels from pre-trained deep networks.After generating new textures for each class,the number of textures increases through image augmentation.During this process,several techniques are proposed to automatically remove incomplete and similar textures that are created.The proposed method is faster than some well-known generative networks by around 4 to 10 times.In addition,the quality of the generated textures surpasses that of these networks.The proposed method can generate textures that surpass those of someGANs and parametric models in certain image qualitymetrics.It can provide a big texture dataset to train deep networks.A new big texture dataset is created artificially using the proposed method.This dataset is approximately 2 GB in size and comprises 30,000 textures,each 150×150 pixels in size,organized into 600 classes.It is uploaded to the Kaggle site and Google Drive.This dataset is called BigTex.Compared to other texture datasets,the proposed dataset is the largest and can serve as a comprehensive texture dataset for training more powerful deep neural networks and mitigating overfitting. 展开更多
关键词 Big texture dataset data generation pre-trained deep neural network
在线阅读 下载PDF
Classification of Conversational Sentences Using an Ensemble Pre-Trained Language Model with the Fine-Tuned Parameter
5
作者 R.Sujatha K.Nimala 《Computers, Materials & Continua》 SCIE EI 2024年第2期1669-1686,共18页
Sentence classification is the process of categorizing a sentence based on the context of the sentence.Sentence categorization requires more semantic highlights than other tasks,such as dependence parsing,which requir... Sentence classification is the process of categorizing a sentence based on the context of the sentence.Sentence categorization requires more semantic highlights than other tasks,such as dependence parsing,which requires more syntactic elements.Most existing strategies focus on the general semantics of a conversation without involving the context of the sentence,recognizing the progress and comparing impacts.An ensemble pre-trained language model was taken up here to classify the conversation sentences from the conversation corpus.The conversational sentences are classified into four categories:information,question,directive,and commission.These classification label sequences are for analyzing the conversation progress and predicting the pecking order of the conversation.Ensemble of Bidirectional Encoder for Representation of Transformer(BERT),Robustly Optimized BERT pretraining Approach(RoBERTa),Generative Pre-Trained Transformer(GPT),DistilBERT and Generalized Autoregressive Pretraining for Language Understanding(XLNet)models are trained on conversation corpus with hyperparameters.Hyperparameter tuning approach is carried out for better performance on sentence classification.This Ensemble of Pre-trained Language Models with a Hyperparameter Tuning(EPLM-HT)system is trained on an annotated conversation dataset.The proposed approach outperformed compared to the base BERT,GPT,DistilBERT and XLNet transformer models.The proposed ensemble model with the fine-tuned parameters achieved an F1_score of 0.88. 展开更多
关键词 Bidirectional encoder for representation of transformer conversation ensemble model fine-tuning generalized autoregressive pretraining for language understanding generative pre-trained transformer hyperparameter tuning natural language processing robustly optimized BERT pretraining approach sentence classification transformer models
在线阅读 下载PDF
Adapter Based on Pre-Trained Language Models for Classification of Medical Text
6
作者 Quan Li 《Journal of Electronic Research and Application》 2024年第3期129-134,共6页
We present an approach to classify medical text at a sentence level automatically.Given the inherent complexity of medical text classification,we employ adapters based on pre-trained language models to extract informa... We present an approach to classify medical text at a sentence level automatically.Given the inherent complexity of medical text classification,we employ adapters based on pre-trained language models to extract information from medical text,facilitating more accurate classification while minimizing the number of trainable parameters.Extensive experiments conducted on various datasets demonstrate the effectiveness of our approach. 展开更多
关键词 Classification of medical text ADAPTER pre-trained language model
在线阅读 下载PDF
A Classification–Detection Approach of COVID-19 Based on Chest X-ray and CT by Using Keras Pre-Trained Deep Learning Models 被引量:10
7
作者 Xing Deng Haijian Shao +2 位作者 Liang Shi Xia Wang Tongling Xie 《Computer Modeling in Engineering & Sciences》 SCIE EI 2020年第11期579-596,共18页
The Coronavirus Disease 2019(COVID-19)is wreaking havoc around the world,bring out that the enormous pressure on national health and medical staff systems.One of the most effective and critical steps in the fight agai... The Coronavirus Disease 2019(COVID-19)is wreaking havoc around the world,bring out that the enormous pressure on national health and medical staff systems.One of the most effective and critical steps in the fight against COVID-19,is to examine the patient’s lungs based on the Chest X-ray and CT generated by radiation imaging.In this paper,five keras-related deep learning models:ResNet50,InceptionResNetV2,Xception,transfer learning and pre-trained VGGNet16 is applied to formulate an classification-detection approaches of COVID-19.Two benchmark methods SVM(Support Vector Machine),CNN(Conventional Neural Networks)are provided to compare with the classification-detection approaches based on the performance indicators,i.e.,precision,recall,F1 scores,confusion matrix,classification accuracy and three types of AUC(Area Under Curve).The highest classification accuracy derived by classification-detection based on 5857 Chest X-rays and 767 Chest CTs are respectively 84%and 75%,which shows that the keras-related deep learning approaches facilitate accurate and effective COVID-19-assisted detection. 展开更多
关键词 COVID-19 detection deep learning transfer learning pre-trained models
在线阅读 下载PDF
Fine-Tuning Pre-Trained CodeBERT for Code Search in Smart Contract 被引量:1
8
作者 JIN Huan LI Qinying 《Wuhan University Journal of Natural Sciences》 CAS CSCD 2023年第3期237-245,共9页
Smart contracts,which automatically execute on decentralized platforms like Ethereum,require high security and low gas consumption.As a result,developers have a strong demand for semantic code search tools that utiliz... Smart contracts,which automatically execute on decentralized platforms like Ethereum,require high security and low gas consumption.As a result,developers have a strong demand for semantic code search tools that utilize natural language queries to efficiently search for existing code snippets.However,existing code search models face a semantic gap between code and queries,which requires a large amount of training data.In this paper,we propose a fine-tuning approach to bridge the semantic gap in code search and improve the search accuracy.We collect 80723 different pairs of<comment,code snippet>from Etherscan.io and use these pairs to fine-tune,validate,and test the pre-trained CodeBERT model.Using the fine-tuned model,we develop a code search engine specifically for smart contracts.We evaluate the Recall@k and Mean Reciprocal Rank(MRR)of the fine-tuned CodeBERT model using different proportions of the finetuned data.It is encouraging that even a small amount of fine-tuned data can produce satisfactory results.In addition,we perform a comparative analysis between the fine-tuned CodeBERT model and the two state-of-the-art models.The experimental results show that the finetuned CodeBERT model has superior performance in terms of Recall@k and MRR.These findings highlight the effectiveness of our finetuning approach and its potential to significantly improve the code search accuracy. 展开更多
关键词 code search smart contract pre-trained code models program analysis machine learning
原文传递
Construction and application of knowledge graph for grid dispatch fault handling based on pre-trained model 被引量:1
9
作者 Zhixiang Ji Xiaohui Wang +1 位作者 Jie Zhang Di Wu 《Global Energy Interconnection》 EI CSCD 2023年第4期493-504,共12页
With the construction of new power systems,the power grid has become extremely large,with an increasing proportion of new energy and AC/DC hybrid connections.The dynamic characteristics and fault patterns of the power... With the construction of new power systems,the power grid has become extremely large,with an increasing proportion of new energy and AC/DC hybrid connections.The dynamic characteristics and fault patterns of the power grid are complex;additionally,power grid control is difficult,operation risks are high,and the task of fault handling is arduous.Traditional power-grid fault handling relies primarily on human experience.The difference in and lack of knowledge reserve of control personnel restrict the accuracy and timeliness of fault handling.Therefore,this mode of operation is no longer suitable for the requirements of new systems.Based on the multi-source heterogeneous data of power grid dispatch,this paper proposes a joint entity–relationship extraction method for power-grid dispatch fault processing based on a pre-trained model,constructs a knowledge graph of power-grid dispatch fault processing and designs,and develops a fault-processing auxiliary decision-making system based on the knowledge graph.It was applied to study a provincial dispatch control center,and it effectively improved the accident processing ability and intelligent level of accident management and control of the power grid. 展开更多
关键词 Power-grid dispatch fault handling Knowledge graph pre-trained model Auxiliary decision-making
在线阅读 下载PDF
Leveraging Vision-Language Pre-Trained Model and Contrastive Learning for Enhanced Multimodal Sentiment Analysis
10
作者 Jieyu An Wan Mohd Nazmee Wan Zainon Binfen Ding 《Intelligent Automation & Soft Computing》 SCIE 2023年第8期1673-1689,共17页
Multimodal sentiment analysis is an essential area of research in artificial intelligence that combines multiple modes,such as text and image,to accurately assess sentiment.However,conventional approaches that rely on... Multimodal sentiment analysis is an essential area of research in artificial intelligence that combines multiple modes,such as text and image,to accurately assess sentiment.However,conventional approaches that rely on unimodal pre-trained models for feature extraction from each modality often overlook the intrinsic connections of semantic information between modalities.This limitation is attributed to their training on unimodal data,and necessitates the use of complex fusion mechanisms for sentiment analysis.In this study,we present a novel approach that combines a vision-language pre-trained model with a proposed multimodal contrastive learning method.Our approach harnesses the power of transfer learning by utilizing a vision-language pre-trained model to extract both visual and textual representations in a unified framework.We employ a Transformer architecture to integrate these representations,thereby enabling the capture of rich semantic infor-mation in image-text pairs.To further enhance the representation learning of these pairs,we introduce our proposed multimodal contrastive learning method,which leads to improved performance in sentiment analysis tasks.Our approach is evaluated through extensive experiments on two publicly accessible datasets,where we demonstrate its effectiveness.We achieve a significant improvement in sentiment analysis accuracy,indicating the supe-riority of our approach over existing techniques.These results highlight the potential of multimodal sentiment analysis and underscore the importance of considering the intrinsic semantic connections between modalities for accurate sentiment assessment. 展开更多
关键词 Multimodal sentiment analysis vision–language pre-trained model contrastive learning sentiment classification
在线阅读 下载PDF
GPT-NAS:Neural Architecture Search Meets Generative Pre-Trained Transformer Model
11
作者 Caiyang Yu Xianggen Liu +5 位作者 Yifan Wang Yun Liu Wentao Feng Xiong Deng Chenwei Tang Jiancheng Lv 《Big Data Mining and Analytics》 2025年第1期45-64,共20页
The pursuit of optimal neural network architectures is foundational to the progression of Neural Architecture Search (NAS). However, the existing NAS methods suffer from the following problem using traditional search ... The pursuit of optimal neural network architectures is foundational to the progression of Neural Architecture Search (NAS). However, the existing NAS methods suffer from the following problem using traditional search strategies, i.e., when facing a large and complex search space, it is difficult to mine more effective architectures within a reasonable time, resulting in inferior search results. This research introduces the Generative Pre-trained Transformer NAS (GPT-NAS), an innovative approach designed to overcome the limitations which are inherent in traditional NAS strategies. This approach improves search efficiency and obtains better architectures by integrating GPT model into the search process. Specifically, we design a reconstruction strategy that utilizes the trained GPT to reorganize the architectures obtained from the search. In addition, to equip the GPT model with the design capabilities of neural architecture, we propose the use of the GPT model for training on a neural architecture dataset. For each architecture, the structural information of its previous layers is utilized to predict the next layer of structure, iteratively traversing the entire architecture. In this way, the GPT model can efficiently learn the key features required for neural architectures. Extensive experimental validation shows that our GPT-NAS approach beats both manually constructed neural architectures and automatically generated architectures by NAS. In addition, we validate the superiority of introducing the GPT model in several ways, and find that the accuracy of the neural architecture on the image dataset obtained from the search after introducing the GPT model is improved by up to about 9%. 展开更多
关键词 Neural Architecture Search(NAS) Generative pre-trained Transformer(GPT)model evolutionary algorithm image classification
原文传递
y-Tuning: an efficient tuning paradigm for large-scale pre-trained models via label representation learning
12
作者 Yitao LIU Chenxin AN Xipeng QIU 《Frontiers of Computer Science》 SCIE EI CSCD 2024年第4期107-116,共10页
With current success of large-scale pre-trained models(PTMs),how efficiently adapting PTMs to downstream tasks has attracted tremendous attention,especially for PTMs with billions of parameters.Previous work focuses o... With current success of large-scale pre-trained models(PTMs),how efficiently adapting PTMs to downstream tasks has attracted tremendous attention,especially for PTMs with billions of parameters.Previous work focuses on designing parameter-efficient tuning paradigms but needs to save and compute the gradient of the whole computational graph.In this paper,we propose y-Tuning,an efficient yet effective paradigm to adapt frozen large-scale PTMs to specific downstream tasks.y-Tuning learns dense representations for labels y defined in a given task and aligns them to fixed feature representation.Without computing the gradients of text encoder at training phrase,y-Tuning is not only parameterefficient but also training-efficient.Experimental results show that for DeBERTaxxL with 1.6 billion parameters,y-Tuning achieves performance more than 96%of full fine-tuning on GLUE Benchmark with only 2%tunable parameters and much fewer training costs. 展开更多
关键词 pre-trained model lightweight fine-tuning paradigms label representation
原文传递
Pre-trained models for natural language processing: A survey 被引量:198
13
作者 QIU XiPeng SUN TianXiang +3 位作者 XU YiGe SHAO YunFan DAI Ning HUANG XuanJing 《Science China(Technological Sciences)》 SCIE EI CAS CSCD 2020年第10期1872-1897,共26页
Recently, the emergence of pre-trained models(PTMs) has brought natural language processing(NLP) to a new era. In this survey, we provide a comprehensive review of PTMs for NLP. We first briefly introduce language rep... Recently, the emergence of pre-trained models(PTMs) has brought natural language processing(NLP) to a new era. In this survey, we provide a comprehensive review of PTMs for NLP. We first briefly introduce language representation learning and its research progress. Then we systematically categorize existing PTMs based on a taxonomy from four different perspectives. Next,we describe how to adapt the knowledge of PTMs to downstream tasks. Finally, we outline some potential directions of PTMs for future research. This survey is purposed to be a hands-on guide for understanding, using, and developing PTMs for various NLP tasks. 展开更多
关键词 deep learning neural network natural language processing pre-trained model distributed representation word embedding self-supervised learning language modelling
原文传递
Large-scale Multi-modal Pre-trained Models: A Comprehensive Survey 被引量:22
14
作者 Xiao Wang Guangyao Chen +5 位作者 Guangwu Qian Pengcheng Gao Xiao-Yong Wei Yaowei Wang Yonghong Tian Wen Gao 《Machine Intelligence Research》 EI CSCD 2023年第4期447-482,共36页
With the urgent demand for generalized deep models,many pre-trained big models are proposed,such as bidirectional encoder representations(BERT),vision transformer(ViT),generative pre-trained transformers(GPT),etc.Insp... With the urgent demand for generalized deep models,many pre-trained big models are proposed,such as bidirectional encoder representations(BERT),vision transformer(ViT),generative pre-trained transformers(GPT),etc.Inspired by the success of these models in single domains(like computer vision and natural language processing),the multi-modal pre-trained big models have also drawn more and more attention in recent years.In this work,we give a comprehensive survey of these models and hope this paper could provide new insights and helps fresh researchers to track the most cutting-edge works.Specifically,we firstly introduce the background of multi-modal pre-training by reviewing the conventional deep learning,pre-training works in natural language process,computer vision,and speech.Then,we introduce the task definition,key challenges,and advantages of multi-modal pre-training models(MM-PTMs),and discuss the MM-PTMs with a focus on data,objectives,network architectures,and knowledge enhanced pre-training.After that,we introduce the downstream tasks used for the validation of large-scale MM-PTMs,including generative,classification,and regression tasks.We also give visualization and analysis of the model parameters and results on representative downstream tasks.Finally,we point out possible research directions for this topic that may benefit future works.In addition,we maintain a continuously updated paper list for large-scale pre-trained multi-modal big models:https://github.com/wangxiao5791509/MultiModal_BigModels_Survey. 展开更多
关键词 Multi-modal(MM) pre-trained model(PTM) information fusion representation learning deep learning
原文传递
Red Alarm for Pre-trained Models:Universal Vulnerability to Neuron-level Backdoor Attacks 被引量:5
15
作者 Zhengyan Zhang Guangxuan Xiao +6 位作者 Yongwei Li Tian Lv Fanchao Qi Zhiyuan Liu Yasheng Wang Xin Jiang Maosong Sun 《Machine Intelligence Research》 EI CSCD 2023年第2期180-193,共14页
The pre-training-then-fine-tuning paradigm has been widely used in deep learning.Due to the huge computation cost for pre-training,practitioners usually download pre-trained models from the Internet and fine-tune them... The pre-training-then-fine-tuning paradigm has been widely used in deep learning.Due to the huge computation cost for pre-training,practitioners usually download pre-trained models from the Internet and fine-tune them on downstream datasets,while the downloaded models may suffer backdoor attacks.Different from previous attacks aiming at a target task,we show that a backdoored pre-trained model can behave maliciously in various downstream tasks without foreknowing task information.Attackers can restrict the output representations(the values of output neurons)of trigger-embedded samples to arbitrary predefined values through additional training,namely neuron-level backdoor attack(NeuBA).Since fine-tuning has little effect on model parameters,the fine-tuned model will retain the backdoor functionality and predict a specific label for the samples embedded with the same trigger.To provoke multiple labels in a specific task,attackers can introduce several triggers with predefined contrastive values.In the experiments of both natural language processing(NLP)and computer vision(CV),we show that NeuBA can well control the predictions for trigger-embedded instances with different trigger designs.Our findings sound a red alarm for the wide use of pre-trained models.Finally,we apply several defense methods to NeuBA and find that model pruning is a promising technique to resist NeuBA by omitting backdoored neurons. 展开更多
关键词 pre-trained language models backdoor attacks transformers natural language processing(NLP) computer vision(CV)
原文传递
Vision Enhanced Generative Pre-trained Language Model for Multimodal Sentence Summarization 被引量:2
16
作者 Liqiang Jing Yiren Li +3 位作者 Junhao Xu Yongcan Yu Pei Shen Xuemeng Song 《Machine Intelligence Research》 EI CSCD 2023年第2期289-298,共10页
Multimodal sentence summarization(MMSS)is a new yet challenging task that aims to generate a concise summary of a long sentence and its corresponding image.Although existing methods have gained promising success in MM... Multimodal sentence summarization(MMSS)is a new yet challenging task that aims to generate a concise summary of a long sentence and its corresponding image.Although existing methods have gained promising success in MMSS,they overlook the powerful generation ability of generative pre-trained language models(GPLMs),which have shown to be effective in many text generation tasks.To fill this research gap,we propose to using GPLMs to promote the performance of MMSS.Notably,adopting GPLMs to solve MMSS inevitably faces two challenges:1)What fusion strategy should we use to inject visual information into GPLMs properly?2)How to keep the GPLM′s generation ability intact to the utmost extent when the visual feature is injected into the GPLM.To address these two challenges,we propose a vision enhanced generative pre-trained language model for MMSS,dubbed as Vision-GPLM.In Vision-GPLM,we obtain features of visual and textual modalities with two separate encoders and utilize a text decoder to produce a summary.In particular,we utilize multi-head attention to fuse the features extracted from visual and textual modalities to inject the visual feature into the GPLM.Meanwhile,we train Vision-GPLM in two stages:the vision-oriented pre-training stage and fine-tuning stage.In the vision-oriented pre-training stage,we particularly train the visual encoder by the masked language model task while the other components are frozen,aiming to obtain homogeneous representations of text and image.In the fine-tuning stage,we train all the components of Vision-GPLM by the MMSS task.Extensive experiments on a public MMSS dataset verify the superiority of our model over existing baselines. 展开更多
关键词 Multimodal sentence summarization(MMSS) generative pre-trained language model(GPLM) natural language generation deep learning artificial intelligence
原文传递
Unsupervised statistical text simplification using pre-trained language modeling for initialization 被引量:1
17
作者 Jipeng QIANG Feng ZHANG +3 位作者 Yun LI Yunhao YUAN Yi ZHU Xindong WU 《Frontiers of Computer Science》 SCIE EI CSCD 2023年第1期81-90,共10页
Unsupervised text simplification has attracted much attention due to the scarcity of high-quality parallel text simplification corpora. Recent an unsupervised statistical text simplification based on phrase-based mach... Unsupervised text simplification has attracted much attention due to the scarcity of high-quality parallel text simplification corpora. Recent an unsupervised statistical text simplification based on phrase-based machine translation system (UnsupPBMT) achieved good performance, which initializes the phrase tables using the similar words obtained by word embedding modeling. Since word embedding modeling only considers the relevance between words, the phrase table in UnsupPBMT contains a lot of dissimilar words. In this paper, we propose an unsupervised statistical text simplification using pre-trained language modeling BERT for initialization. Specifically, we use BERT as a general linguistic knowledge base for predicting similar words. Experimental results show that our method outperforms the state-of-the-art unsupervised text simplification methods on three benchmarks, even outperforms some supervised baselines. 展开更多
关键词 text simplification pre-trained language modeling BERT word embeddings
原文传递
Satellite and instrument entity recognition using a pre-trained language model with distant supervision 被引量:1
18
作者 Ming Lin Meng Jin +1 位作者 Yufu Liu Yuqi Bai 《International Journal of Digital Earth》 SCIE EI 2022年第1期1290-1304,共15页
Earth observations,especially satellite data,have produced a wealth of methods and results in meeting global challenges,often presented in unstructured texts such as papers or reports.Accurate extraction of satellite ... Earth observations,especially satellite data,have produced a wealth of methods and results in meeting global challenges,often presented in unstructured texts such as papers or reports.Accurate extraction of satellite and instrument entities from these unstructured texts can help to link and reuse Earth observation resources.The direct use of an existing dictionary to extract satellite and instrument entities suffers from the problem of poor matching,which leads to low recall.In this study,we present a named entity recognition model to automatically extract satellite and instrument entities from unstructured texts.Due to the lack of manually labeled data,we apply distant supervision to automatically generate labeled training data.Accordingly,we fine-tune the pre-trained language model with early stopping and a weighted cross-entropy loss function.We propose the dictionary-based self-training method to correct the incomplete annotations caused by the distant supervision method.Experiments demonstrate that our method achieves significant improvements in both precision and recall compared to dictionary matching or standard adaptation of pre-trained language models. 展开更多
关键词 Earth observation named entity recognition pre-trained language model distant supervision dictionary-based self-training
原文传递
A comparative analysis of paddy crop biotic stress classification using pre-trained deep neural networks 被引量:1
19
作者 Naveen N.Malvade Rajesh Yakkundimath +2 位作者 Girish Saunshi Mahantesh C.Elemmi Parashuram Baraki 《Artificial Intelligence in Agriculture》 2022年第1期167-175,共9页
The agriculture sector is no exception to the widespread usage of deep learning tools and techniques.In this paper,an automated detection method on the basis of pre-trained Convolutional Neural Network(CNN)models is p... The agriculture sector is no exception to the widespread usage of deep learning tools and techniques.In this paper,an automated detection method on the basis of pre-trained Convolutional Neural Network(CNN)models is proposed to identify and classify paddy crop biotic stresses from the field images.The proposed work also provides the empirical comparison among the leading CNN models with transfer learning from the ImageNet weights namely,Inception-V3,VGG-16,ResNet-50,DenseNet-121 and MobileNet-28.Brown spot,hispa,and leaf blast,three of the most common and destructive paddy crop biotic stresses that occur during the flowering and ripening growth stages are considered for the experimentation.The experimental results reveal that the ResNet-50 model achieves the highest average paddy crop stress classification accuracy of 92.61%outperforming the other considered CNN models.The study explores the feasibility of CNN models for the paddy crop stress identification as well as the applicability of automated methods to non-experts. 展开更多
关键词 Paddy crop Stress classification Biotic stress PlantVillage ImageNet pre-trained CNN models
原文传递
Medical Named Entity Recognition from Un-labelled Medical Records based on Pre-trained Language Models and Domain Dictionary 被引量:1
20
作者 Chaojie Wen Tao Chen +1 位作者 Xudong Jia Jiang Zhu 《Data Intelligence》 2021年第3期402-417,共16页
Medical named entity recognition(NER)is an area in which medical named entities are recognized from medical texts,such as diseases,drugs,surgery reports,anatomical parts,and examination documents.Conventional medical ... Medical named entity recognition(NER)is an area in which medical named entities are recognized from medical texts,such as diseases,drugs,surgery reports,anatomical parts,and examination documents.Conventional medical NER methods do not make full use of un-labelled medical texts embedded in medical documents.To address this issue,we proposed a medical NER approach based on pre-trained language models and a domain dictionary.First,we constructed a medical entity dictionary by extracting medical entities from labelled medical texts and collecting medical entities from other resources,such as the YiduN4 K data set.Second,we employed this dictionary to train domain-specific pre-trained language models using un-labelled medical texts.Third,we employed a pseudo labelling mechanism in un-labelled medical texts to automatically annotate texts and create pseudo labels.Fourth,the BiLSTM-CRF sequence tagging model was used to fine-tune the pre-trained language models.Our experiments on the un-labelled medical texts,which were extracted from Chinese electronic medical records,show that the proposed NER approach enables the strict and relaxed F1 scores to be 88.7%and 95.3%,respectively. 展开更多
关键词 Medical named entity recognition pre-trained language model Domain dictionary Pseudo labelling Un-labelled medical data
原文传递
上一页 1 2 6 下一页 到第
使用帮助 返回顶部