Sentiment analysis,a cornerstone of natural language processing,has witnessed remarkable advancements driven by deep learning models which demonstrated impressive accuracy in discerning sentiment from text across vari...Sentiment analysis,a cornerstone of natural language processing,has witnessed remarkable advancements driven by deep learning models which demonstrated impressive accuracy in discerning sentiment from text across various domains.However,the deployment of such models in resource-constrained environments presents a unique set of challenges that require innovative solutions.Resource-constrained environments encompass scenarios where computing resources,memory,and energy availability are restricted.To empower sentiment analysis in resource-constrained environments,we address the crucial need by leveraging lightweight pre-trained models.These models,derived from popular architectures such as DistilBERT,MobileBERT,ALBERT,TinyBERT,ELECTRA,and SqueezeBERT,offer a promising solution to the resource limitations imposed by these environments.By distilling the knowledge from larger models into smaller ones and employing various optimization techniques,these lightweight models aim to strike a balance between performance and resource efficiency.This paper endeavors to explore the performance of multiple lightweight pre-trained models in sentiment analysis tasks specific to such environments and provide insights into their viability for practical deployment.展开更多
Pneumonia is an acute lung infection that has caused many fatalitiesglobally. Radiologists often employ chest X-rays to identify pneumoniasince they are presently the most effective imaging method for this purpose.Com...Pneumonia is an acute lung infection that has caused many fatalitiesglobally. Radiologists often employ chest X-rays to identify pneumoniasince they are presently the most effective imaging method for this purpose.Computer-aided diagnosis of pneumonia using deep learning techniques iswidely used due to its effectiveness and performance. In the proposed method,the Synthetic Minority Oversampling Technique (SMOTE) approach is usedto eliminate the class imbalance in the X-ray dataset. To compensate forthe paucity of accessible data, pre-trained transfer learning is used, and anensemble Convolutional Neural Network (CNN) model is developed. Theensemble model consists of all possible combinations of the MobileNetv2,Visual Geometry Group (VGG16), and DenseNet169 models. MobileNetV2and DenseNet169 performed well in the Single classifier model, with anaccuracy of 94%, while the ensemble model (MobileNetV2+DenseNet169)achieved an accuracy of 96.9%. Using the data synchronous parallel modelin Distributed Tensorflow, the training process accelerated performance by98.6% and outperformed other conventional approaches.展开更多
We analyze the suitability of existing pre-trained transformer-based language models(PLMs)for abstractive text summarization on German technical healthcare texts.The study focuses on the multilingual capabilities of t...We analyze the suitability of existing pre-trained transformer-based language models(PLMs)for abstractive text summarization on German technical healthcare texts.The study focuses on the multilingual capabilities of these models and their ability to perform the task of abstractive text summarization in the healthcare field.The research hypothesis was that large language models could perform high-quality abstractive text summarization on German technical healthcare texts,even if the model is not specifically trained in that language.Through experiments,the research questions explore the performance of transformer language models in dealing with complex syntax constructs,the difference in performance between models trained in English and German,and the impact of translating the source text to English before conducting the summarization.We conducted an evaluation of four PLMs(GPT-3,a translation-based approach also utilizing GPT-3,a German language Model,and a domain-specific bio-medical model approach).The evaluation considered the informativeness using 3 types of metrics based on Recall-Oriented Understudy for Gisting Evaluation(ROUGE)and the quality of results which is manually evaluated considering 5 aspects.The results show that text summarization models could be used in the German healthcare domain and that domain-independent language models achieved the best results.The study proves that text summarization models can simplify the search for pre-existing German knowledge in various domains.展开更多
Current experimental and computational methods have limitations in accurately and efficiently classifying ion channels within vast protein spaces.Here we have developed a deep learning algorithm,GPT2 Ion Channel Class...Current experimental and computational methods have limitations in accurately and efficiently classifying ion channels within vast protein spaces.Here we have developed a deep learning algorithm,GPT2 Ion Channel Classifier(GPT2-ICC),which effectively distinguishing ion channels from a test set containing approximately 239 times more non-ion-channel proteins.GPT2-ICC integrates representation learning with a large language model(LLM)-based classifier,enabling highly accurate identification of potential ion channels.Several potential ion channels were predicated from the unannotated human proteome,further demonstrating GPT2-ICC’s generalization ability.This study marks a significant advancement in artificial-intelligence-driven ion channel research,highlighting the adaptability and effectiveness of combining representation learning with LLMs to address the challenges of imbalanced protein sequence data.Moreover,it provides a valuable computational tool for uncovering previously uncharacterized ion channels.展开更多
The Coronavirus Disease 2019(COVID-19)is wreaking havoc around the world,bring out that the enormous pressure on national health and medical staff systems.One of the most effective and critical steps in the fight agai...The Coronavirus Disease 2019(COVID-19)is wreaking havoc around the world,bring out that the enormous pressure on national health and medical staff systems.One of the most effective and critical steps in the fight against COVID-19,is to examine the patient’s lungs based on the Chest X-ray and CT generated by radiation imaging.In this paper,five keras-related deep learning models:ResNet50,InceptionResNetV2,Xception,transfer learning and pre-trained VGGNet16 is applied to formulate an classification-detection approaches of COVID-19.Two benchmark methods SVM(Support Vector Machine),CNN(Conventional Neural Networks)are provided to compare with the classification-detection approaches based on the performance indicators,i.e.,precision,recall,F1 scores,confusion matrix,classification accuracy and three types of AUC(Area Under Curve).The highest classification accuracy derived by classification-detection based on 5857 Chest X-rays and 767 Chest CTs are respectively 84%and 75%,which shows that the keras-related deep learning approaches facilitate accurate and effective COVID-19-assisted detection.展开更多
Recently, the emergence of pre-trained models(PTMs) has brought natural language processing(NLP) to a new era. In this survey, we provide a comprehensive review of PTMs for NLP. We first briefly introduce language rep...Recently, the emergence of pre-trained models(PTMs) has brought natural language processing(NLP) to a new era. In this survey, we provide a comprehensive review of PTMs for NLP. We first briefly introduce language representation learning and its research progress. Then we systematically categorize existing PTMs based on a taxonomy from four different perspectives. Next,we describe how to adapt the knowledge of PTMs to downstream tasks. Finally, we outline some potential directions of PTMs for future research. This survey is purposed to be a hands-on guide for understanding, using, and developing PTMs for various NLP tasks.展开更多
Intelligent sorting is an important prerequisite for the full quantitative consumption and harmless disposal of kitchen waste.The existing object detection method based on an ImageNet pre-trained model is an effective...Intelligent sorting is an important prerequisite for the full quantitative consumption and harmless disposal of kitchen waste.The existing object detection method based on an ImageNet pre-trained model is an effective way of sorting.Owing to significant domain gaps between natural images and kitchen waste images,it is difficult to reflect the characteristics of diverse scales and dense distribution in kitchen waste based on an ImageNet pre-trained model,leading to poor generalisation.In this article,the authors propose the first pre-trained model for kitchen waste sorting called KitWaSor,which combines both contrastive learning(CL)and masked image modelling(MIM)through self-supervised learning(SSL).First,to address the issue of diverse scales,the authors propose a mixed masking strategy by introducing an incomplete masking branch based on the original random masking branch.It prevents the complete loss of small-scale objects while avoiding excessive leakage of large-scale object pixels.Second,to address the issue of dense distribution,the authors introduce semantic consistency constraints on the basis of the mixed masking strategy.That is,object semantic reasoning is performed through semantic consistency constraints to compensate for the lack of contextual information.To train KitWaSor,the authors construct the first million-level kitchen waste dataset across seasonal and regional distributions,named KWD-Million.Extensive experiments show that KitWaSor achieves state-of-the-art(SOTA)performance on the two most relevant downstream tasks for kitchen waste sorting(i.e.image classification and object detection),demonstrating the effectiveness of the proposed KitWaSor.展开更多
Predicting cross-immunity between viral strains is vital for public health surveillance and vaccine development.Traditional neural network methods,such as BiLSTM,could be ineffective due to the lack of lab data for mo...Predicting cross-immunity between viral strains is vital for public health surveillance and vaccine development.Traditional neural network methods,such as BiLSTM,could be ineffective due to the lack of lab data for model training and the overshadowing of crucial features within sequence concatenation.The current work proposes a less data-consuming model incorporating a pre-trained gene sequence model and a mutual information inference operator.Our methodology utilizes gene alignment and deduplication algorithms to preprocess gene sequences,enhancing the model’s capacity to discern and focus on distinctions among input gene pairs.The model,i.e.,DNA Pretrained Cross-Immunity Protection Inference model(DPCIPI),outperforms state-of-theart(SOTA)models in predicting hemagglutination inhibition titer from influenza viral gene sequences only.Improvement in binary cross-immunity prediction is 1.58%in F1,2.34%in precision,1.57%in recall,and 1.57%in Accuracy.For multilevel cross-immunity improvements,the improvement is 2.12%in F1,3.50%in precision,2.19%in recall,and 2.19%in Accuracy.Our study showcases the potential of pre-trained gene models to improve predictions of antigenic variation and cross-immunity.With expanding gene data and advancements in pre-trained models,this approach promises significant impacts on vaccine development and public health.展开更多
Climate model prediction has been improved by enhancing model resolution as well as the implementation of sophisticated physical parameterization and refinement of data assimilation systems[section 6.1 in Wang et al.(...Climate model prediction has been improved by enhancing model resolution as well as the implementation of sophisticated physical parameterization and refinement of data assimilation systems[section 6.1 in Wang et al.(2025)].In relation to seasonal forecasting and climate projection in the East Asian summer monsoon season,proper simulation of the seasonal migration of rain bands by models is a challenging and limiting factor[section 7.1 in Wang et al.(2025)].展开更多
With the urgent demand for generalized deep models,many pre-trained big models are proposed,such as bidirectional encoder representations(BERT),vision transformer(ViT),generative pre-trained transformers(GPT),etc.Insp...With the urgent demand for generalized deep models,many pre-trained big models are proposed,such as bidirectional encoder representations(BERT),vision transformer(ViT),generative pre-trained transformers(GPT),etc.Inspired by the success of these models in single domains(like computer vision and natural language processing),the multi-modal pre-trained big models have also drawn more and more attention in recent years.In this work,we give a comprehensive survey of these models and hope this paper could provide new insights and helps fresh researchers to track the most cutting-edge works.Specifically,we firstly introduce the background of multi-modal pre-training by reviewing the conventional deep learning,pre-training works in natural language process,computer vision,and speech.Then,we introduce the task definition,key challenges,and advantages of multi-modal pre-training models(MM-PTMs),and discuss the MM-PTMs with a focus on data,objectives,network architectures,and knowledge enhanced pre-training.After that,we introduce the downstream tasks used for the validation of large-scale MM-PTMs,including generative,classification,and regression tasks.We also give visualization and analysis of the model parameters and results on representative downstream tasks.Finally,we point out possible research directions for this topic that may benefit future works.In addition,we maintain a continuously updated paper list for large-scale pre-trained multi-modal big models:https://github.com/wangxiao5791509/MultiModal_BigModels_Survey.展开更多
The pre-training-then-fine-tuning paradigm has been widely used in deep learning.Due to the huge computation cost for pre-training,practitioners usually download pre-trained models from the Internet and fine-tune them...The pre-training-then-fine-tuning paradigm has been widely used in deep learning.Due to the huge computation cost for pre-training,practitioners usually download pre-trained models from the Internet and fine-tune them on downstream datasets,while the downloaded models may suffer backdoor attacks.Different from previous attacks aiming at a target task,we show that a backdoored pre-trained model can behave maliciously in various downstream tasks without foreknowing task information.Attackers can restrict the output representations(the values of output neurons)of trigger-embedded samples to arbitrary predefined values through additional training,namely neuron-level backdoor attack(NeuBA).Since fine-tuning has little effect on model parameters,the fine-tuned model will retain the backdoor functionality and predict a specific label for the samples embedded with the same trigger.To provoke multiple labels in a specific task,attackers can introduce several triggers with predefined contrastive values.In the experiments of both natural language processing(NLP)and computer vision(CV),we show that NeuBA can well control the predictions for trigger-embedded instances with different trigger designs.Our findings sound a red alarm for the wide use of pre-trained models.Finally,we apply several defense methods to NeuBA and find that model pruning is a promising technique to resist NeuBA by omitting backdoored neurons.展开更多
BACKGROUND Non-erosive reflux disease(NERD),the main gastroesophageal reflux subtype,features reflux symptoms without mucosal damage.Anxiety links to visceral hypersensitivity in NERD,yet mechanisms and animal models ...BACKGROUND Non-erosive reflux disease(NERD),the main gastroesophageal reflux subtype,features reflux symptoms without mucosal damage.Anxiety links to visceral hypersensitivity in NERD,yet mechanisms and animal models are unclear.AIM To establish a translational NERD rat model with anxiety comorbidity via tail clamping and study corticotropin-releasing hormone(CRH)-mediated neuroimmune pathways in visceral hypersensitivity and esophageal injury.METHODS Sprague-Dawley(SD)and Wistar rats were grouped into sham,model,and modified groups(n=10 each).The treatments for the modified groups were as follows:SD rats received ovalbumin/aluminum hydroxide suspension+acid perfusion±tail clamping(40 minutes/day for 7 days),while Wistar rats received fructose water+tail clamping.Esophageal pathology,visceral sensitivity,and behavior were assessed.Serum CRH,calcitonin gene-related peptide(CGRP),5-hydroxytryptamine(5-HT),and mast cell tryptase(MCT)and central amygdala(CeA)CRH mRNA were measured via ELISA and qRT-PCR.RESULTS Tail clamping induced anxiety,worsening visceral hypersensitivity(lower abdominal withdrawal reflex thresholds,P<0.05)and esophageal injury(dilated intercellular spaces and mitochondrial edema).Both models showed raised serum CRH,CGRP,5-HT,and MCT(P<0.01)and CeA CRH mRNA expression(P<0.01).Behavioral tests confirmed anxiety-like phenotypes.NERD-anxiety rats showed clinical-like symptom severity without erosion.CONCLUSION Tail clamping induces anxiety in NERD models,worsening visceral hypersensitivity via CRH neuroimmune dysregulation,offering a translational model and highlighting CRH as a treatment target.展开更多
Noninvasive brain stimulation techniques offer promising therapeutic and regenerative prospects in neurological diseases by modulating brain activity and improving cognitive and motor functions.Given the paucity of kn...Noninvasive brain stimulation techniques offer promising therapeutic and regenerative prospects in neurological diseases by modulating brain activity and improving cognitive and motor functions.Given the paucity of knowledge about the underlying modes of action and optimal treatment modalities,a thorough translational investigation of noninvasive brain stimulation in preclinical animal models is urgently needed.Thus,we reviewed the current literature on the mechanistic underpinnings of noninvasive brain stimulation in models of central nervous system impairment,with a particular emphasis on traumatic brain injury and stroke.Due to the lack of translational models in most noninvasive brain stimulation techniques proposed,we found this review to the most relevant techniques used in humans,i.e.,transcranial magnetic stimulation and transcranial direct current stimulation.We searched the literature in Pub Med,encompassing the MEDLINE and PMC databases,for studies published between January 1,2020 and September 30,2024.Thirty-five studies were eligible.Transcranial magnetic stimulation and transcranial direct current stimulation demonstrated distinct strengths in augmenting rehabilitation post-stroke and traumatic brain injury,with emerging mechanistic evidence.Overall,we identified neuronal,inflammatory,microvascular,and apoptotic pathways highlighted in the literature.This review also highlights a lack of translational surrogate parameters to bridge the gap between preclinical findings and their clinical translation.展开更多
Myasthenia gravis is a chronic autoimmune disorder that affects the neuromuscular junction leading to fluctuating skeletal muscle fatigability. The majority of myasthenia gravis patients have detectable antibodies in ...Myasthenia gravis is a chronic autoimmune disorder that affects the neuromuscular junction leading to fluctuating skeletal muscle fatigability. The majority of myasthenia gravis patients have detectable antibodies in their serum, targeting acetylcholine receptor, muscle-specific kinase, or related proteins. Current treatment for myasthenia gravis involves symptomatic therapy, immunosuppressive drugs such as corticosteroids, azathioprine, and mycophenolate mofetil, and thymectomy, which is primarily indicated in patients with thymoma or thymic hyperplasia. However, this condition continues to pose significant challenges including an unpredictable and variable disease progression, differing response to individual therapies, and substantial longterm side effects associated with standard treatments(including an increased risk of infections, osteoporosis, and diabetes), underscoring the necessity for a more personalized approach to treatment. Furthermore, about fifteen percent of patients, called “refractory myasthenia gravis patients”, do not respond adequately to standard therapies. In this context, the introduction of molecular therapies has marked a significant advance in myasthenia gravis management. Advances in understanding myasthenia gravis pathogenesis, especially the role of pathogenic antibodies, have driven the development of these biological drugs, which offer more selective, rapid, and safer alternatives to traditional immunosuppressants. This review aims to provide a comprehensive overview of emerging therapeutic strategies targeting specific immune pathways in myasthenia gravis, with a particular focus on preclinical evidence, therapeutic rationale, and clinical translation of B-cell depletion therapies, neonatal Fc receptor inhibitors, and complement inhibitors.展开更多
The brain is the most complex human organ,and commonly used models,such as two-dimensional-cell cultures and animal brains,often lack the sophistication needed to accurately use in research.In this context,human cereb...The brain is the most complex human organ,and commonly used models,such as two-dimensional-cell cultures and animal brains,often lack the sophistication needed to accurately use in research.In this context,human cerebral organoids have emerged as valuable tools offering a more complex,versatile,and human-relevant system than traditional animal models,which are often unable to replicate the intricate architecture and functionality of the human brain.Since human cerebral organoids are a state-of-the-art model for the study of neurodevelopment and different pathologies affecting the brain,this field is currently under constant development,and work in this area is abundant.In this review,we give a complete overview of human cerebral organoids technology,starting from the different types of protocols that exist to generate different human cerebral organoids.We continue with the use of brain organoids for the study of brain pathologies,highlighting neurodevelopmental,psychiatric,neurodegenerative,brain tumor,and infectious diseases.Because of the potential value of human cerebral organoids,we describe their use in transplantation,drug screening,and toxicology assays.We also discuss the technologies available to study cell diversity and physiological characteristics of organoids.Finally,we summarize the limitations that currently exist in the field,such as the development of vasculature and microglia,and highlight some of the novel approaches being pursued through bioengineering.展开更多
With current success of large-scale pre-trained models(PTMs),how efficiently adapting PTMs to downstream tasks has attracted tremendous attention,especially for PTMs with billions of parameters.Previous work focuses o...With current success of large-scale pre-trained models(PTMs),how efficiently adapting PTMs to downstream tasks has attracted tremendous attention,especially for PTMs with billions of parameters.Previous work focuses on designing parameter-efficient tuning paradigms but needs to save and compute the gradient of the whole computational graph.In this paper,we propose y-Tuning,an efficient yet effective paradigm to adapt frozen large-scale PTMs to specific downstream tasks.y-Tuning learns dense representations for labels y defined in a given task and aligns them to fixed feature representation.Without computing the gradients of text encoder at training phrase,y-Tuning is not only parameterefficient but also training-efficient.Experimental results show that for DeBERTaxxL with 1.6 billion parameters,y-Tuning achieves performance more than 96%of full fine-tuning on GLUE Benchmark with only 2%tunable parameters and much fewer training costs.展开更多
In the process of constructing domain-specific knowledge graphs,the task of relational triple extraction plays a critical role in transforming unstructured text into structured information.Existing relational triple e...In the process of constructing domain-specific knowledge graphs,the task of relational triple extraction plays a critical role in transforming unstructured text into structured information.Existing relational triple extraction models facemultiple challenges when processing domain-specific data,including insufficient utilization of semantic interaction information between entities and relations,difficulties in handling challenging samples,and the scarcity of domain-specific datasets.To address these issues,our study introduces three innovative components:Relation semantic enhancement,data augmentation,and a voting strategy,all designed to significantly improve the model’s performance in tackling domain-specific relational triple extraction tasks.We first propose an innovative attention interaction module.This method significantly enhances the semantic interaction capabilities between entities and relations by integrating semantic information fromrelation labels.Second,we propose a voting strategy that effectively combines the strengths of large languagemodels(LLMs)and fine-tuned small pre-trained language models(SLMs)to reevaluate challenging samples,thereby improving the model’s adaptability in specific domains.Additionally,we explore the use of LLMs for data augmentation,aiming to generate domain-specific datasets to alleviate the scarcity of domain data.Experiments conducted on three domain-specific datasets demonstrate that our model outperforms existing comparative models in several aspects,with F1 scores exceeding the State of the Art models by 2%,1.6%,and 0.6%,respectively,validating the effectiveness and generalizability of our approach.展开更多
We present an approach to classify medical text at a sentence level automatically.Given the inherent complexity of medical text classification,we employ adapters based on pre-trained language models to extract informa...We present an approach to classify medical text at a sentence level automatically.Given the inherent complexity of medical text classification,we employ adapters based on pre-trained language models to extract information from medical text,facilitating more accurate classification while minimizing the number of trainable parameters.Extensive experiments conducted on various datasets demonstrate the effectiveness of our approach.展开更多
Named Entity Recognition(NER)is crucial for extracting structured information from text.While traditional methods rely on rules,Conditional Random Fields(CRFs),or deep learning,the advent of large-scale Pre-trained La...Named Entity Recognition(NER)is crucial for extracting structured information from text.While traditional methods rely on rules,Conditional Random Fields(CRFs),or deep learning,the advent of large-scale Pre-trained Language Models(PLMs)offers new possibilities.PLMs excel at contextual learning,potentially simplifying many natural language processing tasks.However,their application to NER remains underexplored.This paper investigates leveraging the GPT-3 PLM for NER without fine-tuning.We propose a novel scheme that utilizes carefully crafted templates and context examples selected based on semantic similarity.Our experimental results demonstrate the feasibility of this approach,suggesting a promising direction for harnessing PLMs in NER.展开更多
With the construction of new power systems,the power grid has become extremely large,with an increasing proportion of new energy and AC/DC hybrid connections.The dynamic characteristics and fault patterns of the power...With the construction of new power systems,the power grid has become extremely large,with an increasing proportion of new energy and AC/DC hybrid connections.The dynamic characteristics and fault patterns of the power grid are complex;additionally,power grid control is difficult,operation risks are high,and the task of fault handling is arduous.Traditional power-grid fault handling relies primarily on human experience.The difference in and lack of knowledge reserve of control personnel restrict the accuracy and timeliness of fault handling.Therefore,this mode of operation is no longer suitable for the requirements of new systems.Based on the multi-source heterogeneous data of power grid dispatch,this paper proposes a joint entity–relationship extraction method for power-grid dispatch fault processing based on a pre-trained model,constructs a knowledge graph of power-grid dispatch fault processing and designs,and develops a fault-processing auxiliary decision-making system based on the knowledge graph.It was applied to study a provincial dispatch control center,and it effectively improved the accident processing ability and intelligent level of accident management and control of the power grid.展开更多
文摘Sentiment analysis,a cornerstone of natural language processing,has witnessed remarkable advancements driven by deep learning models which demonstrated impressive accuracy in discerning sentiment from text across various domains.However,the deployment of such models in resource-constrained environments presents a unique set of challenges that require innovative solutions.Resource-constrained environments encompass scenarios where computing resources,memory,and energy availability are restricted.To empower sentiment analysis in resource-constrained environments,we address the crucial need by leveraging lightweight pre-trained models.These models,derived from popular architectures such as DistilBERT,MobileBERT,ALBERT,TinyBERT,ELECTRA,and SqueezeBERT,offer a promising solution to the resource limitations imposed by these environments.By distilling the knowledge from larger models into smaller ones and employing various optimization techniques,these lightweight models aim to strike a balance between performance and resource efficiency.This paper endeavors to explore the performance of multiple lightweight pre-trained models in sentiment analysis tasks specific to such environments and provide insights into their viability for practical deployment.
文摘Pneumonia is an acute lung infection that has caused many fatalitiesglobally. Radiologists often employ chest X-rays to identify pneumoniasince they are presently the most effective imaging method for this purpose.Computer-aided diagnosis of pneumonia using deep learning techniques iswidely used due to its effectiveness and performance. In the proposed method,the Synthetic Minority Oversampling Technique (SMOTE) approach is usedto eliminate the class imbalance in the X-ray dataset. To compensate forthe paucity of accessible data, pre-trained transfer learning is used, and anensemble Convolutional Neural Network (CNN) model is developed. Theensemble model consists of all possible combinations of the MobileNetv2,Visual Geometry Group (VGG16), and DenseNet169 models. MobileNetV2and DenseNet169 performed well in the Single classifier model, with anaccuracy of 94%, while the ensemble model (MobileNetV2+DenseNet169)achieved an accuracy of 96.9%. Using the data synchronous parallel modelin Distributed Tensorflow, the training process accelerated performance by98.6% and outperformed other conventional approaches.
文摘We analyze the suitability of existing pre-trained transformer-based language models(PLMs)for abstractive text summarization on German technical healthcare texts.The study focuses on the multilingual capabilities of these models and their ability to perform the task of abstractive text summarization in the healthcare field.The research hypothesis was that large language models could perform high-quality abstractive text summarization on German technical healthcare texts,even if the model is not specifically trained in that language.Through experiments,the research questions explore the performance of transformer language models in dealing with complex syntax constructs,the difference in performance between models trained in English and German,and the impact of translating the source text to English before conducting the summarization.We conducted an evaluation of four PLMs(GPT-3,a translation-based approach also utilizing GPT-3,a German language Model,and a domain-specific bio-medical model approach).The evaluation considered the informativeness using 3 types of metrics based on Recall-Oriented Understudy for Gisting Evaluation(ROUGE)and the quality of results which is manually evaluated considering 5 aspects.The results show that text summarization models could be used in the German healthcare domain and that domain-independent language models achieved the best results.The study proves that text summarization models can simplify the search for pre-existing German knowledge in various domains.
基金funded by grants from the National Key Research and Development Program of China(Grant Nos.:2022YFE0205600 and 2022YFC3400504)the National Natural Science Foundation of China(Grant Nos.:82373792 and 82273857)the Fundamental Research Funds for the Central Universities,China,and the East China Normal University Medicine and Health Joint Fund,China(Grant No.:2022JKXYD07001).
文摘Current experimental and computational methods have limitations in accurately and efficiently classifying ion channels within vast protein spaces.Here we have developed a deep learning algorithm,GPT2 Ion Channel Classifier(GPT2-ICC),which effectively distinguishing ion channels from a test set containing approximately 239 times more non-ion-channel proteins.GPT2-ICC integrates representation learning with a large language model(LLM)-based classifier,enabling highly accurate identification of potential ion channels.Several potential ion channels were predicated from the unannotated human proteome,further demonstrating GPT2-ICC’s generalization ability.This study marks a significant advancement in artificial-intelligence-driven ion channel research,highlighting the adaptability and effectiveness of combining representation learning with LLMs to address the challenges of imbalanced protein sequence data.Moreover,it provides a valuable computational tool for uncovering previously uncharacterized ion channels.
基金This project is supported by National Natural Science Foundation of China(NSFC)(Nos.61902158,61806087)Graduate student innovation program for academic degrees in general university in Jiangsu Province(No.KYZZ16-0337).
文摘The Coronavirus Disease 2019(COVID-19)is wreaking havoc around the world,bring out that the enormous pressure on national health and medical staff systems.One of the most effective and critical steps in the fight against COVID-19,is to examine the patient’s lungs based on the Chest X-ray and CT generated by radiation imaging.In this paper,five keras-related deep learning models:ResNet50,InceptionResNetV2,Xception,transfer learning and pre-trained VGGNet16 is applied to formulate an classification-detection approaches of COVID-19.Two benchmark methods SVM(Support Vector Machine),CNN(Conventional Neural Networks)are provided to compare with the classification-detection approaches based on the performance indicators,i.e.,precision,recall,F1 scores,confusion matrix,classification accuracy and three types of AUC(Area Under Curve).The highest classification accuracy derived by classification-detection based on 5857 Chest X-rays and 767 Chest CTs are respectively 84%and 75%,which shows that the keras-related deep learning approaches facilitate accurate and effective COVID-19-assisted detection.
基金the National Natural Science Foundation of China(Grant Nos.61751201 and 61672162)the Shanghai Municipal Science and Technology Major Project(Grant No.2018SHZDZX01)and ZJLab。
文摘Recently, the emergence of pre-trained models(PTMs) has brought natural language processing(NLP) to a new era. In this survey, we provide a comprehensive review of PTMs for NLP. We first briefly introduce language representation learning and its research progress. Then we systematically categorize existing PTMs based on a taxonomy from four different perspectives. Next,we describe how to adapt the knowledge of PTMs to downstream tasks. Finally, we outline some potential directions of PTMs for future research. This survey is purposed to be a hands-on guide for understanding, using, and developing PTMs for various NLP tasks.
基金National Key Research and Development Program of China,Grant/Award Number:2021YFC1910402。
文摘Intelligent sorting is an important prerequisite for the full quantitative consumption and harmless disposal of kitchen waste.The existing object detection method based on an ImageNet pre-trained model is an effective way of sorting.Owing to significant domain gaps between natural images and kitchen waste images,it is difficult to reflect the characteristics of diverse scales and dense distribution in kitchen waste based on an ImageNet pre-trained model,leading to poor generalisation.In this article,the authors propose the first pre-trained model for kitchen waste sorting called KitWaSor,which combines both contrastive learning(CL)and masked image modelling(MIM)through self-supervised learning(SSL).First,to address the issue of diverse scales,the authors propose a mixed masking strategy by introducing an incomplete masking branch based on the original random masking branch.It prevents the complete loss of small-scale objects while avoiding excessive leakage of large-scale object pixels.Second,to address the issue of dense distribution,the authors introduce semantic consistency constraints on the basis of the mixed masking strategy.That is,object semantic reasoning is performed through semantic consistency constraints to compensate for the lack of contextual information.To train KitWaSor,the authors construct the first million-level kitchen waste dataset across seasonal and regional distributions,named KWD-Million.Extensive experiments show that KitWaSor achieves state-of-the-art(SOTA)performance on the two most relevant downstream tasks for kitchen waste sorting(i.e.image classification and object detection),demonstrating the effectiveness of the proposed KitWaSor.
基金supported by the Bill & Melinda Gates Foundation and the Minderoo Foundation
文摘Predicting cross-immunity between viral strains is vital for public health surveillance and vaccine development.Traditional neural network methods,such as BiLSTM,could be ineffective due to the lack of lab data for model training and the overshadowing of crucial features within sequence concatenation.The current work proposes a less data-consuming model incorporating a pre-trained gene sequence model and a mutual information inference operator.Our methodology utilizes gene alignment and deduplication algorithms to preprocess gene sequences,enhancing the model’s capacity to discern and focus on distinctions among input gene pairs.The model,i.e.,DNA Pretrained Cross-Immunity Protection Inference model(DPCIPI),outperforms state-of-theart(SOTA)models in predicting hemagglutination inhibition titer from influenza viral gene sequences only.Improvement in binary cross-immunity prediction is 1.58%in F1,2.34%in precision,1.57%in recall,and 1.57%in Accuracy.For multilevel cross-immunity improvements,the improvement is 2.12%in F1,3.50%in precision,2.19%in recall,and 2.19%in Accuracy.Our study showcases the potential of pre-trained gene models to improve predictions of antigenic variation and cross-immunity.With expanding gene data and advancements in pre-trained models,this approach promises significant impacts on vaccine development and public health.
文摘Climate model prediction has been improved by enhancing model resolution as well as the implementation of sophisticated physical parameterization and refinement of data assimilation systems[section 6.1 in Wang et al.(2025)].In relation to seasonal forecasting and climate projection in the East Asian summer monsoon season,proper simulation of the seasonal migration of rain bands by models is a challenging and limiting factor[section 7.1 in Wang et al.(2025)].
基金supported by National Natural Science Foundation of China(Nos.61872256 and 62102205)Key-Area Research and Development Program of Guangdong Province,China(No.2021B0101400002)+1 种基金Peng Cheng Laboratory Key Research Project,China(No.PCL 2021A07)Multi-source Cross-platform Video Analysis and Understanding for Intelligent Perception in Smart City,China(No.U20B2052).
文摘With the urgent demand for generalized deep models,many pre-trained big models are proposed,such as bidirectional encoder representations(BERT),vision transformer(ViT),generative pre-trained transformers(GPT),etc.Inspired by the success of these models in single domains(like computer vision and natural language processing),the multi-modal pre-trained big models have also drawn more and more attention in recent years.In this work,we give a comprehensive survey of these models and hope this paper could provide new insights and helps fresh researchers to track the most cutting-edge works.Specifically,we firstly introduce the background of multi-modal pre-training by reviewing the conventional deep learning,pre-training works in natural language process,computer vision,and speech.Then,we introduce the task definition,key challenges,and advantages of multi-modal pre-training models(MM-PTMs),and discuss the MM-PTMs with a focus on data,objectives,network architectures,and knowledge enhanced pre-training.After that,we introduce the downstream tasks used for the validation of large-scale MM-PTMs,including generative,classification,and regression tasks.We also give visualization and analysis of the model parameters and results on representative downstream tasks.Finally,we point out possible research directions for this topic that may benefit future works.In addition,we maintain a continuously updated paper list for large-scale pre-trained multi-modal big models:https://github.com/wangxiao5791509/MultiModal_BigModels_Survey.
基金supported by the National Key Research and Development Program of China(No.2020AAA0106500)the National Natural Science Foundation of China(NSFC No.62236004).
文摘The pre-training-then-fine-tuning paradigm has been widely used in deep learning.Due to the huge computation cost for pre-training,practitioners usually download pre-trained models from the Internet and fine-tune them on downstream datasets,while the downloaded models may suffer backdoor attacks.Different from previous attacks aiming at a target task,we show that a backdoored pre-trained model can behave maliciously in various downstream tasks without foreknowing task information.Attackers can restrict the output representations(the values of output neurons)of trigger-embedded samples to arbitrary predefined values through additional training,namely neuron-level backdoor attack(NeuBA).Since fine-tuning has little effect on model parameters,the fine-tuned model will retain the backdoor functionality and predict a specific label for the samples embedded with the same trigger.To provoke multiple labels in a specific task,attackers can introduce several triggers with predefined contrastive values.In the experiments of both natural language processing(NLP)and computer vision(CV),we show that NeuBA can well control the predictions for trigger-embedded instances with different trigger designs.Our findings sound a red alarm for the wide use of pre-trained models.Finally,we apply several defense methods to NeuBA and find that model pruning is a promising technique to resist NeuBA by omitting backdoored neurons.
基金Supported by the National Key Specialty of Traditional Chinese Medicine(Spleen and Stomach Diseases),No.0500004National Natural Science Foundation of China,No.82205104 and No.82104850+1 种基金Hospital Capability Enhancement Project of Xiyuan Hospital,CACMS,No.XYZX0303-07the Fundamental Research Funds for the Central Public Welfare Research Institutes,Excellent Young Scientists Training Program of China Academy of Chinese Medical Sciences,No.ZZ16-YQ-002.
文摘BACKGROUND Non-erosive reflux disease(NERD),the main gastroesophageal reflux subtype,features reflux symptoms without mucosal damage.Anxiety links to visceral hypersensitivity in NERD,yet mechanisms and animal models are unclear.AIM To establish a translational NERD rat model with anxiety comorbidity via tail clamping and study corticotropin-releasing hormone(CRH)-mediated neuroimmune pathways in visceral hypersensitivity and esophageal injury.METHODS Sprague-Dawley(SD)and Wistar rats were grouped into sham,model,and modified groups(n=10 each).The treatments for the modified groups were as follows:SD rats received ovalbumin/aluminum hydroxide suspension+acid perfusion±tail clamping(40 minutes/day for 7 days),while Wistar rats received fructose water+tail clamping.Esophageal pathology,visceral sensitivity,and behavior were assessed.Serum CRH,calcitonin gene-related peptide(CGRP),5-hydroxytryptamine(5-HT),and mast cell tryptase(MCT)and central amygdala(CeA)CRH mRNA were measured via ELISA and qRT-PCR.RESULTS Tail clamping induced anxiety,worsening visceral hypersensitivity(lower abdominal withdrawal reflex thresholds,P<0.05)and esophageal injury(dilated intercellular spaces and mitochondrial edema).Both models showed raised serum CRH,CGRP,5-HT,and MCT(P<0.01)and CeA CRH mRNA expression(P<0.01).Behavioral tests confirmed anxiety-like phenotypes.NERD-anxiety rats showed clinical-like symptom severity without erosion.CONCLUSION Tail clamping induces anxiety in NERD models,worsening visceral hypersensitivity via CRH neuroimmune dysregulation,offering a translational model and highlighting CRH as a treatment target.
基金funded by the Deutsche Forschungsgemeinschaft(DFG,German Research Foundation):project ID 431549029-SFB 1451the Marga-und-Walter-Boll-Stiftung(#210-10-15)(to MAR)a stipend from the'Gerok Program'(Faculty of Medicine,University of Cologne,Germany)。
文摘Noninvasive brain stimulation techniques offer promising therapeutic and regenerative prospects in neurological diseases by modulating brain activity and improving cognitive and motor functions.Given the paucity of knowledge about the underlying modes of action and optimal treatment modalities,a thorough translational investigation of noninvasive brain stimulation in preclinical animal models is urgently needed.Thus,we reviewed the current literature on the mechanistic underpinnings of noninvasive brain stimulation in models of central nervous system impairment,with a particular emphasis on traumatic brain injury and stroke.Due to the lack of translational models in most noninvasive brain stimulation techniques proposed,we found this review to the most relevant techniques used in humans,i.e.,transcranial magnetic stimulation and transcranial direct current stimulation.We searched the literature in Pub Med,encompassing the MEDLINE and PMC databases,for studies published between January 1,2020 and September 30,2024.Thirty-five studies were eligible.Transcranial magnetic stimulation and transcranial direct current stimulation demonstrated distinct strengths in augmenting rehabilitation post-stroke and traumatic brain injury,with emerging mechanistic evidence.Overall,we identified neuronal,inflammatory,microvascular,and apoptotic pathways highlighted in the literature.This review also highlights a lack of translational surrogate parameters to bridge the gap between preclinical findings and their clinical translation.
文摘Myasthenia gravis is a chronic autoimmune disorder that affects the neuromuscular junction leading to fluctuating skeletal muscle fatigability. The majority of myasthenia gravis patients have detectable antibodies in their serum, targeting acetylcholine receptor, muscle-specific kinase, or related proteins. Current treatment for myasthenia gravis involves symptomatic therapy, immunosuppressive drugs such as corticosteroids, azathioprine, and mycophenolate mofetil, and thymectomy, which is primarily indicated in patients with thymoma or thymic hyperplasia. However, this condition continues to pose significant challenges including an unpredictable and variable disease progression, differing response to individual therapies, and substantial longterm side effects associated with standard treatments(including an increased risk of infections, osteoporosis, and diabetes), underscoring the necessity for a more personalized approach to treatment. Furthermore, about fifteen percent of patients, called “refractory myasthenia gravis patients”, do not respond adequately to standard therapies. In this context, the introduction of molecular therapies has marked a significant advance in myasthenia gravis management. Advances in understanding myasthenia gravis pathogenesis, especially the role of pathogenic antibodies, have driven the development of these biological drugs, which offer more selective, rapid, and safer alternatives to traditional immunosuppressants. This review aims to provide a comprehensive overview of emerging therapeutic strategies targeting specific immune pathways in myasthenia gravis, with a particular focus on preclinical evidence, therapeutic rationale, and clinical translation of B-cell depletion therapies, neonatal Fc receptor inhibitors, and complement inhibitors.
基金supported by the Grant PID2021-126715OB-IOO financed by MCIN/AEI/10.13039/501100011033 and"ERDFA way of making Europe"by the Grant PI22CⅢ/00055 funded by Instituto de Salud CarlosⅢ(ISCⅢ)+6 种基金the UFIECPY 398/19(PEJ2018-004965) grant to RGS funded by AEI(Spain)the UFIECPY-396/19(PEJ2018-004961)grant financed by MCIN (Spain)FI23CⅢ/00003 grant funded by ISCⅢ-PFIS Spain) to PMMthe UFIECPY 328/22 (PEJ-2021-TL/BMD-21001) grant to LM financed by CAM (Spain)the grant by CAPES (Coordination for the Improvement of Higher Education Personnel)through the PDSE program (Programa de Doutorado Sanduiche no Exterior)to VSCG financed by MEC (Brazil)
文摘The brain is the most complex human organ,and commonly used models,such as two-dimensional-cell cultures and animal brains,often lack the sophistication needed to accurately use in research.In this context,human cerebral organoids have emerged as valuable tools offering a more complex,versatile,and human-relevant system than traditional animal models,which are often unable to replicate the intricate architecture and functionality of the human brain.Since human cerebral organoids are a state-of-the-art model for the study of neurodevelopment and different pathologies affecting the brain,this field is currently under constant development,and work in this area is abundant.In this review,we give a complete overview of human cerebral organoids technology,starting from the different types of protocols that exist to generate different human cerebral organoids.We continue with the use of brain organoids for the study of brain pathologies,highlighting neurodevelopmental,psychiatric,neurodegenerative,brain tumor,and infectious diseases.Because of the potential value of human cerebral organoids,we describe their use in transplantation,drug screening,and toxicology assays.We also discuss the technologies available to study cell diversity and physiological characteristics of organoids.Finally,we summarize the limitations that currently exist in the field,such as the development of vasculature and microglia,and highlight some of the novel approaches being pursued through bioengineering.
基金National Key R&D Program of China(No.2020AAA0108702)National Natural Science Foundation of China(Grant No.62022027).
文摘With current success of large-scale pre-trained models(PTMs),how efficiently adapting PTMs to downstream tasks has attracted tremendous attention,especially for PTMs with billions of parameters.Previous work focuses on designing parameter-efficient tuning paradigms but needs to save and compute the gradient of the whole computational graph.In this paper,we propose y-Tuning,an efficient yet effective paradigm to adapt frozen large-scale PTMs to specific downstream tasks.y-Tuning learns dense representations for labels y defined in a given task and aligns them to fixed feature representation.Without computing the gradients of text encoder at training phrase,y-Tuning is not only parameterefficient but also training-efficient.Experimental results show that for DeBERTaxxL with 1.6 billion parameters,y-Tuning achieves performance more than 96%of full fine-tuning on GLUE Benchmark with only 2%tunable parameters and much fewer training costs.
基金Science and Technology Innovation 2030-Major Project of“New Generation Artificial Intelligence”granted by Ministry of Science and Technology,Grant Number 2020AAA0109300.
文摘In the process of constructing domain-specific knowledge graphs,the task of relational triple extraction plays a critical role in transforming unstructured text into structured information.Existing relational triple extraction models facemultiple challenges when processing domain-specific data,including insufficient utilization of semantic interaction information between entities and relations,difficulties in handling challenging samples,and the scarcity of domain-specific datasets.To address these issues,our study introduces three innovative components:Relation semantic enhancement,data augmentation,and a voting strategy,all designed to significantly improve the model’s performance in tackling domain-specific relational triple extraction tasks.We first propose an innovative attention interaction module.This method significantly enhances the semantic interaction capabilities between entities and relations by integrating semantic information fromrelation labels.Second,we propose a voting strategy that effectively combines the strengths of large languagemodels(LLMs)and fine-tuned small pre-trained language models(SLMs)to reevaluate challenging samples,thereby improving the model’s adaptability in specific domains.Additionally,we explore the use of LLMs for data augmentation,aiming to generate domain-specific datasets to alleviate the scarcity of domain data.Experiments conducted on three domain-specific datasets demonstrate that our model outperforms existing comparative models in several aspects,with F1 scores exceeding the State of the Art models by 2%,1.6%,and 0.6%,respectively,validating the effectiveness and generalizability of our approach.
文摘We present an approach to classify medical text at a sentence level automatically.Given the inherent complexity of medical text classification,we employ adapters based on pre-trained language models to extract information from medical text,facilitating more accurate classification while minimizing the number of trainable parameters.Extensive experiments conducted on various datasets demonstrate the effectiveness of our approach.
文摘Named Entity Recognition(NER)is crucial for extracting structured information from text.While traditional methods rely on rules,Conditional Random Fields(CRFs),or deep learning,the advent of large-scale Pre-trained Language Models(PLMs)offers new possibilities.PLMs excel at contextual learning,potentially simplifying many natural language processing tasks.However,their application to NER remains underexplored.This paper investigates leveraging the GPT-3 PLM for NER without fine-tuning.We propose a novel scheme that utilizes carefully crafted templates and context examples selected based on semantic similarity.Our experimental results demonstrate the feasibility of this approach,suggesting a promising direction for harnessing PLMs in NER.
基金supported by the Science and Technology Project of the State Grid Corporation“Research on Key Technologies of Power Artificial Intelligence Open Platform”(5700-202155260A-0-0-00).
文摘With the construction of new power systems,the power grid has become extremely large,with an increasing proportion of new energy and AC/DC hybrid connections.The dynamic characteristics and fault patterns of the power grid are complex;additionally,power grid control is difficult,operation risks are high,and the task of fault handling is arduous.Traditional power-grid fault handling relies primarily on human experience.The difference in and lack of knowledge reserve of control personnel restrict the accuracy and timeliness of fault handling.Therefore,this mode of operation is no longer suitable for the requirements of new systems.Based on the multi-source heterogeneous data of power grid dispatch,this paper proposes a joint entity–relationship extraction method for power-grid dispatch fault processing based on a pre-trained model,constructs a knowledge graph of power-grid dispatch fault processing and designs,and develops a fault-processing auxiliary decision-making system based on the knowledge graph.It was applied to study a provincial dispatch control center,and it effectively improved the accident processing ability and intelligent level of accident management and control of the power grid.