Sentiment analysis,a cornerstone of natural language processing,has witnessed remarkable advancements driven by deep learning models which demonstrated impressive accuracy in discerning sentiment from text across vari...Sentiment analysis,a cornerstone of natural language processing,has witnessed remarkable advancements driven by deep learning models which demonstrated impressive accuracy in discerning sentiment from text across various domains.However,the deployment of such models in resource-constrained environments presents a unique set of challenges that require innovative solutions.Resource-constrained environments encompass scenarios where computing resources,memory,and energy availability are restricted.To empower sentiment analysis in resource-constrained environments,we address the crucial need by leveraging lightweight pre-trained models.These models,derived from popular architectures such as DistilBERT,MobileBERT,ALBERT,TinyBERT,ELECTRA,and SqueezeBERT,offer a promising solution to the resource limitations imposed by these environments.By distilling the knowledge from larger models into smaller ones and employing various optimization techniques,these lightweight models aim to strike a balance between performance and resource efficiency.This paper endeavors to explore the performance of multiple lightweight pre-trained models in sentiment analysis tasks specific to such environments and provide insights into their viability for practical deployment.展开更多
Pneumonia is an acute lung infection that has caused many fatalitiesglobally. Radiologists often employ chest X-rays to identify pneumoniasince they are presently the most effective imaging method for this purpose.Com...Pneumonia is an acute lung infection that has caused many fatalitiesglobally. Radiologists often employ chest X-rays to identify pneumoniasince they are presently the most effective imaging method for this purpose.Computer-aided diagnosis of pneumonia using deep learning techniques iswidely used due to its effectiveness and performance. In the proposed method,the Synthetic Minority Oversampling Technique (SMOTE) approach is usedto eliminate the class imbalance in the X-ray dataset. To compensate forthe paucity of accessible data, pre-trained transfer learning is used, and anensemble Convolutional Neural Network (CNN) model is developed. Theensemble model consists of all possible combinations of the MobileNetv2,Visual Geometry Group (VGG16), and DenseNet169 models. MobileNetV2and DenseNet169 performed well in the Single classifier model, with anaccuracy of 94%, while the ensemble model (MobileNetV2+DenseNet169)achieved an accuracy of 96.9%. Using the data synchronous parallel modelin Distributed Tensorflow, the training process accelerated performance by98.6% and outperformed other conventional approaches.展开更多
We analyze the suitability of existing pre-trained transformer-based language models(PLMs)for abstractive text summarization on German technical healthcare texts.The study focuses on the multilingual capabilities of t...We analyze the suitability of existing pre-trained transformer-based language models(PLMs)for abstractive text summarization on German technical healthcare texts.The study focuses on the multilingual capabilities of these models and their ability to perform the task of abstractive text summarization in the healthcare field.The research hypothesis was that large language models could perform high-quality abstractive text summarization on German technical healthcare texts,even if the model is not specifically trained in that language.Through experiments,the research questions explore the performance of transformer language models in dealing with complex syntax constructs,the difference in performance between models trained in English and German,and the impact of translating the source text to English before conducting the summarization.We conducted an evaluation of four PLMs(GPT-3,a translation-based approach also utilizing GPT-3,a German language Model,and a domain-specific bio-medical model approach).The evaluation considered the informativeness using 3 types of metrics based on Recall-Oriented Understudy for Gisting Evaluation(ROUGE)and the quality of results which is manually evaluated considering 5 aspects.The results show that text summarization models could be used in the German healthcare domain and that domain-independent language models achieved the best results.The study proves that text summarization models can simplify the search for pre-existing German knowledge in various domains.展开更多
Predicting cross-immunity between viral strains is vital for public health surveillance and vaccine development.Traditional neural network methods,such as BiLSTM,could be ineffective due to the lack of lab data for mo...Predicting cross-immunity between viral strains is vital for public health surveillance and vaccine development.Traditional neural network methods,such as BiLSTM,could be ineffective due to the lack of lab data for model training and the overshadowing of crucial features within sequence concatenation.The current work proposes a less data-consuming model incorporating a pre-trained gene sequence model and a mutual information inference operator.Our methodology utilizes gene alignment and deduplication algorithms to preprocess gene sequences,enhancing the model’s capacity to discern and focus on distinctions among input gene pairs.The model,i.e.,DNA Pretrained Cross-Immunity Protection Inference model(DPCIPI),outperforms state-of-theart(SOTA)models in predicting hemagglutination inhibition titer from influenza viral gene sequences only.Improvement in binary cross-immunity prediction is 1.58%in F1,2.34%in precision,1.57%in recall,and 1.57%in Accuracy.For multilevel cross-immunity improvements,the improvement is 2.12%in F1,3.50%in precision,2.19%in recall,and 2.19%in Accuracy.Our study showcases the potential of pre-trained gene models to improve predictions of antigenic variation and cross-immunity.With expanding gene data and advancements in pre-trained models,this approach promises significant impacts on vaccine development and public health.展开更多
Intelligent sorting is an important prerequisite for the full quantitative consumption and harmless disposal of kitchen waste.The existing object detection method based on an ImageNet pre-trained model is an effective...Intelligent sorting is an important prerequisite for the full quantitative consumption and harmless disposal of kitchen waste.The existing object detection method based on an ImageNet pre-trained model is an effective way of sorting.Owing to significant domain gaps between natural images and kitchen waste images,it is difficult to reflect the characteristics of diverse scales and dense distribution in kitchen waste based on an ImageNet pre-trained model,leading to poor generalisation.In this article,the authors propose the first pre-trained model for kitchen waste sorting called KitWaSor,which combines both contrastive learning(CL)and masked image modelling(MIM)through self-supervised learning(SSL).First,to address the issue of diverse scales,the authors propose a mixed masking strategy by introducing an incomplete masking branch based on the original random masking branch.It prevents the complete loss of small-scale objects while avoiding excessive leakage of large-scale object pixels.Second,to address the issue of dense distribution,the authors introduce semantic consistency constraints on the basis of the mixed masking strategy.That is,object semantic reasoning is performed through semantic consistency constraints to compensate for the lack of contextual information.To train KitWaSor,the authors construct the first million-level kitchen waste dataset across seasonal and regional distributions,named KWD-Million.Extensive experiments show that KitWaSor achieves state-of-the-art(SOTA)performance on the two most relevant downstream tasks for kitchen waste sorting(i.e.image classification and object detection),demonstrating the effectiveness of the proposed KitWaSor.展开更多
With current success of large-scale pre-trained models(PTMs),how efficiently adapting PTMs to downstream tasks has attracted tremendous attention,especially for PTMs with billions of parameters.Previous work focuses o...With current success of large-scale pre-trained models(PTMs),how efficiently adapting PTMs to downstream tasks has attracted tremendous attention,especially for PTMs with billions of parameters.Previous work focuses on designing parameter-efficient tuning paradigms but needs to save and compute the gradient of the whole computational graph.In this paper,we propose y-Tuning,an efficient yet effective paradigm to adapt frozen large-scale PTMs to specific downstream tasks.y-Tuning learns dense representations for labels y defined in a given task and aligns them to fixed feature representation.Without computing the gradients of text encoder at training phrase,y-Tuning is not only parameterefficient but also training-efficient.Experimental results show that for DeBERTaxxL with 1.6 billion parameters,y-Tuning achieves performance more than 96%of full fine-tuning on GLUE Benchmark with only 2%tunable parameters and much fewer training costs.展开更多
In the process of constructing domain-specific knowledge graphs,the task of relational triple extraction plays a critical role in transforming unstructured text into structured information.Existing relational triple e...In the process of constructing domain-specific knowledge graphs,the task of relational triple extraction plays a critical role in transforming unstructured text into structured information.Existing relational triple extraction models facemultiple challenges when processing domain-specific data,including insufficient utilization of semantic interaction information between entities and relations,difficulties in handling challenging samples,and the scarcity of domain-specific datasets.To address these issues,our study introduces three innovative components:Relation semantic enhancement,data augmentation,and a voting strategy,all designed to significantly improve the model’s performance in tackling domain-specific relational triple extraction tasks.We first propose an innovative attention interaction module.This method significantly enhances the semantic interaction capabilities between entities and relations by integrating semantic information fromrelation labels.Second,we propose a voting strategy that effectively combines the strengths of large languagemodels(LLMs)and fine-tuned small pre-trained language models(SLMs)to reevaluate challenging samples,thereby improving the model’s adaptability in specific domains.Additionally,we explore the use of LLMs for data augmentation,aiming to generate domain-specific datasets to alleviate the scarcity of domain data.Experiments conducted on three domain-specific datasets demonstrate that our model outperforms existing comparative models in several aspects,with F1 scores exceeding the State of the Art models by 2%,1.6%,and 0.6%,respectively,validating the effectiveness and generalizability of our approach.展开更多
We present an approach to classify medical text at a sentence level automatically.Given the inherent complexity of medical text classification,we employ adapters based on pre-trained language models to extract informa...We present an approach to classify medical text at a sentence level automatically.Given the inherent complexity of medical text classification,we employ adapters based on pre-trained language models to extract information from medical text,facilitating more accurate classification while minimizing the number of trainable parameters.Extensive experiments conducted on various datasets demonstrate the effectiveness of our approach.展开更多
Named Entity Recognition(NER)is crucial for extracting structured information from text.While traditional methods rely on rules,Conditional Random Fields(CRFs),or deep learning,the advent of large-scale Pre-trained La...Named Entity Recognition(NER)is crucial for extracting structured information from text.While traditional methods rely on rules,Conditional Random Fields(CRFs),or deep learning,the advent of large-scale Pre-trained Language Models(PLMs)offers new possibilities.PLMs excel at contextual learning,potentially simplifying many natural language processing tasks.However,their application to NER remains underexplored.This paper investigates leveraging the GPT-3 PLM for NER without fine-tuning.We propose a novel scheme that utilizes carefully crafted templates and context examples selected based on semantic similarity.Our experimental results demonstrate the feasibility of this approach,suggesting a promising direction for harnessing PLMs in NER.展开更多
The Coronavirus Disease 2019(COVID-19)is wreaking havoc around the world,bring out that the enormous pressure on national health and medical staff systems.One of the most effective and critical steps in the fight agai...The Coronavirus Disease 2019(COVID-19)is wreaking havoc around the world,bring out that the enormous pressure on national health and medical staff systems.One of the most effective and critical steps in the fight against COVID-19,is to examine the patient’s lungs based on the Chest X-ray and CT generated by radiation imaging.In this paper,five keras-related deep learning models:ResNet50,InceptionResNetV2,Xception,transfer learning and pre-trained VGGNet16 is applied to formulate an classification-detection approaches of COVID-19.Two benchmark methods SVM(Support Vector Machine),CNN(Conventional Neural Networks)are provided to compare with the classification-detection approaches based on the performance indicators,i.e.,precision,recall,F1 scores,confusion matrix,classification accuracy and three types of AUC(Area Under Curve).The highest classification accuracy derived by classification-detection based on 5857 Chest X-rays and 767 Chest CTs are respectively 84%and 75%,which shows that the keras-related deep learning approaches facilitate accurate and effective COVID-19-assisted detection.展开更多
This study was aimed to prepare landslide susceptibility maps for the Pithoragarh district in Uttarakhand,India,using advanced ensemble models that combined Radial Basis Function Networks(RBFN)with three ensemble lear...This study was aimed to prepare landslide susceptibility maps for the Pithoragarh district in Uttarakhand,India,using advanced ensemble models that combined Radial Basis Function Networks(RBFN)with three ensemble learning techniques:DAGGING(DG),MULTIBOOST(MB),and ADABOOST(AB).This combination resulted in three distinct ensemble models:DG-RBFN,MB-RBFN,and AB-RBFN.Additionally,a traditional weighted method,Information Value(IV),and a benchmark machine learning(ML)model,Multilayer Perceptron Neural Network(MLP),were employed for comparison and validation.The models were developed using ten landslide conditioning factors,which included slope,aspect,elevation,curvature,land cover,geomorphology,overburden depth,lithology,distance to rivers and distance to roads.These factors were instrumental in predicting the output variable,which was the probability of landslide occurrence.Statistical analysis of the models’performance indicated that the DG-RBFN model,with an Area Under ROC Curve(AUC)of 0.931,outperformed the other models.The AB-RBFN model achieved an AUC of 0.929,the MB-RBFN model had an AUC of 0.913,and the MLP model recorded an AUC of 0.926.These results suggest that the advanced ensemble ML model DG-RBFN was more accurate than traditional statistical model,single MLP model,and other ensemble models in preparing trustworthy landslide susceptibility maps,thereby enhancing land use planning and decision-making.展开更多
Conducting predictability studies is essential for tracing the source of forecast errors,which not only leads to the improvement of observation and forecasting systems,but also enhances the understanding of weather an...Conducting predictability studies is essential for tracing the source of forecast errors,which not only leads to the improvement of observation and forecasting systems,but also enhances the understanding of weather and climate phenomena.In the past few decades,dynamical numerical models have been the primary tools for predictability studies,achieving significant progress.Nowadays,with the advances in artificial intelligence(AI)techniques and accumulations of vast meteorological data,modeling weather and climate events using modern data-driven approaches is becoming trendy,where FourCastNet,Pangu-Weather,and GraphCast are successful pioneers.In this perspective article,we suggest AI models should not be limited to forecasting but be expanded to predictability studies,leveraging AI's advantages of high efficiency and self-contained optimization modules.To this end,we first remark that AI models should possess high simulation capability with fine spatiotemporal resolution for two kinds of predictability studies.AI models with high simulation capabilities comparable to numerical models can be considered to provide solutions to partial differential equations in a data-driven way.Then,we highlight several specific predictability issues with well-determined nonlinear optimization formulizations,which can be well-studied using AI models,holding significant scientific value.In addition,we advocate for the incorporation of AI models into the synergistic cycle of the cognition–observation–model paradigm.Comprehensive predictability studies have the potential to transform“big data”to“big and better data”and shift the focus from“AI for forecasts”to“AI for science”,ultimately advancing the development of the atmospheric and oceanic sciences.展开更多
We propose an integrated method of data-driven and mechanism models for well logging formation evaluation,explicitly focusing on predicting reservoir parameters,such as porosity and water saturation.Accurately interpr...We propose an integrated method of data-driven and mechanism models for well logging formation evaluation,explicitly focusing on predicting reservoir parameters,such as porosity and water saturation.Accurately interpreting these parameters is crucial for effectively exploring and developing oil and gas.However,with the increasing complexity of geological conditions in this industry,there is a growing demand for improved accuracy in reservoir parameter prediction,leading to higher costs associated with manual interpretation.The conventional logging interpretation methods rely on empirical relationships between logging data and reservoir parameters,which suffer from low interpretation efficiency,intense subjectivity,and suitability for ideal conditions.The application of artificial intelligence in the interpretation of logging data provides a new solution to the problems existing in traditional methods.It is expected to improve the accuracy and efficiency of the interpretation.If large and high-quality datasets exist,data-driven models can reveal relationships of arbitrary complexity.Nevertheless,constructing sufficiently large logging datasets with reliable labels remains challenging,making it difficult to apply data-driven models effectively in logging data interpretation.Furthermore,data-driven models often act as“black boxes”without explaining their predictions or ensuring compliance with primary physical constraints.This paper proposes a machine learning method with strong physical constraints by integrating mechanism and data-driven models.Prior knowledge of logging data interpretation is embedded into machine learning regarding network structure,loss function,and optimization algorithm.We employ the Physically Informed Auto-Encoder(PIAE)to predict porosity and water saturation,which can be trained without labeled reservoir parameters using self-supervised learning techniques.This approach effectively achieves automated interpretation and facilitates generalization across diverse datasets.展开更多
Large language models(LLMs)have undergone significant expansion and have been increasingly integrated across various domains.Notably,in the realm of robot task planning,LLMs harness their advanced reasoning and langua...Large language models(LLMs)have undergone significant expansion and have been increasingly integrated across various domains.Notably,in the realm of robot task planning,LLMs harness their advanced reasoning and language comprehension capabilities to formulate precise and efficient action plans based on natural language instructions.However,for embodied tasks,where robots interact with complex environments,textonly LLMs often face challenges due to a lack of compatibility with robotic visual perception.This study provides a comprehensive overview of the emerging integration of LLMs and multimodal LLMs into various robotic tasks.Additionally,we propose a framework that utilizes multimodal GPT-4V to enhance embodied task planning through the combination of natural language instructions and robot visual perceptions.Our results,based on diverse datasets,indicate that GPT-4V effectively enhances robot performance in embodied tasks.This extensive survey and evaluation of LLMs and multimodal LLMs across a variety of robotic tasks enriches the understanding of LLM-centric embodied intelligence and provides forward-looking insights towards bridging the gap in Human-Robot-Environment interaction.展开更多
Developing sensorless techniques for estimating battery expansion is essential for effective mechanical state monitoring,improving the accuracy of digital twin simulation and abnormality detection.Therefore,this paper...Developing sensorless techniques for estimating battery expansion is essential for effective mechanical state monitoring,improving the accuracy of digital twin simulation and abnormality detection.Therefore,this paper presents a data-driven approach to expansion estimation using electromechanical coupled models with machine learning.The proposed method integrates reduced-order impedance models with data-driven mechanical models,coupling the electrochemical and mechanical states through the state of charge(SOC)and mechanical pressure within a state estimation framework.The coupling relationship was established through experimental insights into pressure-related impedance parameters and the nonlinear mechanical behavior with SOC and pressure.The data-driven model was interpreted by introducing a novel swelling coefficient defined by component stiffnesses to capture the nonlinear mechanical behavior across various mechanical constraints.Sensitivity analysis of the impedance model shows that updating model parameters with pressure can reduce the mean absolute error of simulated voltage by 20 mV and SOC estimation error by 2%.The results demonstrate the model's estimation capabilities,achieving a root mean square error of less than 1 kPa when the maximum expansion force is from 30 kPa to 120 kPa,outperforming calibrated stiffness models and other machine learning techniques.The model's robustness and generalizability are further supported by its effective handling of SOC estimation and pressure measurement errors.This work highlights the importance of the proposed framework in enhancing state estimation and fault diagnosis for lithium-ion batteries.展开更多
Influenced by complex external factors,the displacement-time curve of reservoir landslides demonstrates both short-term and long-term diversity and dynamic complexity.It is difficult for existing methods,including Reg...Influenced by complex external factors,the displacement-time curve of reservoir landslides demonstrates both short-term and long-term diversity and dynamic complexity.It is difficult for existing methods,including Regression models and Neural network models,to perform multi-characteristic coupled displacement prediction because they fail to consider landslide creep characteristics.This paper integrates the creep characteristics of landslides with non-linear intelligent algorithms and proposes a dynamic intelligent landslide displacement prediction method based on a combination of the Biological Growth model(BG),Convolutional Neural Network(CNN),and Long ShortTerm Memory Network(LSTM).This prediction approach improves three different biological growth models,thereby effectively extracting landslide creep characteristic parameters.Simultaneously,it integrates external factors(rainfall and reservoir water level)to construct an internal and external comprehensive dataset for data augmentation,which is input into the improved CNN-LSTM model.Thereafter,harnessing the robust feature extraction capabilities and spatial translation invariance of CNN,the model autonomously captures short-term local fluctuation characteristics of landslide displacement,and combines LSTM's efficient handling of long-term nonlinear temporal data to improve prediction performance.An evaluation of the Liangshuijing landslide in the Three Gorges Reservoir Area indicates that BG-CNN-LSTM exhibits high prediction accuracy,excellent generalization capabilities when dealing with various types of landslides.The research provides an innovative approach to achieving the whole-process,realtime,high-precision displacement predictions for multicharacteristic coupled landslides.展开更多
Depressive disorder is a chronic,recurring,and potentially life-endangering neuropsychiatric disease.According to a report by the World Health Organization,the global population suffering from depression is experienci...Depressive disorder is a chronic,recurring,and potentially life-endangering neuropsychiatric disease.According to a report by the World Health Organization,the global population suffering from depression is experiencing a significant annual increase.Despite its prevalence and considerable impact on people,little is known about its pathogenesis.One major reason is the scarcity of reliable animal models due to the absence of consensus on the pathology and etiology of depression.Furthermore,the neural circuit mechanism of depression induced by various factors is particularly complex.Considering the variability in depressive behavior patterns and neurobiological mechanisms among different animal models of depression,a comparison between the neural circuits of depression induced by various factors is essential for its treatment.In this review,we mainly summarize the most widely used behavioral animal models and neural circuits under different triggers of depression,aiming to provide a theoretical basis for depression prevention.展开更多
Sporadic E(Es)layers in the ionosphere are characterized by intense plasma irregularities in the E region at altitudes of 90-130 km.Because they can significantly influence radio communications and navigation systems,...Sporadic E(Es)layers in the ionosphere are characterized by intense plasma irregularities in the E region at altitudes of 90-130 km.Because they can significantly influence radio communications and navigation systems,accurate forecasting of Es layers is crucial for ensuring the precision and dependability of navigation satellite systems.In this study,we present Es predictions made by an empirical model and by a deep learning model,and analyze their differences comprehensively by comparing the model predictions to satellite RO measurements and ground-based ionosonde observations.The deep learning model exhibited significantly better performance,as indicated by its high coefficient of correlation(r=0.87)with RO observations and predictions,than did the empirical model(r=0.53).This study highlights the importance of integrating artificial intelligence technology into ionosphere modelling generally,and into predicting Es layer occurrences and characteristics,in particular.展开更多
Recently,tool learning with large language models(LLMs)has emerged as a promising paradigm for augmenting the capabilities of LLMs to tackle highly complex problems.Despite growing attention and rapid advancements in ...Recently,tool learning with large language models(LLMs)has emerged as a promising paradigm for augmenting the capabilities of LLMs to tackle highly complex problems.Despite growing attention and rapid advancements in this field,the existing literature remains fragmented and lacks systematic organization,posing barriers to entry for newcomers.This gap motivates us to conduct a comprehensive survey of existing works on tool learning with LLMs.In this survey,we focus on reviewing existing literature from the two primary aspects(1)why tool learning is beneficial and(2)how tool learning is implemented,enabling a comprehensive understanding of tool learning with LLMs.We first explore the“why”by reviewing both the benefits of tool integration and the inherent benefits of the tool learning paradigm from six specific aspects.In terms of“how”,we systematically review the literature according to a taxonomy of four key stages in the tool learning workflow:task planning,tool selection,tool calling,and response generation.Additionally,we provide a detailed summary of existing benchmarks and evaluation methods,categorizing them according to their relevance to different stages.Finally,we discuss current challenges and outline potential future directions,aiming to inspire both researchers and industrial developers to further explore this emerging and promising area.展开更多
Background:Due to the widespread use of cell phone devices today,numerous re-search studies have focused on the adverse effects of electromagnetic radiation on human neuropsychological and reproductive systems.In most...Background:Due to the widespread use of cell phone devices today,numerous re-search studies have focused on the adverse effects of electromagnetic radiation on human neuropsychological and reproductive systems.In most studies,oxidative stress has been identified as the primary pathophysiological mechanism underlying the harmful effects of electromagnetic waves.This paper aims to provide a holistic review of the protective effects of melatonin against cell phone-induced electromag-netic waves on various organs.Methods:This study is a systematic review of articles chosen by searching Google Scholar,PubMed,Embase,Scopus,Web of Science,and Science Direct using the key-words‘melatonin’,‘cell phone radiation’,and‘animal model’.The search focused on articles written in English,which were reviewed and evaluated.The PRISMA process was used to review the articles chosen for the study,and the JBI checklist was used to check the quality of the reviewed articles.Results:In the final review of 11 valid quality-checked articles,the effects of me-latonin in the intervention group,the effects of electromagnetic waves in the case group,and the amount of melatonin in the chosen organ,i.e.brain,skin,eyes,testis and the kidney were thoroughly examined.The review showed that electromagnetic waves increase cellular anti-oxidative activity in different tissues such as the brain,the skin,the eyes,the testis,and the kidneys.Melatonin can considerably augment the anti-oxidative system of cells and protect tissues;these measurements were sig-nificantly increased in control groups.Electromagnetic waves can induce tissue atro-phy and cell death in various organs including the brain and the skin and this effect was highly decreased by melatonin.Conclusion:Our review confirms that melatonin effectively protects the organs of an-imal models against electromagnetic waves.In light of this conclusion and the current world-wide use of melatonin,future studies should advance to the stages of human clinical trials.We also recommend that more research in the field of melatonin physi-ology is conducted in order to protect exposed cells from dying and that melatonin should be considered as a pharmaceutical option for treating the complications result-ing from electromagnetic waves in humans.展开更多
文摘Sentiment analysis,a cornerstone of natural language processing,has witnessed remarkable advancements driven by deep learning models which demonstrated impressive accuracy in discerning sentiment from text across various domains.However,the deployment of such models in resource-constrained environments presents a unique set of challenges that require innovative solutions.Resource-constrained environments encompass scenarios where computing resources,memory,and energy availability are restricted.To empower sentiment analysis in resource-constrained environments,we address the crucial need by leveraging lightweight pre-trained models.These models,derived from popular architectures such as DistilBERT,MobileBERT,ALBERT,TinyBERT,ELECTRA,and SqueezeBERT,offer a promising solution to the resource limitations imposed by these environments.By distilling the knowledge from larger models into smaller ones and employing various optimization techniques,these lightweight models aim to strike a balance between performance and resource efficiency.This paper endeavors to explore the performance of multiple lightweight pre-trained models in sentiment analysis tasks specific to such environments and provide insights into their viability for practical deployment.
文摘Pneumonia is an acute lung infection that has caused many fatalitiesglobally. Radiologists often employ chest X-rays to identify pneumoniasince they are presently the most effective imaging method for this purpose.Computer-aided diagnosis of pneumonia using deep learning techniques iswidely used due to its effectiveness and performance. In the proposed method,the Synthetic Minority Oversampling Technique (SMOTE) approach is usedto eliminate the class imbalance in the X-ray dataset. To compensate forthe paucity of accessible data, pre-trained transfer learning is used, and anensemble Convolutional Neural Network (CNN) model is developed. Theensemble model consists of all possible combinations of the MobileNetv2,Visual Geometry Group (VGG16), and DenseNet169 models. MobileNetV2and DenseNet169 performed well in the Single classifier model, with anaccuracy of 94%, while the ensemble model (MobileNetV2+DenseNet169)achieved an accuracy of 96.9%. Using the data synchronous parallel modelin Distributed Tensorflow, the training process accelerated performance by98.6% and outperformed other conventional approaches.
文摘We analyze the suitability of existing pre-trained transformer-based language models(PLMs)for abstractive text summarization on German technical healthcare texts.The study focuses on the multilingual capabilities of these models and their ability to perform the task of abstractive text summarization in the healthcare field.The research hypothesis was that large language models could perform high-quality abstractive text summarization on German technical healthcare texts,even if the model is not specifically trained in that language.Through experiments,the research questions explore the performance of transformer language models in dealing with complex syntax constructs,the difference in performance between models trained in English and German,and the impact of translating the source text to English before conducting the summarization.We conducted an evaluation of four PLMs(GPT-3,a translation-based approach also utilizing GPT-3,a German language Model,and a domain-specific bio-medical model approach).The evaluation considered the informativeness using 3 types of metrics based on Recall-Oriented Understudy for Gisting Evaluation(ROUGE)and the quality of results which is manually evaluated considering 5 aspects.The results show that text summarization models could be used in the German healthcare domain and that domain-independent language models achieved the best results.The study proves that text summarization models can simplify the search for pre-existing German knowledge in various domains.
基金supported by the Bill & Melinda Gates Foundation and the Minderoo Foundation
文摘Predicting cross-immunity between viral strains is vital for public health surveillance and vaccine development.Traditional neural network methods,such as BiLSTM,could be ineffective due to the lack of lab data for model training and the overshadowing of crucial features within sequence concatenation.The current work proposes a less data-consuming model incorporating a pre-trained gene sequence model and a mutual information inference operator.Our methodology utilizes gene alignment and deduplication algorithms to preprocess gene sequences,enhancing the model’s capacity to discern and focus on distinctions among input gene pairs.The model,i.e.,DNA Pretrained Cross-Immunity Protection Inference model(DPCIPI),outperforms state-of-theart(SOTA)models in predicting hemagglutination inhibition titer from influenza viral gene sequences only.Improvement in binary cross-immunity prediction is 1.58%in F1,2.34%in precision,1.57%in recall,and 1.57%in Accuracy.For multilevel cross-immunity improvements,the improvement is 2.12%in F1,3.50%in precision,2.19%in recall,and 2.19%in Accuracy.Our study showcases the potential of pre-trained gene models to improve predictions of antigenic variation and cross-immunity.With expanding gene data and advancements in pre-trained models,this approach promises significant impacts on vaccine development and public health.
基金National Key Research and Development Program of China,Grant/Award Number:2021YFC1910402。
文摘Intelligent sorting is an important prerequisite for the full quantitative consumption and harmless disposal of kitchen waste.The existing object detection method based on an ImageNet pre-trained model is an effective way of sorting.Owing to significant domain gaps between natural images and kitchen waste images,it is difficult to reflect the characteristics of diverse scales and dense distribution in kitchen waste based on an ImageNet pre-trained model,leading to poor generalisation.In this article,the authors propose the first pre-trained model for kitchen waste sorting called KitWaSor,which combines both contrastive learning(CL)and masked image modelling(MIM)through self-supervised learning(SSL).First,to address the issue of diverse scales,the authors propose a mixed masking strategy by introducing an incomplete masking branch based on the original random masking branch.It prevents the complete loss of small-scale objects while avoiding excessive leakage of large-scale object pixels.Second,to address the issue of dense distribution,the authors introduce semantic consistency constraints on the basis of the mixed masking strategy.That is,object semantic reasoning is performed through semantic consistency constraints to compensate for the lack of contextual information.To train KitWaSor,the authors construct the first million-level kitchen waste dataset across seasonal and regional distributions,named KWD-Million.Extensive experiments show that KitWaSor achieves state-of-the-art(SOTA)performance on the two most relevant downstream tasks for kitchen waste sorting(i.e.image classification and object detection),demonstrating the effectiveness of the proposed KitWaSor.
基金National Key R&D Program of China(No.2020AAA0108702)National Natural Science Foundation of China(Grant No.62022027).
文摘With current success of large-scale pre-trained models(PTMs),how efficiently adapting PTMs to downstream tasks has attracted tremendous attention,especially for PTMs with billions of parameters.Previous work focuses on designing parameter-efficient tuning paradigms but needs to save and compute the gradient of the whole computational graph.In this paper,we propose y-Tuning,an efficient yet effective paradigm to adapt frozen large-scale PTMs to specific downstream tasks.y-Tuning learns dense representations for labels y defined in a given task and aligns them to fixed feature representation.Without computing the gradients of text encoder at training phrase,y-Tuning is not only parameterefficient but also training-efficient.Experimental results show that for DeBERTaxxL with 1.6 billion parameters,y-Tuning achieves performance more than 96%of full fine-tuning on GLUE Benchmark with only 2%tunable parameters and much fewer training costs.
基金Science and Technology Innovation 2030-Major Project of“New Generation Artificial Intelligence”granted by Ministry of Science and Technology,Grant Number 2020AAA0109300.
文摘In the process of constructing domain-specific knowledge graphs,the task of relational triple extraction plays a critical role in transforming unstructured text into structured information.Existing relational triple extraction models facemultiple challenges when processing domain-specific data,including insufficient utilization of semantic interaction information between entities and relations,difficulties in handling challenging samples,and the scarcity of domain-specific datasets.To address these issues,our study introduces three innovative components:Relation semantic enhancement,data augmentation,and a voting strategy,all designed to significantly improve the model’s performance in tackling domain-specific relational triple extraction tasks.We first propose an innovative attention interaction module.This method significantly enhances the semantic interaction capabilities between entities and relations by integrating semantic information fromrelation labels.Second,we propose a voting strategy that effectively combines the strengths of large languagemodels(LLMs)and fine-tuned small pre-trained language models(SLMs)to reevaluate challenging samples,thereby improving the model’s adaptability in specific domains.Additionally,we explore the use of LLMs for data augmentation,aiming to generate domain-specific datasets to alleviate the scarcity of domain data.Experiments conducted on three domain-specific datasets demonstrate that our model outperforms existing comparative models in several aspects,with F1 scores exceeding the State of the Art models by 2%,1.6%,and 0.6%,respectively,validating the effectiveness and generalizability of our approach.
文摘We present an approach to classify medical text at a sentence level automatically.Given the inherent complexity of medical text classification,we employ adapters based on pre-trained language models to extract information from medical text,facilitating more accurate classification while minimizing the number of trainable parameters.Extensive experiments conducted on various datasets demonstrate the effectiveness of our approach.
文摘Named Entity Recognition(NER)is crucial for extracting structured information from text.While traditional methods rely on rules,Conditional Random Fields(CRFs),or deep learning,the advent of large-scale Pre-trained Language Models(PLMs)offers new possibilities.PLMs excel at contextual learning,potentially simplifying many natural language processing tasks.However,their application to NER remains underexplored.This paper investigates leveraging the GPT-3 PLM for NER without fine-tuning.We propose a novel scheme that utilizes carefully crafted templates and context examples selected based on semantic similarity.Our experimental results demonstrate the feasibility of this approach,suggesting a promising direction for harnessing PLMs in NER.
基金This project is supported by National Natural Science Foundation of China(NSFC)(Nos.61902158,61806087)Graduate student innovation program for academic degrees in general university in Jiangsu Province(No.KYZZ16-0337).
文摘The Coronavirus Disease 2019(COVID-19)is wreaking havoc around the world,bring out that the enormous pressure on national health and medical staff systems.One of the most effective and critical steps in the fight against COVID-19,is to examine the patient’s lungs based on the Chest X-ray and CT generated by radiation imaging.In this paper,five keras-related deep learning models:ResNet50,InceptionResNetV2,Xception,transfer learning and pre-trained VGGNet16 is applied to formulate an classification-detection approaches of COVID-19.Two benchmark methods SVM(Support Vector Machine),CNN(Conventional Neural Networks)are provided to compare with the classification-detection approaches based on the performance indicators,i.e.,precision,recall,F1 scores,confusion matrix,classification accuracy and three types of AUC(Area Under Curve).The highest classification accuracy derived by classification-detection based on 5857 Chest X-rays and 767 Chest CTs are respectively 84%and 75%,which shows that the keras-related deep learning approaches facilitate accurate and effective COVID-19-assisted detection.
基金the University of Transport Technology under the project entitled“Application of Machine Learning Algorithms in Landslide Susceptibility Mapping in Mountainous Areas”with grant number DTTD2022-16.
文摘This study was aimed to prepare landslide susceptibility maps for the Pithoragarh district in Uttarakhand,India,using advanced ensemble models that combined Radial Basis Function Networks(RBFN)with three ensemble learning techniques:DAGGING(DG),MULTIBOOST(MB),and ADABOOST(AB).This combination resulted in three distinct ensemble models:DG-RBFN,MB-RBFN,and AB-RBFN.Additionally,a traditional weighted method,Information Value(IV),and a benchmark machine learning(ML)model,Multilayer Perceptron Neural Network(MLP),were employed for comparison and validation.The models were developed using ten landslide conditioning factors,which included slope,aspect,elevation,curvature,land cover,geomorphology,overburden depth,lithology,distance to rivers and distance to roads.These factors were instrumental in predicting the output variable,which was the probability of landslide occurrence.Statistical analysis of the models’performance indicated that the DG-RBFN model,with an Area Under ROC Curve(AUC)of 0.931,outperformed the other models.The AB-RBFN model achieved an AUC of 0.929,the MB-RBFN model had an AUC of 0.913,and the MLP model recorded an AUC of 0.926.These results suggest that the advanced ensemble ML model DG-RBFN was more accurate than traditional statistical model,single MLP model,and other ensemble models in preparing trustworthy landslide susceptibility maps,thereby enhancing land use planning and decision-making.
基金in part supported by the National Natural Science Foundation of China(Grant Nos.42288101,42405147 and 42475054)in part by the China National Postdoctoral Program for Innovative Talents(Grant No.BX20230071)。
文摘Conducting predictability studies is essential for tracing the source of forecast errors,which not only leads to the improvement of observation and forecasting systems,but also enhances the understanding of weather and climate phenomena.In the past few decades,dynamical numerical models have been the primary tools for predictability studies,achieving significant progress.Nowadays,with the advances in artificial intelligence(AI)techniques and accumulations of vast meteorological data,modeling weather and climate events using modern data-driven approaches is becoming trendy,where FourCastNet,Pangu-Weather,and GraphCast are successful pioneers.In this perspective article,we suggest AI models should not be limited to forecasting but be expanded to predictability studies,leveraging AI's advantages of high efficiency and self-contained optimization modules.To this end,we first remark that AI models should possess high simulation capability with fine spatiotemporal resolution for two kinds of predictability studies.AI models with high simulation capabilities comparable to numerical models can be considered to provide solutions to partial differential equations in a data-driven way.Then,we highlight several specific predictability issues with well-determined nonlinear optimization formulizations,which can be well-studied using AI models,holding significant scientific value.In addition,we advocate for the incorporation of AI models into the synergistic cycle of the cognition–observation–model paradigm.Comprehensive predictability studies have the potential to transform“big data”to“big and better data”and shift the focus from“AI for forecasts”to“AI for science”,ultimately advancing the development of the atmospheric and oceanic sciences.
基金supported by National Key Research and Development Program (2019YFA0708301)National Natural Science Foundation of China (51974337)+2 种基金the Strategic Cooperation Projects of CNPC and CUPB (ZLZX2020-03)Science and Technology Innovation Fund of CNPC (2021DQ02-0403)Open Fund of Petroleum Exploration and Development Research Institute of CNPC (2022-KFKT-09)
文摘We propose an integrated method of data-driven and mechanism models for well logging formation evaluation,explicitly focusing on predicting reservoir parameters,such as porosity and water saturation.Accurately interpreting these parameters is crucial for effectively exploring and developing oil and gas.However,with the increasing complexity of geological conditions in this industry,there is a growing demand for improved accuracy in reservoir parameter prediction,leading to higher costs associated with manual interpretation.The conventional logging interpretation methods rely on empirical relationships between logging data and reservoir parameters,which suffer from low interpretation efficiency,intense subjectivity,and suitability for ideal conditions.The application of artificial intelligence in the interpretation of logging data provides a new solution to the problems existing in traditional methods.It is expected to improve the accuracy and efficiency of the interpretation.If large and high-quality datasets exist,data-driven models can reveal relationships of arbitrary complexity.Nevertheless,constructing sufficiently large logging datasets with reliable labels remains challenging,making it difficult to apply data-driven models effectively in logging data interpretation.Furthermore,data-driven models often act as“black boxes”without explaining their predictions or ensuring compliance with primary physical constraints.This paper proposes a machine learning method with strong physical constraints by integrating mechanism and data-driven models.Prior knowledge of logging data interpretation is embedded into machine learning regarding network structure,loss function,and optimization algorithm.We employ the Physically Informed Auto-Encoder(PIAE)to predict porosity and water saturation,which can be trained without labeled reservoir parameters using self-supervised learning techniques.This approach effectively achieves automated interpretation and facilitates generalization across diverse datasets.
基金supported by National Natural Science Foundation of China(62376219 and 62006194)Foundational Research Project in Specialized Discipline(Grant No.G2024WD0146)Faculty Construction Project(Grant No.24GH0201148).
文摘Large language models(LLMs)have undergone significant expansion and have been increasingly integrated across various domains.Notably,in the realm of robot task planning,LLMs harness their advanced reasoning and language comprehension capabilities to formulate precise and efficient action plans based on natural language instructions.However,for embodied tasks,where robots interact with complex environments,textonly LLMs often face challenges due to a lack of compatibility with robotic visual perception.This study provides a comprehensive overview of the emerging integration of LLMs and multimodal LLMs into various robotic tasks.Additionally,we propose a framework that utilizes multimodal GPT-4V to enhance embodied task planning through the combination of natural language instructions and robot visual perceptions.Our results,based on diverse datasets,indicate that GPT-4V effectively enhances robot performance in embodied tasks.This extensive survey and evaluation of LLMs and multimodal LLMs across a variety of robotic tasks enriches the understanding of LLM-centric embodied intelligence and provides forward-looking insights towards bridging the gap in Human-Robot-Environment interaction.
基金Fund supported this work for Excellent Youth Scholars of China(Grant No.52222708)the National Natural Science Foundation of China(Grant No.51977007)+1 种基金Part of this work is supported by the research project“SPEED”(03XP0585)at RWTH Aachen Universityfunded by the German Federal Ministry of Education and Research(BMBF)。
文摘Developing sensorless techniques for estimating battery expansion is essential for effective mechanical state monitoring,improving the accuracy of digital twin simulation and abnormality detection.Therefore,this paper presents a data-driven approach to expansion estimation using electromechanical coupled models with machine learning.The proposed method integrates reduced-order impedance models with data-driven mechanical models,coupling the electrochemical and mechanical states through the state of charge(SOC)and mechanical pressure within a state estimation framework.The coupling relationship was established through experimental insights into pressure-related impedance parameters and the nonlinear mechanical behavior with SOC and pressure.The data-driven model was interpreted by introducing a novel swelling coefficient defined by component stiffnesses to capture the nonlinear mechanical behavior across various mechanical constraints.Sensitivity analysis of the impedance model shows that updating model parameters with pressure can reduce the mean absolute error of simulated voltage by 20 mV and SOC estimation error by 2%.The results demonstrate the model's estimation capabilities,achieving a root mean square error of less than 1 kPa when the maximum expansion force is from 30 kPa to 120 kPa,outperforming calibrated stiffness models and other machine learning techniques.The model's robustness and generalizability are further supported by its effective handling of SOC estimation and pressure measurement errors.This work highlights the importance of the proposed framework in enhancing state estimation and fault diagnosis for lithium-ion batteries.
基金the funding support from the National Natural Science Foundation of China(Grant No.52308340)Chongqing Talent Innovation and Entrepreneurship Demonstration Team Project(Grant No.cstc2024ycjh-bgzxm0012)the Science and Technology Projects supported by China Coal Technology and Engineering Chongqing Design and Research Institute(Group)Co.,Ltd..(Grant No.H20230317)。
文摘Influenced by complex external factors,the displacement-time curve of reservoir landslides demonstrates both short-term and long-term diversity and dynamic complexity.It is difficult for existing methods,including Regression models and Neural network models,to perform multi-characteristic coupled displacement prediction because they fail to consider landslide creep characteristics.This paper integrates the creep characteristics of landslides with non-linear intelligent algorithms and proposes a dynamic intelligent landslide displacement prediction method based on a combination of the Biological Growth model(BG),Convolutional Neural Network(CNN),and Long ShortTerm Memory Network(LSTM).This prediction approach improves three different biological growth models,thereby effectively extracting landslide creep characteristic parameters.Simultaneously,it integrates external factors(rainfall and reservoir water level)to construct an internal and external comprehensive dataset for data augmentation,which is input into the improved CNN-LSTM model.Thereafter,harnessing the robust feature extraction capabilities and spatial translation invariance of CNN,the model autonomously captures short-term local fluctuation characteristics of landslide displacement,and combines LSTM's efficient handling of long-term nonlinear temporal data to improve prediction performance.An evaluation of the Liangshuijing landslide in the Three Gorges Reservoir Area indicates that BG-CNN-LSTM exhibits high prediction accuracy,excellent generalization capabilities when dealing with various types of landslides.The research provides an innovative approach to achieving the whole-process,realtime,high-precision displacement predictions for multicharacteristic coupled landslides.
基金supported by the Brain&Behavior Research Foundation(30233).
文摘Depressive disorder is a chronic,recurring,and potentially life-endangering neuropsychiatric disease.According to a report by the World Health Organization,the global population suffering from depression is experiencing a significant annual increase.Despite its prevalence and considerable impact on people,little is known about its pathogenesis.One major reason is the scarcity of reliable animal models due to the absence of consensus on the pathology and etiology of depression.Furthermore,the neural circuit mechanism of depression induced by various factors is particularly complex.Considering the variability in depressive behavior patterns and neurobiological mechanisms among different animal models of depression,a comparison between the neural circuits of depression induced by various factors is essential for its treatment.In this review,we mainly summarize the most widely used behavioral animal models and neural circuits under different triggers of depression,aiming to provide a theoretical basis for depression prevention.
基金supported by the Project of Stable Support for Youth Team in Basic Research Field,CAS(grant No.YSBR-018)the National Natural Science Foundation of China(grant Nos.42188101,42130204)+4 种基金the B-type Strategic Priority Program of CAS(grant no.XDB41000000)the National Natural Science Foundation of China(NSFC)Distinguished Overseas Young Talents Program,Innovation Program for Quantum Science and Technology(2021ZD0300301)the Open Research Project of Large Research Infrastructures of CAS-“Study on the interaction between low/mid-latitude atmosphere and ionosphere based on the Chinese Meridian Project”.The project was supported also by the National Key Laboratory of Deep Space Exploration(Grant No.NKLDSE2023A002)the Open Fund of Anhui Provincial Key Laboratory of Intelligent Underground Detection(Grant No.APKLIUD23KF01)the China National Space Administration(CNSA)pre-research Project on Civil Aerospace Technologies No.D010305,D010301.
文摘Sporadic E(Es)layers in the ionosphere are characterized by intense plasma irregularities in the E region at altitudes of 90-130 km.Because they can significantly influence radio communications and navigation systems,accurate forecasting of Es layers is crucial for ensuring the precision and dependability of navigation satellite systems.In this study,we present Es predictions made by an empirical model and by a deep learning model,and analyze their differences comprehensively by comparing the model predictions to satellite RO measurements and ground-based ionosonde observations.The deep learning model exhibited significantly better performance,as indicated by its high coefficient of correlation(r=0.87)with RO observations and predictions,than did the empirical model(r=0.53).This study highlights the importance of integrating artificial intelligence technology into ionosphere modelling generally,and into predicting Es layer occurrences and characteristics,in particular.
基金funded by the National Key R&D Program of China(2023YFA1008704),the National Natural Science Foundation of China(Grant No.62377044)Beijing Key Laboratory of Big Data Management and Analysis Methods,Major Innovation&Planning Interdisciplinary Platform for the“Double-First Class”Initiative,funds for building world-class universities(disciplines)of Renmin University of China,and PCC@RUC.The authors would like to extend their sincere gratitude to Yankai Lin for his constructive feedback throughout the development of this work.
文摘Recently,tool learning with large language models(LLMs)has emerged as a promising paradigm for augmenting the capabilities of LLMs to tackle highly complex problems.Despite growing attention and rapid advancements in this field,the existing literature remains fragmented and lacks systematic organization,posing barriers to entry for newcomers.This gap motivates us to conduct a comprehensive survey of existing works on tool learning with LLMs.In this survey,we focus on reviewing existing literature from the two primary aspects(1)why tool learning is beneficial and(2)how tool learning is implemented,enabling a comprehensive understanding of tool learning with LLMs.We first explore the“why”by reviewing both the benefits of tool integration and the inherent benefits of the tool learning paradigm from six specific aspects.In terms of“how”,we systematically review the literature according to a taxonomy of four key stages in the tool learning workflow:task planning,tool selection,tool calling,and response generation.Additionally,we provide a detailed summary of existing benchmarks and evaluation methods,categorizing them according to their relevance to different stages.Finally,we discuss current challenges and outline potential future directions,aiming to inspire both researchers and industrial developers to further explore this emerging and promising area.
基金Deputy for Research and Technology,Kermanshah University of Medical Sciences,Grant/Award Number:4030031。
文摘Background:Due to the widespread use of cell phone devices today,numerous re-search studies have focused on the adverse effects of electromagnetic radiation on human neuropsychological and reproductive systems.In most studies,oxidative stress has been identified as the primary pathophysiological mechanism underlying the harmful effects of electromagnetic waves.This paper aims to provide a holistic review of the protective effects of melatonin against cell phone-induced electromag-netic waves on various organs.Methods:This study is a systematic review of articles chosen by searching Google Scholar,PubMed,Embase,Scopus,Web of Science,and Science Direct using the key-words‘melatonin’,‘cell phone radiation’,and‘animal model’.The search focused on articles written in English,which were reviewed and evaluated.The PRISMA process was used to review the articles chosen for the study,and the JBI checklist was used to check the quality of the reviewed articles.Results:In the final review of 11 valid quality-checked articles,the effects of me-latonin in the intervention group,the effects of electromagnetic waves in the case group,and the amount of melatonin in the chosen organ,i.e.brain,skin,eyes,testis and the kidney were thoroughly examined.The review showed that electromagnetic waves increase cellular anti-oxidative activity in different tissues such as the brain,the skin,the eyes,the testis,and the kidneys.Melatonin can considerably augment the anti-oxidative system of cells and protect tissues;these measurements were sig-nificantly increased in control groups.Electromagnetic waves can induce tissue atro-phy and cell death in various organs including the brain and the skin and this effect was highly decreased by melatonin.Conclusion:Our review confirms that melatonin effectively protects the organs of an-imal models against electromagnetic waves.In light of this conclusion and the current world-wide use of melatonin,future studies should advance to the stages of human clinical trials.We also recommend that more research in the field of melatonin physi-ology is conducted in order to protect exposed cells from dying and that melatonin should be considered as a pharmaceutical option for treating the complications result-ing from electromagnetic waves in humans.