This study was aimed to prepare landslide susceptibility maps for the Pithoragarh district in Uttarakhand,India,using advanced ensemble models that combined Radial Basis Function Networks(RBFN)with three ensemble lear...This study was aimed to prepare landslide susceptibility maps for the Pithoragarh district in Uttarakhand,India,using advanced ensemble models that combined Radial Basis Function Networks(RBFN)with three ensemble learning techniques:DAGGING(DG),MULTIBOOST(MB),and ADABOOST(AB).This combination resulted in three distinct ensemble models:DG-RBFN,MB-RBFN,and AB-RBFN.Additionally,a traditional weighted method,Information Value(IV),and a benchmark machine learning(ML)model,Multilayer Perceptron Neural Network(MLP),were employed for comparison and validation.The models were developed using ten landslide conditioning factors,which included slope,aspect,elevation,curvature,land cover,geomorphology,overburden depth,lithology,distance to rivers and distance to roads.These factors were instrumental in predicting the output variable,which was the probability of landslide occurrence.Statistical analysis of the models’performance indicated that the DG-RBFN model,with an Area Under ROC Curve(AUC)of 0.931,outperformed the other models.The AB-RBFN model achieved an AUC of 0.929,the MB-RBFN model had an AUC of 0.913,and the MLP model recorded an AUC of 0.926.These results suggest that the advanced ensemble ML model DG-RBFN was more accurate than traditional statistical model,single MLP model,and other ensemble models in preparing trustworthy landslide susceptibility maps,thereby enhancing land use planning and decision-making.展开更多
We propose an integrated method of data-driven and mechanism models for well logging formation evaluation,explicitly focusing on predicting reservoir parameters,such as porosity and water saturation.Accurately interpr...We propose an integrated method of data-driven and mechanism models for well logging formation evaluation,explicitly focusing on predicting reservoir parameters,such as porosity and water saturation.Accurately interpreting these parameters is crucial for effectively exploring and developing oil and gas.However,with the increasing complexity of geological conditions in this industry,there is a growing demand for improved accuracy in reservoir parameter prediction,leading to higher costs associated with manual interpretation.The conventional logging interpretation methods rely on empirical relationships between logging data and reservoir parameters,which suffer from low interpretation efficiency,intense subjectivity,and suitability for ideal conditions.The application of artificial intelligence in the interpretation of logging data provides a new solution to the problems existing in traditional methods.It is expected to improve the accuracy and efficiency of the interpretation.If large and high-quality datasets exist,data-driven models can reveal relationships of arbitrary complexity.Nevertheless,constructing sufficiently large logging datasets with reliable labels remains challenging,making it difficult to apply data-driven models effectively in logging data interpretation.Furthermore,data-driven models often act as“black boxes”without explaining their predictions or ensuring compliance with primary physical constraints.This paper proposes a machine learning method with strong physical constraints by integrating mechanism and data-driven models.Prior knowledge of logging data interpretation is embedded into machine learning regarding network structure,loss function,and optimization algorithm.We employ the Physically Informed Auto-Encoder(PIAE)to predict porosity and water saturation,which can be trained without labeled reservoir parameters using self-supervised learning techniques.This approach effectively achieves automated interpretation and facilitates generalization across diverse datasets.展开更多
Conducting predictability studies is essential for tracing the source of forecast errors,which not only leads to the improvement of observation and forecasting systems,but also enhances the understanding of weather an...Conducting predictability studies is essential for tracing the source of forecast errors,which not only leads to the improvement of observation and forecasting systems,but also enhances the understanding of weather and climate phenomena.In the past few decades,dynamical numerical models have been the primary tools for predictability studies,achieving significant progress.Nowadays,with the advances in artificial intelligence(AI)techniques and accumulations of vast meteorological data,modeling weather and climate events using modern data-driven approaches is becoming trendy,where FourCastNet,Pangu-Weather,and GraphCast are successful pioneers.In this perspective article,we suggest AI models should not be limited to forecasting but be expanded to predictability studies,leveraging AI's advantages of high efficiency and self-contained optimization modules.To this end,we first remark that AI models should possess high simulation capability with fine spatiotemporal resolution for two kinds of predictability studies.AI models with high simulation capabilities comparable to numerical models can be considered to provide solutions to partial differential equations in a data-driven way.Then,we highlight several specific predictability issues with well-determined nonlinear optimization formulizations,which can be well-studied using AI models,holding significant scientific value.In addition,we advocate for the incorporation of AI models into the synergistic cycle of the cognition–observation–model paradigm.Comprehensive predictability studies have the potential to transform“big data”to“big and better data”and shift the focus from“AI for forecasts”to“AI for science”,ultimately advancing the development of the atmospheric and oceanic sciences.展开更多
Developing sensorless techniques for estimating battery expansion is essential for effective mechanical state monitoring,improving the accuracy of digital twin simulation and abnormality detection.Therefore,this paper...Developing sensorless techniques for estimating battery expansion is essential for effective mechanical state monitoring,improving the accuracy of digital twin simulation and abnormality detection.Therefore,this paper presents a data-driven approach to expansion estimation using electromechanical coupled models with machine learning.The proposed method integrates reduced-order impedance models with data-driven mechanical models,coupling the electrochemical and mechanical states through the state of charge(SOC)and mechanical pressure within a state estimation framework.The coupling relationship was established through experimental insights into pressure-related impedance parameters and the nonlinear mechanical behavior with SOC and pressure.The data-driven model was interpreted by introducing a novel swelling coefficient defined by component stiffnesses to capture the nonlinear mechanical behavior across various mechanical constraints.Sensitivity analysis of the impedance model shows that updating model parameters with pressure can reduce the mean absolute error of simulated voltage by 20 mV and SOC estimation error by 2%.The results demonstrate the model's estimation capabilities,achieving a root mean square error of less than 1 kPa when the maximum expansion force is from 30 kPa to 120 kPa,outperforming calibrated stiffness models and other machine learning techniques.The model's robustness and generalizability are further supported by its effective handling of SOC estimation and pressure measurement errors.This work highlights the importance of the proposed framework in enhancing state estimation and fault diagnosis for lithium-ion batteries.展开更多
Large language models(LLMs)have undergone significant expansion and have been increasingly integrated across various domains.Notably,in the realm of robot task planning,LLMs harness their advanced reasoning and langua...Large language models(LLMs)have undergone significant expansion and have been increasingly integrated across various domains.Notably,in the realm of robot task planning,LLMs harness their advanced reasoning and language comprehension capabilities to formulate precise and efficient action plans based on natural language instructions.However,for embodied tasks,where robots interact with complex environments,textonly LLMs often face challenges due to a lack of compatibility with robotic visual perception.This study provides a comprehensive overview of the emerging integration of LLMs and multimodal LLMs into various robotic tasks.Additionally,we propose a framework that utilizes multimodal GPT-4V to enhance embodied task planning through the combination of natural language instructions and robot visual perceptions.Our results,based on diverse datasets,indicate that GPT-4V effectively enhances robot performance in embodied tasks.This extensive survey and evaluation of LLMs and multimodal LLMs across a variety of robotic tasks enriches the understanding of LLM-centric embodied intelligence and provides forward-looking insights towards bridging the gap in Human-Robot-Environment interaction.展开更多
Influenced by complex external factors,the displacement-time curve of reservoir landslides demonstrates both short-term and long-term diversity and dynamic complexity.It is difficult for existing methods,including Reg...Influenced by complex external factors,the displacement-time curve of reservoir landslides demonstrates both short-term and long-term diversity and dynamic complexity.It is difficult for existing methods,including Regression models and Neural network models,to perform multi-characteristic coupled displacement prediction because they fail to consider landslide creep characteristics.This paper integrates the creep characteristics of landslides with non-linear intelligent algorithms and proposes a dynamic intelligent landslide displacement prediction method based on a combination of the Biological Growth model(BG),Convolutional Neural Network(CNN),and Long ShortTerm Memory Network(LSTM).This prediction approach improves three different biological growth models,thereby effectively extracting landslide creep characteristic parameters.Simultaneously,it integrates external factors(rainfall and reservoir water level)to construct an internal and external comprehensive dataset for data augmentation,which is input into the improved CNN-LSTM model.Thereafter,harnessing the robust feature extraction capabilities and spatial translation invariance of CNN,the model autonomously captures short-term local fluctuation characteristics of landslide displacement,and combines LSTM's efficient handling of long-term nonlinear temporal data to improve prediction performance.An evaluation of the Liangshuijing landslide in the Three Gorges Reservoir Area indicates that BG-CNN-LSTM exhibits high prediction accuracy,excellent generalization capabilities when dealing with various types of landslides.The research provides an innovative approach to achieving the whole-process,realtime,high-precision displacement predictions for multicharacteristic coupled landslides.展开更多
Depressive disorder is a chronic,recurring,and potentially life-endangering neuropsychiatric disease.According to a report by the World Health Organization,the global population suffering from depression is experienci...Depressive disorder is a chronic,recurring,and potentially life-endangering neuropsychiatric disease.According to a report by the World Health Organization,the global population suffering from depression is experiencing a significant annual increase.Despite its prevalence and considerable impact on people,little is known about its pathogenesis.One major reason is the scarcity of reliable animal models due to the absence of consensus on the pathology and etiology of depression.Furthermore,the neural circuit mechanism of depression induced by various factors is particularly complex.Considering the variability in depressive behavior patterns and neurobiological mechanisms among different animal models of depression,a comparison between the neural circuits of depression induced by various factors is essential for its treatment.In this review,we mainly summarize the most widely used behavioral animal models and neural circuits under different triggers of depression,aiming to provide a theoretical basis for depression prevention.展开更多
Sporadic E(Es)layers in the ionosphere are characterized by intense plasma irregularities in the E region at altitudes of 90-130 km.Because they can significantly influence radio communications and navigation systems,...Sporadic E(Es)layers in the ionosphere are characterized by intense plasma irregularities in the E region at altitudes of 90-130 km.Because they can significantly influence radio communications and navigation systems,accurate forecasting of Es layers is crucial for ensuring the precision and dependability of navigation satellite systems.In this study,we present Es predictions made by an empirical model and by a deep learning model,and analyze their differences comprehensively by comparing the model predictions to satellite RO measurements and ground-based ionosonde observations.The deep learning model exhibited significantly better performance,as indicated by its high coefficient of correlation(r=0.87)with RO observations and predictions,than did the empirical model(r=0.53).This study highlights the importance of integrating artificial intelligence technology into ionosphere modelling generally,and into predicting Es layer occurrences and characteristics,in particular.展开更多
Background:Due to the widespread use of cell phone devices today,numerous re-search studies have focused on the adverse effects of electromagnetic radiation on human neuropsychological and reproductive systems.In most...Background:Due to the widespread use of cell phone devices today,numerous re-search studies have focused on the adverse effects of electromagnetic radiation on human neuropsychological and reproductive systems.In most studies,oxidative stress has been identified as the primary pathophysiological mechanism underlying the harmful effects of electromagnetic waves.This paper aims to provide a holistic review of the protective effects of melatonin against cell phone-induced electromag-netic waves on various organs.Methods:This study is a systematic review of articles chosen by searching Google Scholar,PubMed,Embase,Scopus,Web of Science,and Science Direct using the key-words‘melatonin’,‘cell phone radiation’,and‘animal model’.The search focused on articles written in English,which were reviewed and evaluated.The PRISMA process was used to review the articles chosen for the study,and the JBI checklist was used to check the quality of the reviewed articles.Results:In the final review of 11 valid quality-checked articles,the effects of me-latonin in the intervention group,the effects of electromagnetic waves in the case group,and the amount of melatonin in the chosen organ,i.e.brain,skin,eyes,testis and the kidney were thoroughly examined.The review showed that electromagnetic waves increase cellular anti-oxidative activity in different tissues such as the brain,the skin,the eyes,the testis,and the kidneys.Melatonin can considerably augment the anti-oxidative system of cells and protect tissues;these measurements were sig-nificantly increased in control groups.Electromagnetic waves can induce tissue atro-phy and cell death in various organs including the brain and the skin and this effect was highly decreased by melatonin.Conclusion:Our review confirms that melatonin effectively protects the organs of an-imal models against electromagnetic waves.In light of this conclusion and the current world-wide use of melatonin,future studies should advance to the stages of human clinical trials.We also recommend that more research in the field of melatonin physi-ology is conducted in order to protect exposed cells from dying and that melatonin should be considered as a pharmaceutical option for treating the complications result-ing from electromagnetic waves in humans.展开更多
Purpose:Evaluating the quality of academic journal articles is a time consuming but critical task for national research evaluation exercises,appointments and promotion.It is therefore important to investigate whether ...Purpose:Evaluating the quality of academic journal articles is a time consuming but critical task for national research evaluation exercises,appointments and promotion.It is therefore important to investigate whether Large Language Models(LLMs)can play a role in this process.Design/methodology/approach:This article assesses which ChatGPT inputs(full text without tables,figures,and references;title and abstract;title only)produce better quality score estimates,and the extent to which scores are affected by ChatGPT models and system prompts.Findings:The optimal input is the article title and abstract,with average ChatGPT scores based on these(30 iterations on a dataset of 51 papers)correlating at 0.67 with human scores,the highest ever reported.ChatGPT 4o is slightly better than 3.5-turbo(0.66),and 4o-mini(0.66).Research limitations:The data is a convenience sample of the work of a single author,it only includes one field,and the scores are self-evaluations.Practical implications:The results suggest that article full texts might confuse LLM research quality evaluations,even though complex system instructions for the task are more effective than simple ones.Thus,whilst abstracts contain insufficient information for a thorough assessment of rigour,they may contain strong pointers about originality and significance.Finally,linear regression can be used to convert the model scores into the human scale scores,which is 31%more accurate than guessing.Originality/value:This is the first systematic comparison of the impact of different prompts,parameters and inputs for ChatGPT research quality evaluations.展开更多
The integration of artificial intelligence(AI)technology,particularly large language models(LLMs),has become essential across various sectors due to their advanced language comprehension and generation capabilities.De...The integration of artificial intelligence(AI)technology,particularly large language models(LLMs),has become essential across various sectors due to their advanced language comprehension and generation capabilities.Despite their transformative impact in fields such as machine translation and intelligent dialogue systems,LLMs face significant challenges.These challenges include safety,security,and privacy concerns that undermine their trustworthiness and effectiveness,such as hallucinations,backdoor attacks,and privacy leakage.Previous works often conflated safety issues with security concerns.In contrast,our study provides clearer and more reasonable definitions for safety,security,and privacy within the context of LLMs.Building on these definitions,we provide a comprehensive overview of the vulnerabilities and defense mechanisms related to safety,security,and privacy in LLMs.Additionally,we explore the unique research challenges posed by LLMs and suggest potential avenues for future research,aiming to enhance the robustness and reliability of LLMs in the face of emerging threats.展开更多
Cardiac arrest(CA)is a critical condition in the field of cardiovascular medicine.Despite successful resuscitation,patients continue to have a high mortality rate,largely due to post CA syndrome(PCAS).However,the inju...Cardiac arrest(CA)is a critical condition in the field of cardiovascular medicine.Despite successful resuscitation,patients continue to have a high mortality rate,largely due to post CA syndrome(PCAS).However,the injury and pathophysiological mechanisms underlying PCAS remain unclear.Experimental animal models are valuable tools for exploring the etiology,pathogenesis,and potential interventions for CA and PCAS.Current CA animal models include electrical induction of ventricular fibrillation(VF),myocardial infarction,high potassium,asphyxia,and hemorrhagic shock.Although these models do not fully replicate the complexity of clinical CA,the mechanistic insights they provide remain highly relevant,including post-CA brain injury(PCABI),post-CA myocardial dysfunction(PAMD),systemic ischaemia/reperfusion injury(IRI),and the persistent precipitating pathology.Summarizing the methods of establishing CA models,the challenges encountered in the modeling process,and the mechanisms of PCAS can provide a foundation for developing standardized CA modeling protocols.展开更多
Neuromyelitis optica spectrum disorders are neuroinflammatory demyelinating disorders that lead to permanent visual loss and motor dysfunction.To date,no effective treatment exists as the exact causative mechanism rem...Neuromyelitis optica spectrum disorders are neuroinflammatory demyelinating disorders that lead to permanent visual loss and motor dysfunction.To date,no effective treatment exists as the exact causative mechanism remains unknown.Therefore,experimental models of neuromyelitis optica spectrum disorders are essential for exploring its pathogenesis and in screening for therapeutic targets.Since most patients with neuromyelitis optica spectrum disorders are seropositive for IgG autoantibodies against aquaporin-4,which is highly expressed on the membrane of astrocyte endfeet,most current experimental models are based on aquaporin-4-IgG that initially targets astrocytes.These experimental models have successfully simulated many pathological features of neuromyelitis optica spectrum disorders,such as aquaporin-4 loss,astrocytopathy,granulocyte and macrophage infiltration,complement activation,demyelination,and neuronal loss;however,they do not fully capture the pathological process of human neuromyelitis optica spectrum disorders.In this review,we summarize the currently known pathogenic mechanisms and the development of associated experimental models in vitro,ex vivo,and in vivo for neuromyelitis optica spectrum disorders,suggest potential pathogenic mechanisms for further investigation,and provide guidance on experimental model choices.In addition,this review summarizes the latest information on pathologies and therapies for neuromyelitis optica spectrum disorders based on experimental models of aquaporin-4-IgG-seropositive neuromyelitis optica spectrum disorders,offering further therapeutic targets and a theoretical basis for clinical trials.展开更多
This paper presents a high-fidelity lumpedparameter(LP)thermal model(HF-LPTM)for permanent magnet synchronous machines(PMSMs)in electric vehicle(EV)applications,where various cooling techniques are considered,includin...This paper presents a high-fidelity lumpedparameter(LP)thermal model(HF-LPTM)for permanent magnet synchronous machines(PMSMs)in electric vehicle(EV)applications,where various cooling techniques are considered,including frame forced air/liquid cooling,oil jet cooling for endwinding,and rotor shaft cooling.To address the temperature misestimation in the LP thermal modelling due to assumptions of concentrated loss input and uniform heat flows,the developed HF-LPTM introduces two compensation thermal resistances for the winding and PM components,which are analytically derived from the multi-dimensional heat transfer equations and are robust against different load/thermal conditions.As validated by the finite element analysis method and experiments,the conventional LPTMs exhibit significant winding temperature deviations,while the proposed HF-LPTM can accurately predict both the midpoint and average temperatures.The developed HFLPTM is further used to assess the effectiveness of various cooling techniques under different scenarios,i.e.,steady-state thermal states under the rated load condition,and transient temperature profiles under city,freeway,and hybrid(city+freeway)driving cycles.Results indicate that no single cooling technique can maintain both winding and PM temperatures within safety limits.The combination of frame liquid cooling and oil jet cooling for end winding can sufficiently mitigate PMSM thermal stress in EV applications.展开更多
Software security poses substantial risks to our society because software has become part of our life. Numerous techniques have been proposed to resolve or mitigate the impact of software security issues. Among them, ...Software security poses substantial risks to our society because software has become part of our life. Numerous techniques have been proposed to resolve or mitigate the impact of software security issues. Among them, software testing and analysis are two of the critical methods, which significantly benefit from the advancements in deep learning technologies. Due to the successful use of deep learning in software security, recently,researchers have explored the potential of using large language models(LLMs) in this area. In this paper, we systematically review the results focusing on LLMs in software security. We analyze the topics of fuzzing, unit test, program repair, bug reproduction, data-driven bug detection, and bug triage. We deconstruct these techniques into several stages and analyze how LLMs can be used in the stages. We also discuss the future directions of using LLMs in software security, including the future directions for the existing use of LLMs and extensions from conventional deep learning research.展开更多
ChatGPT is a powerful artificial intelligence(AI)language model that has demonstrated significant improvements in various natural language processing(NLP) tasks. However, like any technology, it presents potential sec...ChatGPT is a powerful artificial intelligence(AI)language model that has demonstrated significant improvements in various natural language processing(NLP) tasks. However, like any technology, it presents potential security risks that need to be carefully evaluated and addressed. In this survey, we provide an overview of the current state of research on security of using ChatGPT, with aspects of bias, disinformation, ethics, misuse,attacks and privacy. We review and discuss the literature on these topics and highlight open research questions and future directions.Through this survey, we aim to contribute to the academic discourse on AI security, enriching the understanding of potential risks and mitigations. We anticipate that this survey will be valuable for various stakeholders involved in AI development and usage, including AI researchers, developers, policy makers, and end-users.展开更多
The three-dimensional(3D)geometry of a fault is a critical control on earthquake nucleation,dynamic rupture,stress triggering,and related seismic hazards.Therefore,a 3D model of an active fault can significantly impro...The three-dimensional(3D)geometry of a fault is a critical control on earthquake nucleation,dynamic rupture,stress triggering,and related seismic hazards.Therefore,a 3D model of an active fault can significantly improve our understanding of seismogenesis and our ability to evaluate seismic hazards.Utilising the SKUA GoCAD software,we constructed detailed seismic fault models for the 2021 M_(S)6.4 Yangbi earthquake in Yunnan,China,using two sets of relocated earthquake catalogs and focal mechanism solutions following a convenient 3D fault modeling workflow.Our analysis revealed a NW-striking main fault with a high-angle SW dip,accompanied by two branch faults.Interpretation of one dataset revealed a single NNW-striking branch fault SW of the main fault,whereas the other dataset indicated four steep NNE-striking segments with a left-echelon pattern.Additionally,a third ENE-striking short fault was identified NE of the main fault.In combination with the spatial distribution of pre-existing faults,our 3D fault models indicate that the Yangbi earthquake reactivated pre-existing NW-and NE-striking fault directions rather than the surface-exposed Weixi-Qiaohou-Weishan Fault zone.The occurrence of the Yangbi earthquake demonstrates that the reactivation of pre-existing faults away from active fault zones,through either cascade or conjugate rupture modes,can cause unexpected moderate-large earthquakes and severe disasters,necessitating attention in regions like southeast Xizang,which have complex fault systems.展开更多
BACKGROUND Inflammatory bowel disease(IBD)is a global health burden that affects millions of individuals worldwide,necessitating extensive patient education.Large language models(LLMs)hold promise for addressing patie...BACKGROUND Inflammatory bowel disease(IBD)is a global health burden that affects millions of individuals worldwide,necessitating extensive patient education.Large language models(LLMs)hold promise for addressing patient information needs.However,LLM use to deliver accurate and comprehensible IBD-related medical information has yet to be thoroughly investigated.AIM To assess the utility of three LLMs(ChatGPT-4.0,Claude-3-Opus,and Gemini-1.5-Pro)as a reference point for patients with IBD.METHODS In this comparative study,two gastroenterology experts generated 15 IBD-related questions that reflected common patient concerns.These questions were used to evaluate the performance of the three LLMs.The answers provided by each model were independently assessed by three IBD-related medical experts using a Likert scale focusing on accuracy,comprehensibility,and correlation.Simultaneously,three patients were invited to evaluate the comprehensibility of their answers.Finally,a readability assessment was performed.RESULTS Overall,each of the LLMs achieved satisfactory levels of accuracy,comprehensibility,and completeness when answering IBD-related questions,although their performance varies.All of the investigated models demonstrated strengths in providing basic disease information such as IBD definition as well as its common symptoms and diagnostic methods.Nevertheless,when dealing with more complex medical advice,such as medication side effects,dietary adjustments,and complication risks,the quality of answers was inconsistent between the LLMs.Notably,Claude-3-Opus generated answers with better readability than the other two models.CONCLUSION LLMs have the potential as educational tools for patients with IBD;however,there are discrepancies between the models.Further optimization and the development of specialized models are necessary to ensure the accuracy and safety of the information provided.展开更多
Modeling HIV/AIDS progression is critical for understanding disease dynamics and improving patient care. This study compares the Exponential and Weibull survival models, focusing on their ability to capture state-spec...Modeling HIV/AIDS progression is critical for understanding disease dynamics and improving patient care. This study compares the Exponential and Weibull survival models, focusing on their ability to capture state-specific failure rates in HIV/AIDS progression. While the Exponential model offers simplicity with a constant hazard rate, it often fails to accommodate the complexities of dynamic disease progression. In contrast, the Weibull model provides flexibility by allowing hazard rates to vary over time. Both models are evaluated within the frameworks of the Cox Proportional Hazards (Cox PH) and Accelerated Failure Time (AFT) models, incorporating critical covariates such as age, gender, CD4 count, and ART status. Statistical evaluation metrics, including Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC), log-likelihood, and Pseudo-R2, were employed to assess model performance across diverse patient subgroups. Results indicate that the Weibull model consistently outperforms the Exponential model in dynamic scenarios, such as younger patients and those with co-infections, while maintaining robustness in stable contexts. This study highlights the trade-off between flexibility and simplicity in survival modeling, advocating for tailored model selection to balance interpretability and predictive accuracy. These findings provide valuable insights for optimizing HIV/AIDS management strategies and advancing survival analysis methodologies.展开更多
Large-scale Language Models(LLMs)have achieved significant breakthroughs in Natural Language Processing(NLP),driven by the pre-training and fine-tuning paradigm.While this approach allows models to specialize in speci...Large-scale Language Models(LLMs)have achieved significant breakthroughs in Natural Language Processing(NLP),driven by the pre-training and fine-tuning paradigm.While this approach allows models to specialize in specific tasks with reduced training costs,the substantial memory requirements during fine-tuning present a barrier to broader deployment.Parameter-Efficient Fine-Tuning(PEFT)techniques,such as Low-Rank Adaptation(LoRA),and parameter quantization methods have emerged as solutions to address these challenges by optimizing memory usage and computational efficiency.Among these,QLoRA,which combines PEFT and quantization,has demonstrated notable success in reducing memory footprints during fine-tuning,prompting the development of various QLoRA variants.Despite these advancements,the quantitative impact of key variables on the fine-tuning performance of quantized LLMs remains underexplored.This study presents a comprehensive analysis of these key variables,focusing on their influence across different layer types and depths within LLM architectures.Our investigation uncovers several critical findings:(1)Larger layers,such as MLP layers,can maintain performance despite reductions in adapter rank,while smaller layers,like self-attention layers,aremore sensitive to such changes;(2)The effectiveness of balancing factors depends more on specific values rather than layer type or depth;(3)In quantization-aware fine-tuning,larger layers can effectively utilize smaller adapters,whereas smaller layers struggle to do so.These insights suggest that layer type is a more significant determinant of fine-tuning success than layer depth when optimizing quantized LLMs.Moreover,for the same discount of trainable parameters,reducing the trainable parameters in a larger layer is more effective in preserving fine-tuning accuracy than in a smaller one.This study provides valuable guidance for more efficient fine-tuning strategies and opens avenues for further research into optimizing LLM fine-tuning in resource-constrained environments.展开更多
基金the University of Transport Technology under the project entitled“Application of Machine Learning Algorithms in Landslide Susceptibility Mapping in Mountainous Areas”with grant number DTTD2022-16.
文摘This study was aimed to prepare landslide susceptibility maps for the Pithoragarh district in Uttarakhand,India,using advanced ensemble models that combined Radial Basis Function Networks(RBFN)with three ensemble learning techniques:DAGGING(DG),MULTIBOOST(MB),and ADABOOST(AB).This combination resulted in three distinct ensemble models:DG-RBFN,MB-RBFN,and AB-RBFN.Additionally,a traditional weighted method,Information Value(IV),and a benchmark machine learning(ML)model,Multilayer Perceptron Neural Network(MLP),were employed for comparison and validation.The models were developed using ten landslide conditioning factors,which included slope,aspect,elevation,curvature,land cover,geomorphology,overburden depth,lithology,distance to rivers and distance to roads.These factors were instrumental in predicting the output variable,which was the probability of landslide occurrence.Statistical analysis of the models’performance indicated that the DG-RBFN model,with an Area Under ROC Curve(AUC)of 0.931,outperformed the other models.The AB-RBFN model achieved an AUC of 0.929,the MB-RBFN model had an AUC of 0.913,and the MLP model recorded an AUC of 0.926.These results suggest that the advanced ensemble ML model DG-RBFN was more accurate than traditional statistical model,single MLP model,and other ensemble models in preparing trustworthy landslide susceptibility maps,thereby enhancing land use planning and decision-making.
基金supported by National Key Research and Development Program (2019YFA0708301)National Natural Science Foundation of China (51974337)+2 种基金the Strategic Cooperation Projects of CNPC and CUPB (ZLZX2020-03)Science and Technology Innovation Fund of CNPC (2021DQ02-0403)Open Fund of Petroleum Exploration and Development Research Institute of CNPC (2022-KFKT-09)
文摘We propose an integrated method of data-driven and mechanism models for well logging formation evaluation,explicitly focusing on predicting reservoir parameters,such as porosity and water saturation.Accurately interpreting these parameters is crucial for effectively exploring and developing oil and gas.However,with the increasing complexity of geological conditions in this industry,there is a growing demand for improved accuracy in reservoir parameter prediction,leading to higher costs associated with manual interpretation.The conventional logging interpretation methods rely on empirical relationships between logging data and reservoir parameters,which suffer from low interpretation efficiency,intense subjectivity,and suitability for ideal conditions.The application of artificial intelligence in the interpretation of logging data provides a new solution to the problems existing in traditional methods.It is expected to improve the accuracy and efficiency of the interpretation.If large and high-quality datasets exist,data-driven models can reveal relationships of arbitrary complexity.Nevertheless,constructing sufficiently large logging datasets with reliable labels remains challenging,making it difficult to apply data-driven models effectively in logging data interpretation.Furthermore,data-driven models often act as“black boxes”without explaining their predictions or ensuring compliance with primary physical constraints.This paper proposes a machine learning method with strong physical constraints by integrating mechanism and data-driven models.Prior knowledge of logging data interpretation is embedded into machine learning regarding network structure,loss function,and optimization algorithm.We employ the Physically Informed Auto-Encoder(PIAE)to predict porosity and water saturation,which can be trained without labeled reservoir parameters using self-supervised learning techniques.This approach effectively achieves automated interpretation and facilitates generalization across diverse datasets.
基金in part supported by the National Natural Science Foundation of China(Grant Nos.42288101,42405147 and 42475054)in part by the China National Postdoctoral Program for Innovative Talents(Grant No.BX20230071)。
文摘Conducting predictability studies is essential for tracing the source of forecast errors,which not only leads to the improvement of observation and forecasting systems,but also enhances the understanding of weather and climate phenomena.In the past few decades,dynamical numerical models have been the primary tools for predictability studies,achieving significant progress.Nowadays,with the advances in artificial intelligence(AI)techniques and accumulations of vast meteorological data,modeling weather and climate events using modern data-driven approaches is becoming trendy,where FourCastNet,Pangu-Weather,and GraphCast are successful pioneers.In this perspective article,we suggest AI models should not be limited to forecasting but be expanded to predictability studies,leveraging AI's advantages of high efficiency and self-contained optimization modules.To this end,we first remark that AI models should possess high simulation capability with fine spatiotemporal resolution for two kinds of predictability studies.AI models with high simulation capabilities comparable to numerical models can be considered to provide solutions to partial differential equations in a data-driven way.Then,we highlight several specific predictability issues with well-determined nonlinear optimization formulizations,which can be well-studied using AI models,holding significant scientific value.In addition,we advocate for the incorporation of AI models into the synergistic cycle of the cognition–observation–model paradigm.Comprehensive predictability studies have the potential to transform“big data”to“big and better data”and shift the focus from“AI for forecasts”to“AI for science”,ultimately advancing the development of the atmospheric and oceanic sciences.
基金Fund supported this work for Excellent Youth Scholars of China(Grant No.52222708)the National Natural Science Foundation of China(Grant No.51977007)+1 种基金Part of this work is supported by the research project“SPEED”(03XP0585)at RWTH Aachen Universityfunded by the German Federal Ministry of Education and Research(BMBF)。
文摘Developing sensorless techniques for estimating battery expansion is essential for effective mechanical state monitoring,improving the accuracy of digital twin simulation and abnormality detection.Therefore,this paper presents a data-driven approach to expansion estimation using electromechanical coupled models with machine learning.The proposed method integrates reduced-order impedance models with data-driven mechanical models,coupling the electrochemical and mechanical states through the state of charge(SOC)and mechanical pressure within a state estimation framework.The coupling relationship was established through experimental insights into pressure-related impedance parameters and the nonlinear mechanical behavior with SOC and pressure.The data-driven model was interpreted by introducing a novel swelling coefficient defined by component stiffnesses to capture the nonlinear mechanical behavior across various mechanical constraints.Sensitivity analysis of the impedance model shows that updating model parameters with pressure can reduce the mean absolute error of simulated voltage by 20 mV and SOC estimation error by 2%.The results demonstrate the model's estimation capabilities,achieving a root mean square error of less than 1 kPa when the maximum expansion force is from 30 kPa to 120 kPa,outperforming calibrated stiffness models and other machine learning techniques.The model's robustness and generalizability are further supported by its effective handling of SOC estimation and pressure measurement errors.This work highlights the importance of the proposed framework in enhancing state estimation and fault diagnosis for lithium-ion batteries.
基金supported by National Natural Science Foundation of China(62376219 and 62006194)Foundational Research Project in Specialized Discipline(Grant No.G2024WD0146)Faculty Construction Project(Grant No.24GH0201148).
文摘Large language models(LLMs)have undergone significant expansion and have been increasingly integrated across various domains.Notably,in the realm of robot task planning,LLMs harness their advanced reasoning and language comprehension capabilities to formulate precise and efficient action plans based on natural language instructions.However,for embodied tasks,where robots interact with complex environments,textonly LLMs often face challenges due to a lack of compatibility with robotic visual perception.This study provides a comprehensive overview of the emerging integration of LLMs and multimodal LLMs into various robotic tasks.Additionally,we propose a framework that utilizes multimodal GPT-4V to enhance embodied task planning through the combination of natural language instructions and robot visual perceptions.Our results,based on diverse datasets,indicate that GPT-4V effectively enhances robot performance in embodied tasks.This extensive survey and evaluation of LLMs and multimodal LLMs across a variety of robotic tasks enriches the understanding of LLM-centric embodied intelligence and provides forward-looking insights towards bridging the gap in Human-Robot-Environment interaction.
基金the funding support from the National Natural Science Foundation of China(Grant No.52308340)Chongqing Talent Innovation and Entrepreneurship Demonstration Team Project(Grant No.cstc2024ycjh-bgzxm0012)the Science and Technology Projects supported by China Coal Technology and Engineering Chongqing Design and Research Institute(Group)Co.,Ltd..(Grant No.H20230317)。
文摘Influenced by complex external factors,the displacement-time curve of reservoir landslides demonstrates both short-term and long-term diversity and dynamic complexity.It is difficult for existing methods,including Regression models and Neural network models,to perform multi-characteristic coupled displacement prediction because they fail to consider landslide creep characteristics.This paper integrates the creep characteristics of landslides with non-linear intelligent algorithms and proposes a dynamic intelligent landslide displacement prediction method based on a combination of the Biological Growth model(BG),Convolutional Neural Network(CNN),and Long ShortTerm Memory Network(LSTM).This prediction approach improves three different biological growth models,thereby effectively extracting landslide creep characteristic parameters.Simultaneously,it integrates external factors(rainfall and reservoir water level)to construct an internal and external comprehensive dataset for data augmentation,which is input into the improved CNN-LSTM model.Thereafter,harnessing the robust feature extraction capabilities and spatial translation invariance of CNN,the model autonomously captures short-term local fluctuation characteristics of landslide displacement,and combines LSTM's efficient handling of long-term nonlinear temporal data to improve prediction performance.An evaluation of the Liangshuijing landslide in the Three Gorges Reservoir Area indicates that BG-CNN-LSTM exhibits high prediction accuracy,excellent generalization capabilities when dealing with various types of landslides.The research provides an innovative approach to achieving the whole-process,realtime,high-precision displacement predictions for multicharacteristic coupled landslides.
基金supported by the Brain&Behavior Research Foundation(30233).
文摘Depressive disorder is a chronic,recurring,and potentially life-endangering neuropsychiatric disease.According to a report by the World Health Organization,the global population suffering from depression is experiencing a significant annual increase.Despite its prevalence and considerable impact on people,little is known about its pathogenesis.One major reason is the scarcity of reliable animal models due to the absence of consensus on the pathology and etiology of depression.Furthermore,the neural circuit mechanism of depression induced by various factors is particularly complex.Considering the variability in depressive behavior patterns and neurobiological mechanisms among different animal models of depression,a comparison between the neural circuits of depression induced by various factors is essential for its treatment.In this review,we mainly summarize the most widely used behavioral animal models and neural circuits under different triggers of depression,aiming to provide a theoretical basis for depression prevention.
基金supported by the Project of Stable Support for Youth Team in Basic Research Field,CAS(grant No.YSBR-018)the National Natural Science Foundation of China(grant Nos.42188101,42130204)+4 种基金the B-type Strategic Priority Program of CAS(grant no.XDB41000000)the National Natural Science Foundation of China(NSFC)Distinguished Overseas Young Talents Program,Innovation Program for Quantum Science and Technology(2021ZD0300301)the Open Research Project of Large Research Infrastructures of CAS-“Study on the interaction between low/mid-latitude atmosphere and ionosphere based on the Chinese Meridian Project”.The project was supported also by the National Key Laboratory of Deep Space Exploration(Grant No.NKLDSE2023A002)the Open Fund of Anhui Provincial Key Laboratory of Intelligent Underground Detection(Grant No.APKLIUD23KF01)the China National Space Administration(CNSA)pre-research Project on Civil Aerospace Technologies No.D010305,D010301.
文摘Sporadic E(Es)layers in the ionosphere are characterized by intense plasma irregularities in the E region at altitudes of 90-130 km.Because they can significantly influence radio communications and navigation systems,accurate forecasting of Es layers is crucial for ensuring the precision and dependability of navigation satellite systems.In this study,we present Es predictions made by an empirical model and by a deep learning model,and analyze their differences comprehensively by comparing the model predictions to satellite RO measurements and ground-based ionosonde observations.The deep learning model exhibited significantly better performance,as indicated by its high coefficient of correlation(r=0.87)with RO observations and predictions,than did the empirical model(r=0.53).This study highlights the importance of integrating artificial intelligence technology into ionosphere modelling generally,and into predicting Es layer occurrences and characteristics,in particular.
基金Deputy for Research and Technology,Kermanshah University of Medical Sciences,Grant/Award Number:4030031。
文摘Background:Due to the widespread use of cell phone devices today,numerous re-search studies have focused on the adverse effects of electromagnetic radiation on human neuropsychological and reproductive systems.In most studies,oxidative stress has been identified as the primary pathophysiological mechanism underlying the harmful effects of electromagnetic waves.This paper aims to provide a holistic review of the protective effects of melatonin against cell phone-induced electromag-netic waves on various organs.Methods:This study is a systematic review of articles chosen by searching Google Scholar,PubMed,Embase,Scopus,Web of Science,and Science Direct using the key-words‘melatonin’,‘cell phone radiation’,and‘animal model’.The search focused on articles written in English,which were reviewed and evaluated.The PRISMA process was used to review the articles chosen for the study,and the JBI checklist was used to check the quality of the reviewed articles.Results:In the final review of 11 valid quality-checked articles,the effects of me-latonin in the intervention group,the effects of electromagnetic waves in the case group,and the amount of melatonin in the chosen organ,i.e.brain,skin,eyes,testis and the kidney were thoroughly examined.The review showed that electromagnetic waves increase cellular anti-oxidative activity in different tissues such as the brain,the skin,the eyes,the testis,and the kidneys.Melatonin can considerably augment the anti-oxidative system of cells and protect tissues;these measurements were sig-nificantly increased in control groups.Electromagnetic waves can induce tissue atro-phy and cell death in various organs including the brain and the skin and this effect was highly decreased by melatonin.Conclusion:Our review confirms that melatonin effectively protects the organs of an-imal models against electromagnetic waves.In light of this conclusion and the current world-wide use of melatonin,future studies should advance to the stages of human clinical trials.We also recommend that more research in the field of melatonin physi-ology is conducted in order to protect exposed cells from dying and that melatonin should be considered as a pharmaceutical option for treating the complications result-ing from electromagnetic waves in humans.
文摘Purpose:Evaluating the quality of academic journal articles is a time consuming but critical task for national research evaluation exercises,appointments and promotion.It is therefore important to investigate whether Large Language Models(LLMs)can play a role in this process.Design/methodology/approach:This article assesses which ChatGPT inputs(full text without tables,figures,and references;title and abstract;title only)produce better quality score estimates,and the extent to which scores are affected by ChatGPT models and system prompts.Findings:The optimal input is the article title and abstract,with average ChatGPT scores based on these(30 iterations on a dataset of 51 papers)correlating at 0.67 with human scores,the highest ever reported.ChatGPT 4o is slightly better than 3.5-turbo(0.66),and 4o-mini(0.66).Research limitations:The data is a convenience sample of the work of a single author,it only includes one field,and the scores are self-evaluations.Practical implications:The results suggest that article full texts might confuse LLM research quality evaluations,even though complex system instructions for the task are more effective than simple ones.Thus,whilst abstracts contain insufficient information for a thorough assessment of rigour,they may contain strong pointers about originality and significance.Finally,linear regression can be used to convert the model scores into the human scale scores,which is 31%more accurate than guessing.Originality/value:This is the first systematic comparison of the impact of different prompts,parameters and inputs for ChatGPT research quality evaluations.
基金supported by the National Key R&D Program of China under Grant No.2022YFB3103500the National Natural Science Foundation of China under Grants No.62402087 and No.62020106013+3 种基金the Sichuan Science and Technology Program under Grant No.2023ZYD0142the Chengdu Science and Technology Program under Grant No.2023-XT00-00002-GXthe Fundamental Research Funds for Chinese Central Universities under Grants No.ZYGX2020ZB027 and No.Y030232063003002the Postdoctoral Innovation Talents Support Program under Grant No.BX20230060.
文摘The integration of artificial intelligence(AI)technology,particularly large language models(LLMs),has become essential across various sectors due to their advanced language comprehension and generation capabilities.Despite their transformative impact in fields such as machine translation and intelligent dialogue systems,LLMs face significant challenges.These challenges include safety,security,and privacy concerns that undermine their trustworthiness and effectiveness,such as hallucinations,backdoor attacks,and privacy leakage.Previous works often conflated safety issues with security concerns.In contrast,our study provides clearer and more reasonable definitions for safety,security,and privacy within the context of LLMs.Building on these definitions,we provide a comprehensive overview of the vulnerabilities and defense mechanisms related to safety,security,and privacy in LLMs.Additionally,we explore the unique research challenges posed by LLMs and suggest potential avenues for future research,aiming to enhance the robustness and reliability of LLMs in the face of emerging threats.
基金supported by the National Key Research and Development Program(2021YFC3002205)the Postgraduate Research and Innovation Program of Tianjin Municipal Education Commission(2022BKY113),China.
文摘Cardiac arrest(CA)is a critical condition in the field of cardiovascular medicine.Despite successful resuscitation,patients continue to have a high mortality rate,largely due to post CA syndrome(PCAS).However,the injury and pathophysiological mechanisms underlying PCAS remain unclear.Experimental animal models are valuable tools for exploring the etiology,pathogenesis,and potential interventions for CA and PCAS.Current CA animal models include electrical induction of ventricular fibrillation(VF),myocardial infarction,high potassium,asphyxia,and hemorrhagic shock.Although these models do not fully replicate the complexity of clinical CA,the mechanistic insights they provide remain highly relevant,including post-CA brain injury(PCABI),post-CA myocardial dysfunction(PAMD),systemic ischaemia/reperfusion injury(IRI),and the persistent precipitating pathology.Summarizing the methods of establishing CA models,the challenges encountered in the modeling process,and the mechanisms of PCAS can provide a foundation for developing standardized CA modeling protocols.
文摘Neuromyelitis optica spectrum disorders are neuroinflammatory demyelinating disorders that lead to permanent visual loss and motor dysfunction.To date,no effective treatment exists as the exact causative mechanism remains unknown.Therefore,experimental models of neuromyelitis optica spectrum disorders are essential for exploring its pathogenesis and in screening for therapeutic targets.Since most patients with neuromyelitis optica spectrum disorders are seropositive for IgG autoantibodies against aquaporin-4,which is highly expressed on the membrane of astrocyte endfeet,most current experimental models are based on aquaporin-4-IgG that initially targets astrocytes.These experimental models have successfully simulated many pathological features of neuromyelitis optica spectrum disorders,such as aquaporin-4 loss,astrocytopathy,granulocyte and macrophage infiltration,complement activation,demyelination,and neuronal loss;however,they do not fully capture the pathological process of human neuromyelitis optica spectrum disorders.In this review,we summarize the currently known pathogenic mechanisms and the development of associated experimental models in vitro,ex vivo,and in vivo for neuromyelitis optica spectrum disorders,suggest potential pathogenic mechanisms for further investigation,and provide guidance on experimental model choices.In addition,this review summarizes the latest information on pathologies and therapies for neuromyelitis optica spectrum disorders based on experimental models of aquaporin-4-IgG-seropositive neuromyelitis optica spectrum disorders,offering further therapeutic targets and a theoretical basis for clinical trials.
文摘This paper presents a high-fidelity lumpedparameter(LP)thermal model(HF-LPTM)for permanent magnet synchronous machines(PMSMs)in electric vehicle(EV)applications,where various cooling techniques are considered,including frame forced air/liquid cooling,oil jet cooling for endwinding,and rotor shaft cooling.To address the temperature misestimation in the LP thermal modelling due to assumptions of concentrated loss input and uniform heat flows,the developed HF-LPTM introduces two compensation thermal resistances for the winding and PM components,which are analytically derived from the multi-dimensional heat transfer equations and are robust against different load/thermal conditions.As validated by the finite element analysis method and experiments,the conventional LPTMs exhibit significant winding temperature deviations,while the proposed HF-LPTM can accurately predict both the midpoint and average temperatures.The developed HFLPTM is further used to assess the effectiveness of various cooling techniques under different scenarios,i.e.,steady-state thermal states under the rated load condition,and transient temperature profiles under city,freeway,and hybrid(city+freeway)driving cycles.Results indicate that no single cooling technique can maintain both winding and PM temperatures within safety limits.The combination of frame liquid cooling and oil jet cooling for end winding can sufficiently mitigate PMSM thermal stress in EV applications.
文摘Software security poses substantial risks to our society because software has become part of our life. Numerous techniques have been proposed to resolve or mitigate the impact of software security issues. Among them, software testing and analysis are two of the critical methods, which significantly benefit from the advancements in deep learning technologies. Due to the successful use of deep learning in software security, recently,researchers have explored the potential of using large language models(LLMs) in this area. In this paper, we systematically review the results focusing on LLMs in software security. We analyze the topics of fuzzing, unit test, program repair, bug reproduction, data-driven bug detection, and bug triage. We deconstruct these techniques into several stages and analyze how LLMs can be used in the stages. We also discuss the future directions of using LLMs in software security, including the future directions for the existing use of LLMs and extensions from conventional deep learning research.
文摘ChatGPT is a powerful artificial intelligence(AI)language model that has demonstrated significant improvements in various natural language processing(NLP) tasks. However, like any technology, it presents potential security risks that need to be carefully evaluated and addressed. In this survey, we provide an overview of the current state of research on security of using ChatGPT, with aspects of bias, disinformation, ethics, misuse,attacks and privacy. We review and discuss the literature on these topics and highlight open research questions and future directions.Through this survey, we aim to contribute to the academic discourse on AI security, enriching the understanding of potential risks and mitigations. We anticipate that this survey will be valuable for various stakeholders involved in AI development and usage, including AI researchers, developers, policy makers, and end-users.
基金financial support from the National Key R&D Program of China (No. 2021YFC3000600)National Natural Science Foundation of China (No. 41872206)National Nonprofit Fundamental Research Grant of China, Institute of Geology, China, Earthquake Administration (No. IGCEA2010)
文摘The three-dimensional(3D)geometry of a fault is a critical control on earthquake nucleation,dynamic rupture,stress triggering,and related seismic hazards.Therefore,a 3D model of an active fault can significantly improve our understanding of seismogenesis and our ability to evaluate seismic hazards.Utilising the SKUA GoCAD software,we constructed detailed seismic fault models for the 2021 M_(S)6.4 Yangbi earthquake in Yunnan,China,using two sets of relocated earthquake catalogs and focal mechanism solutions following a convenient 3D fault modeling workflow.Our analysis revealed a NW-striking main fault with a high-angle SW dip,accompanied by two branch faults.Interpretation of one dataset revealed a single NNW-striking branch fault SW of the main fault,whereas the other dataset indicated four steep NNE-striking segments with a left-echelon pattern.Additionally,a third ENE-striking short fault was identified NE of the main fault.In combination with the spatial distribution of pre-existing faults,our 3D fault models indicate that the Yangbi earthquake reactivated pre-existing NW-and NE-striking fault directions rather than the surface-exposed Weixi-Qiaohou-Weishan Fault zone.The occurrence of the Yangbi earthquake demonstrates that the reactivation of pre-existing faults away from active fault zones,through either cascade or conjugate rupture modes,can cause unexpected moderate-large earthquakes and severe disasters,necessitating attention in regions like southeast Xizang,which have complex fault systems.
基金Supported by the China Health Promotion Foundation Young Doctors'Research Foundation for Inflammatory Bowel Disease,the Taishan Scholars Program of Shandong Province,China,No.tsqn202306343National Natural Science Foundation of China,No.82270578.
文摘BACKGROUND Inflammatory bowel disease(IBD)is a global health burden that affects millions of individuals worldwide,necessitating extensive patient education.Large language models(LLMs)hold promise for addressing patient information needs.However,LLM use to deliver accurate and comprehensible IBD-related medical information has yet to be thoroughly investigated.AIM To assess the utility of three LLMs(ChatGPT-4.0,Claude-3-Opus,and Gemini-1.5-Pro)as a reference point for patients with IBD.METHODS In this comparative study,two gastroenterology experts generated 15 IBD-related questions that reflected common patient concerns.These questions were used to evaluate the performance of the three LLMs.The answers provided by each model were independently assessed by three IBD-related medical experts using a Likert scale focusing on accuracy,comprehensibility,and correlation.Simultaneously,three patients were invited to evaluate the comprehensibility of their answers.Finally,a readability assessment was performed.RESULTS Overall,each of the LLMs achieved satisfactory levels of accuracy,comprehensibility,and completeness when answering IBD-related questions,although their performance varies.All of the investigated models demonstrated strengths in providing basic disease information such as IBD definition as well as its common symptoms and diagnostic methods.Nevertheless,when dealing with more complex medical advice,such as medication side effects,dietary adjustments,and complication risks,the quality of answers was inconsistent between the LLMs.Notably,Claude-3-Opus generated answers with better readability than the other two models.CONCLUSION LLMs have the potential as educational tools for patients with IBD;however,there are discrepancies between the models.Further optimization and the development of specialized models are necessary to ensure the accuracy and safety of the information provided.
文摘Modeling HIV/AIDS progression is critical for understanding disease dynamics and improving patient care. This study compares the Exponential and Weibull survival models, focusing on their ability to capture state-specific failure rates in HIV/AIDS progression. While the Exponential model offers simplicity with a constant hazard rate, it often fails to accommodate the complexities of dynamic disease progression. In contrast, the Weibull model provides flexibility by allowing hazard rates to vary over time. Both models are evaluated within the frameworks of the Cox Proportional Hazards (Cox PH) and Accelerated Failure Time (AFT) models, incorporating critical covariates such as age, gender, CD4 count, and ART status. Statistical evaluation metrics, including Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC), log-likelihood, and Pseudo-R2, were employed to assess model performance across diverse patient subgroups. Results indicate that the Weibull model consistently outperforms the Exponential model in dynamic scenarios, such as younger patients and those with co-infections, while maintaining robustness in stable contexts. This study highlights the trade-off between flexibility and simplicity in survival modeling, advocating for tailored model selection to balance interpretability and predictive accuracy. These findings provide valuable insights for optimizing HIV/AIDS management strategies and advancing survival analysis methodologies.
基金supported by the National Key R&D Program of China(No.2021YFB0301200)National Natural Science Foundation of China(No.62025208).
文摘Large-scale Language Models(LLMs)have achieved significant breakthroughs in Natural Language Processing(NLP),driven by the pre-training and fine-tuning paradigm.While this approach allows models to specialize in specific tasks with reduced training costs,the substantial memory requirements during fine-tuning present a barrier to broader deployment.Parameter-Efficient Fine-Tuning(PEFT)techniques,such as Low-Rank Adaptation(LoRA),and parameter quantization methods have emerged as solutions to address these challenges by optimizing memory usage and computational efficiency.Among these,QLoRA,which combines PEFT and quantization,has demonstrated notable success in reducing memory footprints during fine-tuning,prompting the development of various QLoRA variants.Despite these advancements,the quantitative impact of key variables on the fine-tuning performance of quantized LLMs remains underexplored.This study presents a comprehensive analysis of these key variables,focusing on their influence across different layer types and depths within LLM architectures.Our investigation uncovers several critical findings:(1)Larger layers,such as MLP layers,can maintain performance despite reductions in adapter rank,while smaller layers,like self-attention layers,aremore sensitive to such changes;(2)The effectiveness of balancing factors depends more on specific values rather than layer type or depth;(3)In quantization-aware fine-tuning,larger layers can effectively utilize smaller adapters,whereas smaller layers struggle to do so.These insights suggest that layer type is a more significant determinant of fine-tuning success than layer depth when optimizing quantized LLMs.Moreover,for the same discount of trainable parameters,reducing the trainable parameters in a larger layer is more effective in preserving fine-tuning accuracy than in a smaller one.This study provides valuable guidance for more efficient fine-tuning strategies and opens avenues for further research into optimizing LLM fine-tuning in resource-constrained environments.