There is increasing public concern about biological interactions with and the potential health effects of low frequency electric and magnetic fields. Recently, the ICNIRP (International Commission on Non-Ionizing Rad...There is increasing public concern about biological interactions with and the potential health effects of low frequency electric and magnetic fields. Recently, the ICNIRP (International Commission on Non-Ionizing Radiation Protection) has published new exposure guidelines with regard to these fields. The aim of this paper is to demonstrate the calculation of the currents and electric fields induced in the human body by external electric fields at 60 Hz, using numerical human models of anatomically-realistic human bodies, and to compare those results with the basic restrictions proposed by the new guidelines. As a result, in the case that a human is exposed to an electric field of 1 kV/m at 60 Hz the short-circuit current of 18 μA flows though the ankles. Furthermore, the electric field of 40 mV/m in the nervous tissue of the adult model is induced by exposure to external electric fields at the reference level, which is enough smaller than the basic restrictions established in the ICNIRP guidelines for occupational exposure.展开更多
Objective To evaluate the predictive validity of IRIS™(Intuitive Surgical®,Sunnyvale,CA,USA)as a planning tool for robot-assisted partial nephrectomy(RAPN)by assessing the degree of overlap with intraoperative ex...Objective To evaluate the predictive validity of IRIS™(Intuitive Surgical®,Sunnyvale,CA,USA)as a planning tool for robot-assisted partial nephrectomy(RAPN)by assessing the degree of overlap with intraoperative execution.Methods Thirty-one patients scheduled for RAPN by four experienced urologists were enrolled in a prospective study.Prior to surgery,urologists reviewed the IRIS™three-dimensional model on an iphone Operating System(iOS)app and completed a questionnaire outlining their surgical plan including surgical approach,and ischemia technique as well as confidence in executing this plan.Postoperatively,questionnaires assessing the procedural approach,clinical utility,efficiency,and effectiveness of IRIS™were completed.The degree of overlap between the preoperative and intraoperative questionnaires and between the planned approach and actual execution of the procedure was analyzed.Questionnaires were answered on a 5-point Likert scale and scores of 4 or greater were considered positive.Results Mean age was 65.1 years with a mean tumor size of 27.7 mm(interquartile range 17.5-44.0 mm).Hilar tumors consisted of 32.3%;48.4%of patients had R.E.N.A.L.nephrometry scores of 7-9.On preoperative questionnaires,the surgeons reported that in 67.7%cases they were confident that they can perform the procedure successfully,and on intraoperative questionnaires,the surgeons reported that in 96.8%cases IRIS™helped achieve good spatial sensation of the anatomy.There was a high degree of overlap between preoperative and intraoperative questionnaires for the surgical approach,interpreting anatomical details and clinical utility.When comparing plans for selective or off-clamp,the preoperative plan was executed in 90.0%of cases intraoperatively.Conclusion A high degree of overlap between the preoperative surgical approach and intraoperative RAPN execution was found using IRIS™.This is the first study to evaluate the predictive accuracy of IRIS™during RAPN by comparing preoperative plan and intraoperative execution.展开更多
Gastrointestinal(GI)cancers represent a major global health concern due to their high incidence and mortality rates.Foundation models(FMs),also referred to as large models,represent a novel class of artificial intelli...Gastrointestinal(GI)cancers represent a major global health concern due to their high incidence and mortality rates.Foundation models(FMs),also referred to as large models,represent a novel class of artificial intelligence technologies that have demonstrated considerable potential in addressing these challenges.These models encompass large language models(LLMs),vision FMs(VFMs),and multimodal LLMs(MLLMs),all of which utilize transformer architectures and self-supervised pre-training on extensive unlabeled datasets to achieve robust cross-domain generalization.This review delineates the principal applications of these models:LLMs facilitate the structuring of clinical narratives,extraction of insights from medical records,and enhancement of physician-patient communication;VFMs are employed in the analysis of endoscopic,radiological,and pathological images for lesion detection and staging;MLLMs integrate heterogeneous data modalities,including imaging,textual information,and genomic data,to support diagnostic processes,treatment prediction,and prognostic evaluation.Despite these promising developments,several challenges remain,such as the need for data standardization,limited diversity within training datasets,substantial computational resource requirements,and ethical-legal concerns.In conclusion,FMs exhibit significant potential to advance research and clinical management of GI cancers.Future research efforts should prioritize the refinement of these models,promote international collaborations,and adopt interdisciplinary approaches.Such a comprehensive strategy is essential to fully harness the capabilities of FMs,driving substantial progress in the fight against GI malignancies.展开更多
Myasthenia gravis is a chronic autoimmune disorder that affects the neuromuscular junction leading to fluctuating skeletal muscle fatigability. The majority of myasthenia gravis patients have detectable antibodies in ...Myasthenia gravis is a chronic autoimmune disorder that affects the neuromuscular junction leading to fluctuating skeletal muscle fatigability. The majority of myasthenia gravis patients have detectable antibodies in their serum, targeting acetylcholine receptor, muscle-specific kinase, or related proteins. Current treatment for myasthenia gravis involves symptomatic therapy, immunosuppressive drugs such as corticosteroids, azathioprine, and mycophenolate mofetil, and thymectomy, which is primarily indicated in patients with thymoma or thymic hyperplasia. However, this condition continues to pose significant challenges including an unpredictable and variable disease progression, differing response to individual therapies, and substantial longterm side effects associated with standard treatments(including an increased risk of infections, osteoporosis, and diabetes), underscoring the necessity for a more personalized approach to treatment. Furthermore, about fifteen percent of patients, called “refractory myasthenia gravis patients”, do not respond adequately to standard therapies. In this context, the introduction of molecular therapies has marked a significant advance in myasthenia gravis management. Advances in understanding myasthenia gravis pathogenesis, especially the role of pathogenic antibodies, have driven the development of these biological drugs, which offer more selective, rapid, and safer alternatives to traditional immunosuppressants. This review aims to provide a comprehensive overview of emerging therapeutic strategies targeting specific immune pathways in myasthenia gravis, with a particular focus on preclinical evidence, therapeutic rationale, and clinical translation of B-cell depletion therapies, neonatal Fc receptor inhibitors, and complement inhibitors.展开更多
The application of visual-language large models in the field of medical health has gradually become a research focus.The models combine the capability for image understanding and natural language processing,and can si...The application of visual-language large models in the field of medical health has gradually become a research focus.The models combine the capability for image understanding and natural language processing,and can simultaneously process multi-modality data such as medical images and medical reports.These models can not only recognize images,but also understand the semantic relationship between images and texts,effectively realize the integration of medical information,and provide strong support for clinical decision-making and disease diagnosis.The visual-language large model has good performance for specific medical tasks,and also shows strong potential and high intelligence in the general task models.This paper provides a comprehensive review of the visual-language large model in the field of medical health.Specifically,this paper first introduces the basic theoretical basis and technical principles.Then,this paper introduces the specific application scenarios in the field of medical health,including modality fusion,semi-supervised learning,weakly supervised learning,unsupervised learning,cross-domain model and general models.Finally,the challenges including insufficient data,interpretability,and practical deployment are discussed.According to the existing challenges,four potential future development directions are given.展开更多
In the era of AI,especially large models,the importance of open source has become increasingly prominent.First,open source allows innovation to avoid starting from scratch.Through iterative innovation,it promotes tech...In the era of AI,especially large models,the importance of open source has become increasingly prominent.First,open source allows innovation to avoid starting from scratch.Through iterative innovation,it promotes technical exchanges and learning globally.Second,resources required for large model R&D are difficult for a single institution to obtain.The evaluation of general large models also requires the participation of experts from various industries.Third,without open source collaboration,it is difficult to form a unified upper-layer software ecosystem.Therefore,open source has become an important cooperation mechanism to promote the development of AI and large models.There are two cases to illustrate how open source and international standards interact with each other.展开更多
Lung cancer has one of the highest rates of incidence and mortality worldwide,mak-ing research on its mechanisms and treatments crucial.Animal models are essential in lung cancer research as they accurately replicate ...Lung cancer has one of the highest rates of incidence and mortality worldwide,mak-ing research on its mechanisms and treatments crucial.Animal models are essential in lung cancer research as they accurately replicate the biological characteristics and treatment outcomes seen in human diseases.Currently,various lung cancer models have been established,including chemical induction models,orthotopic transplan-tation models,ectopic transplantation models,metastasis models,and gene editing mouse models.Additionally,lung cancer grafts can be categorized into two types:tissue-based and cell-based grafts.This paper summarizes the phenotypes,advan-tages,and disadvantages of various induction methods based on their modeling tech-niques.The goal is to enhance the simulation of clinical lung cancer characteristics and to establish a solid foundation for future clinical research.展开更多
Climate model prediction has been improved by enhancing model resolution as well as the implementation of sophisticated physical parameterization and refinement of data assimilation systems[section 6.1 in Wang et al.(...Climate model prediction has been improved by enhancing model resolution as well as the implementation of sophisticated physical parameterization and refinement of data assimilation systems[section 6.1 in Wang et al.(2025)].In relation to seasonal forecasting and climate projection in the East Asian summer monsoon season,proper simulation of the seasonal migration of rain bands by models is a challenging and limiting factor[section 7.1 in Wang et al.(2025)].展开更多
Large language models(LLMs)have emerged as transformative tools in radiology artificial intelligence(AI),offering significant capabilities in areas such as image report generation,clinical decision support,and workflo...Large language models(LLMs)have emerged as transformative tools in radiology artificial intelligence(AI),offering significant capabilities in areas such as image report generation,clinical decision support,and workflow optimization.The first part of this manuscript presents a comprehensive overview of the current state of LLM applications in radiology,including their historical evolution,technical foundations,and practical uses.Despite notable advances,inherent architectural constraints,such as token-level sequential processing,limit their ability to perform deep abstract reasoning and holistic contextual understanding,which are critical for fine-grained diagnostic interpretation.We provide a critical perspective on current LLMs and discuss key challenges,including model reliability,bias,and explainability,highlighting the pressing need for novel approaches to advance radiology AI.Large concept models(LCMs)represent a nascent and promising paradigm in radiology AI,designed to transcend the limitations of token-level processing by utilizing higher-order conceptual representations and multimodal data integration.The second part of this manuscript introduces the foundational principles and theoretical framework of LCMs,highlighting their potential to facilitate enhanced semantic reasoning,long-range context synthesis,and improved clinical decision-making.Critically,the core of this section is the proposal of a novel theoretical framework for LCMs,formalized and extended from our group’s foundational concept-based models-the world’s earliest articulation of this paradigm for medical AI.This conceptual shift has since been externally validated and propelled by the recent publication of the LCM architectural proposal by Meta AI,providing a large-scale engineering blueprint for the future development of this technology.We also outline future research directions and the transformative implications of this emerging AI paradigm for radiologic practice,aiming to provide a blueprint for advancing toward human-like conceptual understanding in AI.While challenges persist,we are at the very beginning of a new era,and it is not unreasonable to hope that future advancements will overcome these hurdles,pushing the boundaries of AI in Radiology,far beyond even the most state-of-the-art models of today.展开更多
Large models,such as large language models(LLMs),vision-language models(VLMs),and multimodal agents,have become key elements in artificial intelli⁃gence(AI)systems.Their rapid development has greatly improved percepti...Large models,such as large language models(LLMs),vision-language models(VLMs),and multimodal agents,have become key elements in artificial intelli⁃gence(AI)systems.Their rapid development has greatly improved perception,generation,and decision-making in various fields.However,their vast scale and complexity bring about new security challenges.Issues such as backdoor vulnerabilities during training,jailbreaking in multimodal rea⁃soning,and data provenance and copyright auditing have made security a critical focus for both academia and industry.展开更多
BACKGROUND Non-erosive reflux disease(NERD),the main gastroesophageal reflux subtype,features reflux symptoms without mucosal damage.Anxiety links to visceral hypersensitivity in NERD,yet mechanisms and animal models ...BACKGROUND Non-erosive reflux disease(NERD),the main gastroesophageal reflux subtype,features reflux symptoms without mucosal damage.Anxiety links to visceral hypersensitivity in NERD,yet mechanisms and animal models are unclear.AIM To establish a translational NERD rat model with anxiety comorbidity via tail clamping and study corticotropin-releasing hormone(CRH)-mediated neuroimmune pathways in visceral hypersensitivity and esophageal injury.METHODS Sprague-Dawley(SD)and Wistar rats were grouped into sham,model,and modified groups(n=10 each).The treatments for the modified groups were as follows:SD rats received ovalbumin/aluminum hydroxide suspension+acid perfusion±tail clamping(40 minutes/day for 7 days),while Wistar rats received fructose water+tail clamping.Esophageal pathology,visceral sensitivity,and behavior were assessed.Serum CRH,calcitonin gene-related peptide(CGRP),5-hydroxytryptamine(5-HT),and mast cell tryptase(MCT)and central amygdala(CeA)CRH mRNA were measured via ELISA and qRT-PCR.RESULTS Tail clamping induced anxiety,worsening visceral hypersensitivity(lower abdominal withdrawal reflex thresholds,P<0.05)and esophageal injury(dilated intercellular spaces and mitochondrial edema).Both models showed raised serum CRH,CGRP,5-HT,and MCT(P<0.01)and CeA CRH mRNA expression(P<0.01).Behavioral tests confirmed anxiety-like phenotypes.NERD-anxiety rats showed clinical-like symptom severity without erosion.CONCLUSION Tail clamping induces anxiety in NERD models,worsening visceral hypersensitivity via CRH neuroimmune dysregulation,offering a translational model and highlighting CRH as a treatment target.展开更多
Noninvasive brain stimulation techniques offer promising therapeutic and regenerative prospects in neurological diseases by modulating brain activity and improving cognitive and motor functions.Given the paucity of kn...Noninvasive brain stimulation techniques offer promising therapeutic and regenerative prospects in neurological diseases by modulating brain activity and improving cognitive and motor functions.Given the paucity of knowledge about the underlying modes of action and optimal treatment modalities,a thorough translational investigation of noninvasive brain stimulation in preclinical animal models is urgently needed.Thus,we reviewed the current literature on the mechanistic underpinnings of noninvasive brain stimulation in models of central nervous system impairment,with a particular emphasis on traumatic brain injury and stroke.Due to the lack of translational models in most noninvasive brain stimulation techniques proposed,we found this review to the most relevant techniques used in humans,i.e.,transcranial magnetic stimulation and transcranial direct current stimulation.We searched the literature in Pub Med,encompassing the MEDLINE and PMC databases,for studies published between January 1,2020 and September 30,2024.Thirty-five studies were eligible.Transcranial magnetic stimulation and transcranial direct current stimulation demonstrated distinct strengths in augmenting rehabilitation post-stroke and traumatic brain injury,with emerging mechanistic evidence.Overall,we identified neuronal,inflammatory,microvascular,and apoptotic pathways highlighted in the literature.This review also highlights a lack of translational surrogate parameters to bridge the gap between preclinical findings and their clinical translation.展开更多
The brain is the most complex human organ,and commonly used models,such as two-dimensional-cell cultures and animal brains,often lack the sophistication needed to accurately use in research.In this context,human cereb...The brain is the most complex human organ,and commonly used models,such as two-dimensional-cell cultures and animal brains,often lack the sophistication needed to accurately use in research.In this context,human cerebral organoids have emerged as valuable tools offering a more complex,versatile,and human-relevant system than traditional animal models,which are often unable to replicate the intricate architecture and functionality of the human brain.Since human cerebral organoids are a state-of-the-art model for the study of neurodevelopment and different pathologies affecting the brain,this field is currently under constant development,and work in this area is abundant.In this review,we give a complete overview of human cerebral organoids technology,starting from the different types of protocols that exist to generate different human cerebral organoids.We continue with the use of brain organoids for the study of brain pathologies,highlighting neurodevelopmental,psychiatric,neurodegenerative,brain tumor,and infectious diseases.Because of the potential value of human cerebral organoids,we describe their use in transplantation,drug screening,and toxicology assays.We also discuss the technologies available to study cell diversity and physiological characteristics of organoids.Finally,we summarize the limitations that currently exist in the field,such as the development of vasculature and microglia,and highlight some of the novel approaches being pursued through bioengineering.展开更多
Background Cotton is one of the most important commercial crops after food crops,especially in countries like India,where it’s grown extensively under rainfed conditions.Because of its usage in multiple industries,su...Background Cotton is one of the most important commercial crops after food crops,especially in countries like India,where it’s grown extensively under rainfed conditions.Because of its usage in multiple industries,such as textile,medicine,and automobile industries,it has greater commercial importance.The crop’s performance is greatly influenced by prevailing weather dynamics.As climate changes,assessing how weather changes affect crop performance is essential.Among various techniques that are available,crop models are the most effective and widely used tools for predicting yields.Results This study compares statistical and machine learning models to assess their ability to predict cotton yield across major producing districts of Karnataka,India,utilizing a long-term dataset spanning from 1990 to 2023 that includes yield and weather factors.The artificial neural networks(ANNs)performed superiorly with acceptable yield deviations ranging within±10%during both vegetative stage(F1)and mid stage(F2)for cotton.The model evaluation metrics such as root mean square error(RMSE),normalized root mean square error(nRMSE),and modelling efficiency(EF)were also within the acceptance limits in most districts.Furthermore,the tested ANN model was used to assess the importance of the dominant weather factors influencing crop yield in each district.Specifically,the use of morning relative humidity as an individual parameter and its interaction with maximum and minimum tempera-ture had a major influence on cotton yield in most of the yield predicted districts.These differences highlighted the differential interactions of weather factors in each district for cotton yield formation,highlighting individual response of each weather factor under different soils and management conditions over the major cotton growing districts of Karnataka.Conclusions Compared with statistical models,machine learning models such as ANNs proved higher efficiency in forecasting the cotton yield due to their ability to consider the interactive effects of weather factors on yield forma-tion at different growth stages.This highlights the best suitability of ANNs for yield forecasting in rainfed conditions and for the study on relative impacts of weather factors on yield.Thus,the study aims to provide valuable insights to support stakeholders in planning effective crop management strategies and formulating relevant policies.展开更多
To the Editor:Laparoscopic liver resection(LLR)is widely used as a standard procedure for liver malignancies and benign diseases.Consensus guidelines stated that LLR may be feasible and safe in experienced centers.Evi...To the Editor:Laparoscopic liver resection(LLR)is widely used as a standard procedure for liver malignancies and benign diseases.Consensus guidelines stated that LLR may be feasible and safe in experienced centers.Evidence has shown that LLR is less invasive and has bet-ter patient prognosis than conventional procedures[1].However,laparoscopic anatomic liver resection(LALR)such as segment 8(S8)resection is still challenging due to difficulties in segmental mapping and surgical techniques[2,3].Liver S8 is in a deep-seated area surrounded by the ribs and the diaphragm,and closely con-nected to the right and middle hepatic veins and inferior vena cava.Furthermore,the Glissonean pedicle of segment 8(G8)is lo-cated deep in the liver parenchyma,lacking anatomical landmarks,and making forceps manipulation difficult.Therefore,LALR-S8 has been described as the most challenging procedure[4].展开更多
Extracting data from visually rich documents and charts using traditional methods that rely on OCR-based parsing poses multiple challenges,including layout complexity in unstructured formats,limitations in recognizing...Extracting data from visually rich documents and charts using traditional methods that rely on OCR-based parsing poses multiple challenges,including layout complexity in unstructured formats,limitations in recognizing visual elements,and the correlation between different parts of the documents,as well as domain-specific semantics.Simply extracting text is not sufficient;advanced reasoning capabilities are proving to be essential to analyze content and answer questions accurately.This paper aims to evaluate the ability of the Large Language Models(LLMs)to correctly answer questions about various types of charts,comparing their performance when using images as input versus directly parsing PDF files.To retrieve the images from the PDF,ColPali,a model leveraging state-of-the-art visual languagemodels,is used to identify the relevant page containing the appropriate chart for each question.Google’s Gemini multimodal models were used to answer a set of questions through two approaches:1)processing images derived from PDF documents and 2)directly utilizing the content of the same PDFs.Our findings underscore the limitations of traditional OCR-based approaches in visual document understanding(VrDU)and demonstrate the advantages of multimodal methods in both data extraction and reasoning tasks.Through structured benchmarking of chart question answering(CQA)across input formats,our work contributes to the advancement of chart understanding(CU)and the broader field of multimodal document analysis.Using two diverse and information-rich sources:the World Health Statistics 2024 report by theWorld Health Organisation and the Global Banking Annual Review 2024 by McKinsey&Company,we examine the performance ofmultimodal LLMs across different input modalities,comparing their effectiveness in processing charts as images versus parsing directly from PDF content.These documents were selected due to their multimodal nature,combining dense textual analysis with varied visual representations,thus presenting realistic challenges for vision-language models.This comparison is aimed at assessing how advanced models perform with different input formats and to determine if an image-based approach enhances chart comprehension in terms of accurate data extraction and reasoning capabilities.展开更多
The application of generative artificial intelligence(AI)is bringing about notable changes in anime creation.This paper surveys recent advancements and applications of diffusion and language models in anime generation...The application of generative artificial intelligence(AI)is bringing about notable changes in anime creation.This paper surveys recent advancements and applications of diffusion and language models in anime generation,focusing on their demonstrated potential to enhance production efficiency through automation and personalization.Despite these benefits,it is crucial to acknowledge the substantial initial computational investments required for training and deploying these models.We conduct an in-depth survey of cutting-edge generative AI technologies,encompassing models such as Stable Diffusion and GPT,and appraise pivotal large-scale datasets alongside quantifiable evaluation metrics.Review of the surveyed literature indicates the achievement of considerable maturity in the capacity of AI models to synthesize high-quality,aesthetically compelling anime visual images from textual prompts,alongside discernible progress in the generation of coherent narratives.However,achieving perfect long-form consistency,mitigating artifacts like flickering in video sequences,and enabling fine-grained artistic control remain critical ongoing challenges.Building upon these advancements,research efforts have increasingly pivoted towards the synthesis of higher-dimensional content,such as video and three-dimensional assets,with recent studies demonstrating significant progress in this burgeoning field.Nevertheless,formidable challenges endure amidst these advancements.Foremost among these are the substantial computational exigencies requisite for training and deploying these sophisticated models,particularly pronounced in the realm of high-dimensional generation such as video synthesis.Additional persistent hurdles include maintaining spatial-temporal consistency across complex scenes and mitigating ethical considerations surrounding bias and the preservation of human creative autonomy.This research underscores the transformative potential and inherent complexities of AI-driven synergy within the creative industries.We posit that future research should be dedicated to the synergistic fusion of diffusion and autoregressive models,the integration of multimodal inputs,and the balanced consideration of ethical implications,particularly regarding bias and the preservation of human creative autonomy,thereby establishing a robust foundation for the advancement of anime creation and the broader landscape of AI-driven content generation.展开更多
The unprecedented scale of large models,such as large language models(LLMs)and text-to-image diffusion models,has raised critical concerns about the unauthorized use of copyrighted data during model training.These con...The unprecedented scale of large models,such as large language models(LLMs)and text-to-image diffusion models,has raised critical concerns about the unauthorized use of copyrighted data during model training.These concerns have spurred a growing demand for dataset copyright auditing techniques,which aim to detect and verify potential infringements in the training data of commercial AI systems.This paper presents a survey of existing auditing solutions,categorizing them across key dimensions:data modality,model training stage,data overlap scenarios,and model access levels.We highlight major trends,including the prevalence of black-box auditing methods and the emphasis on fine-tuning rather than pre-training.Through an in-depth analysis of 12 representative works,we extract four key observations that reveal the limitations of current methods.Furthermore,we identify three open challenges and propose future directions for robust,multimodal,and scalable auditing solutions.Our findings underscore the urgent need to establish standardized benchmarks and develop auditing frameworks that are resilient to low watermark densities and applicable in diverse deployment settings.展开更多
In recent years,large vision-language models(VLMs)have achieved significant breakthroughs in cross-modal understanding and generation.However,the safety issues arising from their multimodal interactions become promine...In recent years,large vision-language models(VLMs)have achieved significant breakthroughs in cross-modal understanding and generation.However,the safety issues arising from their multimodal interactions become prominent.VLMs are vulnerable to jailbreak attacks,where attackers craft carefully designed prompts to bypass safety mechanisms,leading them to generate harmful content.To address this,we investigate the alignment between visual inputs and task execution,uncovering locality defects and attention biases in VLMs.Based on these findings,we propose VOTI,a novel jailbreak framework leveraging visual obfuscation and task induction.VOTI subtly embeds malicious keywords within neutral image layouts to evade detection,and breaks down harmful queries into a sequence of subtasks.This approach disperses malicious intent across modalities,exploiting VLMs’over-reliance on local visual cues and their fragility in multi-step reasoning to bypass global safety mechanisms.Implemented as an automated framework,VOTI integrates large language models as red-team assistants to generate and iteratively optimize jailbreak strategies.Extensive experiments across seven mainstream VLMs demonstrate VOTI’s effectiveness,achieving a 73.46%attack success rate on GPT-4o-mini.These results reveal critical vulnerabilities in VLMs,highlighting the urgent need for improving robust defenses and multimodal alignment.展开更多
Sentiment analysis,a cornerstone of natural language processing,has witnessed remarkable advancements driven by deep learning models which demonstrated impressive accuracy in discerning sentiment from text across vari...Sentiment analysis,a cornerstone of natural language processing,has witnessed remarkable advancements driven by deep learning models which demonstrated impressive accuracy in discerning sentiment from text across various domains.However,the deployment of such models in resource-constrained environments presents a unique set of challenges that require innovative solutions.Resource-constrained environments encompass scenarios where computing resources,memory,and energy availability are restricted.To empower sentiment analysis in resource-constrained environments,we address the crucial need by leveraging lightweight pre-trained models.These models,derived from popular architectures such as DistilBERT,MobileBERT,ALBERT,TinyBERT,ELECTRA,and SqueezeBERT,offer a promising solution to the resource limitations imposed by these environments.By distilling the knowledge from larger models into smaller ones and employing various optimization techniques,these lightweight models aim to strike a balance between performance and resource efficiency.This paper endeavors to explore the performance of multiple lightweight pre-trained models in sentiment analysis tasks specific to such environments and provide insights into their viability for practical deployment.展开更多
文摘There is increasing public concern about biological interactions with and the potential health effects of low frequency electric and magnetic fields. Recently, the ICNIRP (International Commission on Non-Ionizing Radiation Protection) has published new exposure guidelines with regard to these fields. The aim of this paper is to demonstrate the calculation of the currents and electric fields induced in the human body by external electric fields at 60 Hz, using numerical human models of anatomically-realistic human bodies, and to compare those results with the basic restrictions proposed by the new guidelines. As a result, in the case that a human is exposed to an electric field of 1 kV/m at 60 Hz the short-circuit current of 18 μA flows though the ankles. Furthermore, the electric field of 40 mV/m in the nervous tissue of the adult model is induced by exposure to external electric fields at the reference level, which is enough smaller than the basic restrictions established in the ICNIRP guidelines for occupational exposure.
文摘Objective To evaluate the predictive validity of IRIS™(Intuitive Surgical®,Sunnyvale,CA,USA)as a planning tool for robot-assisted partial nephrectomy(RAPN)by assessing the degree of overlap with intraoperative execution.Methods Thirty-one patients scheduled for RAPN by four experienced urologists were enrolled in a prospective study.Prior to surgery,urologists reviewed the IRIS™three-dimensional model on an iphone Operating System(iOS)app and completed a questionnaire outlining their surgical plan including surgical approach,and ischemia technique as well as confidence in executing this plan.Postoperatively,questionnaires assessing the procedural approach,clinical utility,efficiency,and effectiveness of IRIS™were completed.The degree of overlap between the preoperative and intraoperative questionnaires and between the planned approach and actual execution of the procedure was analyzed.Questionnaires were answered on a 5-point Likert scale and scores of 4 or greater were considered positive.Results Mean age was 65.1 years with a mean tumor size of 27.7 mm(interquartile range 17.5-44.0 mm).Hilar tumors consisted of 32.3%;48.4%of patients had R.E.N.A.L.nephrometry scores of 7-9.On preoperative questionnaires,the surgeons reported that in 67.7%cases they were confident that they can perform the procedure successfully,and on intraoperative questionnaires,the surgeons reported that in 96.8%cases IRIS™helped achieve good spatial sensation of the anatomy.There was a high degree of overlap between preoperative and intraoperative questionnaires for the surgical approach,interpreting anatomical details and clinical utility.When comparing plans for selective or off-clamp,the preoperative plan was executed in 90.0%of cases intraoperatively.Conclusion A high degree of overlap between the preoperative surgical approach and intraoperative RAPN execution was found using IRIS™.This is the first study to evaluate the predictive accuracy of IRIS™during RAPN by comparing preoperative plan and intraoperative execution.
基金Supported by the Open Project Program of Panxi Crops Research and Utilization Key Laboratory of Sichuan Province,No.SZKF202302the Fundamental Research Funds for the Central Universities No.2019CDYGYB024.
文摘Gastrointestinal(GI)cancers represent a major global health concern due to their high incidence and mortality rates.Foundation models(FMs),also referred to as large models,represent a novel class of artificial intelligence technologies that have demonstrated considerable potential in addressing these challenges.These models encompass large language models(LLMs),vision FMs(VFMs),and multimodal LLMs(MLLMs),all of which utilize transformer architectures and self-supervised pre-training on extensive unlabeled datasets to achieve robust cross-domain generalization.This review delineates the principal applications of these models:LLMs facilitate the structuring of clinical narratives,extraction of insights from medical records,and enhancement of physician-patient communication;VFMs are employed in the analysis of endoscopic,radiological,and pathological images for lesion detection and staging;MLLMs integrate heterogeneous data modalities,including imaging,textual information,and genomic data,to support diagnostic processes,treatment prediction,and prognostic evaluation.Despite these promising developments,several challenges remain,such as the need for data standardization,limited diversity within training datasets,substantial computational resource requirements,and ethical-legal concerns.In conclusion,FMs exhibit significant potential to advance research and clinical management of GI cancers.Future research efforts should prioritize the refinement of these models,promote international collaborations,and adopt interdisciplinary approaches.Such a comprehensive strategy is essential to fully harness the capabilities of FMs,driving substantial progress in the fight against GI malignancies.
文摘Myasthenia gravis is a chronic autoimmune disorder that affects the neuromuscular junction leading to fluctuating skeletal muscle fatigability. The majority of myasthenia gravis patients have detectable antibodies in their serum, targeting acetylcholine receptor, muscle-specific kinase, or related proteins. Current treatment for myasthenia gravis involves symptomatic therapy, immunosuppressive drugs such as corticosteroids, azathioprine, and mycophenolate mofetil, and thymectomy, which is primarily indicated in patients with thymoma or thymic hyperplasia. However, this condition continues to pose significant challenges including an unpredictable and variable disease progression, differing response to individual therapies, and substantial longterm side effects associated with standard treatments(including an increased risk of infections, osteoporosis, and diabetes), underscoring the necessity for a more personalized approach to treatment. Furthermore, about fifteen percent of patients, called “refractory myasthenia gravis patients”, do not respond adequately to standard therapies. In this context, the introduction of molecular therapies has marked a significant advance in myasthenia gravis management. Advances in understanding myasthenia gravis pathogenesis, especially the role of pathogenic antibodies, have driven the development of these biological drugs, which offer more selective, rapid, and safer alternatives to traditional immunosuppressants. This review aims to provide a comprehensive overview of emerging therapeutic strategies targeting specific immune pathways in myasthenia gravis, with a particular focus on preclinical evidence, therapeutic rationale, and clinical translation of B-cell depletion therapies, neonatal Fc receptor inhibitors, and complement inhibitors.
基金The Natural Science Foundation of Hebei Province(F2024501044).
文摘The application of visual-language large models in the field of medical health has gradually become a research focus.The models combine the capability for image understanding and natural language processing,and can simultaneously process multi-modality data such as medical images and medical reports.These models can not only recognize images,but also understand the semantic relationship between images and texts,effectively realize the integration of medical information,and provide strong support for clinical decision-making and disease diagnosis.The visual-language large model has good performance for specific medical tasks,and also shows strong potential and high intelligence in the general task models.This paper provides a comprehensive review of the visual-language large model in the field of medical health.Specifically,this paper first introduces the basic theoretical basis and technical principles.Then,this paper introduces the specific application scenarios in the field of medical health,including modality fusion,semi-supervised learning,weakly supervised learning,unsupervised learning,cross-domain model and general models.Finally,the challenges including insufficient data,interpretability,and practical deployment are discussed.According to the existing challenges,four potential future development directions are given.
文摘In the era of AI,especially large models,the importance of open source has become increasingly prominent.First,open source allows innovation to avoid starting from scratch.Through iterative innovation,it promotes technical exchanges and learning globally.Second,resources required for large model R&D are difficult for a single institution to obtain.The evaluation of general large models also requires the participation of experts from various industries.Third,without open source collaboration,it is difficult to form a unified upper-layer software ecosystem.Therefore,open source has become an important cooperation mechanism to promote the development of AI and large models.There are two cases to illustrate how open source and international standards interact with each other.
基金Sichuan Provincial Administration of Traditional Chinese Medicine,Grant/Award Number:2023MS564National Natural Science Foundation of China,Grant/Award Number:82474436。
文摘Lung cancer has one of the highest rates of incidence and mortality worldwide,mak-ing research on its mechanisms and treatments crucial.Animal models are essential in lung cancer research as they accurately replicate the biological characteristics and treatment outcomes seen in human diseases.Currently,various lung cancer models have been established,including chemical induction models,orthotopic transplan-tation models,ectopic transplantation models,metastasis models,and gene editing mouse models.Additionally,lung cancer grafts can be categorized into two types:tissue-based and cell-based grafts.This paper summarizes the phenotypes,advan-tages,and disadvantages of various induction methods based on their modeling tech-niques.The goal is to enhance the simulation of clinical lung cancer characteristics and to establish a solid foundation for future clinical research.
文摘Climate model prediction has been improved by enhancing model resolution as well as the implementation of sophisticated physical parameterization and refinement of data assimilation systems[section 6.1 in Wang et al.(2025)].In relation to seasonal forecasting and climate projection in the East Asian summer monsoon season,proper simulation of the seasonal migration of rain bands by models is a challenging and limiting factor[section 7.1 in Wang et al.(2025)].
文摘Large language models(LLMs)have emerged as transformative tools in radiology artificial intelligence(AI),offering significant capabilities in areas such as image report generation,clinical decision support,and workflow optimization.The first part of this manuscript presents a comprehensive overview of the current state of LLM applications in radiology,including their historical evolution,technical foundations,and practical uses.Despite notable advances,inherent architectural constraints,such as token-level sequential processing,limit their ability to perform deep abstract reasoning and holistic contextual understanding,which are critical for fine-grained diagnostic interpretation.We provide a critical perspective on current LLMs and discuss key challenges,including model reliability,bias,and explainability,highlighting the pressing need for novel approaches to advance radiology AI.Large concept models(LCMs)represent a nascent and promising paradigm in radiology AI,designed to transcend the limitations of token-level processing by utilizing higher-order conceptual representations and multimodal data integration.The second part of this manuscript introduces the foundational principles and theoretical framework of LCMs,highlighting their potential to facilitate enhanced semantic reasoning,long-range context synthesis,and improved clinical decision-making.Critically,the core of this section is the proposal of a novel theoretical framework for LCMs,formalized and extended from our group’s foundational concept-based models-the world’s earliest articulation of this paradigm for medical AI.This conceptual shift has since been externally validated and propelled by the recent publication of the LCM architectural proposal by Meta AI,providing a large-scale engineering blueprint for the future development of this technology.We also outline future research directions and the transformative implications of this emerging AI paradigm for radiologic practice,aiming to provide a blueprint for advancing toward human-like conceptual understanding in AI.While challenges persist,we are at the very beginning of a new era,and it is not unreasonable to hope that future advancements will overcome these hurdles,pushing the boundaries of AI in Radiology,far beyond even the most state-of-the-art models of today.
文摘Large models,such as large language models(LLMs),vision-language models(VLMs),and multimodal agents,have become key elements in artificial intelli⁃gence(AI)systems.Their rapid development has greatly improved perception,generation,and decision-making in various fields.However,their vast scale and complexity bring about new security challenges.Issues such as backdoor vulnerabilities during training,jailbreaking in multimodal rea⁃soning,and data provenance and copyright auditing have made security a critical focus for both academia and industry.
基金Supported by the National Key Specialty of Traditional Chinese Medicine(Spleen and Stomach Diseases),No.0500004National Natural Science Foundation of China,No.82205104 and No.82104850+1 种基金Hospital Capability Enhancement Project of Xiyuan Hospital,CACMS,No.XYZX0303-07the Fundamental Research Funds for the Central Public Welfare Research Institutes,Excellent Young Scientists Training Program of China Academy of Chinese Medical Sciences,No.ZZ16-YQ-002.
文摘BACKGROUND Non-erosive reflux disease(NERD),the main gastroesophageal reflux subtype,features reflux symptoms without mucosal damage.Anxiety links to visceral hypersensitivity in NERD,yet mechanisms and animal models are unclear.AIM To establish a translational NERD rat model with anxiety comorbidity via tail clamping and study corticotropin-releasing hormone(CRH)-mediated neuroimmune pathways in visceral hypersensitivity and esophageal injury.METHODS Sprague-Dawley(SD)and Wistar rats were grouped into sham,model,and modified groups(n=10 each).The treatments for the modified groups were as follows:SD rats received ovalbumin/aluminum hydroxide suspension+acid perfusion±tail clamping(40 minutes/day for 7 days),while Wistar rats received fructose water+tail clamping.Esophageal pathology,visceral sensitivity,and behavior were assessed.Serum CRH,calcitonin gene-related peptide(CGRP),5-hydroxytryptamine(5-HT),and mast cell tryptase(MCT)and central amygdala(CeA)CRH mRNA were measured via ELISA and qRT-PCR.RESULTS Tail clamping induced anxiety,worsening visceral hypersensitivity(lower abdominal withdrawal reflex thresholds,P<0.05)and esophageal injury(dilated intercellular spaces and mitochondrial edema).Both models showed raised serum CRH,CGRP,5-HT,and MCT(P<0.01)and CeA CRH mRNA expression(P<0.01).Behavioral tests confirmed anxiety-like phenotypes.NERD-anxiety rats showed clinical-like symptom severity without erosion.CONCLUSION Tail clamping induces anxiety in NERD models,worsening visceral hypersensitivity via CRH neuroimmune dysregulation,offering a translational model and highlighting CRH as a treatment target.
基金funded by the Deutsche Forschungsgemeinschaft(DFG,German Research Foundation):project ID 431549029-SFB 1451the Marga-und-Walter-Boll-Stiftung(#210-10-15)(to MAR)a stipend from the'Gerok Program'(Faculty of Medicine,University of Cologne,Germany)。
文摘Noninvasive brain stimulation techniques offer promising therapeutic and regenerative prospects in neurological diseases by modulating brain activity and improving cognitive and motor functions.Given the paucity of knowledge about the underlying modes of action and optimal treatment modalities,a thorough translational investigation of noninvasive brain stimulation in preclinical animal models is urgently needed.Thus,we reviewed the current literature on the mechanistic underpinnings of noninvasive brain stimulation in models of central nervous system impairment,with a particular emphasis on traumatic brain injury and stroke.Due to the lack of translational models in most noninvasive brain stimulation techniques proposed,we found this review to the most relevant techniques used in humans,i.e.,transcranial magnetic stimulation and transcranial direct current stimulation.We searched the literature in Pub Med,encompassing the MEDLINE and PMC databases,for studies published between January 1,2020 and September 30,2024.Thirty-five studies were eligible.Transcranial magnetic stimulation and transcranial direct current stimulation demonstrated distinct strengths in augmenting rehabilitation post-stroke and traumatic brain injury,with emerging mechanistic evidence.Overall,we identified neuronal,inflammatory,microvascular,and apoptotic pathways highlighted in the literature.This review also highlights a lack of translational surrogate parameters to bridge the gap between preclinical findings and their clinical translation.
基金supported by the Grant PID2021-126715OB-IOO financed by MCIN/AEI/10.13039/501100011033 and"ERDFA way of making Europe"by the Grant PI22CⅢ/00055 funded by Instituto de Salud CarlosⅢ(ISCⅢ)+6 种基金the UFIECPY 398/19(PEJ2018-004965) grant to RGS funded by AEI(Spain)the UFIECPY-396/19(PEJ2018-004961)grant financed by MCIN (Spain)FI23CⅢ/00003 grant funded by ISCⅢ-PFIS Spain) to PMMthe UFIECPY 328/22 (PEJ-2021-TL/BMD-21001) grant to LM financed by CAM (Spain)the grant by CAPES (Coordination for the Improvement of Higher Education Personnel)through the PDSE program (Programa de Doutorado Sanduiche no Exterior)to VSCG financed by MEC (Brazil)
文摘The brain is the most complex human organ,and commonly used models,such as two-dimensional-cell cultures and animal brains,often lack the sophistication needed to accurately use in research.In this context,human cerebral organoids have emerged as valuable tools offering a more complex,versatile,and human-relevant system than traditional animal models,which are often unable to replicate the intricate architecture and functionality of the human brain.Since human cerebral organoids are a state-of-the-art model for the study of neurodevelopment and different pathologies affecting the brain,this field is currently under constant development,and work in this area is abundant.In this review,we give a complete overview of human cerebral organoids technology,starting from the different types of protocols that exist to generate different human cerebral organoids.We continue with the use of brain organoids for the study of brain pathologies,highlighting neurodevelopmental,psychiatric,neurodegenerative,brain tumor,and infectious diseases.Because of the potential value of human cerebral organoids,we describe their use in transplantation,drug screening,and toxicology assays.We also discuss the technologies available to study cell diversity and physiological characteristics of organoids.Finally,we summarize the limitations that currently exist in the field,such as the development of vasculature and microglia,and highlight some of the novel approaches being pursued through bioengineering.
基金funded through India Meteorological Department,New Delhi,India under the Forecasting Agricultural output using Space,Agrometeorol ogy and Land based observations(FASAL)project and fund number:No.ASC/FASAL/KT-11/01/HQ-2010.
文摘Background Cotton is one of the most important commercial crops after food crops,especially in countries like India,where it’s grown extensively under rainfed conditions.Because of its usage in multiple industries,such as textile,medicine,and automobile industries,it has greater commercial importance.The crop’s performance is greatly influenced by prevailing weather dynamics.As climate changes,assessing how weather changes affect crop performance is essential.Among various techniques that are available,crop models are the most effective and widely used tools for predicting yields.Results This study compares statistical and machine learning models to assess their ability to predict cotton yield across major producing districts of Karnataka,India,utilizing a long-term dataset spanning from 1990 to 2023 that includes yield and weather factors.The artificial neural networks(ANNs)performed superiorly with acceptable yield deviations ranging within±10%during both vegetative stage(F1)and mid stage(F2)for cotton.The model evaluation metrics such as root mean square error(RMSE),normalized root mean square error(nRMSE),and modelling efficiency(EF)were also within the acceptance limits in most districts.Furthermore,the tested ANN model was used to assess the importance of the dominant weather factors influencing crop yield in each district.Specifically,the use of morning relative humidity as an individual parameter and its interaction with maximum and minimum tempera-ture had a major influence on cotton yield in most of the yield predicted districts.These differences highlighted the differential interactions of weather factors in each district for cotton yield formation,highlighting individual response of each weather factor under different soils and management conditions over the major cotton growing districts of Karnataka.Conclusions Compared with statistical models,machine learning models such as ANNs proved higher efficiency in forecasting the cotton yield due to their ability to consider the interactive effects of weather factors on yield forma-tion at different growth stages.This highlights the best suitability of ANNs for yield forecasting in rainfed conditions and for the study on relative impacts of weather factors on yield.Thus,the study aims to provide valuable insights to support stakeholders in planning effective crop management strategies and formulating relevant policies.
文摘To the Editor:Laparoscopic liver resection(LLR)is widely used as a standard procedure for liver malignancies and benign diseases.Consensus guidelines stated that LLR may be feasible and safe in experienced centers.Evidence has shown that LLR is less invasive and has bet-ter patient prognosis than conventional procedures[1].However,laparoscopic anatomic liver resection(LALR)such as segment 8(S8)resection is still challenging due to difficulties in segmental mapping and surgical techniques[2,3].Liver S8 is in a deep-seated area surrounded by the ribs and the diaphragm,and closely con-nected to the right and middle hepatic veins and inferior vena cava.Furthermore,the Glissonean pedicle of segment 8(G8)is lo-cated deep in the liver parenchyma,lacking anatomical landmarks,and making forceps manipulation difficult.Therefore,LALR-S8 has been described as the most challenging procedure[4].
基金supported by a grant from the Ministry of Research,Innovation and Digitization,CNCS/CCCDI-UEFISCDI,project number COFUND-CETP-SMART-LEM-1,within PNCDI Ⅳ.
文摘Extracting data from visually rich documents and charts using traditional methods that rely on OCR-based parsing poses multiple challenges,including layout complexity in unstructured formats,limitations in recognizing visual elements,and the correlation between different parts of the documents,as well as domain-specific semantics.Simply extracting text is not sufficient;advanced reasoning capabilities are proving to be essential to analyze content and answer questions accurately.This paper aims to evaluate the ability of the Large Language Models(LLMs)to correctly answer questions about various types of charts,comparing their performance when using images as input versus directly parsing PDF files.To retrieve the images from the PDF,ColPali,a model leveraging state-of-the-art visual languagemodels,is used to identify the relevant page containing the appropriate chart for each question.Google’s Gemini multimodal models were used to answer a set of questions through two approaches:1)processing images derived from PDF documents and 2)directly utilizing the content of the same PDFs.Our findings underscore the limitations of traditional OCR-based approaches in visual document understanding(VrDU)and demonstrate the advantages of multimodal methods in both data extraction and reasoning tasks.Through structured benchmarking of chart question answering(CQA)across input formats,our work contributes to the advancement of chart understanding(CU)and the broader field of multimodal document analysis.Using two diverse and information-rich sources:the World Health Statistics 2024 report by theWorld Health Organisation and the Global Banking Annual Review 2024 by McKinsey&Company,we examine the performance ofmultimodal LLMs across different input modalities,comparing their effectiveness in processing charts as images versus parsing directly from PDF content.These documents were selected due to their multimodal nature,combining dense textual analysis with varied visual representations,thus presenting realistic challenges for vision-language models.This comparison is aimed at assessing how advanced models perform with different input formats and to determine if an image-based approach enhances chart comprehension in terms of accurate data extraction and reasoning capabilities.
基金supported by the National Natural Science Foundation of China(Grant No.62202210).
文摘The application of generative artificial intelligence(AI)is bringing about notable changes in anime creation.This paper surveys recent advancements and applications of diffusion and language models in anime generation,focusing on their demonstrated potential to enhance production efficiency through automation and personalization.Despite these benefits,it is crucial to acknowledge the substantial initial computational investments required for training and deploying these models.We conduct an in-depth survey of cutting-edge generative AI technologies,encompassing models such as Stable Diffusion and GPT,and appraise pivotal large-scale datasets alongside quantifiable evaluation metrics.Review of the surveyed literature indicates the achievement of considerable maturity in the capacity of AI models to synthesize high-quality,aesthetically compelling anime visual images from textual prompts,alongside discernible progress in the generation of coherent narratives.However,achieving perfect long-form consistency,mitigating artifacts like flickering in video sequences,and enabling fine-grained artistic control remain critical ongoing challenges.Building upon these advancements,research efforts have increasingly pivoted towards the synthesis of higher-dimensional content,such as video and three-dimensional assets,with recent studies demonstrating significant progress in this burgeoning field.Nevertheless,formidable challenges endure amidst these advancements.Foremost among these are the substantial computational exigencies requisite for training and deploying these sophisticated models,particularly pronounced in the realm of high-dimensional generation such as video synthesis.Additional persistent hurdles include maintaining spatial-temporal consistency across complex scenes and mitigating ethical considerations surrounding bias and the preservation of human creative autonomy.This research underscores the transformative potential and inherent complexities of AI-driven synergy within the creative industries.We posit that future research should be dedicated to the synergistic fusion of diffusion and autoregressive models,the integration of multimodal inputs,and the balanced consideration of ethical implications,particularly regarding bias and the preservation of human creative autonomy,thereby establishing a robust foundation for the advancement of anime creation and the broader landscape of AI-driven content generation.
基金supported in part by NSFC under Grant Nos.62402379,U22A2029 and U24A20237.
文摘The unprecedented scale of large models,such as large language models(LLMs)and text-to-image diffusion models,has raised critical concerns about the unauthorized use of copyrighted data during model training.These concerns have spurred a growing demand for dataset copyright auditing techniques,which aim to detect and verify potential infringements in the training data of commercial AI systems.This paper presents a survey of existing auditing solutions,categorizing them across key dimensions:data modality,model training stage,data overlap scenarios,and model access levels.We highlight major trends,including the prevalence of black-box auditing methods and the emphasis on fine-tuning rather than pre-training.Through an in-depth analysis of 12 representative works,we extract four key observations that reveal the limitations of current methods.Furthermore,we identify three open challenges and propose future directions for robust,multimodal,and scalable auditing solutions.Our findings underscore the urgent need to establish standardized benchmarks and develop auditing frameworks that are resilient to low watermark densities and applicable in diverse deployment settings.
文摘In recent years,large vision-language models(VLMs)have achieved significant breakthroughs in cross-modal understanding and generation.However,the safety issues arising from their multimodal interactions become prominent.VLMs are vulnerable to jailbreak attacks,where attackers craft carefully designed prompts to bypass safety mechanisms,leading them to generate harmful content.To address this,we investigate the alignment between visual inputs and task execution,uncovering locality defects and attention biases in VLMs.Based on these findings,we propose VOTI,a novel jailbreak framework leveraging visual obfuscation and task induction.VOTI subtly embeds malicious keywords within neutral image layouts to evade detection,and breaks down harmful queries into a sequence of subtasks.This approach disperses malicious intent across modalities,exploiting VLMs’over-reliance on local visual cues and their fragility in multi-step reasoning to bypass global safety mechanisms.Implemented as an automated framework,VOTI integrates large language models as red-team assistants to generate and iteratively optimize jailbreak strategies.Extensive experiments across seven mainstream VLMs demonstrate VOTI’s effectiveness,achieving a 73.46%attack success rate on GPT-4o-mini.These results reveal critical vulnerabilities in VLMs,highlighting the urgent need for improving robust defenses and multimodal alignment.
文摘Sentiment analysis,a cornerstone of natural language processing,has witnessed remarkable advancements driven by deep learning models which demonstrated impressive accuracy in discerning sentiment from text across various domains.However,the deployment of such models in resource-constrained environments presents a unique set of challenges that require innovative solutions.Resource-constrained environments encompass scenarios where computing resources,memory,and energy availability are restricted.To empower sentiment analysis in resource-constrained environments,we address the crucial need by leveraging lightweight pre-trained models.These models,derived from popular architectures such as DistilBERT,MobileBERT,ALBERT,TinyBERT,ELECTRA,and SqueezeBERT,offer a promising solution to the resource limitations imposed by these environments.By distilling the knowledge from larger models into smaller ones and employing various optimization techniques,these lightweight models aim to strike a balance between performance and resource efficiency.This paper endeavors to explore the performance of multiple lightweight pre-trained models in sentiment analysis tasks specific to such environments and provide insights into their viability for practical deployment.