Chinese abbreviations improve communicative efficiency by extracting key components from longer expressions.They are widely used in both daily communication and professional domains.However,existing abbreviation gener...Chinese abbreviations improve communicative efficiency by extracting key components from longer expressions.They are widely used in both daily communication and professional domains.However,existing abbreviation generation methods still face two major challenges.First,sequence-labeling-based approaches often neglect contextual meaning by making binary decisions at the character level,leading to abbreviations that fail to capture semantic completeness.Second,generation-basedmethods rely heavily on a single decoding process,which frequently produces correct abbreviations but ranks them lower due to inadequate semantic evaluation.To address these limitations,we propose a novel two-stage frameworkwithGeneration–Iterative Optimization forAbbreviation(GIOA).In the first stage,we design aChain-of-Thought prompting strategy and incorporate definitional and situational contexts to generate multiple abbreviation candidates.In the second stage,we introduce a Semantic Preservation Dynamic Adjustment mechanism that alternates between character-level importance estimation and semantic restoration to optimize candidate ranking.Experiments on two public benchmark datasets show that our method outperforms existing state-of-the-art approaches,achieving Hit@1 improvements of 15.15%and 13.01%,respectively,while maintaining consistent results in Hit@3.展开更多
Large language models(LLMs)have revolutionized AI applications across diverse domains.However,their widespread deployment has introduced critical security vulnerabilities,particularly prompt injection attacks that man...Large language models(LLMs)have revolutionized AI applications across diverse domains.However,their widespread deployment has introduced critical security vulnerabilities,particularly prompt injection attacks that manipulate model behavior through malicious instructions.Following Kitchenham’s guidelines,this systematic review synthesizes 128 peer-reviewed studies from 2022 to 2025 to provide a unified understanding of this rapidly evolving threat landscape.Our findings reveal a swift progression from simple direct injections to sophisticated multimodal attacks,achieving over 90%success rates against unprotected systems.In response,defense mechanisms show varying effectiveness:input preprocessing achieves 60%–80%detection rates and advanced architectural defenses demonstrate up to 95%protection against known patterns,though significant gaps persist against novel attack vectors.We identified 37 distinct defense approaches across three categories,but standardized evaluation frameworks remain limited.Our analysis attributes these vulnerabilities to fundamental LLM architectural limitations,such as the inability to distinguish instructions from data and attention mechanism vulnerabilities.This highlights critical research directions such as formal verification methods,standardized evaluation protocols,and architectural innovations for inherently secure LLM designs.展开更多
The energy correlations of prompt fission neutrons have not yet been considered in the related coincidence and multiplication measurement techniques.To measure and verify the energy correlations,an experiment was perf...The energy correlations of prompt fission neutrons have not yet been considered in the related coincidence and multiplication measurement techniques.To measure and verify the energy correlations,an experiment was performed with a total measurement duration of approximately 1200 h.In the experiment,eight CLYC detectors and sixteen EJ309 liquid scintillation detectors were utilized,and the fission moment was tagged with the measured fissionγ-rays.The relative ratios of the energy spectra of the neutrons correlated with different energy neutrons to the^(252)Cf fission neutron energy spectra were obtained.The present results may be helpful for studying fission physics and nuclear technology applications.展开更多
Information extraction(IE)aims to automatically identify and extract information about specific interests from raw texts.Despite the abundance of solutions based on fine-tuning pretrained language models,IE in the con...Information extraction(IE)aims to automatically identify and extract information about specific interests from raw texts.Despite the abundance of solutions based on fine-tuning pretrained language models,IE in the context of fewshot and zero-shot scenarios remains highly challenging due to the scarcity of training data.Large language models(LLMs),on the other hand,can generalize well to unseen tasks with few-shot demonstrations or even zero-shot instructions and have demonstrated impressive ability for a wide range of natural language understanding or generation tasks.Nevertheless,it is unclear,whether such effectiveness can be replicated in the task of IE,where the target tasks involve specialized schema and quite abstractive entity or relation concepts.In this paper,we first examine the validity of LLMs in executing IE tasks with an established prompting strategy and further propose multiple types of augmented prompting methods,including the structured fundamental prompt(SFP),the structured interactive reasoning prompt(SIRP),and the voting-enabled structured interactive reasoning prompt(VESIRP).The experimental results demonstrate that while directly promotes inferior performance,the proposed augmented prompt methods significantly improve the extraction accuracy,achieving comparable or even better performance(e.g.,zero-shot FewNERD,FewNERD-INTRA)than state-of-theart methods that require large-scale training samples.This study represents a systematic exploration of employing instruction-following LLM for the task of IE.It not only establishes a performance benchmark for this novel paradigm but,more importantly,validates a practical technical pathway through the proposed prompt enhancement method,offering a viable solution for efficient IE in low-resource settings.展开更多
Large Language Models(LLMs)have significantly advanced human-computer interaction by improving natural language understanding and generation.However,their vulnerability to adversarial prompts–carefully designed input...Large Language Models(LLMs)have significantly advanced human-computer interaction by improving natural language understanding and generation.However,their vulnerability to adversarial prompts–carefully designed inputs that manipulate model outputs–presents substantial challenges.This paper introduces a classification-based approach to detect adversarial prompts by utilizing both prompt features and prompt response features.Elevenmachine learning models were evaluated based on key metrics such as accuracy,precision,recall,and F1-score.The results show that the Convolutional Neural Network–Long Short-Term Memory(CNN-LSTM)cascade model delivers the best performance,especially when using prompt features,achieving an accuracy of over 97%in all adversarial scenarios.Furthermore,the Support Vector Machine(SVM)model performed best with prompt response features,particularly excelling in prompt type classification tasks.Classification results revealed that certain types of adversarial attacks,such as“Word Level”and“Adversarial Prefix”,were particularly difficult to detect,as indicated by their low recall and F1-scores.These findings suggest that more subtle manipulations can evade detection mechanisms.In contrast,attacks like“Sentence Level”and“Adversarial Insertion”were easier to identify,due to the model’s effectiveness in recognizing inserted content.Natural Language Processing(NLP)techniques played a critical role by enabling the extraction of semantic and syntactic features from both prompts and their corresponding responses.These insights highlight the importance of combining traditional and deep learning approaches,along with advanced NLP techniques,to build more reliable adversarial prompt detection systems for LLMs.展开更多
This paper proposes a knowledge-enhanced disease diagnosis method based on a prompt learning framework.Addressing challenges such as the complexity ofmedical terminology,the difficulty of constructingmedical knowledge...This paper proposes a knowledge-enhanced disease diagnosis method based on a prompt learning framework.Addressing challenges such as the complexity ofmedical terminology,the difficulty of constructingmedical knowledge graphs,and the scarcity of medical data,the method retrieves structured knowledge from clinical cases via external knowledge graphs.The method retrieves structured knowledge from external knowledge graphs related to clinical cases,encodes it,and injects it into the prompt templates to enhance the language model’s understanding and reasoning capabilities for the task.We conducted experiments on three public datasets:CHIP-CTC,IMCS-V2-NER,and KUAKE-QTR.The results indicate that the proposedmethod significantly outperforms existing models acrossmultiple evaluation metrics.Additionally,ablation studies confirmed the critical role of the knowledge injection module,as the removal of this module resulted in a significant drop in F1 score.The experimental results demonstrate that the proposed method not only effectively improves the accuracy of disease diagnosis but also enhances the interpretability of the predictions,providing more reliable support and evidence for clinical diagnosis.展开更多
Microdispersion technology is crucial for a variety of applications in both the chemical and biomedical fields.The precise and rapid characterization of microdroplets and microbubbles is essential for research as well...Microdispersion technology is crucial for a variety of applications in both the chemical and biomedical fields.The precise and rapid characterization of microdroplets and microbubbles is essential for research as well as for optimizing and controlling industrial processes.Traditional methods often rely on time-consuming manual analysis.Although some deep learning-based computer vision methods have been proposed for automated identification and characterization,these approaches often rely on supervised learning,which requires labeled data for model training.This dependency on labeled data can be time-consuming and expensive,especially when working with large and complex datasets.To address these challenges,we propose Micro Flow SAM,an innovative,motion-prompted,annotation-free,and training-free instance segmentation approach.By utilizing motion of microdroplets and microbubbles as prompts,our method directs large-scale vision models to perform accurate instance segmentation without the need for annotated data or model training.This approach eliminates the need for human intervention in data labeling and reduces computational costs,significantly streamlining the data analysis process.We demonstrate the effectiveness of Micro Flow SAM across 12 diverse datasets,achieving outstanding segmentation results that are competitive with traditional methods.This novel approach not only accelerates the analysis process but also establishes a foundation for efficient process control and optimization in microfluidic applications.Micro Flow SAM represents a breakthrough in reducing the complexities and resource demands of instance segmentation,enabling faster insights and advancements in the microdispersion field.展开更多
With the rapid development of intelligent video surveillance technology,pedestrian re-identification has become increasingly important inmulti-camera surveillance systems.This technology plays a critical role in enhan...With the rapid development of intelligent video surveillance technology,pedestrian re-identification has become increasingly important inmulti-camera surveillance systems.This technology plays a critical role in enhancing public safety.However,traditional methods typically process images and text separately,applying upstream models directly to downstream tasks.This approach significantly increases the complexity ofmodel training and computational costs.Furthermore,the common class imbalance in existing training datasets limitsmodel performance improvement.To address these challenges,we propose an innovative framework named Person Re-ID Network Based on Visual Prompt Technology andMulti-Instance Negative Pooling(VPM-Net).First,we incorporate the Contrastive Language-Image Pre-training(CLIP)pre-trained model to accurately map visual and textual features into a unified embedding space,effectively mitigating inconsistencies in data distribution and the training process.To enhancemodel adaptability and generalization,we introduce an efficient and task-specific Visual Prompt Tuning(VPT)technique,which improves the model’s relevance to specific tasks.Additionally,we design two key modules:the Knowledge-Aware Network(KAN)and theMulti-Instance Negative Pooling(MINP)module.The KAN module significantly enhances the model’s understanding of complex scenarios through deep contextual semantic modeling.MINP module handles samples,effectively improving the model’s ability to distinguish fine-grained features.The experimental outcomes across diverse datasets underscore the remarkable performance of VPM-Net.These results vividly demonstrate the unique advantages and robust reliability of VPM-Net in fine-grained retrieval tasks.展开更多
The goal of infrared and visible image fusion(IVIF)is to integrate the unique advantages of both modalities to achieve a more comprehensive understanding of a scene.However,existing methods struggle to effectively han...The goal of infrared and visible image fusion(IVIF)is to integrate the unique advantages of both modalities to achieve a more comprehensive understanding of a scene.However,existing methods struggle to effectively handle modal disparities,resulting in visual degradation of the details and prominent targets of the fused images.To address these challenges,we introduce Prompt Fusion,a prompt-based approach that harmoniously combines multi-modality images under the guidance of semantic prompts.Firstly,to better characterize the features of different modalities,a contourlet autoencoder is designed to separate and extract the high-/low-frequency components of different modalities,thereby improving the extraction of fine details and textures.We also introduce a prompt learning mechanism using positive and negative prompts,leveraging Vision-Language Models to improve the fusion model's understanding and identification of targets in multi-modality images,leading to improved performance in downstream tasks.Furthermore,we employ bi-level asymptotic convergence optimization.This approach simplifies the intricate non-singleton non-convex bi-level problem into a series of convergent and differentiable single optimization problems that can be effectively resolved through gradient descent.Our approach advances the state-of-the-art,delivering superior fusion quality and boosting the performance of related downstream tasks.Project page:https://github.com/hey-it-s-me/PromptFusion.展开更多
Prompt fission neutron spectra(PFNS)have a significant role in nuclear science and technology.In this study,the PFNS for^(239)Pu are evaluated using both differential and integral experimental data.A method that lever...Prompt fission neutron spectra(PFNS)have a significant role in nuclear science and technology.In this study,the PFNS for^(239)Pu are evaluated using both differential and integral experimental data.A method that leverages integral criticality benchmark experiments to constrain the PFNS data is introduced.The measured central values of the PFNS are perturbed by constructing a covariance matrix.The PFNS are sampled using two types of covariance matrices,either generated with an assumed correlation matrix and incorporating experimental uncertainties or derived directly from experimental reports.The joint Monte Carlo transport code is employed to perform transport simulations on five criticality benchmark assemblies by utilizing perturbed PFNS data.Extensive simulations result in an optimized PFNS that shows improved agreement with the integral criticality benchmark experiments.This study introduces a novel approach for optimizing differential experimental data through integral experiments,particularly when a covariance matrix is not provided.展开更多
Dialogue State Tracking(DST)is a critical component of task-oriented spoken dialogue systems(SDS),tasked with maintaining an accurate representation of the conversational state by predicting slots and their correspond...Dialogue State Tracking(DST)is a critical component of task-oriented spoken dialogue systems(SDS),tasked with maintaining an accurate representation of the conversational state by predicting slots and their corresponding values.Recent advances leverage Large Language Models(LLMs)with prompt-based tuning to improve tracking accuracy and efficiency.However,these approaches often incur substantial computational and memory overheads and typically address slot extraction implicitly within prompts,without explicitly modeling the complex dependencies between slots and values.In this work,we propose PUGG,a novel DST framework that constructs schema-driven prompts to fine-tune GPT-2 and utilizes its tokenizer to implement a memory encoder.PUGG explicitly extracts slot values via GPT-2 and employs Graph Attention Networks(GATs)to model and reason over the intricate relationships between slots and their associated values.We evaluate PUGG on four publicly available datasets,where it achieves stateof-the-art performance across multiple evaluation metrics,highlighting its robustness and generalizability in diverse conversational scenarios.Our results indicate that the integration of GPT-2 substantially reduces model complexity and memory consumption by streamlining key processes.Moreover,prompt tuning enhances the model’s flexibility and precision in extracting relevant slot-value pairs,while the incorporation of GATs facilitates effective relational reasoning,leading to improved dialogue state representations.展开更多
Large language models(LLMs)have demonstrated remarkable generalization abilities across multiple tasks in natural language processing(NLP).For multi-step reasoning tasks,chain-of-thought(CoT)prompting facilitates step...Large language models(LLMs)have demonstrated remarkable generalization abilities across multiple tasks in natural language processing(NLP).For multi-step reasoning tasks,chain-of-thought(CoT)prompting facilitates step-by-step thinking,leading to improved performance.However,despite significant advancements in LLMs,current CoT prompting performs suboptimally on smaller-scale models that have fewer parameters.Additionally,the common paradigm of few-shot CoT prompting relies on a set of manual demonstrations,with performance contingent on the quality of these annotations and varying with task-specific requirements.To address these limitations,we propose a select-and-answer prompting method(SAP)to enhance language model performance on reasoning tasks without the need for manual demonstrations.This method comprises two primary steps:guiding the model to conduct preliminary analysis and generate several candidate answers based on the prompting;allowing the model to provide final answers derived from these candidate answers.The proposed prompting strategy is evaluated across two language models of varying sizes and six datasets.On ChatGLM-6B,SAP consistently outperforms few-shot CoT across all datasets.For GPT-3.5,SAP achieves comparable performance to few-shot CoT and outperforms zero-shot CoT in most cases.These experimental results indicate that SAP can significantly improve the accuracy of language models in reasoning tasks.展开更多
基金supported by the National Key Research and Development Program of China(2020AAA0109300)the Shanghai Collaborative Innovation Center of data intelligence technology(No.0232-A1-8900-24-13).
文摘Chinese abbreviations improve communicative efficiency by extracting key components from longer expressions.They are widely used in both daily communication and professional domains.However,existing abbreviation generation methods still face two major challenges.First,sequence-labeling-based approaches often neglect contextual meaning by making binary decisions at the character level,leading to abbreviations that fail to capture semantic completeness.Second,generation-basedmethods rely heavily on a single decoding process,which frequently produces correct abbreviations but ranks them lower due to inadequate semantic evaluation.To address these limitations,we propose a novel two-stage frameworkwithGeneration–Iterative Optimization forAbbreviation(GIOA).In the first stage,we design aChain-of-Thought prompting strategy and incorporate definitional and situational contexts to generate multiple abbreviation candidates.In the second stage,we introduce a Semantic Preservation Dynamic Adjustment mechanism that alternates between character-level importance estimation and semantic restoration to optimize candidate ranking.Experiments on two public benchmark datasets show that our method outperforms existing state-of-the-art approaches,achieving Hit@1 improvements of 15.15%and 13.01%,respectively,while maintaining consistent results in Hit@3.
基金supported by 2023 Higher Education Scientific Research Planning Project of China Society of Higher Education(No.23PG0408)2023 Philosophy and Social Science Research Programs in Jiangsu Province(No.2023SJSZ0993)+2 种基金Nantong Science and Technology Project(No.JC2023070)Key Project of Jiangsu Province Education Science 14th Five-Year Plan(Grant No.B-b/2024/02/41)the Open Fund of Advanced Cryptography and System Security Key Laboratory of Sichuan Province(Grant No.SKLACSS-202407).
文摘Large language models(LLMs)have revolutionized AI applications across diverse domains.However,their widespread deployment has introduced critical security vulnerabilities,particularly prompt injection attacks that manipulate model behavior through malicious instructions.Following Kitchenham’s guidelines,this systematic review synthesizes 128 peer-reviewed studies from 2022 to 2025 to provide a unified understanding of this rapidly evolving threat landscape.Our findings reveal a swift progression from simple direct injections to sophisticated multimodal attacks,achieving over 90%success rates against unprotected systems.In response,defense mechanisms show varying effectiveness:input preprocessing achieves 60%–80%detection rates and advanced architectural defenses demonstrate up to 95%protection against known patterns,though significant gaps persist against novel attack vectors.We identified 37 distinct defense approaches across three categories,but standardized evaluation frameworks remain limited.Our analysis attributes these vulnerabilities to fundamental LLM architectural limitations,such as the inability to distinguish instructions from data and attention mechanism vulnerabilities.This highlights critical research directions such as formal verification methods,standardized evaluation protocols,and architectural innovations for inherently secure LLM designs.
基金supported by the National Natural Science Foundation of China(No.12105257)the Research and Development Fund(No.JMJJ202401)。
文摘The energy correlations of prompt fission neutrons have not yet been considered in the related coincidence and multiplication measurement techniques.To measure and verify the energy correlations,an experiment was performed with a total measurement duration of approximately 1200 h.In the experiment,eight CLYC detectors and sixteen EJ309 liquid scintillation detectors were utilized,and the fission moment was tagged with the measured fissionγ-rays.The relative ratios of the energy spectra of the neutrons correlated with different energy neutrons to the^(252)Cf fission neutron energy spectra were obtained.The present results may be helpful for studying fission physics and nuclear technology applications.
基金supported by the National Natural Science Foundation of China(62222212).
文摘Information extraction(IE)aims to automatically identify and extract information about specific interests from raw texts.Despite the abundance of solutions based on fine-tuning pretrained language models,IE in the context of fewshot and zero-shot scenarios remains highly challenging due to the scarcity of training data.Large language models(LLMs),on the other hand,can generalize well to unseen tasks with few-shot demonstrations or even zero-shot instructions and have demonstrated impressive ability for a wide range of natural language understanding or generation tasks.Nevertheless,it is unclear,whether such effectiveness can be replicated in the task of IE,where the target tasks involve specialized schema and quite abstractive entity or relation concepts.In this paper,we first examine the validity of LLMs in executing IE tasks with an established prompting strategy and further propose multiple types of augmented prompting methods,including the structured fundamental prompt(SFP),the structured interactive reasoning prompt(SIRP),and the voting-enabled structured interactive reasoning prompt(VESIRP).The experimental results demonstrate that while directly promotes inferior performance,the proposed augmented prompt methods significantly improve the extraction accuracy,achieving comparable or even better performance(e.g.,zero-shot FewNERD,FewNERD-INTRA)than state-of-theart methods that require large-scale training samples.This study represents a systematic exploration of employing instruction-following LLM for the task of IE.It not only establishes a performance benchmark for this novel paradigm but,more importantly,validates a practical technical pathway through the proposed prompt enhancement method,offering a viable solution for efficient IE in low-resource settings.
文摘Large Language Models(LLMs)have significantly advanced human-computer interaction by improving natural language understanding and generation.However,their vulnerability to adversarial prompts–carefully designed inputs that manipulate model outputs–presents substantial challenges.This paper introduces a classification-based approach to detect adversarial prompts by utilizing both prompt features and prompt response features.Elevenmachine learning models were evaluated based on key metrics such as accuracy,precision,recall,and F1-score.The results show that the Convolutional Neural Network–Long Short-Term Memory(CNN-LSTM)cascade model delivers the best performance,especially when using prompt features,achieving an accuracy of over 97%in all adversarial scenarios.Furthermore,the Support Vector Machine(SVM)model performed best with prompt response features,particularly excelling in prompt type classification tasks.Classification results revealed that certain types of adversarial attacks,such as“Word Level”and“Adversarial Prefix”,were particularly difficult to detect,as indicated by their low recall and F1-scores.These findings suggest that more subtle manipulations can evade detection mechanisms.In contrast,attacks like“Sentence Level”and“Adversarial Insertion”were easier to identify,due to the model’s effectiveness in recognizing inserted content.Natural Language Processing(NLP)techniques played a critical role by enabling the extraction of semantic and syntactic features from both prompts and their corresponding responses.These insights highlight the importance of combining traditional and deep learning approaches,along with advanced NLP techniques,to build more reliable adversarial prompt detection systems for LLMs.
基金supported by the National Natural Science Foundation of China(Grant No.62162014).
文摘This paper proposes a knowledge-enhanced disease diagnosis method based on a prompt learning framework.Addressing challenges such as the complexity ofmedical terminology,the difficulty of constructingmedical knowledge graphs,and the scarcity of medical data,the method retrieves structured knowledge from clinical cases via external knowledge graphs.The method retrieves structured knowledge from external knowledge graphs related to clinical cases,encodes it,and injects it into the prompt templates to enhance the language model’s understanding and reasoning capabilities for the task.We conducted experiments on three public datasets:CHIP-CTC,IMCS-V2-NER,and KUAKE-QTR.The results indicate that the proposedmethod significantly outperforms existing models acrossmultiple evaluation metrics.Additionally,ablation studies confirmed the critical role of the knowledge injection module,as the removal of this module resulted in a significant drop in F1 score.The experimental results demonstrate that the proposed method not only effectively improves the accuracy of disease diagnosis but also enhances the interpretability of the predictions,providing more reliable support and evidence for clinical diagnosis.
基金the financial support from National Natural Science Foundation of China (21991104)。
文摘Microdispersion technology is crucial for a variety of applications in both the chemical and biomedical fields.The precise and rapid characterization of microdroplets and microbubbles is essential for research as well as for optimizing and controlling industrial processes.Traditional methods often rely on time-consuming manual analysis.Although some deep learning-based computer vision methods have been proposed for automated identification and characterization,these approaches often rely on supervised learning,which requires labeled data for model training.This dependency on labeled data can be time-consuming and expensive,especially when working with large and complex datasets.To address these challenges,we propose Micro Flow SAM,an innovative,motion-prompted,annotation-free,and training-free instance segmentation approach.By utilizing motion of microdroplets and microbubbles as prompts,our method directs large-scale vision models to perform accurate instance segmentation without the need for annotated data or model training.This approach eliminates the need for human intervention in data labeling and reduces computational costs,significantly streamlining the data analysis process.We demonstrate the effectiveness of Micro Flow SAM across 12 diverse datasets,achieving outstanding segmentation results that are competitive with traditional methods.This novel approach not only accelerates the analysis process but also establishes a foundation for efficient process control and optimization in microfluidic applications.Micro Flow SAM represents a breakthrough in reducing the complexities and resource demands of instance segmentation,enabling faster insights and advancements in the microdispersion field.
基金funded by the Key Research and Development Program of Hubei Province,China(Grant No.2023BEB024)the Young and Middle-aged Scientific and Technological Innova-tion Team Plan in Higher Education Institutions inHubei Province,China(GrantNo.T2023007)the key projects ofHubei Provincial Department of Education(No.D20161403).
文摘With the rapid development of intelligent video surveillance technology,pedestrian re-identification has become increasingly important inmulti-camera surveillance systems.This technology plays a critical role in enhancing public safety.However,traditional methods typically process images and text separately,applying upstream models directly to downstream tasks.This approach significantly increases the complexity ofmodel training and computational costs.Furthermore,the common class imbalance in existing training datasets limitsmodel performance improvement.To address these challenges,we propose an innovative framework named Person Re-ID Network Based on Visual Prompt Technology andMulti-Instance Negative Pooling(VPM-Net).First,we incorporate the Contrastive Language-Image Pre-training(CLIP)pre-trained model to accurately map visual and textual features into a unified embedding space,effectively mitigating inconsistencies in data distribution and the training process.To enhancemodel adaptability and generalization,we introduce an efficient and task-specific Visual Prompt Tuning(VPT)technique,which improves the model’s relevance to specific tasks.Additionally,we design two key modules:the Knowledge-Aware Network(KAN)and theMulti-Instance Negative Pooling(MINP)module.The KAN module significantly enhances the model’s understanding of complex scenarios through deep contextual semantic modeling.MINP module handles samples,effectively improving the model’s ability to distinguish fine-grained features.The experimental outcomes across diverse datasets underscore the remarkable performance of VPM-Net.These results vividly demonstrate the unique advantages and robust reliability of VPM-Net in fine-grained retrieval tasks.
基金partially supported by China Postdoctoral Science Foundation(2023M730741)the National Natural Science Foundation of China(U22B2052,52102432,52202452,62372080,62302078)
文摘The goal of infrared and visible image fusion(IVIF)is to integrate the unique advantages of both modalities to achieve a more comprehensive understanding of a scene.However,existing methods struggle to effectively handle modal disparities,resulting in visual degradation of the details and prominent targets of the fused images.To address these challenges,we introduce Prompt Fusion,a prompt-based approach that harmoniously combines multi-modality images under the guidance of semantic prompts.Firstly,to better characterize the features of different modalities,a contourlet autoencoder is designed to separate and extract the high-/low-frequency components of different modalities,thereby improving the extraction of fine details and textures.We also introduce a prompt learning mechanism using positive and negative prompts,leveraging Vision-Language Models to improve the fusion model's understanding and identification of targets in multi-modality images,leading to improved performance in downstream tasks.Furthermore,we employ bi-level asymptotic convergence optimization.This approach simplifies the intricate non-singleton non-convex bi-level problem into a series of convergent and differentiable single optimization problems that can be effectively resolved through gradient descent.Our approach advances the state-of-the-art,delivering superior fusion quality and boosting the performance of related downstream tasks.Project page:https://github.com/hey-it-s-me/PromptFusion.
基金supported by the National Natural Science Foundation of China(No.12347126)。
文摘Prompt fission neutron spectra(PFNS)have a significant role in nuclear science and technology.In this study,the PFNS for^(239)Pu are evaluated using both differential and integral experimental data.A method that leverages integral criticality benchmark experiments to constrain the PFNS data is introduced.The measured central values of the PFNS are perturbed by constructing a covariance matrix.The PFNS are sampled using two types of covariance matrices,either generated with an assumed correlation matrix and incorporating experimental uncertainties or derived directly from experimental reports.The joint Monte Carlo transport code is employed to perform transport simulations on five criticality benchmark assemblies by utilizing perturbed PFNS data.Extensive simulations result in an optimized PFNS that shows improved agreement with the integral criticality benchmark experiments.This study introduces a novel approach for optimizing differential experimental data through integral experiments,particularly when a covariance matrix is not provided.
基金supported by the MSIT(Ministry of Science and ICT),Republic of Korea,under the ITRC(Information Technology Research Centre)support program(IITP-2024-RS-2024-00437191)supervised by the IITP(Institute for Information&Communications Technology Planning&Evaluation).
文摘Dialogue State Tracking(DST)is a critical component of task-oriented spoken dialogue systems(SDS),tasked with maintaining an accurate representation of the conversational state by predicting slots and their corresponding values.Recent advances leverage Large Language Models(LLMs)with prompt-based tuning to improve tracking accuracy and efficiency.However,these approaches often incur substantial computational and memory overheads and typically address slot extraction implicitly within prompts,without explicitly modeling the complex dependencies between slots and values.In this work,we propose PUGG,a novel DST framework that constructs schema-driven prompts to fine-tune GPT-2 and utilizes its tokenizer to implement a memory encoder.PUGG explicitly extracts slot values via GPT-2 and employs Graph Attention Networks(GATs)to model and reason over the intricate relationships between slots and their associated values.We evaluate PUGG on four publicly available datasets,where it achieves stateof-the-art performance across multiple evaluation metrics,highlighting its robustness and generalizability in diverse conversational scenarios.Our results indicate that the integration of GPT-2 substantially reduces model complexity and memory consumption by streamlining key processes.Moreover,prompt tuning enhances the model’s flexibility and precision in extracting relevant slot-value pairs,while the incorporation of GATs facilitates effective relational reasoning,leading to improved dialogue state representations.
基金National Natural Science Foundation of China(No.62176052)。
文摘Large language models(LLMs)have demonstrated remarkable generalization abilities across multiple tasks in natural language processing(NLP).For multi-step reasoning tasks,chain-of-thought(CoT)prompting facilitates step-by-step thinking,leading to improved performance.However,despite significant advancements in LLMs,current CoT prompting performs suboptimally on smaller-scale models that have fewer parameters.Additionally,the common paradigm of few-shot CoT prompting relies on a set of manual demonstrations,with performance contingent on the quality of these annotations and varying with task-specific requirements.To address these limitations,we propose a select-and-answer prompting method(SAP)to enhance language model performance on reasoning tasks without the need for manual demonstrations.This method comprises two primary steps:guiding the model to conduct preliminary analysis and generate several candidate answers based on the prompting;allowing the model to provide final answers derived from these candidate answers.The proposed prompting strategy is evaluated across two language models of varying sizes and six datasets.On ChatGLM-6B,SAP consistently outperforms few-shot CoT across all datasets.For GPT-3.5,SAP achieves comparable performance to few-shot CoT and outperforms zero-shot CoT in most cases.These experimental results indicate that SAP can significantly improve the accuracy of language models in reasoning tasks.