This article proposes a document-level prompt learning approach using LLMs to extract the timeline-based storyline. Through verification tests on datasets such as ESCv1.2 and Timeline17, the results show that the prom...This article proposes a document-level prompt learning approach using LLMs to extract the timeline-based storyline. Through verification tests on datasets such as ESCv1.2 and Timeline17, the results show that the prompt + one-shot learning proposed in this article works well. Meanwhile, our research findings indicate that although timeline-based storyline extraction has shown promising prospects in the practical applications of LLMs, it is still a complex natural language processing task that requires further research.展开更多
Plant biomass is an important agronomic trait that has been subjected to intense human selection for yield improvement.The underlying mechanism regulating biomass formation is currently gaining increasing attention,bu...Plant biomass is an important agronomic trait that has been subjected to intense human selection for yield improvement.The underlying mechanism regulating biomass formation is currently gaining increasing attention,but it remains unexplored.In this study,we isolated a cucumber(Cucumis sativus L.)minicuke mutant with remarkably reduced biomass.The causative gene was identified as CsNMT1,a homologue of the Arabidopsis thaliana N-myristoyltransferase1.Our clustered regularly interspaced shot palindromic repeat-based genome editing confirmed the key role of CsNMT1 in biomass regulation.Multi-omics analyses integrating metabolomic and transcriptomic analyses revealed the suppression of a very early step of lignin biosynthesis and the corresponding down-regulation of genes involved in lignin biosynthesis in the minicikue mutant,suggesting an unexpected pathway for regulating biomass accumulation through lignin sink strength.Our findings demonstrate the function of NMT1 in regulating plant biomass and its potential application value for biomass improvement in cucurbits.展开更多
Neural machine translation(NMT)has advanced with deep learning and large-scale multilingual models,yet translating lowresource languages often lacks sufficient training data and leads to hallucinations.This often resu...Neural machine translation(NMT)has advanced with deep learning and large-scale multilingual models,yet translating lowresource languages often lacks sufficient training data and leads to hallucinations.This often results in translated content that diverges significantly from the source text.This research proposes a refined Contrastive Decoding(CD)algorithm that dynamically adjusts weights of log probabilities from strong expert and weak amateur models to mitigate hallucinations in lowresource NMT and improve translation quality.Advanced large language NMT models,including ChatGLM and LLaMA,are fine-tuned and implemented for their superior contextual understanding and cross-lingual capabilities.The refined CD algorithm evaluates multiple candidate translations using BLEU score,semantic similarity,and Named Entity Recognition accuracy.Extensive experimental results show substantial improvements in translation quality and a significant reduction in hallucination rates.Fine-tuned models achieve higher evaluation metrics compared to baseline models and state-of-the-art models.An ablation study confirms the contributions of each methodological component and highlights the effectiveness of the refined CD algorithm and advanced models in mitigating hallucinations.Notably,the refined methodology increased the BLEU score by approximately 30%compared to baseline models.展开更多
This study aims to explore the potential and limitations of ChatGPT in translation,focusing on its application in Neural Machine Translation(NMT).By combining theoretical analysis with empirical research,the study eva...This study aims to explore the potential and limitations of ChatGPT in translation,focusing on its application in Neural Machine Translation(NMT).By combining theoretical analysis with empirical research,the study evaluates ChatGPT’s strengths and weaknesses.It reveals ChatGPT’s superior performance in handling technical documents with high translation quality and efficiency.However,its limitations become evident in addressing cultural nuances and emotional expressions,where semantic deviation or cultural loss often occurs.Moreover,ChatGPT struggles with creative translation,failing to convey the artistic style and emotional depth of original texts,such as literary works and advertisements.The study proposes optimized paths for human-machine collaboration,emphasizing the crucial role of human translators in cultural adaptation and quality assurance.It suggests incorporating multimodal data,dynamic feedback mechanisms,and pragmatic reasoning techniques to enhance machine translation capabilities.The findings conclude that while ChatGPT serves as an efficient translation tool,complex tasks require human-machine synergy to achieve high-quality cross-cultural communication.展开更多
Social media like Twitter who serves as a novel news medium and has become increasingly popular since its establishment. Large scale first-hand user-generated tweets motivate automatic event detection on Twitter. Prev...Social media like Twitter who serves as a novel news medium and has become increasingly popular since its establishment. Large scale first-hand user-generated tweets motivate automatic event detection on Twitter. Previous unsupervised approaches detected events by clustering words. These methods detect events using burstiness,which measures surging frequencies of words at certain time windows. However,event clusters represented by a set of individual words are difficult to understand. This issue is addressed by building a document-level event detection model that directly calculates the burstiness of tweets,leveraging distributed word representations for modeling semantic information,thereby avoiding sparsity. Results show that the document-level model not only offers event summaries that are directly human-readable,but also gives significantly improved accuracies compared to previous methods on unsupervised tweet event detection,which are based on words/segments.展开更多
RS10-CLOUD快速开发平台,是RS10-CLOUD云平台的重要组成部分,其隶属于国家重大项目,是一个面向零散制造业管理市场,支撑企业生产管理类实现的低代码开发平台。主要描述了基于RS10-CLOUD快速开发平台模块优化的过程。出发点在于面对当今...RS10-CLOUD快速开发平台,是RS10-CLOUD云平台的重要组成部分,其隶属于国家重大项目,是一个面向零散制造业管理市场,支撑企业生产管理类实现的低代码开发平台。主要描述了基于RS10-CLOUD快速开发平台模块优化的过程。出发点在于面对当今中国企业跨界转产已涵盖到了各个不同的领域的形势,转产过程中工业生产管理环境的业务数据分类及含义的二义性无形中增加了生产管理换件人工控制的成本。因此在RS10-CLOUD工业管理软件中,引入了工业标签统一管理动态生效的逻辑,其在开发阶段统一定义工业术语、工业业务标签并注入到页面;在软件应用阶段可以持续维护,并且实现页面自动生效。在这种开发模式的考量下,创新性地在此类软件中采用加入先验知识的神经网络机器翻译NMT进行训练,同时基于训练可行性和翻译模型的准确性,运用TLA(target language lemmas)进行约束训练,得到了从中文到英文的翻译模型。为汇总对比国内外工业属性的含义,最终实现多语言翻译,从而减少合资企业,输出型产业的生产管理成本提供考量。展开更多
文摘This article proposes a document-level prompt learning approach using LLMs to extract the timeline-based storyline. Through verification tests on datasets such as ESCv1.2 and Timeline17, the results show that the prompt + one-shot learning proposed in this article works well. Meanwhile, our research findings indicate that although timeline-based storyline extraction has shown promising prospects in the practical applications of LLMs, it is still a complex natural language processing task that requires further research.
基金supported by the National Natural Science Foundation of China(32172606 to Dr.Xueyong Yang and 32302543 to Dr.Shuai Wang)the National Key Research and Development Program of China(2021YFF1000100)+1 种基金the Beijing Joint Research Program for Germplasm Innovation and New Variety Breeding(G2022062800303)the Science and Technology Innovation Program of the Chinese Academy of Agricultural Sciences(CAASASTIP)。
文摘Plant biomass is an important agronomic trait that has been subjected to intense human selection for yield improvement.The underlying mechanism regulating biomass formation is currently gaining increasing attention,but it remains unexplored.In this study,we isolated a cucumber(Cucumis sativus L.)minicuke mutant with remarkably reduced biomass.The causative gene was identified as CsNMT1,a homologue of the Arabidopsis thaliana N-myristoyltransferase1.Our clustered regularly interspaced shot palindromic repeat-based genome editing confirmed the key role of CsNMT1 in biomass regulation.Multi-omics analyses integrating metabolomic and transcriptomic analyses revealed the suppression of a very early step of lignin biosynthesis and the corresponding down-regulation of genes involved in lignin biosynthesis in the minicikue mutant,suggesting an unexpected pathway for regulating biomass accumulation through lignin sink strength.Our findings demonstrate the function of NMT1 in regulating plant biomass and its potential application value for biomass improvement in cucurbits.
基金M.Faheem is supported by VTT Technical Research Center of Finland.
文摘Neural machine translation(NMT)has advanced with deep learning and large-scale multilingual models,yet translating lowresource languages often lacks sufficient training data and leads to hallucinations.This often results in translated content that diverges significantly from the source text.This research proposes a refined Contrastive Decoding(CD)algorithm that dynamically adjusts weights of log probabilities from strong expert and weak amateur models to mitigate hallucinations in lowresource NMT and improve translation quality.Advanced large language NMT models,including ChatGLM and LLaMA,are fine-tuned and implemented for their superior contextual understanding and cross-lingual capabilities.The refined CD algorithm evaluates multiple candidate translations using BLEU score,semantic similarity,and Named Entity Recognition accuracy.Extensive experimental results show substantial improvements in translation quality and a significant reduction in hallucination rates.Fine-tuned models achieve higher evaluation metrics compared to baseline models and state-of-the-art models.An ablation study confirms the contributions of each methodological component and highlights the effectiveness of the refined CD algorithm and advanced models in mitigating hallucinations.Notably,the refined methodology increased the BLEU score by approximately 30%compared to baseline models.
文摘This study aims to explore the potential and limitations of ChatGPT in translation,focusing on its application in Neural Machine Translation(NMT).By combining theoretical analysis with empirical research,the study evaluates ChatGPT’s strengths and weaknesses.It reveals ChatGPT’s superior performance in handling technical documents with high translation quality and efficiency.However,its limitations become evident in addressing cultural nuances and emotional expressions,where semantic deviation or cultural loss often occurs.Moreover,ChatGPT struggles with creative translation,failing to convey the artistic style and emotional depth of original texts,such as literary works and advertisements.The study proposes optimized paths for human-machine collaboration,emphasizing the crucial role of human translators in cultural adaptation and quality assurance.It suggests incorporating multimodal data,dynamic feedback mechanisms,and pragmatic reasoning techniques to enhance machine translation capabilities.The findings conclude that while ChatGPT serves as an efficient translation tool,complex tasks require human-machine synergy to achieve high-quality cross-cultural communication.
基金Supported by the National High Technology Research and Development Programme of China(No.2015AA015405)
文摘Social media like Twitter who serves as a novel news medium and has become increasingly popular since its establishment. Large scale first-hand user-generated tweets motivate automatic event detection on Twitter. Previous unsupervised approaches detected events by clustering words. These methods detect events using burstiness,which measures surging frequencies of words at certain time windows. However,event clusters represented by a set of individual words are difficult to understand. This issue is addressed by building a document-level event detection model that directly calculates the burstiness of tweets,leveraging distributed word representations for modeling semantic information,thereby avoiding sparsity. Results show that the document-level model not only offers event summaries that are directly human-readable,but also gives significantly improved accuracies compared to previous methods on unsupervised tweet event detection,which are based on words/segments.
文摘RS10-CLOUD快速开发平台,是RS10-CLOUD云平台的重要组成部分,其隶属于国家重大项目,是一个面向零散制造业管理市场,支撑企业生产管理类实现的低代码开发平台。主要描述了基于RS10-CLOUD快速开发平台模块优化的过程。出发点在于面对当今中国企业跨界转产已涵盖到了各个不同的领域的形势,转产过程中工业生产管理环境的业务数据分类及含义的二义性无形中增加了生产管理换件人工控制的成本。因此在RS10-CLOUD工业管理软件中,引入了工业标签统一管理动态生效的逻辑,其在开发阶段统一定义工业术语、工业业务标签并注入到页面;在软件应用阶段可以持续维护,并且实现页面自动生效。在这种开发模式的考量下,创新性地在此类软件中采用加入先验知识的神经网络机器翻译NMT进行训练,同时基于训练可行性和翻译模型的准确性,运用TLA(target language lemmas)进行约束训练,得到了从中文到英文的翻译模型。为汇总对比国内外工业属性的含义,最终实现多语言翻译,从而减少合资企业,输出型产业的生产管理成本提供考量。