Model evaluation using benchmark datasets is an important method to measure the capability of large language models(LLMs)in specific domains,and it is mainly used to assess the knowledge and reasoning abilities of LLM...Model evaluation using benchmark datasets is an important method to measure the capability of large language models(LLMs)in specific domains,and it is mainly used to assess the knowledge and reasoning abilities of LLMs.Therefore,in order to better assess the capability of LLMs in the agricultural domain,Agri-Eval was proposed as a benchmark for assessing the knowledge and reasoning ability of LLMs in agriculture.The assessment dataset used in Agri-Eval covered seven major disciplines in the agricultural domain:crop science,horticulture,plant protection,animal husbandry,forest science,aquaculture science,and grass science,and contained a total of 2283 questions.Among domestic general-purpose LLMs,DeepSeek R1 performed best with an accuracy rate of 75.49%.In the realm of international general-purpose LLMs,Gemini 2.0 pro exp 0205 standed out as the top performer,achieving an accuracy rate of 74.28%.As an LLMs in agriculture vertical,Shennong V2.0 outperformed all the LLMs in China,and the answer accuracy rate of agricultural knowledge exceeded that of all the existing general-purpose LLMs.The launch of Agri-Eval helped the LLM developers to comprehensively evaluate the model's capability in the field of agriculture through a variety of tasks and tests to promote the development of the LLMs in the field of agriculture.展开更多
Covert timing channels(CTC)exploit network resources to establish hidden communication pathways,posing signi cant risks to data security and policy compliance.erefore,detecting such hidden and dangerous threats remain...Covert timing channels(CTC)exploit network resources to establish hidden communication pathways,posing signi cant risks to data security and policy compliance.erefore,detecting such hidden and dangerous threats remains one of the security challenges. is paper proposes LinguTimeX,a new framework that combines natural language processing with arti cial intelligence,along with explainable Arti cial Intelligence(AI)not only to detect CTC but also to provide insights into the decision process.LinguTimeX performs multidimensional feature extraction by fusing linguistic attributes with temporal network patterns to identify covert channels precisely.LinguTimeX demonstrates strong e ectiveness in detecting CTC across multiple languages;namely English,Arabic,and Chinese.Speci cally,the LSTM and RNN models achieved F1 scores of 90%on the English dataset,89%on the Arabic dataset,and 88%on the Chinese dataset,showcasing their superior performance and ability to generalize across multiple languages. is highlights their robustness in detecting CTCs within security systems,regardless of the language or cultural context of the data.In contrast,the DeepForest model produced F1-scores ranging from 86%to 87%across the same datasets,further con rming its e ectiveness in CTC detection.Although other algorithms also showed reasonable accuracy,the LSTM and RNN models consistently outperformed them in multilingual settings,suggesting that deep learning models might be better suited for this particular problem.展开更多
This study demonstrates a novel integration of large language models,machine learning,and multicriteria decision-making to investigate self-moderation in small online communities,a topic under-explored compared to use...This study demonstrates a novel integration of large language models,machine learning,and multicriteria decision-making to investigate self-moderation in small online communities,a topic under-explored compared to user behavior and platform-driven moderation on social media.The proposed methodological framework(1)utilizes large language models for social media post analysis and categorization,(2)employs k-means clustering for content characterization,and(3)incorporates the TODIM(Tomada de Decisão Interativa Multicritério)method to determine moderation strategies based on expert judgments.In general,the fully integrated framework leverages the strengths of these intelligent systems in a more systematic evaluation of large-scale decision problems.When applied in social media moderation,this approach promotes nuanced and context-sensitive self-moderation by taking into account factors such as cultural background and geographic location.The application of this framework is demonstrated within Facebook groups.Eight distinct content clusters encompassing safety,harassment,diversity,and misinformation are identified.Analysis revealed a preference for content removal across all clusters,suggesting a cautious approach towards potentially harmful content.However,the framework also highlights the use of other moderation actions,like account suspension,depending on the content category.These findings contribute to the growing body of research on self-moderation and offer valuable insights for creating safer and more inclusive online spaces within smaller communities.展开更多
This study examines how foreign language education in the artificial intelligence(AI)era could assist the cultivation of national consciousness through a technology-enhanced pedagogy of film appreciation.Using The Wil...This study examines how foreign language education in the artificial intelligence(AI)era could assist the cultivation of national consciousness through a technology-enhanced pedagogy of film appreciation.Using The Wild Robot as a case study,we argue that cinematic narratives serve as cultural mirrors,offering immersive,reflective,and affective sites for intercultural learning.We propose a three-layered pedagogical framework-progressing from semiotic decoding,through narrative and value comparison,to creative identity construction-that integrates intelligent tools to develop both communicative competence and an agentive sense of belonging.The approach exemplifies a humanistic turn in language teaching,aiming to form“rooted global communicators”who can engage in cross-civilization dialogue with cultural confidence and critical awareness.展开更多
The programming technology about Embedded SQL dis-cuss the form of Embedded SQL base on C/C++language andORACLE9i DBMS in this paper.We have also discussed thecommunications between the sentence of Embedded SQL and C/...The programming technology about Embedded SQL dis-cuss the form of Embedded SQL base on C/C++language andORACLE9i DBMS in this paper.We have also discussed thecommunications between the sentence of Embedded SQL and C/C++language and have provided the code.展开更多
自然语言转换结构化查询语言(NL2SQL)能降低非专业人员操作数据库的技术门槛,从而提升用户体验和工作效率。此外,检索增强生成(RAG)技术可以通过引入外部知识库提升NL2SQL的性能。针对目前RAG在NL2SQL应用中存在的检索策略漏检率高和召...自然语言转换结构化查询语言(NL2SQL)能降低非专业人员操作数据库的技术门槛,从而提升用户体验和工作效率。此外,检索增强生成(RAG)技术可以通过引入外部知识库提升NL2SQL的性能。针对目前RAG在NL2SQL应用中存在的检索策略漏检率高和召回上下文的相关性不强等问题,提出一种分序检索重排序RAG(RAG-SRR)方法优化知识库构建、检索召回策略和提示词设计等环节。首先,从问答对、专业名词和数据库结构这3个方面进行领域知识库的构建:问答对根据文物艺术品拍卖监管的高频处理和查询的问题构建,专业名词根据拍卖行业标准构建,而数据库结构根据雅昌艺术拍卖网的数据构建;其次,在检索阶段采取分序检索的策略,并对3类知识库设置不同的优先级,且在召回阶段重排序检索的信息;最后,在提示词设计中给出提示词优化设计的原则及提示词模板。实验结果表明:在领域数据集、Spider数据集上,RAG-SRR方法与基于BERT(Bidirectional Encoder Representations from Transformers)模型和RESDSQL(Ranking-enhanced Encoding plus a Skeleton-aware Decoding framework for text-to-SQL)模型的方法的执行准确率分别至少提高了19.50、24.20和12.17、8.90个百分点。而在相同大语言模型下,RAG-SRR方法比未优化的RAG方法的执行准确率分别至少提高了12.83和15.60个百分点,与C3SQL方法相比,执行准确率分别至少提高了1.50和3.10个百分点。在使用Llama3.1-8B时,与DIN-SQL方法相比,执行准确率在中文语料数据集中提升0.30个百分点,在英文语料数据集中最多相差3.90个百分点;但在使用Qwen2.5-7B时,执行准确率分别提高1.60和4.10个百分点。可见,RAG-SRR方法具备较强的实用性和可移植性。展开更多
随着自然语言处理、人工智能和多域数据库应用的发展,对智能数据库查询系统的需求迅速增长,尤其是在中文语境中,实现准确的查询生成已成为金融、医疗保健和客户服务等行业的必需要素。现有的SQL生成方法难以解决中文语义解析、多域适应...随着自然语言处理、人工智能和多域数据库应用的发展,对智能数据库查询系统的需求迅速增长,尤其是在中文语境中,实现准确的查询生成已成为金融、医疗保健和客户服务等行业的必需要素。现有的SQL生成方法难以解决中文语义解析、多域适应性及人机交互中语义一致性的问题,限制复杂查询的跨域处理。针对上述挑战,提出一种面向中文的多域人机交互式SQL生成算法MH-CSQL(multi-domain human-computer interaction for Chinese SQL generation algorithm),结合历史信息和课程学习技术以增强自然语言理解,支持多域数据库处理各种查询任务。实验结果表明,MH-CSQL在准确性和适应性方面均优于传统方法。此外,将人机交互模型的结果可视图进行展示,验证了MH-CSQL在智能问答等领域的应用前景。展开更多
在智慧城市发展进程中,交通系统的精细化管理和智能化服务面临海量异构数据处理的挑战。传统交通信息查询系统存在数据源异构性强、自然语言交互能力不足、长尾查询场景覆盖有限等问题。文章基于ChatGLM3大语言模型,创新性地构建了融合N...在智慧城市发展进程中,交通系统的精细化管理和智能化服务面临海量异构数据处理的挑战。传统交通信息查询系统存在数据源异构性强、自然语言交互能力不足、长尾查询场景覆盖有限等问题。文章基于ChatGLM3大语言模型,创新性地构建了融合NL2SQL(Natural Language to Structured Query Language)技术的智能问数系统,通过动态Schema对齐、LoRA微调优化及多维度提示工程技术,实现了交通领域复杂自然语言查询到精准SQL指令的智能转换。实验结果表明,经过微调的模型在交通信息查询任务中准确率达到78.9%,较基线模型提升15.8个百分点。本研究为交通管理智能化转型提供了创新技术路径,并对大模型在垂直领域的深度适配进行了系统性探索。展开更多
文摘Model evaluation using benchmark datasets is an important method to measure the capability of large language models(LLMs)in specific domains,and it is mainly used to assess the knowledge and reasoning abilities of LLMs.Therefore,in order to better assess the capability of LLMs in the agricultural domain,Agri-Eval was proposed as a benchmark for assessing the knowledge and reasoning ability of LLMs in agriculture.The assessment dataset used in Agri-Eval covered seven major disciplines in the agricultural domain:crop science,horticulture,plant protection,animal husbandry,forest science,aquaculture science,and grass science,and contained a total of 2283 questions.Among domestic general-purpose LLMs,DeepSeek R1 performed best with an accuracy rate of 75.49%.In the realm of international general-purpose LLMs,Gemini 2.0 pro exp 0205 standed out as the top performer,achieving an accuracy rate of 74.28%.As an LLMs in agriculture vertical,Shennong V2.0 outperformed all the LLMs in China,and the answer accuracy rate of agricultural knowledge exceeded that of all the existing general-purpose LLMs.The launch of Agri-Eval helped the LLM developers to comprehensively evaluate the model's capability in the field of agriculture through a variety of tasks and tests to promote the development of the LLMs in the field of agriculture.
基金This study is financed by the European Union-NextGenerationEU,through the National Recovery and Resilience Plan of the Republic of Bulgaria,Project No.BG-RRP-2.013-0001.
文摘Covert timing channels(CTC)exploit network resources to establish hidden communication pathways,posing signi cant risks to data security and policy compliance.erefore,detecting such hidden and dangerous threats remains one of the security challenges. is paper proposes LinguTimeX,a new framework that combines natural language processing with arti cial intelligence,along with explainable Arti cial Intelligence(AI)not only to detect CTC but also to provide insights into the decision process.LinguTimeX performs multidimensional feature extraction by fusing linguistic attributes with temporal network patterns to identify covert channels precisely.LinguTimeX demonstrates strong e ectiveness in detecting CTC across multiple languages;namely English,Arabic,and Chinese.Speci cally,the LSTM and RNN models achieved F1 scores of 90%on the English dataset,89%on the Arabic dataset,and 88%on the Chinese dataset,showcasing their superior performance and ability to generalize across multiple languages. is highlights their robustness in detecting CTCs within security systems,regardless of the language or cultural context of the data.In contrast,the DeepForest model produced F1-scores ranging from 86%to 87%across the same datasets,further con rming its e ectiveness in CTC detection.Although other algorithms also showed reasonable accuracy,the LSTM and RNN models consistently outperformed them in multilingual settings,suggesting that deep learning models might be better suited for this particular problem.
基金funded by the Office of the Vice-President for Research and Development of Cebu Technological University.
文摘This study demonstrates a novel integration of large language models,machine learning,and multicriteria decision-making to investigate self-moderation in small online communities,a topic under-explored compared to user behavior and platform-driven moderation on social media.The proposed methodological framework(1)utilizes large language models for social media post analysis and categorization,(2)employs k-means clustering for content characterization,and(3)incorporates the TODIM(Tomada de Decisão Interativa Multicritério)method to determine moderation strategies based on expert judgments.In general,the fully integrated framework leverages the strengths of these intelligent systems in a more systematic evaluation of large-scale decision problems.When applied in social media moderation,this approach promotes nuanced and context-sensitive self-moderation by taking into account factors such as cultural background and geographic location.The application of this framework is demonstrated within Facebook groups.Eight distinct content clusters encompassing safety,harassment,diversity,and misinformation are identified.Analysis revealed a preference for content removal across all clusters,suggesting a cautious approach towards potentially harmful content.However,the framework also highlights the use of other moderation actions,like account suspension,depending on the content category.These findings contribute to the growing body of research on self-moderation and offer valuable insights for creating safer and more inclusive online spaces within smaller communities.
基金supported by the project:Hunan Provincial Educational Science Research Project“Research on Cultivating National Consciousness in College Foreign Language Courses(XJT23CGD001)”.
文摘This study examines how foreign language education in the artificial intelligence(AI)era could assist the cultivation of national consciousness through a technology-enhanced pedagogy of film appreciation.Using The Wild Robot as a case study,we argue that cinematic narratives serve as cultural mirrors,offering immersive,reflective,and affective sites for intercultural learning.We propose a three-layered pedagogical framework-progressing from semiotic decoding,through narrative and value comparison,to creative identity construction-that integrates intelligent tools to develop both communicative competence and an agentive sense of belonging.The approach exemplifies a humanistic turn in language teaching,aiming to form“rooted global communicators”who can engage in cross-civilization dialogue with cultural confidence and critical awareness.
文摘The programming technology about Embedded SQL dis-cuss the form of Embedded SQL base on C/C++language andORACLE9i DBMS in this paper.We have also discussed thecommunications between the sentence of Embedded SQL and C/C++language and have provided the code.
文摘自然语言转换结构化查询语言(NL2SQL)能降低非专业人员操作数据库的技术门槛,从而提升用户体验和工作效率。此外,检索增强生成(RAG)技术可以通过引入外部知识库提升NL2SQL的性能。针对目前RAG在NL2SQL应用中存在的检索策略漏检率高和召回上下文的相关性不强等问题,提出一种分序检索重排序RAG(RAG-SRR)方法优化知识库构建、检索召回策略和提示词设计等环节。首先,从问答对、专业名词和数据库结构这3个方面进行领域知识库的构建:问答对根据文物艺术品拍卖监管的高频处理和查询的问题构建,专业名词根据拍卖行业标准构建,而数据库结构根据雅昌艺术拍卖网的数据构建;其次,在检索阶段采取分序检索的策略,并对3类知识库设置不同的优先级,且在召回阶段重排序检索的信息;最后,在提示词设计中给出提示词优化设计的原则及提示词模板。实验结果表明:在领域数据集、Spider数据集上,RAG-SRR方法与基于BERT(Bidirectional Encoder Representations from Transformers)模型和RESDSQL(Ranking-enhanced Encoding plus a Skeleton-aware Decoding framework for text-to-SQL)模型的方法的执行准确率分别至少提高了19.50、24.20和12.17、8.90个百分点。而在相同大语言模型下,RAG-SRR方法比未优化的RAG方法的执行准确率分别至少提高了12.83和15.60个百分点,与C3SQL方法相比,执行准确率分别至少提高了1.50和3.10个百分点。在使用Llama3.1-8B时,与DIN-SQL方法相比,执行准确率在中文语料数据集中提升0.30个百分点,在英文语料数据集中最多相差3.90个百分点;但在使用Qwen2.5-7B时,执行准确率分别提高1.60和4.10个百分点。可见,RAG-SRR方法具备较强的实用性和可移植性。
文摘随着自然语言处理、人工智能和多域数据库应用的发展,对智能数据库查询系统的需求迅速增长,尤其是在中文语境中,实现准确的查询生成已成为金融、医疗保健和客户服务等行业的必需要素。现有的SQL生成方法难以解决中文语义解析、多域适应性及人机交互中语义一致性的问题,限制复杂查询的跨域处理。针对上述挑战,提出一种面向中文的多域人机交互式SQL生成算法MH-CSQL(multi-domain human-computer interaction for Chinese SQL generation algorithm),结合历史信息和课程学习技术以增强自然语言理解,支持多域数据库处理各种查询任务。实验结果表明,MH-CSQL在准确性和适应性方面均优于传统方法。此外,将人机交互模型的结果可视图进行展示,验证了MH-CSQL在智能问答等领域的应用前景。
文摘在智慧城市发展进程中,交通系统的精细化管理和智能化服务面临海量异构数据处理的挑战。传统交通信息查询系统存在数据源异构性强、自然语言交互能力不足、长尾查询场景覆盖有限等问题。文章基于ChatGLM3大语言模型,创新性地构建了融合NL2SQL(Natural Language to Structured Query Language)技术的智能问数系统,通过动态Schema对齐、LoRA微调优化及多维度提示工程技术,实现了交通领域复杂自然语言查询到精准SQL指令的智能转换。实验结果表明,经过微调的模型在交通信息查询任务中准确率达到78.9%,较基线模型提升15.8个百分点。本研究为交通管理智能化转型提供了创新技术路径,并对大模型在垂直领域的深度适配进行了系统性探索。