为解决自然语言数据处理模型进行数据处理时存在效果差、资源消耗大等问题,提出一种基于多尺度特征提取和注意力机制的融合算法。通过不同尺度的特征数据提取,并在特征图上应用加权算法,从而增强对某些特定尺度特征的关注,并基于该融合...为解决自然语言数据处理模型进行数据处理时存在效果差、资源消耗大等问题,提出一种基于多尺度特征提取和注意力机制的融合算法。通过不同尺度的特征数据提取,并在特征图上应用加权算法,从而增强对某些特定尺度特征的关注,并基于该融合算法对自然语言数据处理模型进行优化。仿真实验的结果表明:该融合算法特征提取效果较好,显著提升了计算机进行数据处理的各项能力。将优化后的自然语言处理(natural language processing,NLP)数据处理模型与CSAMT数据处理模型、BETG数据处理模型和优化前的NLP数据处理模型的性能进行对比可知:经过CBAM-MS-CNN优化的NLP数据处理模型的各项性能均优于其他模型。研究结果表明:该融合算法可以满足电子化移交流程中非结构化数据管理领域中的高可靠性、智能处理等业务需求,能提升数据处理效率和数据质量,减少人工录入数据和人工复核数据的工作量。展开更多
近年来,大语言模型(large language models,LLMs)在自然语言处理(natural language processing,NLP)等领域取得了显著进展,展现出强大的语言理解与生成能力。然而,在实际应用过程中,大语言模型仍然面临诸多挑战。其中,幻觉(hallucinati...近年来,大语言模型(large language models,LLMs)在自然语言处理(natural language processing,NLP)等领域取得了显著进展,展现出强大的语言理解与生成能力。然而,在实际应用过程中,大语言模型仍然面临诸多挑战。其中,幻觉(hallucination)问题引起了学术界和工业界的广泛关注。如何有效检测大语言模型幻觉,成为确保其在文本生成等下游任务可靠、安全、可信应用的关键挑战。该研究着重对大语言模型幻觉检测方法进行综述:首先,介绍了大语言模型概念,进一步明确了幻觉的定义与分类,系统梳理了大语言模型从构建到部署应用全生命周期各环节的特点,并深入分析了幻觉的产生机制与诱因;其次,立足于实际应用需求,考虑到在不同任务场景下模型透明度的差异等因素,将幻觉检测方法划分为针对白盒模型和黑盒模型2类,并进行了重点梳理和深入对比;而后,分析总结了现阶段主流的幻觉检测基准,为后续开展幻觉检测奠定基础;最后,指出了大语言模型幻觉检测的各种潜在研究方法和新的挑战。展开更多
Objective Natural language processing (NLP) was used to excavate and visualize the core content of syndrome element syndrome differentiation (SESD). Methods The first step was to build a text mining and analysis envir...Objective Natural language processing (NLP) was used to excavate and visualize the core content of syndrome element syndrome differentiation (SESD). Methods The first step was to build a text mining and analysis environment based on Python language, and built a corpus based on the core chapters of SESD. The second step was to digitalize the corpus. The main steps included word segmentation, information cleaning and merging, document-entry matrix, dictionary compilation and information conversion. The third step was to mine and display the internal information of SESD corpus by means of word cloud, keyword extraction and visualization. Results NLP played a positive role in computer recognition and comprehension of SESD. Different chapters had different keywords and weights. Deficiency syndrome elements were an important component of SESD, such as "Qi deficiency""Yang deficiency" and "Yin deficiency". The important syndrome elements of substantiality included "Blood stasis""Qi stagnation", etc. Core syndrome elements were closely related. Conclusions Syndrome differentiation and treatment was the core of SESD. Using NLP to excavate syndromes differentiation could help reveal the internal relationship between syndromes differentiation and provide basis for artificial intelligence to learn syndromes differentiation.展开更多
随着计算机算力的提升和智能设备的普及,社会逐步进入智慧化时代。高校图书馆作为高校的文献信息中心,进行智慧化转型提升服务质量是时代所需。因此,文章借助智能问答技术,设计了基于自然语言处理(Natural Language Processing,NLP)的...随着计算机算力的提升和智能设备的普及,社会逐步进入智慧化时代。高校图书馆作为高校的文献信息中心,进行智慧化转型提升服务质量是时代所需。因此,文章借助智能问答技术,设计了基于自然语言处理(Natural Language Processing,NLP)的图书馆智能问答系统,创新图书馆参考咨询服务模式,提高图书馆服务水平和效率。展开更多
随着人工智能技术的快速发展,自然语言处理(Natural Language Processing,NLP)技术在各个领域得到了广泛应用。文章提出一种基于NLP技术的智能培训系统中知识点与题库关联方法,该方法利用NLP技术对培训材料进行文本分析,自动提取知识点...随着人工智能技术的快速发展,自然语言处理(Natural Language Processing,NLP)技术在各个领域得到了广泛应用。文章提出一种基于NLP技术的智能培训系统中知识点与题库关联方法,该方法利用NLP技术对培训材料进行文本分析,自动提取知识点,并基于知识点和题库之间建立关联模型,实现试卷题目的自动分配。该方法能够有效提高培训系统的智能化水平,提高培训效率和质量。展开更多
Language disorder,a common manifestation of Alzheimer’s disease(AD),has attracted widespread attention in recent years.This paper uses a novel natural language processing(NLP)method,compared with latest deep learning...Language disorder,a common manifestation of Alzheimer’s disease(AD),has attracted widespread attention in recent years.This paper uses a novel natural language processing(NLP)method,compared with latest deep learning technology,to detect AD and explore the lexical performance.Our proposed approach is based on two stages.First,the dialogue contents are summarized into two categories with the same category.Second,term frequency—inverse document frequency(TF-IDF)algorithm is used to extract the keywords of transcripts,and the similarity of keywords between the groups was calculated separately by cosine distance.Several deep learning methods are used to compare the performance.In the meanwhile,keywords with the best performance are used to analyze AD patients’lexical performance.In the Predictive Challenge of Alzheimer’s Disease held by iFlytek in 2019,the proposed AD diagnosis model achieves a better performance in binary classification by adjusting the number of keywords.The F1 score of the model has a considerable improvement over the baseline of 75.4%,and the training process of which is simple and efficient.We analyze the keywords of the model and find that AD patients use less noun and verb than normal controls.A computer-assisted AD diagnosis model on small Chinese dataset is proposed in this paper,which provides a potential way for assisting diagnosis of AD and analyzing lexical performance in clinical setting.展开更多
As Natural Language Processing(NLP)continues to advance,driven by the emergence of sophisticated large language models such as ChatGPT,there has been a notable growth in research activity.This rapid uptake reflects in...As Natural Language Processing(NLP)continues to advance,driven by the emergence of sophisticated large language models such as ChatGPT,there has been a notable growth in research activity.This rapid uptake reflects increasing interest in the field and induces critical inquiries into ChatGPT’s applicability in the NLP domain.This review paper systematically investigates the role of ChatGPT in diverse NLP tasks,including information extraction,Name Entity Recognition(NER),event extraction,relation extraction,Part of Speech(PoS)tagging,text classification,sentiment analysis,emotion recognition and text annotation.The novelty of this work lies in its comprehensive analysis of the existing literature,addressing a critical gap in understanding ChatGPT’s adaptability,limitations,and optimal application.In this paper,we employed a systematic stepwise approach following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses(PRISMA)framework to direct our search process and seek relevant studies.Our review reveals ChatGPT’s significant potential in enhancing various NLP tasks.Its adaptability in information extraction tasks,sentiment analysis,and text classification showcases its ability to comprehend diverse contexts and extract meaningful details.Additionally,ChatGPT’s flexibility in annotation tasks reducesmanual efforts and accelerates the annotation process,making it a valuable asset in NLP development and research.Furthermore,GPT-4 and prompt engineering emerge as a complementary mechanism,empowering users to guide the model and enhance overall accuracy.Despite its promising potential,challenges persist.The performance of ChatGP Tneeds tobe testedusingmore extensivedatasets anddiversedata structures.Subsequently,its limitations in handling domain-specific language and the need for fine-tuning in specific applications highlight the importance of further investigations to address these issues.展开更多
随着自然语言处理(Natural Language Processing,NLP)技术的发展,其对各行各业的发展注入了新的动力,同时在网络教育快速发展的背景下,二者的有机融合也便成为热点。本文提出基于浏览器/服务器(Browser/Server,B/S)模式,通过建立录题模...随着自然语言处理(Natural Language Processing,NLP)技术的发展,其对各行各业的发展注入了新的动力,同时在网络教育快速发展的背景下,二者的有机融合也便成为热点。本文提出基于浏览器/服务器(Browser/Server,B/S)模式,通过建立录题模板实现对试题的分割和录入,借助Textrank4zh和word2vec模块,建立以TextRank算法为基础的隐马尔可夫模型完成组卷功能,完成以Vue.js框架为前端和Flask框架为后端的题库考试系统的设计与实现。该项目在减轻教师工作量的同时可更好地考察学生知识掌握的程度。展开更多
文摘为解决自然语言数据处理模型进行数据处理时存在效果差、资源消耗大等问题,提出一种基于多尺度特征提取和注意力机制的融合算法。通过不同尺度的特征数据提取,并在特征图上应用加权算法,从而增强对某些特定尺度特征的关注,并基于该融合算法对自然语言数据处理模型进行优化。仿真实验的结果表明:该融合算法特征提取效果较好,显著提升了计算机进行数据处理的各项能力。将优化后的自然语言处理(natural language processing,NLP)数据处理模型与CSAMT数据处理模型、BETG数据处理模型和优化前的NLP数据处理模型的性能进行对比可知:经过CBAM-MS-CNN优化的NLP数据处理模型的各项性能均优于其他模型。研究结果表明:该融合算法可以满足电子化移交流程中非结构化数据管理领域中的高可靠性、智能处理等业务需求,能提升数据处理效率和数据质量,减少人工录入数据和人工复核数据的工作量。
文摘近年来,大语言模型(large language models,LLMs)在自然语言处理(natural language processing,NLP)等领域取得了显著进展,展现出强大的语言理解与生成能力。然而,在实际应用过程中,大语言模型仍然面临诸多挑战。其中,幻觉(hallucination)问题引起了学术界和工业界的广泛关注。如何有效检测大语言模型幻觉,成为确保其在文本生成等下游任务可靠、安全、可信应用的关键挑战。该研究着重对大语言模型幻觉检测方法进行综述:首先,介绍了大语言模型概念,进一步明确了幻觉的定义与分类,系统梳理了大语言模型从构建到部署应用全生命周期各环节的特点,并深入分析了幻觉的产生机制与诱因;其次,立足于实际应用需求,考虑到在不同任务场景下模型透明度的差异等因素,将幻觉检测方法划分为针对白盒模型和黑盒模型2类,并进行了重点梳理和深入对比;而后,分析总结了现阶段主流的幻觉检测基准,为后续开展幻觉检测奠定基础;最后,指出了大语言模型幻觉检测的各种潜在研究方法和新的挑战。
基金the funding support from the National Natural Science Foundation of China (No. 81874429)Digital and Applied Research Platform for Diagnosis of Traditional Chinese Medicine (No. 49021003005)+1 种基金2018 Hunan Provincial Postgraduate Research Innovation Project (No. CX2018B465)Excellent Youth Project of Hunan Education Department in 2018 (No. 18B241)
文摘Objective Natural language processing (NLP) was used to excavate and visualize the core content of syndrome element syndrome differentiation (SESD). Methods The first step was to build a text mining and analysis environment based on Python language, and built a corpus based on the core chapters of SESD. The second step was to digitalize the corpus. The main steps included word segmentation, information cleaning and merging, document-entry matrix, dictionary compilation and information conversion. The third step was to mine and display the internal information of SESD corpus by means of word cloud, keyword extraction and visualization. Results NLP played a positive role in computer recognition and comprehension of SESD. Different chapters had different keywords and weights. Deficiency syndrome elements were an important component of SESD, such as "Qi deficiency""Yang deficiency" and "Yin deficiency". The important syndrome elements of substantiality included "Blood stasis""Qi stagnation", etc. Core syndrome elements were closely related. Conclusions Syndrome differentiation and treatment was the core of SESD. Using NLP to excavate syndromes differentiation could help reveal the internal relationship between syndromes differentiation and provide basis for artificial intelligence to learn syndromes differentiation.
文摘随着计算机算力的提升和智能设备的普及,社会逐步进入智慧化时代。高校图书馆作为高校的文献信息中心,进行智慧化转型提升服务质量是时代所需。因此,文章借助智能问答技术,设计了基于自然语言处理(Natural Language Processing,NLP)的图书馆智能问答系统,创新图书馆参考咨询服务模式,提高图书馆服务水平和效率。
文摘随着人工智能技术的快速发展,自然语言处理(Natural Language Processing,NLP)技术在各个领域得到了广泛应用。文章提出一种基于NLP技术的智能培训系统中知识点与题库关联方法,该方法利用NLP技术对培训材料进行文本分析,自动提取知识点,并基于知识点和题库之间建立关联模型,实现试卷题目的自动分配。该方法能够有效提高培训系统的智能化水平,提高培训效率和质量。
基金the Natural Science Foundation of Zhejiang Province(No.GF20F020063)the Fujian Province Young and Middle-Aged Teacher Education Research Project(No.JAT170480)。
文摘Language disorder,a common manifestation of Alzheimer’s disease(AD),has attracted widespread attention in recent years.This paper uses a novel natural language processing(NLP)method,compared with latest deep learning technology,to detect AD and explore the lexical performance.Our proposed approach is based on two stages.First,the dialogue contents are summarized into two categories with the same category.Second,term frequency—inverse document frequency(TF-IDF)algorithm is used to extract the keywords of transcripts,and the similarity of keywords between the groups was calculated separately by cosine distance.Several deep learning methods are used to compare the performance.In the meanwhile,keywords with the best performance are used to analyze AD patients’lexical performance.In the Predictive Challenge of Alzheimer’s Disease held by iFlytek in 2019,the proposed AD diagnosis model achieves a better performance in binary classification by adjusting the number of keywords.The F1 score of the model has a considerable improvement over the baseline of 75.4%,and the training process of which is simple and efficient.We analyze the keywords of the model and find that AD patients use less noun and verb than normal controls.A computer-assisted AD diagnosis model on small Chinese dataset is proposed in this paper,which provides a potential way for assisting diagnosis of AD and analyzing lexical performance in clinical setting.
文摘As Natural Language Processing(NLP)continues to advance,driven by the emergence of sophisticated large language models such as ChatGPT,there has been a notable growth in research activity.This rapid uptake reflects increasing interest in the field and induces critical inquiries into ChatGPT’s applicability in the NLP domain.This review paper systematically investigates the role of ChatGPT in diverse NLP tasks,including information extraction,Name Entity Recognition(NER),event extraction,relation extraction,Part of Speech(PoS)tagging,text classification,sentiment analysis,emotion recognition and text annotation.The novelty of this work lies in its comprehensive analysis of the existing literature,addressing a critical gap in understanding ChatGPT’s adaptability,limitations,and optimal application.In this paper,we employed a systematic stepwise approach following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses(PRISMA)framework to direct our search process and seek relevant studies.Our review reveals ChatGPT’s significant potential in enhancing various NLP tasks.Its adaptability in information extraction tasks,sentiment analysis,and text classification showcases its ability to comprehend diverse contexts and extract meaningful details.Additionally,ChatGPT’s flexibility in annotation tasks reducesmanual efforts and accelerates the annotation process,making it a valuable asset in NLP development and research.Furthermore,GPT-4 and prompt engineering emerge as a complementary mechanism,empowering users to guide the model and enhance overall accuracy.Despite its promising potential,challenges persist.The performance of ChatGP Tneeds tobe testedusingmore extensivedatasets anddiversedata structures.Subsequently,its limitations in handling domain-specific language and the need for fine-tuning in specific applications highlight the importance of further investigations to address these issues.
文摘随着自然语言处理(Natural Language Processing,NLP)技术的发展,其对各行各业的发展注入了新的动力,同时在网络教育快速发展的背景下,二者的有机融合也便成为热点。本文提出基于浏览器/服务器(Browser/Server,B/S)模式,通过建立录题模板实现对试题的分割和录入,借助Textrank4zh和word2vec模块,建立以TextRank算法为基础的隐马尔可夫模型完成组卷功能,完成以Vue.js框架为前端和Flask框架为后端的题库考试系统的设计与实现。该项目在减轻教师工作量的同时可更好地考察学生知识掌握的程度。