期刊文献+
共找到33,823篇文章
< 1 2 250 >
每页显示 20 50 100
AI-Generated Text Detection:A Comprehensive Review of Active and Passive Approaches
1
作者 Lingyun Xiang Nian Li +1 位作者 Yuling Liu Jiayong Hu 《Computers, Materials & Continua》 2026年第3期201-229,共29页
The rapid advancement of large language models(LLMs)has driven the pervasive adoption of AI-generated content(AIGC),while also raising concerns about misinformation,academic misconduct,biased or harmful content,and ot... The rapid advancement of large language models(LLMs)has driven the pervasive adoption of AI-generated content(AIGC),while also raising concerns about misinformation,academic misconduct,biased or harmful content,and other risks.Detecting AI-generated text has thus become essential to safeguard the authenticity and reliability of digital information.This survey reviews recent progress in detection methods,categorizing approaches into passive and active categories based on their reliance on intrinsic textual features or embedded signals.Passive detection is further divided into surface linguistic feature-based and language model-based methods,whereas active detection encompasses watermarking-based and semantic retrieval-based approaches.This taxonomy enables systematic comparison of methodological differences in model dependency,applicability,and robustness.A key challenge for AI-generated text detection is that existing detectors are highly vulnerable to adversarial attacks,particularly paraphrasing,which substantially compromises their effectiveness.Addressing this gap highlights the need for future research on enhancing robustness and cross-domain generalization.By synthesizing current advances and limitations,this survey provides a structured reference for the field and outlines pathways toward more reliable and scalable detection solutions. 展开更多
关键词 AI-generated text detection large language models text classification WATERMARKING
在线阅读 下载PDF
Research on the Classification of Digital Cultural Texts Based on ASSC-TextRCNN Algorithm
2
作者 Zixuan Guo Houbin Wang +1 位作者 Sameer Kumar Yuanfang Chen 《Computers, Materials & Continua》 2026年第3期2119-2145,共27页
With the rapid development of digital culture,a large number of cultural texts are presented in the form of digital and network.These texts have significant characteristics such as sparsity,real-time and non-standard ... With the rapid development of digital culture,a large number of cultural texts are presented in the form of digital and network.These texts have significant characteristics such as sparsity,real-time and non-standard expression,which bring serious challenges to traditional classification methods.In order to cope with the above problems,this paper proposes a new ASSC(ALBERT,SVD,Self-Attention and Cross-Entropy)-TextRCNN digital cultural text classification model.Based on the framework of TextRCNN,the Albert pre-training language model is introduced to improve the depth and accuracy of semantic embedding.Combined with the dual attention mechanism,the model’s ability to capture and model potential key information in short texts is strengthened.The Singular Value Decomposition(SVD)was used to replace the traditional Max pooling operation,which effectively reduced the feature loss rate and retained more key semantic information.The cross-entropy loss function was used to optimize the prediction results,making the model more robust in class distribution learning.The experimental results indicate that,in the digital cultural text classification task,as compared to the baseline model,the proposed ASSC-TextRCNN method achieves an 11.85%relative improvement in accuracy and an 11.97%relative increase in the F1 score.Meanwhile,the relative error rate decreases by 53.18%.This achievement not only validates the effectiveness and advanced nature of the proposed approach but also offers a novel technical route and methodological underpinnings for the intelligent analysis and dissemination of digital cultural texts.It holds great significance for promoting the in-depth exploration and value realization of digital culture. 展开更多
关键词 text classification natural language processing textRCNN model albert pre-training singular value decomposition cross-entropy loss function
在线阅读 下载PDF
Context-Aware Spam Detection Using BERT Embeddings with Multi-Window CNNs
3
作者 Sajid Ali Qazi Mazhar Ul Haq +3 位作者 Ala Saleh Alluhaidan Muhammad Shahid Anwar Sadique Ahmad Leila Jamel 《Computer Modeling in Engineering & Sciences》 2026年第1期1296-1310,共15页
Spam emails remain one of the most persistent threats to digital communication,necessitating effective detection solutions that safeguard both individuals and organisations.We propose a spam email classification frame... Spam emails remain one of the most persistent threats to digital communication,necessitating effective detection solutions that safeguard both individuals and organisations.We propose a spam email classification frame-work that uses Bidirectional Encoder Representations from Transformers(BERT)for contextual feature extraction and a multiple-window Convolutional Neural Network(CNN)for classification.To identify semantic nuances in email content,BERT embeddings are used,and CNN filters extract discriminative n-gram patterns at various levels of detail,enabling accurate spam identification.The proposed model outperformed Word2Vec-based baselines on a sample of 5728 labelled emails,achieving an accuracy of 98.69%,AUC of 0.9981,F1 Score of 0.9724,and MCC of 0.9639.With a medium kernel size of(6,9)and compact multi-window CNN architectures,it improves performance.Cross-validation illustrates stability and generalization across folds.By balancing high recall with minimal false positives,our method provides a reliable and scalable solution for current spam detection in advanced deep learning.By combining contextual embedding and a neural architecture,this study develops a security analysis method. 展开更多
关键词 E-mail spam detection BERT embedding text classification CYBERSECURITY CNN
在线阅读 下载PDF
基于BERT-TextCNN模型的基础研究与应用研究论文分类方法研究
4
作者 张萌萌 钟永恒 刘佳 《科技管理研究》 2026年第1期256-267,共12页
研究旨在构建一种高效且精准的分类模型用于判别单篇论文归属基础研究或应用研究。通过构建融合半自动标注的BERT-TextCNN模型,借助半自动标注策略降低人工标注工作量并提高模型分类效率,利用BERT生成文本向量,通过TextCNN提取关键特征... 研究旨在构建一种高效且精准的分类模型用于判别单篇论文归属基础研究或应用研究。通过构建融合半自动标注的BERT-TextCNN模型,借助半自动标注策略降低人工标注工作量并提高模型分类效率,利用BERT生成文本向量,通过TextCNN提取关键特征;通过文献计量法和BERTopic模型分析量子信息领域的分类结果。结果表明,该模型的F1值高达0.896,相较于BERT和TextCNN分别提升2.1%和7.9%,并显著优于Baichuan4-Turbo、DeepSeek-V3和GLM-4-Plus等大语言模型,F1值提升幅度分别为12.2%、13.1%和18.8%。这既验证了语义表征与局部特征融合机制的优越性,又有效克服了大语言模型在专业领域分类中存在的“高召回低精度”缺陷。将模型应用至量子信息领域,发现基础研究聚焦在量子态与纠缠、离子自旋等方向,应用研究重点关注密钥分发、量子传感与网络组件等研究。研究为科学文献分类提供了新方法,在科研评估与资源优化方面具有重要应用价值。 展开更多
关键词 文献分类 深度学习 半自动标注 文本挖掘 量子信息
在线阅读 下载PDF
GSPT-CVAE: A New Controlled Long Text Generation Method Based on T-CVAE
5
作者 Tian Zhao Jun Tu +1 位作者 Puzheng Quan Ruisheng Xiong 《Computers, Materials & Continua》 2025年第7期1351-1377,共27页
Aiming at the problems of incomplete characterization of text relations,poor guidance of potential representations,and low quality of model generation in the field of controllable long text generation,this paper propo... Aiming at the problems of incomplete characterization of text relations,poor guidance of potential representations,and low quality of model generation in the field of controllable long text generation,this paper proposes a new GSPT-CVAE model(Graph Structured Processing,Single Vector,and Potential Attention Com-puting Transformer-Based Conditioned Variational Autoencoder model).The model obtains a more comprehensive representation of textual relations by graph-structured processing of the input text,and at the same time obtains a single vector representation by weighted merging of the vector sequences after graph-structured processing to get an effective potential representation.In the process of potential representation guiding text generation,the model adopts a combination of traditional embedding and potential attention calculation to give full play to the guiding role of potential representation for generating text,to improve the controllability and effectiveness of text generation.The experimental results show that the model has excellent representation learning ability and can learn rich and useful textual relationship representations.The model also achieves satisfactory results in the effectiveness and controllability of text generation and can generate long texts that match the given constraints.The ROUGE-1 F1 score of this model is 0.243,the ROUGE-2 F1 score is 0.041,the ROUGE-L F1 score is 0.22,and the PPL-Word score is 34.303,which gives the GSPT-CVAE model a certain advantage over the baseline model.Meanwhile,this paper compares this model with the state-of-the-art generative models T5,GPT-4,Llama2,and so on,and the experimental results show that the GSPT-CVAE model has a certain competitiveness. 展开更多
关键词 Controllable text generation textual graph structuring text relationships potential characterization
在线阅读 下载PDF
基于Transformer和Text-CNN的日志异常检测 被引量:1
6
作者 尹春勇 张小虎 《计算机工程与科学》 北大核心 2025年第3期448-458,共11页
日志数据作为软件系统中最为重要的数据资源之一,记录着系统运行期间的详细信息,自动化的日志异常检测对于维护系统安全至关重要。随着大型语言模型在自然语言处理领域的广泛应用,基于Transformer的日志异常检测方法被广泛地提出。传统... 日志数据作为软件系统中最为重要的数据资源之一,记录着系统运行期间的详细信息,自动化的日志异常检测对于维护系统安全至关重要。随着大型语言模型在自然语言处理领域的广泛应用,基于Transformer的日志异常检测方法被广泛地提出。传统的基于Transformer的方法,难以捕捉日志序列的局部特征,针对上述问题,提出了基于Transformer和Text-CNN的日志异常检测方法LogTC。首先,通过规则匹配将日志转换成结构化的日志数据,并保留日志语句中的有效信息;其次,根据日志特性采用固定窗口或会话窗口将日志语句划分为日志序列;再次,使用自然语言处理技术Sentence-BERT生成日志语句的语义化表示;最后,将日志序列的语义化向量输入到LogTC日志异常检测模型中进行检测。实验结果表明,LogTC能够有效地检测日志数据中的异常,且在2个数据集上都取得了较好的结果。 展开更多
关键词 日志异常检测 深度学习 词嵌入 TRANSFORMER text-CNN
在线阅读 下载PDF
RNSQL:融合逆规范化的Text2SQL生成
7
作者 帖军 范子琪 +2 位作者 孙翀 郑禄 朱柏尔 《计算机应用与软件》 北大核心 2025年第9期31-37,86,共8页
Text2SQL是自然语言处理科研领域中的一项重要任务,在研究智能问答系统中发挥关键性的作用,其核心任务是将自然语言描述的问题自动转换为SQL查询语句。当前研究重点为提高SQL子句任务的匹配准确率,但忽略了SQL的句法生成的正确性,涉及... Text2SQL是自然语言处理科研领域中的一项重要任务,在研究智能问答系统中发挥关键性的作用,其核心任务是将自然语言描述的问题自动转换为SQL查询语句。当前研究重点为提高SQL子句任务的匹配准确率,但忽略了SQL的句法生成的正确性,涉及多表连接的SQL生成仍存在大量错误。因此,提出一种基于神经网络的Text2SQL方法,该方法通过逆规范化技术,对数据库模式进行重构,关注SQL句法生成的正确性,称为逆规范化网络(Reverse Normalization SQL,RNSQL)。经理论分析和在公共数据集Spider上实验验证,RNSQL能有效提升Text2SQL任务的质量。 展开更多
关键词 逆规范化 语义解析 text2SQL 槽填充
在线阅读 下载PDF
J-TEXT托卡马克相干成像光谱诊断系统设计
8
作者 聂林 吴骏彬 +5 位作者 龙婷 雷驰 严伟 李杨波 张霄翼 J-TEXT实验团队 《核聚变与等离子体物理》 北大核心 2025年第3期273-279,共7页
相干成像光谱诊断是一种采用高速相机拍摄方式对等离子体边界的杂质离子流速进行二维成像的被动光谱诊断,对研究托卡马克边界和偏滤器等离子体环向旋转、杂质离子分布有着重要的作用。J-TEXT装置成功研制并部署了一套主要基于CⅢ线(464.... 相干成像光谱诊断是一种采用高速相机拍摄方式对等离子体边界的杂质离子流速进行二维成像的被动光谱诊断,对研究托卡马克边界和偏滤器等离子体环向旋转、杂质离子分布有着重要的作用。J-TEXT装置成功研制并部署了一套主要基于CⅢ线(464.88 nm)的相干成像光谱诊断系统。该系统的光学视场设计为12°,主要针对J-TEXT强场侧边缘等离子体区域进行观测。在性能指标方面,系统具备2 ms的时间分辨率,同时实现了11 mm(垂直方向)空间分辨率。目前该诊断系统已完成实验测试,并成功获取了等离子体边界的关键数据,为开展边界物理研究提供了新的实验手段。 展开更多
关键词 相干成像光谱诊断 环向速度 J-text托卡马克
在线阅读 下载PDF
中文短文本情感分类:融入位置感知强化的Transformer-TextCNN模型研究
9
作者 李浩君 王耀东 汪旭辉 《计算机工程与应用》 北大核心 2025年第11期216-226,共11页
针对当前中文短文本情感分类模型文本位置信息与关键特征获取不足的问题,提出了一种融入位置感知强化的Transformer-TextCNN情感分类模型。利用BERT可学习绝对位置编码与正弦位置编码强化模型的位置感知能力,融合Transformer的全局上下... 针对当前中文短文本情感分类模型文本位置信息与关键特征获取不足的问题,提出了一种融入位置感知强化的Transformer-TextCNN情感分类模型。利用BERT可学习绝对位置编码与正弦位置编码强化模型的位置感知能力,融合Transformer的全局上下文理解能力与TextCNN的局部特征捕捉能力,分别提取中文短文本全局特征与局部特征,构建位置感知强化与特征协同的情感特征输出服务,实现中文短文本情感准确分类。实验结果表明,该模型在视频弹幕数据集上的准确率达到90.23%,在SMP2020数据集上的准确率达到87.38%。相较于最优的基线模型,准确率在视频弹幕数据集和SMP2020数据集上分别提高了1.98和0.44个百分点,在中文短文本情感分类任务中取得更好的分类效果。 展开更多
关键词 文本情感分类 BERT TRANSFORMER textCNN 位置编码
在线阅读 下载PDF
Text Structured Algorithm of Lung Cancer Cases Based on Deep Learning
10
作者 MI Linhui YUAN Junyi +1 位作者 ZHOU Yankang HOU Xumin 《Journal of Shanghai Jiaotong university(Science)》 2025年第4期778-789,共12页
Surgical site infections(SSIs)are the most common healthcare-related infections in patients with lung cancer.Constructing a lung cancer SSI risk prediction model requires the extraction of relevant risk factors from l... Surgical site infections(SSIs)are the most common healthcare-related infections in patients with lung cancer.Constructing a lung cancer SSI risk prediction model requires the extraction of relevant risk factors from lung cancer case texts,which involves two types of text structuring tasks:attribute discrimination and attribute extraction.This article proposes a joint model,Multi-BGLC,around these two types of tasks,using bidirectional encoder representations from transformers(BERT)as the encoder and fine-tuning the decoder composed of graph convolutional neural network(GCNN)+long short-term memory(LSTM)+conditional random field(CRF)based on cancer case data.The GCNN is used for attribute discrimination,whereas the LSTM and CRF are used for attribute extraction.The experiment verified the effectiveness and accuracy of the model compared with other baseline models. 展开更多
关键词 text structuring text classification sequence labeling data augmentation lung cancer electronic medical record
原文传递
Safeguarding a Treasure Trove:Sakya Monastery Preserves Relics and Ancient Texts
11
作者 Palden Nyima(Text/Photos) 《China's Tibet》 2025年第6期36-39,共4页
Since the launch of a digitization project for the protection and utilization of ancient texts in the Sakya Monastery of the Xizang Autonomous Region in 2012,significant efforts and achievements have been made in anci... Since the launch of a digitization project for the protection and utilization of ancient texts in the Sakya Monastery of the Xizang Autonomous Region in 2012,significant efforts and achievements have been made in ancient text preservation. 展开更多
关键词 protection utilization ancient texts DIGITIZATION Sakya Monastery ancient text preservation digitization project Xizang Autonomous Region
在线阅读 下载PDF
Multilingual Text Summarization in Healthcare Using Pre-Trained Transformer-Based Language Models
12
作者 Josua Käser Thomas Nagy +1 位作者 Patrick Stirnemann Thomas Hanne 《Computers, Materials & Continua》 2025年第4期201-217,共17页
We analyze the suitability of existing pre-trained transformer-based language models(PLMs)for abstractive text summarization on German technical healthcare texts.The study focuses on the multilingual capabilities of t... We analyze the suitability of existing pre-trained transformer-based language models(PLMs)for abstractive text summarization on German technical healthcare texts.The study focuses on the multilingual capabilities of these models and their ability to perform the task of abstractive text summarization in the healthcare field.The research hypothesis was that large language models could perform high-quality abstractive text summarization on German technical healthcare texts,even if the model is not specifically trained in that language.Through experiments,the research questions explore the performance of transformer language models in dealing with complex syntax constructs,the difference in performance between models trained in English and German,and the impact of translating the source text to English before conducting the summarization.We conducted an evaluation of four PLMs(GPT-3,a translation-based approach also utilizing GPT-3,a German language Model,and a domain-specific bio-medical model approach).The evaluation considered the informativeness using 3 types of metrics based on Recall-Oriented Understudy for Gisting Evaluation(ROUGE)and the quality of results which is manually evaluated considering 5 aspects.The results show that text summarization models could be used in the German healthcare domain and that domain-independent language models achieved the best results.The study proves that text summarization models can simplify the search for pre-existing German knowledge in various domains. 展开更多
关键词 text summarization pre-trained transformer-based language models large language models technical healthcare texts natural language processing
在线阅读 下载PDF
Heimtextil grows and starts with over 3,000 exhibitors and design icon Patricia Urquiola
13
《China Textile》 2025年第1期54-55,共2页
On January 14,Heimtextil kicked off the new trade fair year with over 3,000 exhibitors from 65 countries.With steady growth,the leading trade fair for home and contract textiles and textile design is strongly position... On January 14,Heimtextil kicked off the new trade fair year with over 3,000 exhibitors from 65 countries.With steady growth,the leading trade fair for home and contract textiles and textile design is strongly positioned. This makes it a reliable platform for international participants.At the opening,architect and designer Patricia Urquiola presented her installation 'among-us' at Heimtextil. 展开更多
关键词 textiles EXHIBITOR text
在线阅读 下载PDF
面向研究生招生咨询的中文Text-to-SQL模型 被引量:1
14
作者 王庆丰 李旭 +1 位作者 姚春龙 程腾腾 《计算机工程》 北大核心 2025年第3期362-368,共7页
研究生招生咨询是一种具有代表性的短时间高频次问答应用场景。针对现有基于词向量等方法的招生问答系统返回答案不够精确,以及每年需要更新问题库的问题,引入了基于文本转结构化查询语言(Text-to-SQL)技术的RESDSQL模型,可将自然语言... 研究生招生咨询是一种具有代表性的短时间高频次问答应用场景。针对现有基于词向量等方法的招生问答系统返回答案不够精确,以及每年需要更新问题库的问题,引入了基于文本转结构化查询语言(Text-to-SQL)技术的RESDSQL模型,可将自然语言问题转化为SQL语句后到结构化数据库中查询答案并返回。搜集了研究生招生场景中的高频咨询问题,根据3所高校真实招生数据,构建问题与SQL语句模板,通过填充模板的方式构建数据集,共有训练集1501条、测试集386条。将RESDSQL的RoBERTa模型替换为具有更强多语言生成能力的XLM-RoBERTa模型、T5模型替换为mT5模型,并在目标领域数据集上进行微调,在招生领域问题上取得了较高的准确率,在mT5-large模型上执行正确率为0.95,精确匹配率为1。与基于ChatGPT3.5模型、使用零样本提示的C3SQL方法对比,该模型性能与成本均更优。 展开更多
关键词 中文文本转结构化查询语言 自然语言查询 中文SQL语句生成 预训练模型 text-to-SQL数据集
在线阅读 下载PDF
Smart Approaches to Efficient Text Mining for Categorizing Sexual Reproductive Health Short Messages into Key Themes
15
作者 Tobias Makai Mayumbo Nyirenda 《Open Journal of Applied Sciences》 2024年第2期511-532,共22页
To promote behavioral change among adolescents in Zambia, the National HIV/AIDS/STI/TB Council, in collaboration with UNICEF, developed the Zambia U-Report platform. This platform provides young people with improved a... To promote behavioral change among adolescents in Zambia, the National HIV/AIDS/STI/TB Council, in collaboration with UNICEF, developed the Zambia U-Report platform. This platform provides young people with improved access to information on various Sexual Reproductive Health topics through Short Messaging Service (SMS) messages. Over the years, the platform has accumulated millions of incoming and outgoing messages, which need to be categorized into key thematic areas for better tracking of sexual reproductive health knowledge gaps among young people. The current manual categorization process of these text messages is inefficient and time-consuming and this study aims to automate the process for improved analysis using text-mining techniques. Firstly, the study investigates the current text message categorization process and identifies a list of categories adopted by counselors over time which are then used to build and train a categorization model. Secondly, the study presents a proof of concept tool that automates the categorization of U-report messages into key thematic areas using the developed categorization model. Finally, it compares the performance and effectiveness of the developed proof of concept tool against the manual system. The study used a dataset comprising 206,625 text messages. The current process would take roughly 2.82 years to categorise this dataset whereas the trained SVM model would require only 6.4 minutes while achieving an accuracy of 70.4% demonstrating that the automated method is significantly faster, more scalable, and consistent when compared to the current manual categorization. These advantages make the SVM model a more efficient and effective tool for categorizing large unstructured text datasets. These results and the proof-of-concept tool developed demonstrate the potential for enhancing the efficiency and accuracy of message categorization on the Zambia U-report platform and other similar text messages-based platforms. 展开更多
关键词 Knowledge Discovery in text (KDT) Sexual Reproductive Health (SRH) text Categorization text Classification text Extraction text Mining Feature Extraction Automated Classification Process Performance Stemming and Lemmatization Natural Language Processing (NLP)
在线阅读 下载PDF
基于检索增强Text-to-SQL生成的书目搜索对话问答系统研究
16
作者 王震宇 朱学芳 +2 位作者 张君冬 杨睿 刘崧印 《数据分析与知识发现》 北大核心 2025年第11期165-174,共10页
【目的】针对书目搜索场景中自然语言查询难以准确映射为结构化数据库查询的问题,本文构建对话式问答系统并提出改进方法。【方法】该系统采用模型上下文协议实现大语言模型与外部数据库的无缝集成。在此基础上,针对示例驱动的Text-to-... 【目的】针对书目搜索场景中自然语言查询难以准确映射为结构化数据库查询的问题,本文构建对话式问答系统并提出改进方法。【方法】该系统采用模型上下文协议实现大语言模型与外部数据库的无缝集成。在此基础上,针对示例驱动的Text-to-SQL生成易受噪声与领域差异影响的问题,设计了一种基于对比学习的示例选择策略,通过微调文本嵌入模型,使其更关注查询的句法结构与检索意图,从而提升相似度排序质量。实验基于构建的书目搜索语义解析数据集进行,在零样本与少样本条件下对系统性能进行对比验证。【结果】相较于零样本设置,采用本文方法的DeepSeek-V3模型在5-Shot场景下的SQL执行准确率提高了18.5个百分点,验证了该示例选择策略在专业领域Text-to-SQL任务中的有效性。【局限】由于实验数据集覆盖范围有限,系统对跨领域查询的适应性仍需进一步增强。【结论】研究证明了大语言模型结合对比学习示例选择策略在书目智能搜索场景中的有效性,可为其他垂直领域对话问答系统的构建与优化提供参考。 展开更多
关键词 书目搜索 检索增强生成 text-to-SQL 对话问答系统 模型上下文协议
原文传递
基于TextRank和自注意力的长文档无监督抽取式摘要
17
作者 邢玲 程兵 闫强 《计算机应用与软件》 北大核心 2025年第3期274-283,共10页
针对中文长文档自动文本摘要问题,提出将TextRank与自注意力相融合的两种模型:TRAI和TRAO。TRAI将基于统计共现字数得到的句子相似性同基于自注意力得到的句子相关性进行加权求和,作为TextRank边的权重参与迭代计算,对句子进行打分。TRA... 针对中文长文档自动文本摘要问题,提出将TextRank与自注意力相融合的两种模型:TRAI和TRAO。TRAI将基于统计共现字数得到的句子相似性同基于自注意力得到的句子相关性进行加权求和,作为TextRank边的权重参与迭代计算,对句子进行打分。TRAO利用TextRank对句子打分;利用自注意力重新表示每个句子融合整个文档信息的分布式向量,在此基础上计算句子间余弦相似度,作为TextRank边的权重参与迭代计算,给句子打分;将两种得分加权求和作为句子最终得分。两种模型均根据得分对句子进行排序得到候选摘要。为去除摘要冗余性,利用最大边界相关法(Maximal Marginal Relevance,MMR)在候选摘要中选取摘要句子。将提出的两种模型在构建的长文档上进行实验,与TextRank方法相比,所提方法在ROUGE评价指标上有显著提高。 展开更多
关键词 中文长文本摘要 textRank 自注意力机制 分布式向量表示 语义信息 融合文档信息
在线阅读 下载PDF
基于Text2Vec_AE_KMeans的微博话题聚类分析方法
18
作者 万文桐 黄润才 《智能计算机与应用》 2025年第5期82-89,共8页
传统的话题聚类分析方法使用静态词向量对微博文本进行建模,对微博文本不规范表达、一词多义等特点应对不佳,从而影响聚类效果与话题表述。针对此,提出了一种基于Text2Vec_AE_KMeans的深度文本特征提取与聚类的微博话题聚类分析方法。首... 传统的话题聚类分析方法使用静态词向量对微博文本进行建模,对微博文本不规范表达、一词多义等特点应对不佳,从而影响聚类效果与话题表述。针对此,提出了一种基于Text2Vec_AE_KMeans的深度文本特征提取与聚类的微博话题聚类分析方法。首先,使用基于MacBert预训练模型与CoSENT文本语句建模方法设计的Text2Vec预训练模型,对微博话题文本进行文本语义表示,从而改进静态词向量在文本特征建模方面的不足;然后,通过带有非线性激活函数的AutoEncoder降维网络对高维非线性文本特征进行降维;最后,在话题聚类分析的过程中采用KMeans_C-TF-IDF算法进行面向微博文本的聚类分析,从聚类簇的角度把握话题分布信息。在真实微博话题数据集上,相较于传统静态词向量建模方法,本文提出的方法在聚类评价指标上表现优异,生成的话题信息可识别性较好。 展开更多
关键词 话题聚类分析 CoSENT text2Vec 自编码器
在线阅读 下载PDF
全球家纺行业的韧性:Heimtextil 2025展览规模创新高 被引量:1
19
作者 钟梦夏 《中国纺织》 2025年第1期96-97,共2页
1月14日至17日,Heimtextil 2025法兰克福国际家用及商用纺织品展览会(以下简称“Heimtextil 2025”)在德国法兰克福展览中心隆重举行。这场为期四天的展会,来自全球142个国家和地区的3000多家展商聚集于此,50000多名观众参与其中,展商... 1月14日至17日,Heimtextil 2025法兰克福国际家用及商用纺织品展览会(以下简称“Heimtextil 2025”)在德国法兰克福展览中心隆重举行。这场为期四天的展会,来自全球142个国家和地区的3000多家展商聚集于此,50000多名观众参与其中,展商数量、观众数量、观众满意度等多项数据再创新记录。 展开更多
关键词 展览规模 家纺行业 法兰克福展览 观众满意度 text 纺织品 He
在线阅读 下载PDF
From text to image:challenges in integrating vision into ChatGPT for medical image interpretation
20
作者 Shunsuke Koga Wei Du 《Neural Regeneration Research》 SCIE CAS 2025年第2期487-488,共2页
Large language models(LLMs),such as ChatGPT developed by OpenAI,represent a significant advancement in artificial intelligence(AI),designed to understand,generate,and interpret human language by analyzing extensive te... Large language models(LLMs),such as ChatGPT developed by OpenAI,represent a significant advancement in artificial intelligence(AI),designed to understand,generate,and interpret human language by analyzing extensive text data.Their potential integration into clinical settings offers a promising avenue that could transform clinical diagnosis and decision-making processes in the future(Thirunavukarasu et al.,2023).This article aims to provide an in-depth analysis of LLMs’current and potential impact on clinical practices.Their ability to generate differential diagnosis lists underscores their potential as invaluable tools in medical practice and education(Hirosawa et al.,2023;Koga et al.,2023). 展开更多
关键词 IMAGE DIAGNOSIS text
在线阅读 下载PDF
上一页 1 2 250 下一页 到第
使用帮助 返回顶部