期刊文献+

高可用性的跨领域机器生成文本检测方法

Highly Available Cross-Domain Machine-Generated Text Detection Method
在线阅读 下载PDF
导出
摘要 AIGC已严重影响信息的真实性、可靠性,造成数据污染、产权归属、诚信危机等众多技术和社会问题.现有机器生成文本检测方法主要针对特定领域且检测准确率较低,更难用于敏感、私有、小样本等跨领域数据.针对该问题提出一种高可用性的跨领域机器生成文本检测方法.该方法优选任一领域内的类别中心样本训练生成专域编码器,利用领域特征增强边界区分性;构建一种正交损失函数联合专域编码器训练生成泛域编码器,强化机器生成文本的共性特征支持多领域机器生成文本的检测.真实数据实验结果表明,单领域检测模型无需微调即可在其他领域获得高检测准确率,适用范围广,实用性强. Artificial intelligence generated content(AIGC)has seriously affected information authenticity and reliability,leading to various technical and social problems such as data pollution,property ownership,and credibility crisis.Existing machine-generated text detection methods are primarily designed for specific domains and suffer from relatively low detection accuracy,making them even less effective when applied to cross-domain data such as sensitive,private,or small-sample data.To address this problem,a high available cross-domain machine-generated text detection method was proposed.This method first selected the class-center samples in any domain to train a domain-specific encoder,thereby leveraging domain features enhance boundary distinguishability.Then,an orthogonal loss function was constructed to train a domain-general encoder with the domain-specific encoder,reinforcing the general-feature of machine-generated text to support the detection across multiple domains.Experimental results on real-world data show that the detection model trained on a single domain can obtain high detection accuracy in other domains without fine-tuning,highlighting its broad applications and strong practicality.
作者 罗森林 杨宗源 潘丽敏 周瑾洁 门元昊 李晔 LUO Senlin;YANG Zongyuan;PAN Limin;ZHOU Jinjie;MEN Yuanhao;LI Ye(School of Information and Electronics,Beijing Institute of Technology,Beijing 100081,China;China Network Coordination Emergency Response Team/China Coordination Center,Beijing 100029,China)
出处 《北京理工大学学报》 北大核心 2025年第12期1296-1304,共9页 Transactions of Beijing Institute of Technology
基金 国家“二四二”信息安全项目(2020A065)。
关键词 机器生成文本检测 域泛化 预训练语言模型 machine-generated text detection domain generalization pre-trained language model
  • 相关文献

参考文献4

二级参考文献9

共引文献206

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部