期刊文献+

基于宋词生成的大容量构造式信息隐藏算法 被引量:7

Large-Capacity Constructive Information Hiding Based on Song Ci Generation
在线阅读 下载PDF
导出
摘要 在基于文本生成的信息隐藏算法研究中,如何在保证生成文本质量的同时提高隐藏容量是主要存在的挑战.为此本文提出一种基于宋词生成的构造式信息隐藏算法.首先对宋词文本数据进行预训练,然后基于自回归语言模型搭建宋词生成模型;其次根据宋词词牌固有的格式信息设计格律模块,在宋词生成阶段,需要向生成模型输入该格律模块,并通过符号集设计、编码等综合作用,生成宋词诗句.在利用宋词生成模型进行秘密信息隐藏的过程中,对格律模块进行重构,通过平仄韵词牌、词牌格式模板、关键字、韵律及押韵字符的不同选择,有效实现秘密信息的隐藏.信息提取是隐藏的逆过程,且提取过程不需要利用宋词生成模型,仅需根据模板和词典库来进行索引即可,提高了信息提取的效率.实验结果表明,本文提出的算法能够生成格式严格、韵律清晰、句子完整性高的宋词,且生成的宋词文本的信息隐藏容量均值可达21比特/句、安全性高,整体性能优于已报道的主流算法. As the typical media of information dissemination,text data is one kind of the commonly-used cover data in the field of information hiding research,which has attracted extensive attentions in the community of information security.At present,the technique of text generation based on natural language processing(NLP)becomes more and more popular,but its application in the field of information hiding is not very satisfactory,which is still in the elementary stage.As for the current research of information hiding algorithms based on text generation,how to improve the hiding capacity while ensuring the quality of the generated texts is the main challenge.Therefore,in this paper,we propose a novel constructive information hiding algorithm based on Song Ci generation.Firstly,the text data of Song Ci is pre-trained,and a set of customized indicators are introduced to improve the modeling performance,especially the robustness in format,rhythm and sentence integrity.Then,the Song Ci generation model in our algorithm is established based on the attention mechanism,whose backbone is the transformer-based auto-regressive language model.Secondly,the metrical module is designed according to the inherent format information of Song Ci tunes.In the stage of Song Ci generation,the metrical module is inputted into the Song Ci generation model,and then the Song Ci poetries can be generated through the comprehensive action of symbol set design and coding.In the process of information hiding by using the Song Ci generation model,the metrical module is reconstructed through choosing different tunes,tune templates,keywords,rhythms and rhyming characters according to secret information.Information extraction is the inverse process of information hiding,and the extraction process does not need to use the Song Ci generation model,but only needs to use the index mapping according to the metrical template and dictionary library,which can improve the efficiency of information extraction.Experimental results show that the proposed algorithm can generate Song Ci with strict format,clear rhythm and high sentence integrity,and the generated Song Ci have high security and the information hiding capacity that reaches 21 bits/sentence in average.The overall performance of the proposed algorithm is significantly better than some reported algorithms.We also utilized the same set of training data on our model with information hiding and the current typical poetry generation model without information hiding,respectively,and the trained results show that the performance indices of the two models,such as the perplexity,are close.Then,by comparing a series of Song Ci poetries randomly generated by these two models,we can find that the performances of semantic quality for the generated Song Ci poetries are quite good.Moreover,we also make an experimental analysis on the anti-semantic similarity detection of our algorithm,and experimental results demonstrate that the Song Ci with information hiding and the Song Ci without information hiding can hardly be distinguished in the semantic space.In addition,the time complexity,tampering detection analysis and security of the algorithm are also discussed.Finally,the future research directions of this work are given.
作者 秦川 李蓉受 钱振兴 张新鹏 QIN Chuan;LI Rong-Shou;QIAN Zhen-Xing;ZHANG Xin-Peng(School of Optical-Electrical and Computer Engineering,University of Shanghai for Science and Technology,Shanghai 200093;School of Computer Science,Fudan University,Shanghai 200433)
出处 《计算机学报》 EI CAS CSCD 北大核心 2023年第1期17-30,共14页 Chinese Journal of Computers
基金 国家自然科学基金面上项目(No.62172280) 国家自然科学基金重点项目(No.U20B2051) 上海市自然科学基金项目(No.21ZR1444600)的资助.
关键词 文本生成 构造式信息隐藏 宋词 格律控制 隐藏容量 text generation constructive information hiding Song Ci metric control hiding capacity
  • 相关文献

参考文献11

二级参考文献74

共引文献125

同被引文献42

引证文献7

二级引证文献6

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部