In the field of Internet, an image is of great significance to information transmission. Meanwhile, how to ensure and improve its security has become the focus of international research. We combine DNA codec with quan...In the field of Internet, an image is of great significance to information transmission. Meanwhile, how to ensure and improve its security has become the focus of international research. We combine DNA codec with quantum Arnold transform(QAr T) to propose a new double encryption algorithm for quantum color images to improve the security and robustness of image encryption. First, we utilize the biological characteristics of DNA codecs to perform encoding and decoding operations on pixel color information in quantum color images, and achieve pixel-level diffusion. Second, we use QAr T to scramble the position information of quantum images and use the operated image as the key matrix for quantum XOR operations. All quantum operations in this paper are reversible, so the decryption operation of the ciphertext image can be realized by the reverse operation of the encryption process. We conduct simulation experiments on encryption and decryption using three color images of “Monkey”, “Flower”, and “House”. The experimental results show that the peak value and correlation of the encrypted images on the histogram have good similarity, and the average normalized pixel change rate(NPCR) of RGB three-channel is 99.61%, the average uniform average change intensity(UACI) is 33.41%,and the average information entropy is about 7.9992. In addition, the robustness of the proposed algorithm is verified by the simulation of noise interference in the actual scenario.展开更多
为探究X⁃codec对大语言模型音频生成性能的影响,本研究基于LibriSpeech数据集分析语料特征(时长、音色)对基于X⁃codec的大语言模型(large language model,LLM)在音频生成任务中的表现。相似性目标(similarity objective,Sim⁃O)得分和全...为探究X⁃codec对大语言模型音频生成性能的影响,本研究基于LibriSpeech数据集分析语料特征(时长、音色)对基于X⁃codec的大语言模型(large language model,LLM)在音频生成任务中的表现。相似性目标(similarity objective,Sim⁃O)得分和全体平均意见得分(user test mean opinion score,UTMOS)指标测定结果表明:当语料时长超过10 s(即长语料)且音色为男声时,Sim⁃O得分和UTMOS在算术平均数上均显著高于相应特征分类中的其他组,同时在标准差上均显著低于相应特征分类中的其他组。因此,男声的长语料更有可能使应用了X⁃codec的LLM性能达到最佳状态。本研究结果可为优化音频编解码器设计提供理论支持。展开更多
借助离散神经音频编解码器的能力,大型语言模型(Large language model,LLM)已被广泛认为是一种零样本语音合成(Text-to-Speech,TTS)的潜在方法。然而,基于采样的解码策略虽然能够为语音生成带来丰富的多样性,但同时也引入了诸如拼写错...借助离散神经音频编解码器的能力,大型语言模型(Large language model,LLM)已被广泛认为是一种零样本语音合成(Text-to-Speech,TTS)的潜在方法。然而,基于采样的解码策略虽然能够为语音生成带来丰富的多样性,但同时也引入了诸如拼写错误、遗漏和重复等鲁棒性问题。为了解决上述问题,我们提出了VALL-E R,一个鲁棒且高效的零样本TTS系统,并以VALL-E为基础进行构建。具体而言,我们引入了一种音素单调对齐策略,通过约束声学标记与其对应的音素严格匹配,增强了音素与声学序列之间的映射关系,从而确保更精确的对齐。此外,我们采用编解码器合并的方法,在浅层量化层对离散码进行降采样,以减少解码计算量,同时保持语音输出的高质量。受益于这些策略,VALL-E R在音素可控性方面取得了显著提升,并通过逼近真实语音的词错误率展现了卓越的鲁棒性。此外,该系统仅需较少的自回归推理步骤,推理时间降低超过60%,极大提升了推理效率。展开更多
In the study and implementation of a programmable RS codec module in satellite communication modem, FPGA is used as the kernel in the implementation, while some ASICs are used as necessary assistant measures. The modu...In the study and implementation of a programmable RS codec module in satellite communication modem, FPGA is used as the kernel in the implementation, while some ASICs are used as necessary assistant measures. The module includes the RS codec unit, the interleaver and deinterleaver unit, the scrambler and descrambler unit and the frame synchronization unit. The module is realized successfully and it can be programmed on-line to meet the requirements of IESS 308/309/310 including many specifications about different service types and data rates. With the implementation combining FPGA with ASICs, size of the circuit is much reduced, its flexibility dramatically increased, and its stability further strengthened. Furthermore, the module is based on the software radio concept and can be easily integrated into the whole satellite communication modem.展开更多
基金Project supported by the Natural Science Foundation of Shandong Province, China (Grant No. ZR2021MF049)Joint Fund of Natural Science Foundation of Shandong Province (Grant Nos. ZR2022LLZ012 and ZR2021LLZ001)the Key R&D Program of Shandong Province, China (Grant No. 2023CXGC010901)。
文摘In the field of Internet, an image is of great significance to information transmission. Meanwhile, how to ensure and improve its security has become the focus of international research. We combine DNA codec with quantum Arnold transform(QAr T) to propose a new double encryption algorithm for quantum color images to improve the security and robustness of image encryption. First, we utilize the biological characteristics of DNA codecs to perform encoding and decoding operations on pixel color information in quantum color images, and achieve pixel-level diffusion. Second, we use QAr T to scramble the position information of quantum images and use the operated image as the key matrix for quantum XOR operations. All quantum operations in this paper are reversible, so the decryption operation of the ciphertext image can be realized by the reverse operation of the encryption process. We conduct simulation experiments on encryption and decryption using three color images of “Monkey”, “Flower”, and “House”. The experimental results show that the peak value and correlation of the encrypted images on the histogram have good similarity, and the average normalized pixel change rate(NPCR) of RGB three-channel is 99.61%, the average uniform average change intensity(UACI) is 33.41%,and the average information entropy is about 7.9992. In addition, the robustness of the proposed algorithm is verified by the simulation of noise interference in the actual scenario.
文摘为探究X⁃codec对大语言模型音频生成性能的影响,本研究基于LibriSpeech数据集分析语料特征(时长、音色)对基于X⁃codec的大语言模型(large language model,LLM)在音频生成任务中的表现。相似性目标(similarity objective,Sim⁃O)得分和全体平均意见得分(user test mean opinion score,UTMOS)指标测定结果表明:当语料时长超过10 s(即长语料)且音色为男声时,Sim⁃O得分和UTMOS在算术平均数上均显著高于相应特征分类中的其他组,同时在标准差上均显著低于相应特征分类中的其他组。因此,男声的长语料更有可能使应用了X⁃codec的LLM性能达到最佳状态。本研究结果可为优化音频编解码器设计提供理论支持。
文摘借助离散神经音频编解码器的能力,大型语言模型(Large language model,LLM)已被广泛认为是一种零样本语音合成(Text-to-Speech,TTS)的潜在方法。然而,基于采样的解码策略虽然能够为语音生成带来丰富的多样性,但同时也引入了诸如拼写错误、遗漏和重复等鲁棒性问题。为了解决上述问题,我们提出了VALL-E R,一个鲁棒且高效的零样本TTS系统,并以VALL-E为基础进行构建。具体而言,我们引入了一种音素单调对齐策略,通过约束声学标记与其对应的音素严格匹配,增强了音素与声学序列之间的映射关系,从而确保更精确的对齐。此外,我们采用编解码器合并的方法,在浅层量化层对离散码进行降采样,以减少解码计算量,同时保持语音输出的高质量。受益于这些策略,VALL-E R在音素可控性方面取得了显著提升,并通过逼近真实语音的词错误率展现了卓越的鲁棒性。此外,该系统仅需较少的自回归推理步骤,推理时间降低超过60%,极大提升了推理效率。
文摘In the study and implementation of a programmable RS codec module in satellite communication modem, FPGA is used as the kernel in the implementation, while some ASICs are used as necessary assistant measures. The module includes the RS codec unit, the interleaver and deinterleaver unit, the scrambler and descrambler unit and the frame synchronization unit. The module is realized successfully and it can be programmed on-line to meet the requirements of IESS 308/309/310 including many specifications about different service types and data rates. With the implementation combining FPGA with ASICs, size of the circuit is much reduced, its flexibility dramatically increased, and its stability further strengthened. Furthermore, the module is based on the software radio concept and can be easily integrated into the whole satellite communication modem.