期刊文献+
共找到3篇文章
< 1 >
每页显示 20 50 100
Recent Progresses in Deep Learning Based Acoustic Models 被引量:11
1
作者 Dong Yu Jinyu Li 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2017年第3期396-409,共14页
In this paper,we summarize recent progresses made in deep learning based acoustic models and the motivation and insights behind the surveyed techniques.We first discuss models such as recurrent neural networks(RNNs) a... In this paper,we summarize recent progresses made in deep learning based acoustic models and the motivation and insights behind the surveyed techniques.We first discuss models such as recurrent neural networks(RNNs) and convolutional neural networks(CNNs) that can effectively exploit variablelength contextual information,and their various combination with other models.We then describe models that are optimized end-to-end and emphasize on feature representations learned jointly with the rest of the system,the connectionist temporal classification(CTC) criterion,and the attention-based sequenceto-sequence translation model.We further illustrate robustness issues in speech recognition systems,and discuss acoustic model adaptation,speech enhancement and separation,and robust training strategies.We also cover modeling techniques that lead to more efficient decoding and discuss possible future directions in acoustic model research. 展开更多
关键词 Attention model convolutional neural network(CNN) connectionist temporal classification(CTC) deep learning(DL) long short-term memory(LSTM) permutation invariant training speech adaptation speech processing speech recognition speech separation
在线阅读 下载PDF
AN IMPROVED ALGORITHM OF GMM VOICE CONVERSION SYSTEM BASED ON CHANGING THE TIME-SCALE
2
作者 Zhou Ying Zhang Linghua 《Journal of Electronics(China)》 2011年第4期518-523,共6页
This paper improves and presents an advanced method of the voice conversion system based on Gaussian Mixture Models(GMM) models by changing the time-scale of speech.The Speech Transformation and Representation using A... This paper improves and presents an advanced method of the voice conversion system based on Gaussian Mixture Models(GMM) models by changing the time-scale of speech.The Speech Transformation and Representation using Adaptive Interpolation of weiGHTed spectrum(STRAIGHT) model is adopted to extract the spectrum features,and the GMM models are trained to generate the conversion function.The spectrum features of a source speech will be converted by the conversion function.The time-scale of speech is changed by extracting the converted features and adding to the spectrum.The conversion voice was evaluated by subjective and objective measurements.The results confirm that the transformed speech not only approximates the characteristics of the target speaker,but also more natural and more intelligible. 展开更多
关键词 Gaussian Mixture Models(GMM) speech Transformation and Representation using Adaptive Interpolation of weiGHTed spectrum(STRAIGHT) TIME-SCALE Voice conversion
在线阅读 下载PDF
Steganography algorithm for adaptive multi-rate wideband speech based on algebraic codebook search
3
作者 Ruan Ye Xu Yanyan +1 位作者 Ke Dengfeng Su Kaile 《The Journal of China Universities of Posts and Telecommunications》 EI CSCD 2018年第3期71-79,共9页
With the popularity of adaptive multi-rate wideband (AMR-WB) audio in mobile communication, many AMR- WB based techniques, such as a similar compression architecture to transmit secret information during the process... With the popularity of adaptive multi-rate wideband (AMR-WB) audio in mobile communication, many AMR- WB based techniques, such as a similar compression architecture to transmit secret information during the process of compression, were proposed to transmit covert messages. However, if a sender does not have the original waveform audio format (WAV) audio, the architecture cannot be used. In this paper, a new covert message method, which takes effect after WAV audio is compressed into AMR-WB speech, is proposed. This method takes advantage of algebraic codebook search. Aiming at improving speed and reducing search space, it does not perform algebraic codebook search using the optimal search algorithm, and it does not reach the positions of non-zero pulses via depth-first tree search that characterizes the energy of audio. According to the features of search methods and the codebook index construction, every track in each subframe is analyzed to find the proper positions for embedding secret information. Experimental results show that the proposed method has satisfactory capacity and simplicity regardless of compression process. 展开更多
关键词 steganography algorithm WAV audio compressed audio adaptive multi-rate wideband speech
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部