A new type of vocoder system based upon formant analysis is presented in this paper. The LMS adaptive algorithm is used for tracking formants of speech signals. The results of computer simulation show that the new voc...A new type of vocoder system based upon formant analysis is presented in this paper. The LMS adaptive algorithm is used for tracking formants of speech signals. The results of computer simulation show that the new vocoder has better synthesized speech quality.展开更多
This study was concerned with the short vowels in modern standard Arabic words with Consonant Vowel-Consonant Vowel-Consonant Vowel (CVCVCV) structure, and the long vowels in words with Consonant Vowel Vowel-Consonant...This study was concerned with the short vowels in modern standard Arabic words with Consonant Vowel-Consonant Vowel-Consonant Vowel (CVCVCV) structure, and the long vowels in words with Consonant Vowel Vowel-Consonant (CVVC). Even though there has been a dispute on the precise number of Arabic vowels that exist between language studies, this study used the opinion that the Arabic language has three vowels;the elongation of each vowel gave the other three because this is the opinion of classical Arabic linguists which is the source of the Modern Standard Arabic (MSA). Studies said that the first and second formant values (F1, F2) can represent the vowels. In this study, the formants were measured using LPC (Linear Predictive Coding), verifying the measurement to see if the measured follows the pattern of formants measurements of the other studies, and the formants were used to investigate the relationship between short and long vowels. Furthermore, the study figured out if the dialect of speakers can affect the values of formants, even if the spoken language is MSA, some statistical measurements were calculated to evaluate the relationship.展开更多
In order to improve the Mandarin vowel pronunciation quality assessment, a nox/el formant feature was proposed and applied to formant classification for Chinese Mandarin vowel pronunciation quality evaluation. Formant...In order to improve the Mandarin vowel pronunciation quality assessment, a nox/el formant feature was proposed and applied to formant classification for Chinese Mandarin vowel pronunciation quality evaluation. Formant candidates of each frame were plotted on the time-frequency plane to form a bitmap, and its Gabor feature was extracted to represent the formant trajectory. The feature was then classified by using GMM model and the classification posterior probability was mapped to pronunciation quality grade. The experiments of comparing the Gabor transformation based formant trajectory feature with several other kinds of traditionally used features show that with this method, a human-machine scoring correlation coefficient (CC) of 0.842 can be achieved, which is better than the result of 0.832 by traditional speech recognition techniques. At the same time, considering that the long-term information of formant classification and the short-term information of speech recognition technique are complementary to each other, it is investigated to combine their results with linear or nonlinear methods to further improve the evaluation performance. As a result, experiments on PSK show that the best CC of 0.913, which is very close to the correlation of inter-human rating of 0.94, is gotten by using neural network.展开更多
Arabic texts suffer from missing short vowels. Arabic Speech Recognition is not as good as English speech recognition due to the short vowels not being recognized. And the Arabic language is unlike the English languag...Arabic texts suffer from missing short vowels. Arabic Speech Recognition is not as good as English speech recognition due to the short vowels not being recognized. And the Arabic language is unlike the English language in characteristics such as the number of vowels. English has more than 24 vowels that are close to each other in pronunciation. The Arabic language only has three short vowels that are far from each other in utter and measurement, by elongating those short vowels, long vowels arose. Researchers said that the vowels could be recognized using formants. The formants’ measurements of Arabic vowels are far from each other too, so it is possible to recognize them so that Arabic Speech recognition can give more accurate results. The paper applies this idea to the corpus Phonemes of Arabic. It uses the Euclidian distance method to measure the distances between formant values to recognize Arabic from words with a CV3 structure, the Linear Predictive Coding method and MATLAB to develop the programs that will extract the formants and calculate the means of the short vowels by using the corpus to identify the short vowels within words in the corpus. The results showed that if highly qualified readers were chosen to read the Arabic text, then higher rates of recognition of the short vowels involved in words will be achieved. This paper revealed that some of the characteristics of a language can be utilized for vowel recognition or to enhance the existing methods for speech recognition.展开更多
Rhythm Formant Theory(RFT),a modulation-theoretic approach to the physical modelling of speech rhythm,is described and applied in an exploratory analysis of the rhetorical rhythms of read-aloud Mandarin Chinese transl...Rhythm Formant Theory(RFT),a modulation-theoretic approach to the physical modelling of speech rhythm,is described and applied in an exploratory analysis of the rhetorical rhythms of read-aloud Mandarin Chinese translations of the IPA benchmark text The North Wind and the Sun.Rhythm Formant Analysis(RFA),a methodology for empirically investigating Rhythm Formant Theory without prior annotation of the speech signal,is presented in some detail,with the aim of studying rhythm variation in larger units throughout longer texts,rather than restricting analysis to words,phrases and sentences.A test case of read-aloud narratives was investigated,with the null hypothesis that male and female readers do not differ in rhetorical reading strategies.RFA was used to generate vectors of low frequency(LF)variation in spectrograms,for analysis with hierarchical clustering methods.The clustering indicates that the null hypothesis was falsified and rhetorical differences between female and male speakers were tentatively confirmed.Ongoing work includes the analysis of linguistic factors underlying LF variation.In the conclusion,RFT is placed into a more general framework of a Speech Modulation Frequency Scale of modulation types.展开更多
文摘A new type of vocoder system based upon formant analysis is presented in this paper. The LMS adaptive algorithm is used for tracking formants of speech signals. The results of computer simulation show that the new vocoder has better synthesized speech quality.
文摘This study was concerned with the short vowels in modern standard Arabic words with Consonant Vowel-Consonant Vowel-Consonant Vowel (CVCVCV) structure, and the long vowels in words with Consonant Vowel Vowel-Consonant (CVVC). Even though there has been a dispute on the precise number of Arabic vowels that exist between language studies, this study used the opinion that the Arabic language has three vowels;the elongation of each vowel gave the other three because this is the opinion of classical Arabic linguists which is the source of the Modern Standard Arabic (MSA). Studies said that the first and second formant values (F1, F2) can represent the vowels. In this study, the formants were measured using LPC (Linear Predictive Coding), verifying the measurement to see if the measured follows the pattern of formants measurements of the other studies, and the formants were used to investigate the relationship between short and long vowels. Furthermore, the study figured out if the dialect of speakers can affect the values of formants, even if the spoken language is MSA, some statistical measurements were calculated to evaluate the relationship.
基金Project(61062011)supported by the National Natural Science Foundation of ChinaProject(2010GXNSFA013128)supported by the Natural Science Foundation of Guangxi Province,China
文摘In order to improve the Mandarin vowel pronunciation quality assessment, a nox/el formant feature was proposed and applied to formant classification for Chinese Mandarin vowel pronunciation quality evaluation. Formant candidates of each frame were plotted on the time-frequency plane to form a bitmap, and its Gabor feature was extracted to represent the formant trajectory. The feature was then classified by using GMM model and the classification posterior probability was mapped to pronunciation quality grade. The experiments of comparing the Gabor transformation based formant trajectory feature with several other kinds of traditionally used features show that with this method, a human-machine scoring correlation coefficient (CC) of 0.842 can be achieved, which is better than the result of 0.832 by traditional speech recognition techniques. At the same time, considering that the long-term information of formant classification and the short-term information of speech recognition technique are complementary to each other, it is investigated to combine their results with linear or nonlinear methods to further improve the evaluation performance. As a result, experiments on PSK show that the best CC of 0.913, which is very close to the correlation of inter-human rating of 0.94, is gotten by using neural network.
文摘Arabic texts suffer from missing short vowels. Arabic Speech Recognition is not as good as English speech recognition due to the short vowels not being recognized. And the Arabic language is unlike the English language in characteristics such as the number of vowels. English has more than 24 vowels that are close to each other in pronunciation. The Arabic language only has three short vowels that are far from each other in utter and measurement, by elongating those short vowels, long vowels arose. Researchers said that the vowels could be recognized using formants. The formants’ measurements of Arabic vowels are far from each other too, so it is possible to recognize them so that Arabic Speech recognition can give more accurate results. The paper applies this idea to the corpus Phonemes of Arabic. It uses the Euclidian distance method to measure the distances between formant values to recognize Arabic from words with a CV3 structure, the Linear Predictive Coding method and MATLAB to develop the programs that will extract the formants and calculate the means of the short vowels by using the corpus to identify the short vowels within words in the corpus. The results showed that if highly qualified readers were chosen to read the Arabic text, then higher rates of recognition of the short vowels involved in words will be achieved. This paper revealed that some of the characteristics of a language can be utilized for vowel recognition or to enhance the existing methods for speech recognition.
文摘Rhythm Formant Theory(RFT),a modulation-theoretic approach to the physical modelling of speech rhythm,is described and applied in an exploratory analysis of the rhetorical rhythms of read-aloud Mandarin Chinese translations of the IPA benchmark text The North Wind and the Sun.Rhythm Formant Analysis(RFA),a methodology for empirically investigating Rhythm Formant Theory without prior annotation of the speech signal,is presented in some detail,with the aim of studying rhythm variation in larger units throughout longer texts,rather than restricting analysis to words,phrases and sentences.A test case of read-aloud narratives was investigated,with the null hypothesis that male and female readers do not differ in rhetorical reading strategies.RFA was used to generate vectors of low frequency(LF)variation in spectrograms,for analysis with hierarchical clustering methods.The clustering indicates that the null hypothesis was falsified and rhetorical differences between female and male speakers were tentatively confirmed.Ongoing work includes the analysis of linguistic factors underlying LF variation.In the conclusion,RFT is placed into a more general framework of a Speech Modulation Frequency Scale of modulation types.