期刊文献+
共找到1,094篇文章
< 1 2 55 >
每页显示 20 50 100
Noise Removal in Speech Processing Using Spectral Subtraction 被引量:5
1
作者 Marc Karam Hasan F. Khazaal +1 位作者 Heshmat Aglan Cliston Cole 《Journal of Signal and Information Processing》 2014年第2期32-41,共10页
Spectral subtraction is used in this research as a method to remove noise from noisy speech signals in the frequency domain. This method consists of computing the spectrum of the noisy speech using the Fast Fourier Tr... Spectral subtraction is used in this research as a method to remove noise from noisy speech signals in the frequency domain. This method consists of computing the spectrum of the noisy speech using the Fast Fourier Transform (FFT) and subtracting the average magnitude of the noise spectrum from the noisy speech spectrum. We applied spectral subtraction to the speech signal “Real graph”. A digital audio recorder system embedded in a personal computer was used to sample the speech signal “Real graph” to which we digitally added vacuum cleaner noise. The noise removal algorithm was implemented using Matlab software by storing the noisy speech data into Hanning time-widowed half-overlapped data buffers, computing the corresponding spectrums using the FFT, removing the noise from the noisy speech, and reconstructing the speech back into the time domain using the inverse Fast Fourier Transform (IFFT). The performance of the algorithm was evaluated by calculating the Speech to Noise Ratio (SNR). Frame averaging was introduced as an optional technique that could improve the SNR. Seventeen different configurations with various lengths of the Hanning time windows, various degrees of data buffers overlapping, and various numbers of frames to be averaged were investigated in view of improving the SNR. Results showed that using one-fourth overlapped data buffers with 128 points Hanning windows and no frames averaging leads to the best performance in removing noise from the noisy speech. 展开更多
关键词 speech processing Spectral SUBTRACTION Noise Removal FAST FOURIER TRANSFORM INVERSE FAST FOURIER TRANSFORM
在线阅读 下载PDF
AN ANALYSIS OF ACOUSTIC CHARACTERISTICS OFCLEFT PALATE SPEECH WITH COMPUTERIZED SPEECH SIGNAL PROCESSING SYSTEM 被引量:1
2
作者 李锦峰 刘建华 《Journal of Pharmaceutical Analysis》 CAS 1996年第2期162-165,共4页
The acoustic characteristics or the chinese vowels of 24 children with cleft palate and 10 normal control children were analyzed by computerized speech signal processing system (CSSPS),and the speech articulation was ... The acoustic characteristics or the chinese vowels of 24 children with cleft palate and 10 normal control children were analyzed by computerized speech signal processing system (CSSPS),and the speech articulation was judged with Glossary of clert palate speech(GCPS).The listening judgement showed that the speech articulation was significantly different between the two groups(P<0.01).The objective quantitative measurement suggested that the formant pattern(FP)of vowels in children with cleft palate was different from that of normal control children except vowel[a](P< 0.05).The acoustic vowelgraph or the Chinese vowels which demonstrated directly the relationship of vocal space and speech perception was stated with the first formant frequence(F1)and the second formant frequence(F2).The authors conclude that the values or F1 and F2 point out the upward and backward tongue movement to close the clert, which reflects the vocal characteristics of trausmission of clert palate speech. 展开更多
关键词 cleft palate speech the Chinese vowels the formant pattern the speech articulation computerized speech singnal processing system
暂未订购
A Multi-Band Speech Enhancement Algorithm Exploiting Iterative Processing for Enhancement of Single Channel Speech
3
作者 Navneet Upadhyay Abhijit Karmakar 《Journal of Signal and Information Processing》 2013年第2期197-211,共15页
This paper proposes a multi-band speech enhancement algorithm exploiting iterative processing for enhancement of single channel speech. In the proposed algorithm, the output of the multi-band spectral subtraction (MBS... This paper proposes a multi-band speech enhancement algorithm exploiting iterative processing for enhancement of single channel speech. In the proposed algorithm, the output of the multi-band spectral subtraction (MBSS) algorithm is used as the input signal again for next iteration process. As after the first MBSS processing step, the additive noise transforms to the remnant noise, the remnant noise needs to be further re-estimated. The proposed algorithm reduces the remnant musical noise further by iterating the enhanced output signal to the input again and performing the operation repeatedly. The newly estimated remnant noise is further used to process the next MBSS step. This procedure is iterated a small number of times. The proposed algorithm estimates noise in each iteration and spectral over-subtraction is executed independently in each band. The experiments are conducted for various types of noises. The performance of the proposed enhancement algorithm is evaluated for various types of noises at different level of SNRs using, 1) objective quality measures: signal-to-noise ratio (SNR), segmental SNR, perceptual evaluation of speech quality (PESQ);and 2) subjective quality measure: mean opinion score (MOS). The results of proposed enhancement algorithm are compared with the popular MBSS algorithm. Experimental results as well as the objective and subjective quality measurement test results confirm that the enhanced speech obtained from the proposed algorithm is more pleasant to listeners than speech enhanced by classical MBSS algorithm. 展开更多
关键词 speech ENHANCEMENT MULTI-BAND Spectral SUBTRACTION Iterative processing REMNANT MUSICAL Noise
在线阅读 下载PDF
基于噪声反馈的MVDR-MTGAN多通道语音增强
4
作者 王霄雪 刘拓 +1 位作者 江志健 郑能恒 《深圳大学学报(理工版)》 北大核心 2026年第1期93-100,I0003,I0004,共10页
目前主流的多通道语音增强系统大多采用波束形成-后滤波级联架构.在非稳态噪声场景下,波束形成因噪声估计偏差易导致空域滤波失效,而基于深度学习的后滤波虽能改善残留噪声抑制,但计算复杂度高.提出融合最小方差无畸变响应(minimum vari... 目前主流的多通道语音增强系统大多采用波束形成-后滤波级联架构.在非稳态噪声场景下,波束形成因噪声估计偏差易导致空域滤波失效,而基于深度学习的后滤波虽能改善残留噪声抑制,但计算复杂度高.提出融合最小方差无畸变响应(minimum variance distortionless response,MVDR)和多目标生成对抗网络(multi-target generative adversarial network,MTGAN)的闭环增强框架,采用噪声估计反馈机制实现空频域联合优化.通过构建MTGAN的双分支生成器架构同步实现后滤波和噪声估计,并将估计噪声动态反馈至MVDR的协方差矩阵更新过程,形成闭环迭代优化.基于公开的多环境多通道噪声数据集(diverse environments acoustic noise database,DEMAND)的仿真实验表明,噪声反馈机制能有效提升MVDR输出性能;与现有MVDR-CRUSE系统相比,所提MVDR+MTGAN方法保持较低模型复杂度(参数量减少10.5%)的同时,在语音质量评价指标上均获得显著提升,平均分段信噪比提高6.56 dB,整体效果预测得分提升了0.17.该方法为复杂声学场景下的多通道语音增强提供了高效的解决方案. 展开更多
关键词 语音处理 多通道语音增强 最小方差无畸变响应 多目标生成对抗网络 噪声反馈
在线阅读 下载PDF
复杂言语加工任务中无关言语效应的元分析
5
作者 张楠 李旭奎 《心理科学进展》 北大核心 2026年第1期60-82,I0001-I0010,共33页
大量研究证明简单言语加工任务中存在无关言语效应,但与之相比,在更复杂的言语加工任务中,该效应是否依然存在、效应有多大、干扰来源是什么、作用路径是什么,以及影响该效应的调节变量有哪些等问题,目前尚未得到解决。本研究使用随机... 大量研究证明简单言语加工任务中存在无关言语效应,但与之相比,在更复杂的言语加工任务中,该效应是否依然存在、效应有多大、干扰来源是什么、作用路径是什么,以及影响该效应的调节变量有哪些等问题,目前尚未得到解决。本研究使用随机效应模型对检索后获得的30个研究(113个独立效应量)进行元分析。结果发现,剔除异常值后,无关言语可显著干扰个体在复杂言语加工任务中的表现,但整体效应量较小;无关言语效应受受试年龄群体、无关言语可理解性、响度、可预测性、无意义言语类型及复杂言语加工任务类型的调节,且无关言语可理解性与受试年龄群体、复杂言语加工任务类型、任务语言单位、任务言语所属文字系统分别存在交互作用。此外,相比语音干扰假说,语义干扰假说的合理性更强;而关于无关言语效应的作用路径,过程干扰假说的合理性得到了一定支持。后续研究应进一步探索复杂言语加工任务中无关言语效应的其他潜在调节变量的影响,以便为当今社会多媒体教学背景下利用音频、视频等现代教育技术辅助语言学习带来启发,为学习环境的设计与优化、减少背景噪音对个体的干扰提供思路,并为针对认知缺陷人群开发更有效的干预与治疗方法提供理论指导和支持。 展开更多
关键词 无关言语效应 言语加工 干扰机制 元分析
在线阅读 下载PDF
基于DenseNet和迁移学习的声纹识别方法
6
作者 陈润强 王卫辰 +1 位作者 徐亚博 李烈 《现代电子技术》 北大核心 2026年第2期171-177,共7页
传统的声纹识别方法受环境噪声和个体变化等因素的影响,准确率难以进一步提升。为此,提出一种基于DenseNet和迁移学习的语谱图声纹识别方法,以进一步提高声纹识别系统的性能。使用DenseNet的声纹识别模型对源域语音进行训练;采用迁移学... 传统的声纹识别方法受环境噪声和个体变化等因素的影响,准确率难以进一步提升。为此,提出一种基于DenseNet和迁移学习的语谱图声纹识别方法,以进一步提高声纹识别系统的性能。使用DenseNet的声纹识别模型对源域语音进行训练;采用迁移学习将源域训练的DenseNet模型迁移到目标域训练数据;在目标域测试数据上验证迁移后模型的性能,并对比分析迁移前后DenseNet模型和ResNet模型的声纹识别性能。实验结果表明,与原始ResNet模型、DenseNet模型和经迁移学习的ResNet模型相比,经迁移学习的DenseNet模型的识别准确率分别提高了3.89%、6.67%和3.34%,且具有较快的收敛速度。 展开更多
关键词 声纹识别 DenseNet 迁移学习 语谱图 ResNet 语音信号处理
在线阅读 下载PDF
Speech-Music-Noise Discrimination in Sound Indexing of Multimedia Documents
7
作者 Lamia Bouafif Noureddine Ellouze 《Sound & Vibration》 2018年第6期2-10,共9页
Sound indexing and segmentation of digital documentsespecially in the internet and digital libraries are very useful tosimplify and to accelerate the multimedia document retrieval. Wecan imagine that we can extract mu... Sound indexing and segmentation of digital documentsespecially in the internet and digital libraries are very useful tosimplify and to accelerate the multimedia document retrieval. Wecan imagine that we can extract multimedia files not only bykeywords but also by speech semantic contents. The maindifficulty of this operation is the parameterization and modellingof the sound track and the discrimination of the speech, musicand noise segments. In this paper, we will present aSpeech/Music/Noise indexing interface designed for audiodiscrimination in multimedia documents. The program uses astatistical method based on ANN and HMM classifiers. After preemphasisand segmentation, the audio segments are analysed bythe cepstral acoustic analysis method. The developed system wasevaluated on a database constituted of music songs with Arabicspeech segments under several noisy environments. 展开更多
关键词 speech processing audio indexing training andrecognition
在线阅读 下载PDF
Analysis of Deaf Speakers’ Speech Signal for Understanding the Acoustic Characteristics by Territory Specific Utterances
8
作者 Nirmaladevi Jaganathan Bommannaraja Kanagaraj 《Circuits and Systems》 2016年第8期1709-1721,共13页
An important concern with the deaf community is inability to hear partially or totally. This may affect the development of language during childhood, which limits their habitual existence. Consequently to facilitate s... An important concern with the deaf community is inability to hear partially or totally. This may affect the development of language during childhood, which limits their habitual existence. Consequently to facilitate such deaf speakers through certain assistive mechanism, an effort has been taken to understand the acoustic characteristics of deaf speakers by evaluating the territory specific utterances. Speech signals are acquired from 32 normal and 32 deaf speakers by uttering ten Indian native Tamil language words. The speech parameters like pitch, formants, signal-to-noise ratio, energy, intensity, jitter and shimmer are analyzed. From the results, it has been observed that the acoustic characteristics of deaf speakers differ significantly and their quantitative measure dominates the normal speakers for the words considered. The study also reveals that the informative part of speech in a normal and deaf speakers may be identified using the acoustic features. In addition, these attributes may be used for differential corrections of deaf speaker’s speech signal and facilitate listeners to understand the conveyed information. 展开更多
关键词 Deaf Speaker Hard of Hearing Deaf speech processing Assistive Mechanism for Deaf Speaker speech Correction speech Signal processing
在线阅读 下载PDF
开源情报多模态智能处理系统设计与工程实现
9
作者 董泽云 甘莅豪 +1 位作者 薛楠 陆泰廷 《大数据》 2026年第1期71-83,共13页
针对开源情报系统存在的模态割裂、结构化能力不足及用户交互性差等问题,提出一种融合计算机视觉、自然语言处理与文本转语音技术的智能信息处理系统。基于多源异构数据设计了涵盖数据采集、预处理、深度建模、智能决策与用户交互反馈... 针对开源情报系统存在的模态割裂、结构化能力不足及用户交互性差等问题,提出一种融合计算机视觉、自然语言处理与文本转语音技术的智能信息处理系统。基于多源异构数据设计了涵盖数据采集、预处理、深度建模、智能决策与用户交互反馈的完整闭环流程,重点突破跨模态数据融合、情报内容结构化处理、语音播报与多媒体可视化呈现等关键技术。实验结果表明,系统在情报抽取准确率、响应时间及用户可解释反馈等关键指标上表现优异,具备模块化与可扩展性,适配政务安全、金融风控与公共舆情等场景。 展开更多
关键词 开源情报 计算机视觉 自然语言处理 文本转语音 语音识别 多模态融合 大语言模型 人工智能
在线阅读 下载PDF
基于语音合成技术的个性化教学音频智能处理系统设计
10
作者 徐若依 卢佳欣 《计算机应用文摘》 2026年第3期149-151,共3页
语音合成技术是人工智能领域的重要分支,已在各类智能系统中得到广泛应用。文章设计了一种基于语音合成的个性化教学音频智能处理系统,旨在通过智能化音频生成技术为学习者提供个性化教学体验,以提升学习兴趣与效果。该系统通过分析学... 语音合成技术是人工智能领域的重要分支,已在各类智能系统中得到广泛应用。文章设计了一种基于语音合成的个性化教学音频智能处理系统,旨在通过智能化音频生成技术为学习者提供个性化教学体验,以提升学习兴趣与效果。该系统通过分析学生的学习特征、进度及个人需求,动态生成定制化教学音频,从而支持学生在课外自主学习过程中提升学习成效。 展开更多
关键词 语音合成技术 音频智能处理系统 学习特征 学习进度 自主学习
在线阅读 下载PDF
Enhanced Frequency-Domain Frost Algorithm Using Conjugate Gradient Techniques for Speech Enhancement 被引量:1
11
作者 Shengkui Zhao Douglas L. Jones 《Journal of Electronic Science and Technology》 CAS 2012年第2期158-162,共5页
In this paper, the frequency-domain Frost algorithm is enhanced by using conjugate gradient techniques for speech enhancement. Unlike the non-adaptive approach of computing the optimum minimum variance distortionless ... In this paper, the frequency-domain Frost algorithm is enhanced by using conjugate gradient techniques for speech enhancement. Unlike the non-adaptive approach of computing the optimum minimum variance distortionless response (MVDR) solution with the correlation matrix inversion, the Frost algorithm implementing the stochastic constrained least mean square (LMS) algorithm can adaptively converge to the MVDR solution in mean-square sense, but with a very slow convergence rate. In this paper, we propose a frequency-domain constrained conjugate gradient (FDCCG) algorithm to speed up the convergence. The devised FDCCG algorithm avoids the matrix inversion and exhibits fast convergence. The speech enhancement experiments for the target speech signal corrupted by two and five interfering speech signals are demonstrated by using a four-channel acoustic-vector-sensor (AVS) micro-phone array and show the superior performance. 展开更多
关键词 Adaptive gence correlation speech arrays. signal processing conver- enhancement MICROPHONE
在线阅读 下载PDF
Enhancing Parkinson’s Disease Diagnosis Accuracy Through Speech Signal Algorithm Modeling 被引量:1
12
作者 Omar M.El-Habbak Abdelrahman M.Abdelalim +5 位作者 Nour H.Mohamed Habiba M.Abd-Elaty Mostafa A.Hammouda Yasmeen Y.Mohamed Mohanad A.Taifor Ali W.Mohamed 《Computers, Materials & Continua》 SCIE EI 2022年第2期2953-2969,共17页
Parkinson’s disease(PD),one of whose symptoms is dysphonia,is a prevalent neurodegenerative disease.The use of outdated diagnosis techniques,which yield inaccurate and unreliable results,continues to represent an obs... Parkinson’s disease(PD),one of whose symptoms is dysphonia,is a prevalent neurodegenerative disease.The use of outdated diagnosis techniques,which yield inaccurate and unreliable results,continues to represent an obstacle in early-stage detection and diagnosis for clinical professionals in the medical field.To solve this issue,the study proposes using machine learning and deep learning models to analyze processed speech signals of patients’voice recordings.Datasets of these processed speech signals were obtained and experimented on by random forest and logistic regression classifiers.Results were highly successful,with 90%accuracy produced by the random forest classifier and 81.5%by the logistic regression classifier.Furthermore,a deep neural network was implemented to investigate if such variation in method could add to the findings.It proved to be effective,as the neural network yielded an accuracy of nearly 92%.Such results suggest that it is possible to accurately diagnose early-stage PD through merely testing patients’voices.This research calls for a revolutionary diagnostic approach in decision support systems,and is the first step in a market-wide implementation of healthcare software dedicated to the aid of clinicians in early diagnosis of PD. 展开更多
关键词 Early diagnosis logistic regression neural network Parkinson’s disease random forest speech signal processing algorithms
暂未订购
基于Java Speech API规范的语音识别引擎的实现 被引量:2
13
作者 倪素萍 董滨 +1 位作者 赵庆卫 颜永红 《微计算机应用》 2005年第2期168-172,共5页
本文介绍了Java Speech API(JSAPI)规范的语音识别引擎的系统框架,描述了采用已有的C/C++识别引擎实现JSAPI语音识别引擎的思路和实现策略,提出并分析了以事件处理和状态处理为核心来实现JSAPI规范的具体方法,完成了基于JSAPI规范的语... 本文介绍了Java Speech API(JSAPI)规范的语音识别引擎的系统框架,描述了采用已有的C/C++识别引擎实现JSAPI语音识别引擎的思路和实现策略,提出并分析了以事件处理和状态处理为核心来实现JSAPI规范的具体方法,完成了基于JSAPI规范的语音识别软件系统的实现。 展开更多
关键词 语音识别引擎 事件处理 JAVA speech API规范
在线阅读 下载PDF
Underdetermined Blind Mixing Matrix Estimation Using STWP Analysis for Speech Source Signals 被引量:2
14
作者 Behzad Mozaffari Tazehkand Mohammad Ali Tinati 《Wireless Sensor Network》 2010年第11期854-860,共7页
Wavelet packets decompose signals in to broader components using linear spectral bisecting. Mixing matrix is the key issue in the Blind Source Separation (BSS) literature especially in under-determined cases. In this ... Wavelet packets decompose signals in to broader components using linear spectral bisecting. Mixing matrix is the key issue in the Blind Source Separation (BSS) literature especially in under-determined cases. In this paper, we propose a simple and novel method in Short Time Wavelet Packet (STWP) analysis to estimate blindly the mixing matrix of speech signals from noise free linear mixtures in over-complete cases. In this paper, the Laplacian model is considered in short time-wavelet packets and is applied to each histogram of packets. Expectation Maximization (EM) algorithm is used to train the model and calculate the model parameters. In our simulations, comparison with the other recent results will be computed and it is shown that our results are better than others. It is shown that complexity of computation of model is decreased and consequently the speed of convergence is increased. 展开更多
关键词 ICA CWT DWT BSS WPD Laplacian Model EXPECTATION Maximization Wavelet PACKETS Short Time ANALYSIS Over-complete BLIND Source Separation speech processing
在线阅读 下载PDF
High Performance Speech Compression System 被引量:6
15
作者 Ke Liu, Zhichun Mu, Zhong Wang Information Engineering School, University of Science & Technology Beijing, Beijing 100083, China 《Journal of University of Science and Technology Beijing》 CSCD 2001年第3期229-233,共5页
Since Pulse Code Modulation emerged in 1937, digitized speech has experienced rapid development due to its outstanding voice quality, reliability, robustness and security in communication. But how to reduce channel wi... Since Pulse Code Modulation emerged in 1937, digitized speech has experienced rapid development due to its outstanding voice quality, reliability, robustness and security in communication. But how to reduce channel width without loss of speech quality remains a crucial problem in speech coding theory. A new full-duplex digital speech communication system based on the Vocoder of AMBE-1000(TM) and microcontroller ATMEL 89C51 is introduced. It shows higher voice quality than current mobile phone system with only a quarter of channel width needed for the latter. The prospective areas in which the system can be applied include satellite communication, IP Phone, virtual meeting and the most important, defence industry. 展开更多
关键词 digital signal processing digital speech compression digital communication full-duplex coding rate
在线阅读 下载PDF
BLIND SPEECH SEPARATION FOR ROBOTS WITH INTELLIGENT HUMAN-MACHINE INTERACTION
16
作者 Huang Yulei Ding Zhizhong +1 位作者 Dai Lirong Chen Xiaoping 《Journal of Electronics(China)》 2012年第3期286-293,共8页
Speech recognition rate will deteriorate greatly in human-machine interaction when the speaker's speech mixes with a bystander's voice. This paper proposes a time-frequency approach for Blind Source Seperation... Speech recognition rate will deteriorate greatly in human-machine interaction when the speaker's speech mixes with a bystander's voice. This paper proposes a time-frequency approach for Blind Source Seperation (BSS) for intelligent Human-Machine Interaction(HMI). Main idea of the algorithm is to simultaneously diagonalize the correlation matrix of the pre-whitened signals at different time delays for every frequency bins in time-frequency domain. The prososed method has two merits: (1) fast convergence speed; (2) high signal to interference ratio of the separated signals. Numerical evaluations are used to compare the performance of the proposed algorithm with two other deconvolution algorithms. An efficient algorithm to resolve permutation ambiguity is also proposed in this paper. The algorithm proposed saves more than 10% of computational time with properly selected parameters and achieves good performances for both simulated convolutive mixtures and real room recorded speeches. 展开更多
关键词 Blind Source Separation (BSS) Blind deconvolution speech signal processing Human-machine interaction Simultaneous diagonalization
在线阅读 下载PDF
Speech Encryption with Fractional Watermark
17
作者 Yan Sun Cun Zhu Qi Cui 《Computers, Materials & Continua》 SCIE EI 2022年第10期1817-1825,共9页
Research on the feature of speech and image signals are carried out from two perspectives,the time domain and the frequency domain.The speech and image signals are a non-stationary signal,so FT is not used for the non... Research on the feature of speech and image signals are carried out from two perspectives,the time domain and the frequency domain.The speech and image signals are a non-stationary signal,so FT is not used for the non-stationary characteristics of the signal.When short-term stable speech is obtained by windowing and framing the subsequent processing of the signal is completed by the Discrete Fourier Transform(DFT).The Fast Discrete Fourier Transform is a commonly used analysis method for speech and image signal processing in frequency domain.It has the problem of adjusting window size to a for desired resolution.But the Fractional Fourier Transform can have both time domain and frequency domain processing capabilities.This paper performs global processing speech encryption by combining speech with image of Fractional Fourier Transform.The speech signal is embedded watermark image that is processed by fractional transformation,and the embedded watermark has the effect of rotation and superposition,which improves the security of the speech.The paper results show that the proposed speech encryption method has a higher security level by Fractional Fourier Transform.The technology is easy to extend to practical applications. 展开更多
关键词 Fractional Fourier Transform WATERMARK speech signal processing image processing
在线阅读 下载PDF
A Rule Based System for Speech Language Context Understanding
18
作者 Imran Sarwar Bajwa Muhammad Abbas Choudhary 《Journal of Donghua University(English Edition)》 EI CAS 2006年第6期39-42,共4页
Speech or Natural language contents are major tools of communication.This research paper presents a natural language processing based automated system for understanding speech language text.A new rule based model has ... Speech or Natural language contents are major tools of communication.This research paper presents a natural language processing based automated system for understanding speech language text.A new rule based model has been presented for analyzing the natural languages and extracting the relative meanings from the given text.User writes the natural language text in simple English in a few paragraphs and the designed system has a sound ability of analyzing the given script by the user.After composite analysis and extraction of associated information,the designed system gives particular meanings to an assortment of speech language text on the basis of its context.The designed system uses standard speech language rules that are clearly defined for all speech languages as English,Urdu,Chinese,Arabic,French,etc.The designed system provides a quick and reliable way to comprehend speech language context and generate respective meanings. 展开更多
关键词 automatic text understanding speech language processing information extraction language engineering.
在线阅读 下载PDF
TWO KINDS OF PITCH PREDICTORS IN SPEECH COMPRESSING CODING
19
作者 Bao Changchun Dai Yisong Fan Changxin(Xidian University, Xi’an 710071) (Jilin University of technology, Changchun 130025) 《Journal of Electronics(China)》 1997年第3期200-208,共9页
This paper studies two kinds of methods for pitch predictor in speech compressing coding, i.e., open-loop and closed-loop structures. Some of simplified approaches for solving pitch predictor equation are suggested, a... This paper studies two kinds of methods for pitch predictor in speech compressing coding, i.e., open-loop and closed-loop structures. Some of simplified approaches for solving pitch predictor equation are suggested, and the performances are compared under several conditions. The computer simulation results are shown. 展开更多
关键词 speech processing LINEAR PREDICTION PITCH PREDICTION
在线阅读 下载PDF
An Overview of Basics Speech Recognition and Autonomous Approach for Smart Home IOT Low Power Devices
20
作者 Jean-Yves Fourniols Nadim Nasreddine +3 位作者 Christophe Escriba Pascal Acco Julien Roux Georges Soto Romero 《Journal of Signal and Information Processing》 2018年第4期239-257,共19页
Automatic speech recognition, often incorrectly called voice recognition, is a computer based software technique that analyzes audio signals captured by a microphone and translates them into machine interpreted text. ... Automatic speech recognition, often incorrectly called voice recognition, is a computer based software technique that analyzes audio signals captured by a microphone and translates them into machine interpreted text. Speech processing is based on techniques that need local CPU or cloud computing with an Internet link. An activation word starts the uplink;“OK google”, “Alexa”, … and voice analysis is not usually suitable for autonomous limited CPU system (16 bits microcontroller) with low energy. To achieve this realization, this paper presents specific techniques and details an efficiency voice command method compatible with an embedded IOT low-power device. 展开更多
关键词 VOICE Recognition speech processing VOICE COMMAND Embedded Device
在线阅读 下载PDF
上一页 1 2 55 下一页 到第
使用帮助 返回顶部