期刊文献+
共找到656篇文章
< 1 2 33 >
每页显示 20 50 100
New algorithm for variable-rate linear broadcast network coding 被引量:1
1
作者 夏寅 张惕远 黄佳庆 《Journal of Central South University》 SCIE EI CAS 2011年第4期1193-1199,共7页
To adjust the variance of source rate in linear broadcast networks, global encoding kernels should have corresponding dimensions to instruct the decoding process. The algorithm of constructing such global encoding ker... To adjust the variance of source rate in linear broadcast networks, global encoding kernels should have corresponding dimensions to instruct the decoding process. The algorithm of constructing such global encoding kernels is to adjust heterogeneous network to possible link failures. Linear algebra, graph theory and group theory are applied to construct one series of global encoding kernels which are applicable to all source rates. The effectiveness and existence of such global encoding kernels are proved. Based on 2 information flow, the algorithm of construction is explicitly given within polynomial time O(|E| |T|.ω^2max), and the memory complexity of algorithm is O(|E|). Both time and memory complexity of this algorithm proposed can be O(ωmax) less than those of algorithms in related works. 展开更多
关键词 network coding variable-rate linear broadcast heterogeneous network code construction algorithm
在线阅读 下载PDF
Adaptive bands filter bank optimized by genetic algorithm for robust speech recognition system 被引量:5
2
作者 黄丽霞 G.Evangelista 张雪英 《Journal of Central South University》 SCIE EI CAS 2011年第5期1595-1601,共7页
Perceptual auditory filter banks such as Bark-scale filter bank are widely used as front-end processing in speech recognition systems.However,the problem of the design of optimized filter banks that provide higher acc... Perceptual auditory filter banks such as Bark-scale filter bank are widely used as front-end processing in speech recognition systems.However,the problem of the design of optimized filter banks that provide higher accuracy in recognition tasks is still open.Owing to spectral analysis in feature extraction,an adaptive bands filter bank (ABFB) is presented.The design adopts flexible bandwidths and center frequencies for the frequency responses of the filters and utilizes genetic algorithm (GA) to optimize the design parameters.The optimization process is realized by combining the front-end filter bank with the back-end recognition network in the performance evaluation loop.The deployment of ABFB together with zero-crossing peak amplitude (ZCPA) feature as a front process for radial basis function (RBF) system shows significant improvement in robustness compared with the Bark-scale filter bank.In ABFB,several sub-bands are still more concentrated toward lower frequency but their exact locations are determined by the performance rather than the perceptual criteria.For the ease of optimization,only symmetrical bands are considered here,which still provide satisfactory results. 展开更多
关键词 perceptual filter banks bark scale speaker independent speech recognition systems zero-crossing peak amplitude genetic algorithm
在线阅读 下载PDF
Speech Analysis for Diagnosis of Parkinson’s Disease Using Genetic Algorithm and Support Vector Machine 被引量:1
3
作者 Mohammad Shahbakhi Danial Taheri Far Ehsan Tahami 《Journal of Biomedical Science and Engineering》 2014年第4期147-156,共10页
Parkinson’s disease (PD) is the most common disease of motor system degeneration that occurs when the dopamine-producing cells are damaged in substantia nigra. To detect PD, various signals have been investigated, in... Parkinson’s disease (PD) is the most common disease of motor system degeneration that occurs when the dopamine-producing cells are damaged in substantia nigra. To detect PD, various signals have been investigated, including EEG, gait and speech. Since approximately 90 percent of the people with PD suffer from speech disorders, speech analysis is considered as the most common technique for this aim. This paper proposes a new algorithm for diagnosing of Parkinson’s disease based on voice analysis. In the first step, genetic algorithm (GA) is undertaken for selecting optimized features from all extracted features. Afterwards a network based on support vector machine (SVM) is used for classification between healthy and people with Parkinson. The dataset of this research is composed of a range of biomedical voice signals from 31 people, 23 with Parkinson’s disease and 8 healthy people. The subjects were asked to pronounce letter “A” for 3 seconds. 22 linear and non-linear features were extracted from the signals that 14 features were based on F0 (fundamental frequency or pitch), jitter, shimmer and noise to harmonics ratio, which are main factors in voice signal. Because changing in these factors is noticeable for the people with PD, optimized features were selected among them. Of the various numbers of optimized features, the data classification was investigated. Results show that the classification accuracy percent of 94.50 per 4 optimized features, the accuracy percent of 93.66 per 7 optimized features and the accuracy percent of 94.22 per 9 optimized features, could be achieved. It can be observed that the best classification accuracy may be achieved using Fhi (Hz), Fho (Hz), jitter (RAP) and shimmer (APQ5). 展开更多
关键词 Parkinson’s Disease speech Analysis GENETIC algorithm Support VECTOR Machine
暂未订购
Enhancing Parkinson’s Disease Diagnosis Accuracy Through Speech Signal Algorithm Modeling 被引量:1
4
作者 Omar M.El-Habbak Abdelrahman M.Abdelalim +5 位作者 Nour H.Mohamed Habiba M.Abd-Elaty Mostafa A.Hammouda Yasmeen Y.Mohamed Mohanad A.Taifor Ali W.Mohamed 《Computers, Materials & Continua》 SCIE EI 2022年第2期2953-2969,共17页
Parkinson’s disease(PD),one of whose symptoms is dysphonia,is a prevalent neurodegenerative disease.The use of outdated diagnosis techniques,which yield inaccurate and unreliable results,continues to represent an obs... Parkinson’s disease(PD),one of whose symptoms is dysphonia,is a prevalent neurodegenerative disease.The use of outdated diagnosis techniques,which yield inaccurate and unreliable results,continues to represent an obstacle in early-stage detection and diagnosis for clinical professionals in the medical field.To solve this issue,the study proposes using machine learning and deep learning models to analyze processed speech signals of patients’voice recordings.Datasets of these processed speech signals were obtained and experimented on by random forest and logistic regression classifiers.Results were highly successful,with 90%accuracy produced by the random forest classifier and 81.5%by the logistic regression classifier.Furthermore,a deep neural network was implemented to investigate if such variation in method could add to the findings.It proved to be effective,as the neural network yielded an accuracy of nearly 92%.Such results suggest that it is possible to accurately diagnose early-stage PD through merely testing patients’voices.This research calls for a revolutionary diagnostic approach in decision support systems,and is the first step in a market-wide implementation of healthcare software dedicated to the aid of clinicians in early diagnosis of PD. 展开更多
关键词 Early diagnosis logistic regression neural network Parkinson’s disease random forest speech signal processing algorithms
暂未订购
AN EFFECTIVE LVQ-BASED ALGORITHMFOR ROBUST SPEECH RECOGNITION
5
作者 朱策 关存太 +1 位作者 厉大华 何振亚 《Journal of Southeast University(English Edition)》 EI CAS 1994年第1期9-12,共4页
Dynamic time warping (DTW) and dynamic spectral wafliing (DSW)techniques are introduced into learning vector quantization (LVQ) algorithm to con-struct a “dynamic” Bayes classifier for speech recognition. It can pre... Dynamic time warping (DTW) and dynamic spectral wafliing (DSW)techniques are introduced into learning vector quantization (LVQ) algorithm to con-struct a “dynamic” Bayes classifier for speech recognition. It can preduce highly dis-criminiative “dynamic” reference vectors to represent the temporal and spectral vari-abilities of speech. Recognition experiments on 19 Chinese consonants show that the“dynamic” classifier outperforms the original “static” classifier significantly. 展开更多
关键词 speech recognition NEURAL networks algorithms/learning vectorquantization DYNAMIC time WARPING DYNAMIC spectral WARPING
在线阅读 下载PDF
Annoyance-type speech emotion detection in working environment
6
作者 王青云 赵力 +1 位作者 梁瑞宇 张潇丹 《Journal of Southeast University(English Edition)》 EI CAS 2013年第4期366-371,共6页
In order to recognize people's annoyance emotions in the working environment and evaluate emotional well- being, emotional speech in a work environment is induced to obtain adequate samples of emotional speech, and a... In order to recognize people's annoyance emotions in the working environment and evaluate emotional well- being, emotional speech in a work environment is induced to obtain adequate samples of emotional speech, and a Mandarin database with two thousands samples is built. In searching for annoyance-type emotion features, the prosodic feature and the voice quality feature parameters of the emotional statements are extracted first. Then an improved back propagation (BP) neural network based on the shuffled frog leaping algorithm (SFLA) is proposed to recognize the emotion. The recognition capability of the BP, radical basis function (RBF) and the SFLA neural networks are compared experimentally. The results show that the recognition ratio of the SFLA neural network is 4. 7% better than that of the BP neural network and 4. 3% better than that of the RBF neural network. The experimental results demonstrate that the random initial data trained by the SFLA can optimize the connection weights and thresholds of the neural network, speed up the convergence and improve the recognition rate. 展开更多
关键词 speech emotion detection annoyance type sentence length shuffled frog leaping algorithm
在线阅读 下载PDF
Speech Enhancement Based on Approximate Message Passing 被引量:1
7
作者 Chao Li Ting Jiang Sheng Wu 《China Communications》 SCIE CSCD 2020年第8期187-198,共12页
To overcome the limitations of conventional speech enhancement methods, such as inaccurate voice activity detector(VAD) and noise estimation, a novel speech enhancement algorithm based on the approximate message passi... To overcome the limitations of conventional speech enhancement methods, such as inaccurate voice activity detector(VAD) and noise estimation, a novel speech enhancement algorithm based on the approximate message passing(AMP) is adopted. AMP exploits the difference between speech and noise sparsity to remove or mute the noise from the corrupted speech. The AMP algorithm is adopted to reconstruct the clean speech efficiently for speech enhancement. More specifically, the prior probability distribution of speech sparsity coefficient is characterized by Gaussian-model, and the hyper-parameters of the prior model are excellently learned by expectation maximization(EM) algorithm. We utilize the k-nearest neighbor(k-NN) algorithm to learn the sparsity with the fact that the speech coefficients between adjacent frames are correlated. In addition, computational simulations are used to validate the proposed algorithm, which achieves better speech enhancement performance than other four baseline methods-Wiener filtering, subspace pursuit(SP), distributed sparsity adaptive matching pursuit(DSAMP), and expectation-maximization Gaussian-model approximate message passing(EM-GAMP) under different compression ratios and a wide range of signal to noise ratios(SNRs). 展开更多
关键词 speech enhancement approximate message passing Gaussian model expectation maximization algorithm
在线阅读 下载PDF
An Efficient Reference Free Adaptive Learning Process for Speech Enhancement Applications 被引量:1
8
作者 Girika Jyoshna Md.Zia Ur Rahman L.Koteswararao 《Computers, Materials & Continua》 SCIE EI 2022年第2期3067-3080,共14页
In issues like hearing impairment,speech therapy and hearing aids play a major role in reducing the impairment.Removal of noise signals from speech signals is a key task in hearing aids as well as in speech therapy.Du... In issues like hearing impairment,speech therapy and hearing aids play a major role in reducing the impairment.Removal of noise signals from speech signals is a key task in hearing aids as well as in speech therapy.During the transmission of speech signals,several noise components contaminate the actual speech components.This paper addresses a new adaptive speech enhancement(ASE)method based on a modified version of singular spectrum analysis(MSSA).The MSSA generates a reference signal for ASE and makes the ASE is free from feeding reference component.The MSSA adopts three key steps for generating the reference from the contaminated speech only.These are decomposition,grouping and reconstruction.The generated reference is taken as a reference for variable size adaptive learning algorithms.In this work two categories of adaptive learning algorithms are used.They are step variable adaptive learning(SVAL)algorithm and time variable step size adaptive learning(TVAL).Further,sign regressor function is applied to adaptive learning algorithms to reduce the computational complexity of the proposed adaptive learning algorithms.The performance measures of the proposed schemes are calculated in terms of signal to noise ratio improvement(SNRI),excess mean square error(EMSE)and misadjustment(MSD).For cockpit noise these measures are found to be 29.2850,-27.6060 and 0.0758 dB respectively during the experiments using SVAL algorithm.By considering the reduced number of multiplications the sign regressor version of SVAL based ASE method is found to better then the counter parts. 展开更多
关键词 Adaptive algorithm speech enhancement singular spectrum analysis reference free noise canceller variable step size
在线阅读 下载PDF
Applying and Comparison of Chaotic-Based Permutation Algorithms for Audio Encryption 被引量:1
9
作者 Osama M.Abu Zaid Medhat A.Tawfeek Saad Alanazi 《Computers, Materials & Continua》 SCIE EI 2021年第6期3161-3176,共16页
This research presents,and claries the application of two permutation algorithms,based on chaotic map systems,and applied to a le of speech signals.They are the Arnold cat map-based permutation algorithm,and the Baker... This research presents,and claries the application of two permutation algorithms,based on chaotic map systems,and applied to a le of speech signals.They are the Arnold cat map-based permutation algorithm,and the Baker’s chaotic map-based permutation algorithm.Both algorithms are implemented on the same speech signal sample.Then,both the premier and the encrypted le histograms are documented and plotted.The speech signal amplitude values with time signals of the original le are recorded and plotted against the encrypted and decrypted les.Furthermore,the original le is plotted against the encrypted le,using the spectrogram frequencies of speech signals with the signal duration.These permutation algorithms are used to shufe the positions of the speech les signals’values without any changes,to produce an encrypted speech le.A comparative analysis is introduced by using some of sundry statistical and experimental analyses for the procedures of encryption and decryption,e.g.,the time of both procedures,the encrypted audio signals histogram,the correlation coefcient between specimens in the premier and encrypted signals,a test of the Spectral Distortion(SD),and the Log-Likelihood Ratio(LLR)measures.The outcomes of the different experimental and comparative studies demonstrate that the two permutation algorithms(Baker and Arnold)are sufcient for providing an efcient and reliable voice signal encryption solution.However,the Arnold’s algorithm gives better results in most cases as compared to the results of Baker’s algorithm. 展开更多
关键词 Arnold’s cat map chaotic maps permutation algorithms speech encryption Baker’s chaotic map
在线阅读 下载PDF
Hybrid In-Vehicle Background Noise Reduction for Robust Speech Recognition:The Possibilities of Next Generation 5G Data Networks
10
作者 Radek Martinek Jan Baros +2 位作者 Rene Jaros Lukas Danys Jan Nedoma 《Computers, Materials & Continua》 SCIE EI 2022年第6期4659-4676,共18页
This pilot study focuses on employment of hybrid LMS-ICA system for in-vehicle background noise reduction.Modern vehicles are nowadays increasingly supporting voice commands,which are one of the pillars of autonomous ... This pilot study focuses on employment of hybrid LMS-ICA system for in-vehicle background noise reduction.Modern vehicles are nowadays increasingly supporting voice commands,which are one of the pillars of autonomous and SMART vehicles.Robust speaker recognition for context-aware in-vehicle applications is limited to a certain extent by in-vehicle back-ground noise.This article presents the new concept of a hybrid system which is implemented as a virtual instrument.The highly modular concept of the virtual car used in combination with real recordings of various driving scenarios enables effective testing of the investigated methods of in-vehicle background noise reduction.The study also presents a unique concept of an adaptive system using intelligent clusters of distributed next generation 5G data networks,which allows the exchange of interference information and/or optimal hybrid algorithm settings between individual vehicles.On average,the unfiltered voice commands were successfully recognized in 29.34%of all scenarios,while the LMS reached up to 71.81%,and LMS-ICA hybrid improved the performance further to 73.03%. 展开更多
关键词 5G noise reduction hybrid algorithms speech recognition 5G data networks in-vehicle background noise
在线阅读 下载PDF
Improved Attentive Recurrent Network for Applied Linguistics-Based Offensive Speech Detection
11
作者 Manar Ahmed Hamza Hala J.Alshahrani +5 位作者 Khaled Tarmissi Ayman Yafoz Amira Sayed A.Aziz Mohammad Mahzari Abu Sarwar Zamani Ishfaq Yaseen 《Computer Systems Science & Engineering》 SCIE EI 2023年第11期1691-1707,共17页
Applied linguistics is one of the fields in the linguistics domain and deals with the practical applications of the language studies such as speech processing,language teaching,translation and speech therapy.The ever-... Applied linguistics is one of the fields in the linguistics domain and deals with the practical applications of the language studies such as speech processing,language teaching,translation and speech therapy.The ever-growing Online Social Networks(OSNs)experience a vital issue to confront,i.e.,hate speech.Amongst the OSN-oriented security problems,the usage of offensive language is the most important threat that is prevalently found across the Internet.Based on the group targeted,the offensive language varies in terms of adult content,hate speech,racism,cyberbullying,abuse,trolling and profanity.Amongst these,hate speech is the most intimidating form of using offensive language in which the targeted groups or individuals are intimidated with the intent of creating harm,social chaos or violence.Machine Learning(ML)techniques have recently been applied to recognize hate speech-related content.The current research article introduces a Grasshopper Optimization with an Attentive Recurrent Network for Offensive Speech Detection(GOARN-OSD)model for social media.The GOARNOSD technique integrates the concepts of DL and metaheuristic algorithms for detecting hate speech.In the presented GOARN-OSD technique,the primary stage involves the data pre-processing and word embedding processes.Then,this study utilizes the Attentive Recurrent Network(ARN)model for hate speech recognition and classification.At last,the Grasshopper Optimization Algorithm(GOA)is exploited as a hyperparameter optimizer to boost the performance of the hate speech recognition process.To depict the promising performance of the proposed GOARN-OSD method,a widespread experimental analysis was conducted.The comparison study outcomes demonstrate the superior performance of the proposed GOARN-OSD model over other state-of-the-art approaches. 展开更多
关键词 Applied linguistics hate speech offensive language natural language processing deep learning grasshopper optimization algorithm
在线阅读 下载PDF
Enhanced Marathi Speech Recognition Facilitated by Grasshopper Optimisation-Based Recurrent Neural Network
12
作者 Ravindra Parshuram Bachate Ashok Sharma +3 位作者 Amar Singh Ayman AAly Abdulaziz HAlghtani Dac-Nhuong Le 《Computer Systems Science & Engineering》 SCIE EI 2022年第11期439-454,共16页
Communication is a significant part of being human and living in the world.Diverse kinds of languages and their variations are there;thus,one person can speak any language and cannot effectively communicate with one w... Communication is a significant part of being human and living in the world.Diverse kinds of languages and their variations are there;thus,one person can speak any language and cannot effectively communicate with one who speaks that language in a different accent.Numerous application fields such as education,mobility,smart systems,security,and health care systems utilize the speech or voice recognition models abundantly.Though,various studies are focused on the Arabic or Asian and English languages by ignoring other significant languages like Marathi that leads to the broader research motivations in regional languages.It is necessary to understand the speech recognition field,in which the major concentrated stages are feature extraction and classification.This paper emphasis developing a Speech Recognition model for the Marathi language by optimizing Recurrent Neural Network(RNN).Here,the preprocessing of the input signal is performed by smoothing and median filtering.After preprocessing the feature extraction is carried out using MFCC and Spectral features to get precise features from the input Marathi Speech corpus.The optimized RNN classifier is used for speech recognition after completing the feature extraction task,where the optimization of hidden neurons in RNN is performed by the Grasshopper Optimization Algorithm(GOA).Finally,the comparison with the conventional techniques has shown that the proposed model outperforms most competing models on a benchmark dataset. 展开更多
关键词 Deep learning grasshopper optimization algorithm recurrent neural network speech recognition word error rate
在线阅读 下载PDF
Hidden Markov Models for Automatic Speech Recognition
13
作者 Mbarki Aymen Ammari Abdelaziz Sghaier Halim Hassen Maaref 《Journal of Mechanics Engineering and Automation》 2011年第1期68-73,共6页
In this paper the authors look into the problem of Hidden Markov Models (HMM): the evaluation, the decoding and the learning problem. The authors have explored an approach to increase the effectiveness of HMM in th... In this paper the authors look into the problem of Hidden Markov Models (HMM): the evaluation, the decoding and the learning problem. The authors have explored an approach to increase the effectiveness of HMM in the speech recognition field. Although hidden Markov modeling has significantly improved the performance of current speech-recognition systems, the general problem of completely fluent speaker-independent speech recognition is still far from being solved. For example, there is no system which is capable of reliably recognizing unconstrained conversational speech. Also, there does not exist a good way to infer the language structure from a limited corpus of spoken sentences statistically. Therefore, the authors want to provide an overview of the theory of HMM, discuss the role of statistical methods, and point out a range of theoretical and practical issues that deserve attention and are necessary to understand so as to further advance research in the field of speech recognition. 展开更多
关键词 Hidden markov models (HMMs) speech recognition HMM problems viterbi algorithm.
在线阅读 下载PDF
基于遗传算法的FIR数字滤波器的优化设计 被引量:2
14
作者 徐开军 《信息化研究》 2025年第1期37-42,共6页
传统的FIR数字滤波器设计方法,往往依赖于设计者的经验和对特定函数的选择,在面对复杂的滤波指标要求时,难以实现全局最优的设计结果。遗传算法作为一种模拟自然进化过程的随机搜索算法,具有强大的全局优化能力,能够在复杂的解空间中有... 传统的FIR数字滤波器设计方法,往往依赖于设计者的经验和对特定函数的选择,在面对复杂的滤波指标要求时,难以实现全局最优的设计结果。遗传算法作为一种模拟自然进化过程的随机搜索算法,具有强大的全局优化能力,能够在复杂的解空间中有效地搜索到接近最优的解,将遗传算法应用于FIR数字滤波器的设计中,为解决传统设计方法的局限性提供了新的途径。本文深入研究了基于遗传算法的FIR数字滤波器优化设计方法,阐述了其设计流程,并通过实例验证了该方法相较于传统设计方法的优势。 展开更多
关键词 数字滤波器 频率采样法 遗传算法 语音处理
在线阅读 下载PDF
Speech Transmission Based on Computer Network
15
作者 Jie Li Changyun Miao Zhigang Wu 《通讯和计算机(中英文版)》 2005年第7期59-63,共5页
在线阅读 下载PDF
基于融合特征选择方法的语音情感识别
16
作者 吉训生 罗展 谢捷 《计算机与数字工程》 2025年第7期1880-1884,共5页
针对复杂高维的语音情感特征,特征选择对降低模型复杂度和提高情感识别率具有重要意义。为了降低特征维度,提出一种融合特征选择方法,根据不同的评价准则经过过滤和包装方法选择最优特征子集。在过滤方法ReliefF中,引入k个最近邻或非最... 针对复杂高维的语音情感特征,特征选择对降低模型复杂度和提高情感识别率具有重要意义。为了降低特征维度,提出一种融合特征选择方法,根据不同的评价准则经过过滤和包装方法选择最优特征子集。在过滤方法ReliefF中,引入k个最近邻或非最近邻样本之间的平均距离来估计样本之间的差异,有效地评估样本中特征的权重值,去掉权值为负的不相关特征向量,由此生成的约简特征向量作为融合特征选择第二阶段的输入。包装方法使用两阶变异灰狼优化算法,该方法考虑了特征与分类算法之间的交互作用,并且引入两阶变异算子以提升算法的开发效率。实验提出随机森林权值初始化方法,在种群初期保留大部分显著特征,加速算法收敛。实验结果表明,该方法可以有效降低特征维度的同时提升分类性能。使用支持向量机分类器在EMODB和SAVEE数据集上分别取得90.66%和78.96%的分类精度,特征维数从1582分别降到366和255。 展开更多
关键词 语音情感识别 灰狼算法 ReliefF方法 特征选择 融合算法
在线阅读 下载PDF
基于TF-IDF和GloVe算法面向多种类别文本自动分类系统的优化研究
17
作者 刘爱琴 王上丹 《新世纪图书馆》 2025年第10期40-46,共7页
通过检索关键词,指定一个或多个类别标签实现文本的高效组织和自动分类,是发现文档中的隐含关系、推动知识传播和创新的重要途径。然而,检索关键词的获取位置、词性以及选取是否全面等因素,会导致关键词语义信息缺失和关键词识别准确性... 通过检索关键词,指定一个或多个类别标签实现文本的高效组织和自动分类,是发现文档中的隐含关系、推动知识传播和创新的重要途径。然而,检索关键词的获取位置、词性以及选取是否全面等因素,会导致关键词语义信息缺失和关键词识别准确性较差;这两大问题,正是影响文档高效、精准自动分类的突出障碍。基于此,论文构建了一个融合TF-IDF(Term Frequency-Inverse Document Frequency)和GloVe(Global Vectors for Word Representation)的文本自动分类系统。该系统首先就词性影响因子和位置权重系数对TF-IDF算法进行改进,以弥补传统TF-IDF算法在关键词识别和语义分析上的不足;其次,使用GloVe模型对关键词集进一步扩充,使文本自动分类的准确率和召回率分别达到92.6%和90.9%;最后,通过实验比对,进一步验证该系统在处理多类别文本自动分类任务中的有效性。 展开更多
关键词 TF-IDF算法 GloVe模型 文本自动分类 关键词位置 词性 语义扩展
在线阅读 下载PDF
基于改进小波阈值和优化VMD算法的语音增强方法 被引量:3
18
作者 张礼艳 刘增力 彭艺 《吉林大学学报(理学版)》 北大核心 2025年第2期608-621,共14页
针对语音信号传输过程中受噪声和回声等因素干扰,导致信号质量和可懂度下降的问题,提出一种基于优化的变分模态分解算法和改进小波阈值的语音信号增强方法.首先,采用麻雀搜索算法优化模态分解参数,并分解语音信号得到模态分量;其次,根... 针对语音信号传输过程中受噪声和回声等因素干扰,导致信号质量和可懂度下降的问题,提出一种基于优化的变分模态分解算法和改进小波阈值的语音信号增强方法.首先,采用麻雀搜索算法优化模态分解参数,并分解语音信号得到模态分量;其次,根据模态分量与原信号的相关系数和中心频率,消除高频噪声分量,保留接近原信号的模态分量作为纯语音,其他模态分量作为带噪语音,进行小波阈值处理;最后,重构纯语音和处理后的噪声模态分量,得到增强的语音信号.结果表明:该方法比单一方法具有更优的语音增强效果;优化的变分模态分解算法和改进的阈值与阈值函数实现了比传统方法更好的增强效果,适用于各种噪声环境,有效提升了语音信号的质量和可懂度. 展开更多
关键词 语音增强 麻雀搜索算法 变分模态分解 小波阈值 相关系数
在线阅读 下载PDF
基于S型微纳光纤的声带振动传感器及语音智能识别研究 被引量:1
19
作者 王智君 黄嵊釉 +5 位作者 李昆 杨杨 陈复旦 罗彬彬 吴德操 邹雪 《光子学报》 北大核心 2025年第5期21-32,共12页
使用聚二甲基硅氧烷作为传感器衬底嵌入直径为4μm,弯曲半径为1 mm的S型微纳光纤,研制了一种可穿戴的声带语音识别柔性传感器。该传感器通过人体发声时声带产生振动引起传感器的光强度变化将其转变为电信号的变化从而可以实现声带振动... 使用聚二甲基硅氧烷作为传感器衬底嵌入直径为4μm,弯曲半径为1 mm的S型微纳光纤,研制了一种可穿戴的声带语音识别柔性传感器。该传感器通过人体发声时声带产生振动引起传感器的光强度变化将其转变为电信号的变化从而可以实现声带振动信号的识别。利用目标检测算法模型对26个英文字母的识别率为96.8%,对日常词汇的识别率为97.75%,凸显了传感器在语音识别方面的普适性。该传感器具有制作简便,快速振动响应(222 ms)、可重复性且稳定性好等特点,在医疗领域和健康监测中具有潜在的应用价值。 展开更多
关键词 微纳光纤 声带振动 语音识别 目标检测算法 可穿戴传感器
在线阅读 下载PDF
基于编辑约束的端到端越南语文本正则化方法
20
作者 蒋铭 王琳钦 +1 位作者 赖华 高盛祥 《计算机应用》 北大核心 2025年第2期362-370,共9页
文本正则化是语音合成(TTS)前端分析任务中不可或缺的步骤,而语义歧义性是文本正则化任务面临的主要问题,比如数字、日期、时间等非标准词的语义歧义性。针对该问题,提出一种基于编辑约束的端到端文本正则化方法,并且在充分考虑越南语... 文本正则化是语音合成(TTS)前端分析任务中不可或缺的步骤,而语义歧义性是文本正则化任务面临的主要问题,比如数字、日期、时间等非标准词的语义歧义性。针对该问题,提出一种基于编辑约束的端到端文本正则化方法,并且在充分考虑越南语的语言特点后,设计专门用于越南语的标注方法,以提高模型对上下文语义信息的建模能力。此外,针对神经网络模型容易产生不可恢复性错误的问题,提出一种编辑对齐算法以有效约束非标准词文本的范围,减小解码端的搜索空间,从而避免模型自身局限性所导致的非正则化文本预测错误。选取FastCorrect模型作为基准模型,将各类优化方法应用到基准模型中得到新模型。实验结果表明,所提模型在越南语不同优化方式的对比实验中的精准率相比使用无标注数据的基准模型提高了23.71个百分点,在同类中文实验中的精准率提高了26.24个百分点。可见,所提方法不仅在越南语上表现出色,而且在中文开源数据上也取得了显著的效果,验证了该方法在越南语之外的适用性。而且,与六类基线模型相比,使用所提方法的模型取得了最高的97.14%的精准率,在F1值上超过加权有限状态转换器(WFST)的两阶段方法2.29个百分点,证明了所提方法在文本正则化任务上的优越性。 展开更多
关键词 越南语 文本正则化 编辑对齐算法 语音合成 端到端
在线阅读 下载PDF
上一页 1 2 33 下一页 到第
使用帮助 返回顶部