An algorithm involving Mel-Frequency Cepstral Coefficients (MFCCs) is provided to perform signal feature extraction for the task of speaker accent recognition. Then different classifiers are compared based on the MFCC...An algorithm involving Mel-Frequency Cepstral Coefficients (MFCCs) is provided to perform signal feature extraction for the task of speaker accent recognition. Then different classifiers are compared based on the MFCC feature. For each signal, the mean vector of MFCC matrix is used as an input vector for pattern recognition. A sample of 330 signals, containing 165 US voice and 165 non-US voice, is analyzed. By comparison, k-nearest neighbors yield the highest average test accuracy, after using a cross-validation of size 500, and least time being used in the computation.展开更多
异常肺音听诊识别是儿童支气管肺部疾病诊断的一种重要手段。针对儿童异常肺音分类研究常用的声谱图图像识别方法计算资源大、识别率不高等问题,提出了一种结合梅尔倒谱系数(Mel frequency cepstral coefficients,MFCC)特征、卷积神经网...异常肺音听诊识别是儿童支气管肺部疾病诊断的一种重要手段。针对儿童异常肺音分类研究常用的声谱图图像识别方法计算资源大、识别率不高等问题,提出了一种结合梅尔倒谱系数(Mel frequency cepstral coefficients,MFCC)特征、卷积神经网络(convolutional neural network,CNN)与双向长短时记忆网络(bidirectional long short-term memory,BiLSTM)的混合模型,用于儿童异常肺音的分类方法。该方法通过CNN对MFCC特征进行空间特性提取,利用BiLSTM对MFCC音频特征进行时序特性提取,建立了BCNnet(BILSTM CNN network)模型。文章收集并建立了一个儿童肺音数据集,在该数据集上,所提方法平均准确率可达75.3%,与以声谱图为输入的CNN(并行池化)模型相比,准确率提高了3.7个百分点,且在模型大小和识别速度上均有改善。展开更多
为研究长时信号中对具有特定特征的声音来源方向进行检测的问题,本课题提出一种基于多特征自适应的语音信号活动检测对长时阵列信号进行检测,将结合多子空间拟合(MUSIC)算法与语音活动检测(VAD)技术,提出一种新型的信号处理方法,旨在提...为研究长时信号中对具有特定特征的声音来源方向进行检测的问题,本课题提出一种基于多特征自适应的语音信号活动检测对长时阵列信号进行检测,将结合多子空间拟合(MUSIC)算法与语音活动检测(VAD)技术,提出一种新型的信号处理方法,旨在提高对特征明显且目标具有特定属性的信号源的检测精度和定位准确性。通过语音信号MFCC特征和语音信号能量特征来设置自适应阈值,对特定声源的特征进行语音活动检测,以提高语音活动检测的准确性。再通过检测到的语音信号活动片段进行阵列信号测向,通过MUSIC算法实现对长时信号中不同时段不同来源方向的特定声源进行检测。To investigate the problem of detecting the direction of sound sources with specific features in long-term signals, this project proposes a voice signal activity detection method based on multi feature adaptation for detecting long-term array signals. By combining the Multi Subspace Fitting (MUSIC) algorithm with Voice Activity Detection (VAD) technology, a new signal processing method is proposed to improve the detection and localization accuracy of signal sources with obvious features and specific target attributes. By setting adaptive thresholds based on the MFCC features and energy features of voice signals, voice activity detection can be performed on specific sound source features to improve the accuracy of voice activity detection. Then, the direction of arrival is determined by detecting active voice signal segments, and the MUSIC algorithm is used to detect specific sound sources in different time periods and source directions in long-term signals.展开更多
文摘An algorithm involving Mel-Frequency Cepstral Coefficients (MFCCs) is provided to perform signal feature extraction for the task of speaker accent recognition. Then different classifiers are compared based on the MFCC feature. For each signal, the mean vector of MFCC matrix is used as an input vector for pattern recognition. A sample of 330 signals, containing 165 US voice and 165 non-US voice, is analyzed. By comparison, k-nearest neighbors yield the highest average test accuracy, after using a cross-validation of size 500, and least time being used in the computation.
文摘为研究长时信号中对具有特定特征的声音来源方向进行检测的问题,本课题提出一种基于多特征自适应的语音信号活动检测对长时阵列信号进行检测,将结合多子空间拟合(MUSIC)算法与语音活动检测(VAD)技术,提出一种新型的信号处理方法,旨在提高对特征明显且目标具有特定属性的信号源的检测精度和定位准确性。通过语音信号MFCC特征和语音信号能量特征来设置自适应阈值,对特定声源的特征进行语音活动检测,以提高语音活动检测的准确性。再通过检测到的语音信号活动片段进行阵列信号测向,通过MUSIC算法实现对长时信号中不同时段不同来源方向的特定声源进行检测。To investigate the problem of detecting the direction of sound sources with specific features in long-term signals, this project proposes a voice signal activity detection method based on multi feature adaptation for detecting long-term array signals. By combining the Multi Subspace Fitting (MUSIC) algorithm with Voice Activity Detection (VAD) technology, a new signal processing method is proposed to improve the detection and localization accuracy of signal sources with obvious features and specific target attributes. By setting adaptive thresholds based on the MFCC features and energy features of voice signals, voice activity detection can be performed on specific sound source features to improve the accuracy of voice activity detection. Then, the direction of arrival is determined by detecting active voice signal segments, and the MUSIC algorithm is used to detect specific sound sources in different time periods and source directions in long-term signals.