An algorithm involving Mel-Frequency Cepstral Coefficients (MFCCs) is provided to perform signal feature extraction for the task of speaker accent recognition. Then different classifiers are compared based on the MFCC...An algorithm involving Mel-Frequency Cepstral Coefficients (MFCCs) is provided to perform signal feature extraction for the task of speaker accent recognition. Then different classifiers are compared based on the MFCC feature. For each signal, the mean vector of MFCC matrix is used as an input vector for pattern recognition. A sample of 330 signals, containing 165 US voice and 165 non-US voice, is analyzed. By comparison, k-nearest neighbors yield the highest average test accuracy, after using a cross-validation of size 500, and least time being used in the computation.展开更多
异常肺音听诊识别是儿童支气管肺部疾病诊断的一种重要手段。针对儿童异常肺音分类研究常用的声谱图图像识别方法计算资源大、识别率不高等问题,提出了一种结合梅尔倒谱系数(Mel frequency cepstral coefficients,MFCC)特征、卷积神经网...异常肺音听诊识别是儿童支气管肺部疾病诊断的一种重要手段。针对儿童异常肺音分类研究常用的声谱图图像识别方法计算资源大、识别率不高等问题,提出了一种结合梅尔倒谱系数(Mel frequency cepstral coefficients,MFCC)特征、卷积神经网络(convolutional neural network,CNN)与双向长短时记忆网络(bidirectional long short-term memory,BiLSTM)的混合模型,用于儿童异常肺音的分类方法。该方法通过CNN对MFCC特征进行空间特性提取,利用BiLSTM对MFCC音频特征进行时序特性提取,建立了BCNnet(BILSTM CNN network)模型。文章收集并建立了一个儿童肺音数据集,在该数据集上,所提方法平均准确率可达75.3%,与以声谱图为输入的CNN(并行池化)模型相比,准确率提高了3.7个百分点,且在模型大小和识别速度上均有改善。展开更多
文摘An algorithm involving Mel-Frequency Cepstral Coefficients (MFCCs) is provided to perform signal feature extraction for the task of speaker accent recognition. Then different classifiers are compared based on the MFCC feature. For each signal, the mean vector of MFCC matrix is used as an input vector for pattern recognition. A sample of 330 signals, containing 165 US voice and 165 non-US voice, is analyzed. By comparison, k-nearest neighbors yield the highest average test accuracy, after using a cross-validation of size 500, and least time being used in the computation.