摘要
飞机舱音的关键背景声为航空器飞行监控评估与事故调查分析提供了重要的依据。针对驾驶舱话音记录器(CVR)音频识别的高专业性和数据密集型特征、低频瞬时背景声识别难度高以及发动机噪声干扰的问题,提出了一种基于深度学习的CVR背景声智能分类方法。该方法以十类CVR背景声建立数据集;采用3种特征谱图提取声学特征,并搭建时延神经网络模型;利用上下文掩蔽模块降低噪声对开关和操作声音的影响,使用前端卷积模块捕捉低频瞬时声信号,进而优化出卷积神经和时延神经的混合模型TDNN-CF。改进后模型的CVR音频分类准确率达到98.90%,相较于传统的卷积神经网络和时延神经网络模型,其准确率分别提升了13.04和2.99个百分点。此外,与决策树、随机森林和K近邻等经典机器学习算法相比,准确率分别提升了18.07,15.62和14.55个百分点。实验结果表明,所提方法实现了CVR音频的高效分类。
The critical background sounds in the cockpit provide important evidence for flight monitoring evaluations and accident investigations.Regarding the high complexity and large data requirements of cockpit voice recorder(CVR)audio recognition,the issue of identifying low-frequency transient background sounds is particularly challenging,along with the interference caused by engine noise,an intelligent classification method of CVR background sounds based on deep learning is paper proposed.A dataset of 10 types of CVR background sounds was established,with acoustic features extracted by using three spectrogram methods,and a time-delay neural network model was built.Context-aware masking modules were used to reduce the impact of noise on operational sounds,while the front-end convolution module captured low-frequency transient signals.This study optimized a hybrid convolutional and time-delay neural network model,TDNN-CF.The improved model achieved a classification accuracy of 98.90%,representing increases of 13.04 and 2.99 percentage points comparing with the traditional CNN and TDNN models,respectively.Additionally,comparing with the classic machine learning algorithms like decision trees,random forests,and K-nearest neighbors(KNN),accuracy improved by 18.07,15.62,and 14.55 percentage points,respectively.Experimental results show that the present method efficiently classifies CVR audio.
作者
张迪
柴源通
曾佩佩
杨娟
ZHANG Di;CHAI Yuantong;ZENG Peipei;YANG Juan(Engineering Technology Training Center,Civil Aviation University of China,Tianjin 300300,China;College of Electronic Information and Automation,Civil Aviation University of China,Tianjin 300300,China)
出处
《西北工业大学学报》
北大核心
2025年第4期784-793,共10页
Journal of Northwestern Polytechnical University
基金
国家自然科学基金(U1733119)
民航安全能力建设基金([2023]50)资助。
关键词
驾驶舱话音记录器
声音分类
特征谱图
时延神经网络
上下文掩蔽模块
cockpit voice recorder
sound classification
characteristic spectrum
time delay neural network
context-aware masking