摘要
基于深度学习的声音事件定位与检测网络存在输入特征的关键信息丢失的问题,导致声音事件定位与检测更加困难,提出了一种基于注意力机制的简单无参数网络模型(simple and parameter-free network,SimNet)。首先在残差块后引入简单无参注意力模块(simple and parameter-free attention module,SimAM),通过能量函数帮助网络聚焦特征图中各神经元的深度特征,以此增强模型对更丰富的特征信息的辨别能力。此外为促进模型朝更精准的方向训练,还采用了一种均方根绝对误差(root mean square absolute error,RMSAE)损失函数,有助于模型准确搜索更全面的空间信息。实验结果表明,在TAU-NIGENS Spatial Sound Events 2021数据集中,提出的网络算法相比原基线网络性能有较大程度的提升,错误率(error rate,ER)和定位误差(localization error,LE)降低到0.394和12.03°,F1分数(F1-score)和定位召回(localization recall,LR)提升到72.6%和73.8%。
The sound event localization and detection network based on deep learning has the problem of missing key information of input features,which makes the sound event localization and detection more difficult.A simple and prarameter-free network(SimNet)model based on attention mechanism is proposed.First,a simple and parameter-free attention module(SimAM)is introduced after the residual block,and the energy function helps the network focus on the depth features of each neuron in the feature map,thus enhancing the model's ability to discriminate richer feature information.In addition,a root mean square absolute error(RMSAE)loss function is used to promote the model to train more accurately,which helps the model to search more comprehensive spatial information accurately.The experimental results show that the performance of the proposed network algorithm is improved to a large extent compared with the original baseline network in the TAU-NIGENS Spatial Sound Events 2021 dataset,and the error rate(ER)and localization error(LE)are reduced to 0.394 and 12.03.The F1-score and localization recall(LR)were improved to 72.6%and 73.8%.
作者
许春冬
汪雄
闵源
Xu Chundong;Wang Xiong;Min Yuan(School of Information Engineering,Jiangxi University of Science and Technology,Ganzhou 341000,China)
出处
《国外电子测量技术》
北大核心
2023年第8期33-39,共7页
Foreign Electronic Measurement Technology
基金
国家自然科学基金(61671442,11864016,11704164)
江西省科技厅重点研发计划一般项目(20202BBEL53006)
江西理工大学研究生创新专项资金项目(XY2022-S160)资助。
关键词
声音事件定位与检测
注意力机制
RMSAE
卷积神经网络
sound event localization and detection
attention mechanism
RMSAE
convolutional neural network