期刊文献+
共找到1篇文章
< 1 >
每页显示 20 50 100
End-to-End Audio Pattern Recognition Network for Overcoming Feature Limitations in Human-Machine Interaction
1
作者 Zijian Sun Yaqian Li +2 位作者 Haoran Liu Haibin Li Wenming Zhang 《Computers, Materials & Continua》 2025年第5期3187-3210,共24页
In recent years,audio pattern recognition has emerged as a key area of research,driven by its applications in human-computer interaction,robotics,and healthcare.Traditional methods,which rely heavily on handcrafted fe... In recent years,audio pattern recognition has emerged as a key area of research,driven by its applications in human-computer interaction,robotics,and healthcare.Traditional methods,which rely heavily on handcrafted features such asMel filters,often suffer frominformation loss and limited feature representation capabilities.To address these limitations,this study proposes an innovative end-to-end audio pattern recognition framework that directly processes raw audio signals,preserving original information and extracting effective classification features.The proposed framework utilizes a dual-branch architecture:a global refinement module that retains channel and temporal details and a multi-scale embedding module that captures high-level semantic information.Additionally,a guided fusion module integrates complementary features from both branches,ensuring a comprehensive representation of audio data.Specifically,the multi-scale audio context embedding module is designed to effectively extract spatiotemporal dependencies,while the global refinement module aggregates multi-scale channel and temporal cues for enhanced modeling.The guided fusion module leverages these features to achieve efficient integration of complementary information,resulting in improved classification accuracy.Experimental results demonstrate the model’s superior performance on multiple datasets,including ESC-50,UrbanSound8K,RAVDESS,and CREMA-D,with classification accuracies of 93.25%,90.91%,92.36%,and 70.50%,respectively.These results highlight the robustness and effectiveness of the proposed framework,which significantly outperforms existing approaches.By addressing critical challenges such as information loss and limited feature representation,thiswork provides newinsights and methodologies for advancing audio classification and multimodal interaction systems. 展开更多
关键词 audio pattern recognition raw audio end-to-end network feature fusion
在线阅读 下载PDF
上一页 1 下一页 到第
使用帮助 返回顶部