期刊文献+

基于多尺度注意力时序编码网络的语音诱发脑电解码

Speech-evoked EEG decoding based on Multi-scale Attention Temporal Encoding Network
在线阅读 下载PDF
导出
摘要 针对诱发隐性语音(无声/想象语音)的脑电信号特征复杂且数据获取困难的问题,提出多尺度注意力时序编码网络(MATE-Net),利用相对丰富的显性语音数据训练模型,应用于隐性语音解码任务.模型通过Inception多感受野模块提取多尺度特征;引入双向GRU结构有效捕获前后文依赖关系,增强对时序动态的表征能力;为了解决深层网络训练问题,加入残差连接机制,确保梯度在反向传播过程中的稳定性;引入多头注意力机制以有效捕捉局部与全局时序依赖,增强关键特征的表达.实验结果表明,本模型在显性语音解码任务中展现出良好的性能表现.在五折交叉验证中,测试集的平均准确率达到74.30%,且Spearman相关系数和Pearson相关系数分别为0.884与0.942.MATE-Net的预训练模型能够成功应用于无声语音及想象语音任务,实现语音频谱的有效重构. A Multi-scale Attention Temporal Encoding Network(MATE-Net)was proposed to address the issues of complex EEG signal features and difficulty in acquiring elicited covert speech data(whispered and imagined).The relatively abundant overt speech data was leveraged to train the model,which was then applied to covert speech decoding tasks.An Inception-based multi-receptive field module was utilized to extract multi-scale features from the input signals,while a bidirectional GRU architecture was employed to capture contextual dependencies and improve the representation of temporal dynamics.To tackle the training issues of deep networks,a residual connection mechanism was added to ensure robust gradient flow during backpropagation.Moreover,a multi-head attention mechanism was introduced to effectively capture both local and global temporal dependencies,thereby strengthening the representation of salient features in the sequence.Experimental results showed that the model achieved excellent performance in overt speech decoding,with an average accuracy of 74.30%on the test set and Spearman and Pearson correlation coefficients of 0.884 and 0.942,respectively,in five-fold cross-validation.The pre-trained MATE-Net was successfully applied to whispered and imagined speech tasks,enabling effective reconstruction of speech spectrograms.
作者 姚梓豪 贾海蓉 李雅荣 陈桂军 YAO Zihao;JIA Hairong;LI Yarong;CHEN Guijun(College of Electronic and Information Engineering,Taiyuan University of Technology,Taiyuan 030024,China)
出处 《浙江大学学报(工学版)》 北大核心 2026年第4期896-905,共10页 Journal of Zhejiang University(Engineering Science)
基金 国家自然科学基金资助项目(62201377) 山西省基础研究计划资助项目(202403021211098) 山西省研究生创新项目(RC2400005582).
关键词 脑机接口 脑电图(EEG) 显性语音 无声语音 想象语音 brain-computer interface electroencephalography(EEG) overt speech whispered speech imagined speech
  • 相关文献

参考文献1

二级参考文献5

共引文献7

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部