摘要
语音是现有嵌入式移动设备广泛使用的一种输入接口.尽管现有的云端服务提供商提供了强大的语音语言理解(Spoken Language Understanding,SLU)服务,但也对用户隐私造成了极大的威胁.为此,基于信息解耦的隐私保护编码器被提出,以在不影响SLU功能的前提下,从语音信号中移除敏感信息.然而,这些编码器往往需要较高的内存和复杂的计算,因而在资源受限的小型设备上难以实际应用.本文基于大量实验观察到了一个关键现象,即SLU依赖于整个语句的全局信息,而隐私敏感词的识别则多为局部信息依赖.利用这一观察,我们提出了一个面向语音意图理解的高效编码器(SImpLe ENCodEr designed for efficient privacy-preserving SLU offloading,SILENCE)系统.我们在STM32H7微控制单元上实现了该系统,并在不同的攻击场景下评估了其效果.实验结果表明:SILENCE在语音意图提取任务的性能和隐私保护能力上可与传统隐私保护编码器媲美,同时实现了高达53.3倍的速度提升和134.1倍的内存占用减少,首次在内存仅有1 MB的微控制单元上实现了隐私保护的SLU服务.
Speech input is increasingly adopted as an intuitive interface for various embedded mobile devices.Cloudbased solutions provide powerful speech language understanding(SLU)capabilities but introduce privacy risks,as sensitive information may be processed remotely.To address these concerns,disentanglement-based encoders have been developed to strip sensitive data from audio signals,allowing SLU without compromising privacy.However,such encoders are often memory-intensive and computationally demanding,limiting their practicality on resource-constrained devices.Based on extensive experiments,this paper observes a key phenomenon:SLU relies on global information from the entire sentence,whereas the recognition of privacy-sensitive words predominantly depends on local information.We implemented simple encoder designed for efficient privacy-preserving SLU offloading(SILENCE)on an STM32H7 microcontroller and evaluated its performance under various privacy threat scenarios.Results demonstrate that SILENCE provides competitive speech intent classification accuracy and privacy protection compared to more complex encoders.Simultaneously,it achieves a speedup of up to 53.3 times and a reduction in memory footprint by 134.1 times,marking the first time that privacy-preserving SLU services have been realized on a microcontroller with only 1 MB of memory.
作者
蔡栋琪
王尚广
张泽凌
马骁
徐梦炜
CAI Dong-qi;WANG Shang-guang;ZHANG Ze-ling;Ma Xiao;XU Meng-wei(Department of Computer Science,Beijing University of Posts and Telecommunications,Beijing 100876,China;State Key Laboratory of Networking and Switching Technology,Beijing 100876,China)
出处
《电子学报》
北大核心
2025年第8期2601-2613,共13页
Acta Electronica Sinica
基金
国家自然科学基金(No.62032003,No.U21B2016,No.62425203)
中国科协青年人才托举工程(No.2023QNRC001)。
关键词
语音语言理解(SLU)
资源受限终端
隐私保护
微控制单元
语音意图提取
内存优化
spoken language understanding(SLU)
resource-constrained devices
privacy-preserving
microcontroller unit
speech intent classification
memory efficient