期刊文献+

基于深度学习语音分析的双相障碍患者情绪时相检测 被引量:1

Emotional time-based detection of patients with bipolar disorder based on deep learning speech analysis
原文传递
导出
摘要 目的利用基于语音的深度学习方法区分双相障碍患者抑郁和躁狂情绪时相。方法选取于2018年6月至2022年3月就诊于北京大学第六医院精神科门诊的双相障碍患者61例,使用抑郁症状快速筛查量表、心境障碍问卷和杨氏躁狂量表评估患者的情绪时相。收集所有患者的语音,缓解期、抑郁情绪和躁狂情绪各190条。使用Python中的语音分析库提取语音中的梅尔倒谱系数、过零率等136个特征,通过类LIGHT-SERNET网络训练模型检测情绪时相。采用准确度评估模型整体性能,使用敏感度、特异度、阳性预测值(positive predictive value,PPV)、阴性预测值(negative predictive value,NPV)、受试者工作特征(receiver operation characteristic,ROC)曲线评估模型对3种情绪时相的预测结果。不同情绪时相人口统计学信息比较采用Kruskal-Wallis H检验或χ^(2)检验。结果双相障碍3种情绪时相患者年龄(H=25.83,P<0.001)、受教育年限(H=25.25,P<0.001)和婚姻状况(χ^(2)=23.81,P<0.001)差异均有统计学意义,性别差异无统计学意义(χ^(2)=4.63,P=0.099)。类LIGHT-SERNET模型对3种情绪时相检测的准确度为0.84,其中对缓解期的敏感度为0.88,特异度为0.93,PPV为0.87,NPV为0.94;对抑郁情绪的敏感度为0.82,特异度为0.92,PPV为0.84,NPV为0.92;对躁狂情绪的敏感度为0.82,特异度为0.91,PPV为0.83,NPV为0.91。模型对3种语音情绪时相检测的ROC曲线面积值相近,均在0.90以上。结论通过类LIGHT-SERNET网络对语音进行深度学习分析建立的模型对双相障碍抑郁和躁狂情绪时相具有较好的区分度。 Objective To utilize a deep learning approach based on speech to distinguish between depressive and manic mood states in patients with bipolar disorder(BD).Methods Sixty-one BD patients who visited the outpatient department of psychiatry at Peking University Sixth Hospital were recruited to participate in the study from June 2018 to March 2022.Quick Inventory of Depressive Symptomatology,Mood Disorder Questionnaire and Young Mania Rating Scale were used to determine patients′mood states.The voices of the patients were recorded,including 190 samples during the patient′s remission,depressive,and manic mood period respectively.A total of 136 features were extracted from the voice samples,including Mel-frequency cepstral coefficients and zero-crossing rates using the speech analysis library in Python.A LIGHT-SERNET-based network was then used to train a model for emotion classification.Accuracy is used to evaluate the performance of the model,using sensitivity,specificity,positive predictive value(PPV),negative predictive value(NPV),and receiver operating characteristic curve(ROC)to evaluate the predictive results of model for three mood states.Kruskal-Wallis H tests orχ^(2)tests were conducted to compare the differences among the demographic information of three groups.Results There were statistically significant differences among the three groups in age(H=25.83,P<0.001),years of education(H=25.25,P<0.001)and marital status(χ^(2)=23.81,P<0.001).There is no significant difference in gender(χ^(2)=4.63,P=0.099).The accuracy of the model in detecting the three emotional states was 0.84.The sensitivity and specificity in detecting remission were 0.88 and 0.93,respectively,and the positive predictive value and negative predictive value were 0.87 and 0.94,respectively.The sensitivity and specificity in detecting depressive episodes were 0.82 and 0.92,respectively,and the positive predictive value and negative predictive value were 0.84 and 0.92,respectively.The sensitivity and specificity in detecting manic episodes were 0.82 and 0.91,respectively,and the positive predictive value and negative predictive value were 0.83 and 0.91,respectively.The areas of the receiver operation characteristic curve for the three mood states were similar and all exceeded 0.90.Conclusion The LIGHT-SERNET-based deep learning model shows good discrimination ability between depressive and manic mood states based on speech analysis.
作者 李志营 纪俊 周书喆 李嘉琪 李欣慧 冯超南 管丽丽 马灶晖 马燕桃 Li Zhiying;Ji Jun;Zhou Shuzhe;Li Jiaqi;Li Xinhui;Feng Chaonan;Guan Lili;Ma Zaohui;Ma Yantao(Clinical Research Division,Peking University Sixth Hospital,Peking University Institute of Mental Health,NHC Key Laboratory of Mental Health(Peking University),National Clinical Research Center for Mental Disorders(Peking University Sixth Hospital),Beijing 100191,China;College of Computer Science and Technology,Qingdao University,Qingdao 266071,China;Department of Psychology,Queen′s University,Ontario,K7L 3N6,Canada;College of Electronic and Electrical Engineering,University of London,London N16AT,UK;Beijing Wanling Pangu Science and Technology Ltd.,Beijing 100080,China;Zhongshan school of medicine,Guangzhou 510080,China)
出处 《中华精神科杂志》 CAS CSCD 北大核心 2024年第4期207-212,共6页 Chinese Journal of Psychiatry
基金 北京市科委首都发展专项基金(2018-2-4112) 北京市科委首都临床特色应用研究与成果推广重大项目(Z171100001017086)。
关键词 双相情感障碍 语音 情绪时相 深度学习 类LIGHT-SERNET网络 Bipolar disorder Voice Mood states Deep learning LIGHT-SERNET-based
  • 相关文献

参考文献7

二级参考文献32

共引文献69

同被引文献12

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部