In the past decade,information technologies(e.g.,artificial intelligence(AI),big data,wearables)have deeply influenced the field of mental health.As a typical breaking-through idea,computational psychophysiology(CPP)h...In the past decade,information technologies(e.g.,artificial intelligence(AI),big data,wearables)have deeply influenced the field of mental health.As a typical breaking-through idea,computational psychophysiology(CPP)has changed the paradigm of mental healthcare from traditional“symptom description-driven”to“data-driven”.展开更多
Neural network models for audio tasks,such as automatic speech recognition(ASR)and acoustic scene classification(ASC),are susceptible to noise contamination for real-life applications.To improve audio quality,an enhan...Neural network models for audio tasks,such as automatic speech recognition(ASR)and acoustic scene classification(ASC),are susceptible to noise contamination for real-life applications.To improve audio quality,an enhancement module,which can be developed independently,is explicitly used at the front-end of the target audio applications.In this paper,we present an end-to-end learning solution to jointly optimise the models for audio enhancement(AE)and the subsequent applications.To guide the optimisation of the AE module towards a target application,and especially to overcome difficult samples,we make use of the sample-wise performance measure as an indication of sample importance.In experiments,we consider four representative applications to evaluate our training paradigm,i.e.,ASR,speech command recognition(SCR),speech emotion recognition(SER),and ASC.These applications are associated with speech and nonspeech tasks concerning semantic and non-semantic features,transient and global information,and the experimental results indicate that our proposed approach can considerably boost the noise robustness of the models,especially at low signal-to-noise ratios,for a wide range of computer audition tasks in everyday-life noisy environments.展开更多
Cardiovascular diseases are a prominent cause of mortality,emphasizing the need for early prevention and diagnosis.Utilizing artificial intelligence(AI)models,heart sound analysis emerges as a noninvasive and universa...Cardiovascular diseases are a prominent cause of mortality,emphasizing the need for early prevention and diagnosis.Utilizing artificial intelligence(AI)models,heart sound analysis emerges as a noninvasive and universally applicable approach for assessing cardiovascular health conditions.However,real-world medical data are dispersed across medical institutions,forming“data islands”due to data sharing limitations for security reasons.To this end,federated learning(FL)has been extensively employed in the medical field,which can effectively model across multiple institutions.Additionally,conventional supervised classification methods require fully labeled data classes,e.g.,binary classification requires labeling of positive and negative samples.Nevertheless,the process of labeling healthcare data is timeconsuming and labor-intensive,leading to the possibility of mislabeling negative samples.In this study,we validate an FL framework with a naive positive-unlabeled(PU)learning strategy.Semisupervised FL model can directly learn from a limited set of positive samples and an extensive pool of unlabeled samples.Our emphasis is on vertical-FL to enhance collaboration across institutions with different medical record feature spaces.Additionally,our contribution extends to feature importance analysis,where we explore 6 methods and provide practical recommendations for detecting abnormal heart sounds.The study demonstrated an impressive accuracy of 84%,comparable to outcomes in supervised learning,thereby advancing the application of FL in abnormal heart sound detection.展开更多
Leveraging the power of artificial intelligence to facilitate an automatic analysis and monitoring of heart sounds has increasingly attracted tremendous efforts in the past decade.Nevertheless,lacking on standard open...Leveraging the power of artificial intelligence to facilitate an automatic analysis and monitoring of heart sounds has increasingly attracted tremendous efforts in the past decade.Nevertheless,lacking on standard open-access database made it difficult to maintain a sustainable and comparable research before the first release of the PhysioNet CinC Challenge Dataset.However,inconsistent standards on data collection,annotation,and partition are still restraining a fair and efficient comparison between different works.To this line,we introduced and benchmarked a first version of the Heart Sounds Shenzhen(HSS)corpus.Motivated and inspired by the previous works based on HSS,we redefined the tasks and make a comprehensive investigation on shallow and deep models in this study.First,we segmented the heart sound recording into shorter recordings(10 s),which makes it more similar to the human auscultation case.Second,we redefined the classification tasks.Besides using the 3 class categories(normal,moderate,and mild/severe)adopted in HSS,we added a binary classification task in this study,i.e.,normal and abnormal.In this work,we provided detailed benchmarks based on both the classic machine learning and the state-of-the-art deep learning technologies,which are reproducible by using open-source toolkits.Last but not least,we analyzed the feature contributions of best performance achieved by the benchmark to make the results more convincing and interpretable.展开更多
基金supported by the National Key R&D Program of China(2023YFC2506804)the National Natural Science Foundation of China(62272044 and 62227807)+3 种基金the Beijing Natural Science Foundation(L243034)the Ministry of Science and Technology of the People’s Republic of China with the STI2030-Major Projects(2021ZD0201900)the Japan Society for the Promotion of Science(S24116)the Teli Young Fellow Program from the Beijing Institute of Technology,China.
文摘In the past decade,information technologies(e.g.,artificial intelligence(AI),big data,wearables)have deeply influenced the field of mental health.As a typical breaking-through idea,computational psychophysiology(CPP)has changed the paradigm of mental healthcare from traditional“symptom description-driven”to“data-driven”.
基金supported by the Affective Computing&HCI Innovation Research Lab between Huawei Technologies and the University of Augsburg,and the EU H2020 Project under Grant No.101135556(INDUX-R).
文摘Neural network models for audio tasks,such as automatic speech recognition(ASR)and acoustic scene classification(ASC),are susceptible to noise contamination for real-life applications.To improve audio quality,an enhancement module,which can be developed independently,is explicitly used at the front-end of the target audio applications.In this paper,we present an end-to-end learning solution to jointly optimise the models for audio enhancement(AE)and the subsequent applications.To guide the optimisation of the AE module towards a target application,and especially to overcome difficult samples,we make use of the sample-wise performance measure as an indication of sample importance.In experiments,we consider four representative applications to evaluate our training paradigm,i.e.,ASR,speech command recognition(SCR),speech emotion recognition(SER),and ASC.These applications are associated with speech and nonspeech tasks concerning semantic and non-semantic features,transient and global information,and the experimental results indicate that our proposed approach can considerably boost the noise robustness of the models,especially at low signal-to-noise ratios,for a wide range of computer audition tasks in everyday-life noisy environments.
基金partially supported by the National Natural Science Foundation of China(grant number 62272044)the Ministry of Science and Technology of the People’s Republic of China with the STI2030-Major Projects(grant number 2021ZD0201900)+5 种基金the Teli Young Fellow Program from the Beijing Institute of Technology,Chinathe Grants-in-Aid for Scientific Research(grant number 20H00569)from the Ministry of Education,Culture,Sports,Science and Technology(MEXT),Japanthe JSPS KAKENHI(grant number 20H00569),Japanthe JST Mirai Program(grant number 21473074),Japanthe JST MOONSHOT Program(grant number JPMJMS229B),Japanthe BIT Research and Innovation Promoting Project(grant number 2023YCXZ014).
文摘Cardiovascular diseases are a prominent cause of mortality,emphasizing the need for early prevention and diagnosis.Utilizing artificial intelligence(AI)models,heart sound analysis emerges as a noninvasive and universally applicable approach for assessing cardiovascular health conditions.However,real-world medical data are dispersed across medical institutions,forming“data islands”due to data sharing limitations for security reasons.To this end,federated learning(FL)has been extensively employed in the medical field,which can effectively model across multiple institutions.Additionally,conventional supervised classification methods require fully labeled data classes,e.g.,binary classification requires labeling of positive and negative samples.Nevertheless,the process of labeling healthcare data is timeconsuming and labor-intensive,leading to the possibility of mislabeling negative samples.In this study,we validate an FL framework with a naive positive-unlabeled(PU)learning strategy.Semisupervised FL model can directly learn from a limited set of positive samples and an extensive pool of unlabeled samples.Our emphasis is on vertical-FL to enhance collaboration across institutions with different medical record feature spaces.Additionally,our contribution extends to feature importance analysis,where we explore 6 methods and provide practical recommendations for detecting abnormal heart sounds.The study demonstrated an impressive accuracy of 84%,comparable to outcomes in supervised learning,thereby advancing the application of FL in abnormal heart sound detection.
基金partially supported by the Ministry of Science and Technology of the People's Republic of China with the STI2030-Major Projects(2021ZD0201900)the National Natural Science Foundation of China(No.62227807 and 62272044)+3 种基金the Teli Young Fellow Program from the Beijing Institute of Technology,Chinathe Natural Science Foundation of Shenzhen University General Hospital(No.SUGH2018QD013),Chinathe Shenzhen Science and Technology Innovation Commission Project(No.JCYJ20190808120613189),Chinathe Grants-in-Aid for Scientific Research(No.20H00569)from the Ministry of Education,Culture,Sports,Science and Technology(MEXT),Japan.
文摘Leveraging the power of artificial intelligence to facilitate an automatic analysis and monitoring of heart sounds has increasingly attracted tremendous efforts in the past decade.Nevertheless,lacking on standard open-access database made it difficult to maintain a sustainable and comparable research before the first release of the PhysioNet CinC Challenge Dataset.However,inconsistent standards on data collection,annotation,and partition are still restraining a fair and efficient comparison between different works.To this line,we introduced and benchmarked a first version of the Heart Sounds Shenzhen(HSS)corpus.Motivated and inspired by the previous works based on HSS,we redefined the tasks and make a comprehensive investigation on shallow and deep models in this study.First,we segmented the heart sound recording into shorter recordings(10 s),which makes it more similar to the human auscultation case.Second,we redefined the classification tasks.Besides using the 3 class categories(normal,moderate,and mild/severe)adopted in HSS,we added a binary classification task in this study,i.e.,normal and abnormal.In this work,we provided detailed benchmarks based on both the classic machine learning and the state-of-the-art deep learning technologies,which are reproducible by using open-source toolkits.Last but not least,we analyzed the feature contributions of best performance achieved by the benchmark to make the results more convincing and interpretable.