Cardiovascular diseases are a prominent cause of mortality,emphasizing the need for early prevention and diagnosis.Utilizing artificial intelligence(AI)models,heart sound analysis emerges as a noninvasive and universa...Cardiovascular diseases are a prominent cause of mortality,emphasizing the need for early prevention and diagnosis.Utilizing artificial intelligence(AI)models,heart sound analysis emerges as a noninvasive and universally applicable approach for assessing cardiovascular health conditions.However,real-world medical data are dispersed across medical institutions,forming“data islands”due to data sharing limitations for security reasons.To this end,federated learning(FL)has been extensively employed in the medical field,which can effectively model across multiple institutions.Additionally,conventional supervised classification methods require fully labeled data classes,e.g.,binary classification requires labeling of positive and negative samples.Nevertheless,the process of labeling healthcare data is timeconsuming and labor-intensive,leading to the possibility of mislabeling negative samples.In this study,we validate an FL framework with a naive positive-unlabeled(PU)learning strategy.Semisupervised FL model can directly learn from a limited set of positive samples and an extensive pool of unlabeled samples.Our emphasis is on vertical-FL to enhance collaboration across institutions with different medical record feature spaces.Additionally,our contribution extends to feature importance analysis,where we explore 6 methods and provide practical recommendations for detecting abnormal heart sounds.The study demonstrated an impressive accuracy of 84%,comparable to outcomes in supervised learning,thereby advancing the application of FL in abnormal heart sound detection.展开更多
Clustering is widely exploited in data mining.It has been proved that embedding weak label prior into clustering is effective to promote its performance.Previous researches mainly focus on only one type of prior.Howev...Clustering is widely exploited in data mining.It has been proved that embedding weak label prior into clustering is effective to promote its performance.Previous researches mainly focus on only one type of prior.However,in many real scenarios,two kinds of weak label prior information,e.g.,pairwise constraints and cluster ratio,are easily obtained or already available.How to incorporate them to improve clustering performance is important but rarely studied.We propose a novel constrained Clustering with Weak Label Prior method(CWLP),which is an integrated framework.Within the unified spectral clustering model,the pairwise constraints are employed as a regularizer in spectral embedding and label proportion is added as a constraint in spectral rotation.To approximate a variant of the embedding matrix more precisely,we replace a cluster indicator matrix with its scaled version.Instead of fixing an initial similarity matrix,we propose a new similarity matrix that is more suitable for deriving clustering results.Except for the theoretical convergence and computational complexity analyses,we validate the effectiveness of CWLP through several benchmark datasets,together with its ability to discriminate suspected breast cancer patients from healthy controls.The experimental evaluation illustrates the superiority of our proposed approach.展开更多
基金partially supported by the National Natural Science Foundation of China(grant number 62272044)the Ministry of Science and Technology of the People’s Republic of China with the STI2030-Major Projects(grant number 2021ZD0201900)+5 种基金the Teli Young Fellow Program from the Beijing Institute of Technology,Chinathe Grants-in-Aid for Scientific Research(grant number 20H00569)from the Ministry of Education,Culture,Sports,Science and Technology(MEXT),Japanthe JSPS KAKENHI(grant number 20H00569),Japanthe JST Mirai Program(grant number 21473074),Japanthe JST MOONSHOT Program(grant number JPMJMS229B),Japanthe BIT Research and Innovation Promoting Project(grant number 2023YCXZ014).
文摘Cardiovascular diseases are a prominent cause of mortality,emphasizing the need for early prevention and diagnosis.Utilizing artificial intelligence(AI)models,heart sound analysis emerges as a noninvasive and universally applicable approach for assessing cardiovascular health conditions.However,real-world medical data are dispersed across medical institutions,forming“data islands”due to data sharing limitations for security reasons.To this end,federated learning(FL)has been extensively employed in the medical field,which can effectively model across multiple institutions.Additionally,conventional supervised classification methods require fully labeled data classes,e.g.,binary classification requires labeling of positive and negative samples.Nevertheless,the process of labeling healthcare data is timeconsuming and labor-intensive,leading to the possibility of mislabeling negative samples.In this study,we validate an FL framework with a naive positive-unlabeled(PU)learning strategy.Semisupervised FL model can directly learn from a limited set of positive samples and an extensive pool of unlabeled samples.Our emphasis is on vertical-FL to enhance collaboration across institutions with different medical record feature spaces.Additionally,our contribution extends to feature importance analysis,where we explore 6 methods and provide practical recommendations for detecting abnormal heart sounds.The study demonstrated an impressive accuracy of 84%,comparable to outcomes in supervised learning,thereby advancing the application of FL in abnormal heart sound detection.
基金supported by the National Key R&D Program(No.2022ZD0114803)the National Natural Science Foundation of China(Grant Nos.62136005,61922087).
文摘Clustering is widely exploited in data mining.It has been proved that embedding weak label prior into clustering is effective to promote its performance.Previous researches mainly focus on only one type of prior.However,in many real scenarios,two kinds of weak label prior information,e.g.,pairwise constraints and cluster ratio,are easily obtained or already available.How to incorporate them to improve clustering performance is important but rarely studied.We propose a novel constrained Clustering with Weak Label Prior method(CWLP),which is an integrated framework.Within the unified spectral clustering model,the pairwise constraints are employed as a regularizer in spectral embedding and label proportion is added as a constraint in spectral rotation.To approximate a variant of the embedding matrix more precisely,we replace a cluster indicator matrix with its scaled version.Instead of fixing an initial similarity matrix,we propose a new similarity matrix that is more suitable for deriving clustering results.Except for the theoretical convergence and computational complexity analyses,we validate the effectiveness of CWLP through several benchmark datasets,together with its ability to discriminate suspected breast cancer patients from healthy controls.The experimental evaluation illustrates the superiority of our proposed approach.