Sonic Hedgehog Medulloblastoma(SHH-MB)is one of the four primary molecular subgroups of Medulloblastoma.It is estimated to be responsible for nearly one-third of allMB cases.Using transcriptomic and DNA methylation pr...Sonic Hedgehog Medulloblastoma(SHH-MB)is one of the four primary molecular subgroups of Medulloblastoma.It is estimated to be responsible for nearly one-third of allMB cases.Using transcriptomic and DNA methylation profiling techniques,new developments in this field determined four molecular subtypes for SHH-MB.SHH-MB subtypes show distinct DNAmethylation patterns that allow their discrimination fromoverlapping subtypes and predict clinical outcomes.Class overlapping occurs when two or more classes share common features,making it difficult to distinguish them as separate.Using the DNA methylation dataset,a novel classification technique is presented to address the issue of overlapping SHH-MBsubtypes.Penalizedmultinomial regression(PMR),Tomek links(TL),and singular value decomposition(SVD)were all smoothly integrated into a single framework.SVD and group lasso improve computational efficiency,address the problem of high-dimensional datasets,and clarify class distinctions by removing redundant or irrelevant features that might lead to class overlap.As a method to eliminate the issues of decision boundary overlap and class imbalance in the classification task,TL enhances dataset balance and increases the clarity of decision boundaries through the elimination of overlapping samples.Using fivefold cross-validation,our proposed method(TL-SVDPMR)achieved a remarkable overall accuracy of almost 95%in the classification of SHH-MB molecular subtypes.The results demonstrate the strong performance of the proposed classification model among the various SHH-MB subtypes given a high average of the area under the curve(AUC)values.Additionally,the statistical significance test indicates that TL-SVDPMR is more accurate than both SVM and random forest algorithms in classifying the overlapping SHH-MB subtypes,highlighting its importance for precision medicine applications.Our findings emphasized the success of combining SVD,TL,and PMRtechniques to improve the classification performance for biomedical applications with many features and overlapping subtypes.展开更多
基金funded by the Deanship of Graduate Studies and Scientific Research at Jouf University under grant No.(DGSSR-2024-02-01137).
文摘Sonic Hedgehog Medulloblastoma(SHH-MB)is one of the four primary molecular subgroups of Medulloblastoma.It is estimated to be responsible for nearly one-third of allMB cases.Using transcriptomic and DNA methylation profiling techniques,new developments in this field determined four molecular subtypes for SHH-MB.SHH-MB subtypes show distinct DNAmethylation patterns that allow their discrimination fromoverlapping subtypes and predict clinical outcomes.Class overlapping occurs when two or more classes share common features,making it difficult to distinguish them as separate.Using the DNA methylation dataset,a novel classification technique is presented to address the issue of overlapping SHH-MBsubtypes.Penalizedmultinomial regression(PMR),Tomek links(TL),and singular value decomposition(SVD)were all smoothly integrated into a single framework.SVD and group lasso improve computational efficiency,address the problem of high-dimensional datasets,and clarify class distinctions by removing redundant or irrelevant features that might lead to class overlap.As a method to eliminate the issues of decision boundary overlap and class imbalance in the classification task,TL enhances dataset balance and increases the clarity of decision boundaries through the elimination of overlapping samples.Using fivefold cross-validation,our proposed method(TL-SVDPMR)achieved a remarkable overall accuracy of almost 95%in the classification of SHH-MB molecular subtypes.The results demonstrate the strong performance of the proposed classification model among the various SHH-MB subtypes given a high average of the area under the curve(AUC)values.Additionally,the statistical significance test indicates that TL-SVDPMR is more accurate than both SVM and random forest algorithms in classifying the overlapping SHH-MB subtypes,highlighting its importance for precision medicine applications.Our findings emphasized the success of combining SVD,TL,and PMRtechniques to improve the classification performance for biomedical applications with many features and overlapping subtypes.