MFA-conformer methods are widely used in English and Chinese speaker recognition.Theoretically language-independent but practically language-related,Tibetan speaker recognition currently relies on traditional models w...MFA-conformer methods are widely used in English and Chinese speaker recognition.Theoretically language-independent but practically language-related,Tibetan speaker recognition currently relies on traditional models with poor performance.To address this,we adopt MFA-conformer as the basic framework and propose improvements:integrating 1D depth-wise separable convolution and channel attention into the conformer feed-forward network,fusing multi-block features,and adding an intra-class correlation regularizer to GE2E loss.Experiments show the improved model reduces the equal error rate(EER)compared with the conformer baseline.展开更多
文摘MFA-conformer methods are widely used in English and Chinese speaker recognition.Theoretically language-independent but practically language-related,Tibetan speaker recognition currently relies on traditional models with poor performance.To address this,we adopt MFA-conformer as the basic framework and propose improvements:integrating 1D depth-wise separable convolution and channel attention into the conformer feed-forward network,fusing multi-block features,and adding an intra-class correlation regularizer to GE2E loss.Experiments show the improved model reduces the equal error rate(EER)compared with the conformer baseline.