Owing to the constraints of depth sensing technology,images acquired by depth cameras are inevitably mixed with various noises.For depth maps presented in gray values,this research proposes a novel denoising model,ter...Owing to the constraints of depth sensing technology,images acquired by depth cameras are inevitably mixed with various noises.For depth maps presented in gray values,this research proposes a novel denoising model,termed graph-based transform(GBT)and dual graph Laplacian regularization(DGLR)(DGLR-GBT).This model specifically aims to remove Gaussian white noise by capitalizing on the nonlocal self-similarity(NSS)and the piecewise smoothness properties intrinsic to depth maps.Within the group sparse coding(GSC)framework,a combination of GBT and DGLR is implemented.Firstly,within each group,the graph is constructed by using estimates of the true values of the averaged blocks instead of the observations.Secondly,the graph Laplacian regular terms are constructed based on rows and columns of similar block groups,respectively.Lastly,the solution is obtained effectively by combining the alternating direction multiplication method(ADMM)with the weighted thresholding method within the domain of GBT.展开更多
Emotion Recognition in Conversations(ERC)is fundamental in creating emotionally intelligentmachines.Graph-BasedNetwork(GBN)models have gained popularity in detecting conversational contexts for ERC tasks.However,their...Emotion Recognition in Conversations(ERC)is fundamental in creating emotionally intelligentmachines.Graph-BasedNetwork(GBN)models have gained popularity in detecting conversational contexts for ERC tasks.However,their limited ability to collect and acquire contextual information hinders their effectiveness.We propose a Text Augmentation-based computational model for recognizing emotions using transformers(TA-MERT)to address this.The proposed model uses the Multimodal Emotion Lines Dataset(MELD),which ensures a balanced representation for recognizing human emotions.Themodel used text augmentation techniques to producemore training data,improving the proposed model’s accuracy.Transformer encoders train the deep neural network(DNN)model,especially Bidirectional Encoder(BE)representations that capture both forward and backward contextual information.This integration improves the accuracy and robustness of the proposed model.Furthermore,we present a method for balancing the training dataset by creating enhanced samples from the original dataset.By balancing the dataset across all emotion categories,we can lessen the adverse effects of data imbalance on the accuracy of the proposed model.Experimental results on the MELD dataset show that TA-MERT outperforms earlier methods,achieving a weighted F1 score of 62.60%and an accuracy of 64.36%.Overall,the proposed TA-MERT model solves the GBN models’weaknesses in obtaining contextual data for ERC.TA-MERT model recognizes human emotions more accurately by employing text augmentation and transformer-based encoding.The balanced dataset and the additional training samples also enhance its resilience.These findings highlight the significance of transformer-based approaches for special emotion recognition in conversations.展开更多
基金National Natural Science Foundation of China(No.62372100)。
文摘Owing to the constraints of depth sensing technology,images acquired by depth cameras are inevitably mixed with various noises.For depth maps presented in gray values,this research proposes a novel denoising model,termed graph-based transform(GBT)and dual graph Laplacian regularization(DGLR)(DGLR-GBT).This model specifically aims to remove Gaussian white noise by capitalizing on the nonlocal self-similarity(NSS)and the piecewise smoothness properties intrinsic to depth maps.Within the group sparse coding(GSC)framework,a combination of GBT and DGLR is implemented.Firstly,within each group,the graph is constructed by using estimates of the true values of the averaged blocks instead of the observations.Secondly,the graph Laplacian regular terms are constructed based on rows and columns of similar block groups,respectively.Lastly,the solution is obtained effectively by combining the alternating direction multiplication method(ADMM)with the weighted thresholding method within the domain of GBT.
文摘Emotion Recognition in Conversations(ERC)is fundamental in creating emotionally intelligentmachines.Graph-BasedNetwork(GBN)models have gained popularity in detecting conversational contexts for ERC tasks.However,their limited ability to collect and acquire contextual information hinders their effectiveness.We propose a Text Augmentation-based computational model for recognizing emotions using transformers(TA-MERT)to address this.The proposed model uses the Multimodal Emotion Lines Dataset(MELD),which ensures a balanced representation for recognizing human emotions.Themodel used text augmentation techniques to producemore training data,improving the proposed model’s accuracy.Transformer encoders train the deep neural network(DNN)model,especially Bidirectional Encoder(BE)representations that capture both forward and backward contextual information.This integration improves the accuracy and robustness of the proposed model.Furthermore,we present a method for balancing the training dataset by creating enhanced samples from the original dataset.By balancing the dataset across all emotion categories,we can lessen the adverse effects of data imbalance on the accuracy of the proposed model.Experimental results on the MELD dataset show that TA-MERT outperforms earlier methods,achieving a weighted F1 score of 62.60%and an accuracy of 64.36%.Overall,the proposed TA-MERT model solves the GBN models’weaknesses in obtaining contextual data for ERC.TA-MERT model recognizes human emotions more accurately by employing text augmentation and transformer-based encoding.The balanced dataset and the additional training samples also enhance its resilience.These findings highlight the significance of transformer-based approaches for special emotion recognition in conversations.