Background A crucial element of human-machine interaction,the automatic detection of emotional states from human speech has long been regarded as a challenging task for machine learning models.One vital challenge in s...Background A crucial element of human-machine interaction,the automatic detection of emotional states from human speech has long been regarded as a challenging task for machine learning models.One vital challenge in speech emotion recognition(SER)is learning robust and discriminative representations from speech.Although machine learning methods have been widely applied in SER research,the inadequate amount of available annotated data has become a bottleneck impeding the extended application of such techniques(e.g.,deep neural networks).To address this issue,we present a deep learning method that combines knowledge transfer and self-attention for SER tasks.Herein,we apply the log-Mel spectrogram with deltas and delta-deltas as inputs.Moreover,given that emotions are time dependent,we apply temporal convolutional neural networks to model the variations in emotions.We further introduce an attention transfer mechanism,which is based on a self-attention algorithm to learn long-term dependencies.The self-attention transfer network(SATN)in our proposed approach takes advantage of attention transfer to learn attention from speech recognition,followed by transferring this knowledge into SER.An evaluation built on Interactive Emotional Dyadic Motion Capture(IEMOCAP)dataset demonstrates the effectiveness of the proposed model.展开更多
Facial micro-expressions are short and imperceptible expressions that involuntarily reveal the true emotions that a person may be attempting to suppress,hide,disguise,or conceal.Such expressions can reflect a person...Facial micro-expressions are short and imperceptible expressions that involuntarily reveal the true emotions that a person may be attempting to suppress,hide,disguise,or conceal.Such expressions can reflect a person's real emotions and have a wide range of application in public safety and clinical diagnosis.The analysis of facial micro-expressions in video sequences through computer vision is still relatively recent.In this research,a comprehensive review on the topic of spotting and recognition used in micro expression analysis databases and methods,is conducted,and advanced technologies in this area are summarized.In addition,we discuss challenges that remain unresolved alongside future work to be completed in the field of micro-expression analysis.展开更多
Background Continuous emotion recognition as a function of time assigns emotional values to every frame in a sequence.Incorporating long-term temporal context information is essential for continuous emotion recognitio...Background Continuous emotion recognition as a function of time assigns emotional values to every frame in a sequence.Incorporating long-term temporal context information is essential for continuous emotion recognition tasks.Methods For this purpose,we employ a window of feature frames in place of a single frame as inputs to strengthen the temporal modeling at the feature level.The ideas of frame skipping and temporal pooling are utilized to alleviate the resulting redundancy.At the model level,we leverage the skip recurrent neural network to model the long-term temporal variability by skipping trivial information for continuous emotion recognition.Results The experimental results using the AVEC 2017 database demonstrate that our proposed methods are beneficial to a performance improvement.Further,the skip long short-term memory(LSTM)model can focus on the critical emotional state when training the models,thereby achieving a better performance than the LSTM model and other methods.展开更多
Emotion recognition is to quantify,describe and recognize different emotional states through the behavioral and physiological responses generated from emotional expressions.Emotion recognition is an important field du...Emotion recognition is to quantify,describe and recognize different emotional states through the behavioral and physiological responses generated from emotional expressions.Emotion recognition is an important field due to its wide applications in many tasks,such as dialogue generation,social media analysis and intelligent system.It builds a harmonious human-computer environment by enabling the computer systems and devices to recognize and interpret human affects.Emotion recognition models are built using multimodal information such as audio,video,text and so on.It is important to consider emotion characteristics of humans in the design and presentation of intelligent interaction.We have selected seven papers that provide the latest updates on the development of emotion recognition technology covering micro-expression spotting and recognition,speech emotion recognition,physiological signal emotion recognition,emotional dialog generation and so on.展开更多
Information networks store rich information in the nodes and edges,which benefit many downstream tasks,such as recommender systems and knowledge graph completion.Information networks contain homogeneous information,he...Information networks store rich information in the nodes and edges,which benefit many downstream tasks,such as recommender systems and knowledge graph completion.Information networks contain homogeneous information,heterogeneous information and knowledge graphs.A significant number of surveys focus on one of the three parts and summarize the research works,but few surveys conclude and compare the three kinds of information networks.In addition,in real scenarios,lots of information networks lack sufficient labeled data,so the combination of meta-learning and information networks can bring in extended applications.This paper concentrates on few-shot information networks and systematically presents recent works to help analyze and follow related works.展开更多
基金the National Natural Science Foundation of China(62071330)the National Science Fund for Distinguished Young Scholars(61425017)+3 种基金the Key Program of the National Natural Science Foundation(61831022)the Key Program of the Natural Science Foundation of Tianjin(18JCZDJC36300)the Open Projects Program of the National Laboratory of Pattern Recognition and the Senior Visiting Scholar Program of Tianjin Normal Universitythe Innovative Medicines Initiative 2 Joint Undertaking(115902),which receives support from the European Union's Horizon 2020 research and innovation program and EFPIA.
文摘Background A crucial element of human-machine interaction,the automatic detection of emotional states from human speech has long been regarded as a challenging task for machine learning models.One vital challenge in speech emotion recognition(SER)is learning robust and discriminative representations from speech.Although machine learning methods have been widely applied in SER research,the inadequate amount of available annotated data has become a bottleneck impeding the extended application of such techniques(e.g.,deep neural networks).To address this issue,we present a deep learning method that combines knowledge transfer and self-attention for SER tasks.Herein,we apply the log-Mel spectrogram with deltas and delta-deltas as inputs.Moreover,given that emotions are time dependent,we apply temporal convolutional neural networks to model the variations in emotions.We further introduce an attention transfer mechanism,which is based on a self-attention algorithm to learn long-term dependencies.The self-attention transfer network(SATN)in our proposed approach takes advantage of attention transfer to learn attention from speech recognition,followed by transferring this knowledge into SER.An evaluation built on Interactive Emotional Dyadic Motion Capture(IEMOCAP)dataset demonstrates the effectiveness of the proposed model.
文摘Facial micro-expressions are short and imperceptible expressions that involuntarily reveal the true emotions that a person may be attempting to suppress,hide,disguise,or conceal.Such expressions can reflect a person's real emotions and have a wide range of application in public safety and clinical diagnosis.The analysis of facial micro-expressions in video sequences through computer vision is still relatively recent.In this research,a comprehensive review on the topic of spotting and recognition used in micro expression analysis databases and methods,is conducted,and advanced technologies in this area are summarized.In addition,we discuss challenges that remain unresolved alongside future work to be completed in the field of micro-expression analysis.
基金the National Key Research&Development Plan of China(2017YFB1002804)the National Natural Science Foundation of China(NSFC)(61831022,61771472,61773379,61901473).
文摘Background Continuous emotion recognition as a function of time assigns emotional values to every frame in a sequence.Incorporating long-term temporal context information is essential for continuous emotion recognition tasks.Methods For this purpose,we employ a window of feature frames in place of a single frame as inputs to strengthen the temporal modeling at the feature level.The ideas of frame skipping and temporal pooling are utilized to alleviate the resulting redundancy.At the model level,we leverage the skip recurrent neural network to model the long-term temporal variability by skipping trivial information for continuous emotion recognition.Results The experimental results using the AVEC 2017 database demonstrate that our proposed methods are beneficial to a performance improvement.Further,the skip long short-term memory(LSTM)model can focus on the critical emotional state when training the models,thereby achieving a better performance than the LSTM model and other methods.
文摘Emotion recognition is to quantify,describe and recognize different emotional states through the behavioral and physiological responses generated from emotional expressions.Emotion recognition is an important field due to its wide applications in many tasks,such as dialogue generation,social media analysis and intelligent system.It builds a harmonious human-computer environment by enabling the computer systems and devices to recognize and interpret human affects.Emotion recognition models are built using multimodal information such as audio,video,text and so on.It is important to consider emotion characteristics of humans in the design and presentation of intelligent interaction.We have selected seven papers that provide the latest updates on the development of emotion recognition technology covering micro-expression spotting and recognition,speech emotion recognition,physiological signal emotion recognition,emotional dialog generation and so on.
文摘Information networks store rich information in the nodes and edges,which benefit many downstream tasks,such as recommender systems and knowledge graph completion.Information networks contain homogeneous information,heterogeneous information and knowledge graphs.A significant number of surveys focus on one of the three parts and summarize the research works,but few surveys conclude and compare the three kinds of information networks.In addition,in real scenarios,lots of information networks lack sufficient labeled data,so the combination of meta-learning and information networks can bring in extended applications.This paper concentrates on few-shot information networks and systematically presents recent works to help analyze and follow related works.