目的现有表情识别方法聚焦提升模型的整体识别准确率,对方法的头部姿态鲁棒性研究不充分。在实际应用中,人的头部姿态往往变化多样,影响表情识别效果,因此研究头部姿态对表情识别的影响,并提升模型在该方面的鲁棒性显得尤为重要。为此,...目的现有表情识别方法聚焦提升模型的整体识别准确率,对方法的头部姿态鲁棒性研究不充分。在实际应用中,人的头部姿态往往变化多样,影响表情识别效果,因此研究头部姿态对表情识别的影响,并提升模型在该方面的鲁棒性显得尤为重要。为此,在深入分析头部姿态对表情识别影响的基础上,提出一种能够基于无标签非正脸表情数据提升模型头部姿态鲁棒性的半监督表情识别方法。方法首先按头部姿态对典型表情识别数据集AffectNet重新划分,构建了AffectNet-Yaw数据集,支持在不同角度上进行模型精度测试,提升了模型对比公平性。其次,提出一种基于双一致性约束的半监督表情识别方法(dual-consistency semi-supervised learning for facial expression recognition,DCSSL),利用空间一致性模块对翻转前后人脸图像的类别激活一致性进行空间约束,使模型训练时更关注面部表情关键区域特征;利用语义一致性模块通过非对称数据增强和自学式学习方法不断地筛选高质量非正脸数据用于模型优化。在无需对非正脸表情数据人工标注的情况下,方法直接从有标签正脸数据和无标签非正脸数据中学习。最后,联合优化了交叉熵损失、空间一致性约束损失和语义一致性约束损失函数,以确保有监督学习和半监督学习之间的平衡。结果实验结果表明,头部姿态对自然场景表情识别有显著影响;提出AffectNet-Yaw具有更均衡的头部姿态分布,有效促进了对这种影响的全面评估;DCSSL方法结合空间一致性和语义一致性约束充分利用无标签非正脸表情数据,显著提高了模型在头部姿态变化下的鲁棒性,较MA-NET(multi-scale and local attention network)和EfficientFace全监督方法,平均表情识别精度分别提升了5.40%和17.01%。结论本文提出的双一致性半监督方法能充分利用正脸和非正脸数据,显著提升了模型在头部姿态变化下的表情识别精度;新数据集有效支撑了对头部姿态对表情识别影响的全面评估。展开更多
Facial expression datasets commonly exhibit imbalances between various categories or between difficult and simple samples.This imbalance introduces bias into feature extraction within facial expression recognition(FER...Facial expression datasets commonly exhibit imbalances between various categories or between difficult and simple samples.This imbalance introduces bias into feature extraction within facial expression recognition(FER)models,which hinders the algorithm’s comprehension of emotional states and reduces the overall recognition accuracy.A novel FER model is introduced to address these issues.It integrates rebalancing mechanisms to regulate attention consistency and focus,offering enhanced efficacy.Our approach proposes the following improvements:(i)rebalancing weights are used to enhance the consistency between the heatmaps of an original face sample and its horizontally flipped counterpart;(ii)coefficient factors are incorporated into the standard cross entropy loss function,and rebalancing weights are incorporated to fine-tune the loss adjustment.Experimental results indicate that the FER model outperforms the current leading algorithm,MEK,achieving 0.69%and 2.01%increases in overall and average recognition accuracies,respectively,on the RAF-DB dataset.The model exhibits accuracy improvements of 0.49%and 1.01%in the AffectNet dataset and 0.83%and 1.23%in the FERPlus dataset,respectively.These outcomes validate the superiority and stability of the proposed FER model.展开更多
文摘目的现有表情识别方法聚焦提升模型的整体识别准确率,对方法的头部姿态鲁棒性研究不充分。在实际应用中,人的头部姿态往往变化多样,影响表情识别效果,因此研究头部姿态对表情识别的影响,并提升模型在该方面的鲁棒性显得尤为重要。为此,在深入分析头部姿态对表情识别影响的基础上,提出一种能够基于无标签非正脸表情数据提升模型头部姿态鲁棒性的半监督表情识别方法。方法首先按头部姿态对典型表情识别数据集AffectNet重新划分,构建了AffectNet-Yaw数据集,支持在不同角度上进行模型精度测试,提升了模型对比公平性。其次,提出一种基于双一致性约束的半监督表情识别方法(dual-consistency semi-supervised learning for facial expression recognition,DCSSL),利用空间一致性模块对翻转前后人脸图像的类别激活一致性进行空间约束,使模型训练时更关注面部表情关键区域特征;利用语义一致性模块通过非对称数据增强和自学式学习方法不断地筛选高质量非正脸数据用于模型优化。在无需对非正脸表情数据人工标注的情况下,方法直接从有标签正脸数据和无标签非正脸数据中学习。最后,联合优化了交叉熵损失、空间一致性约束损失和语义一致性约束损失函数,以确保有监督学习和半监督学习之间的平衡。结果实验结果表明,头部姿态对自然场景表情识别有显著影响;提出AffectNet-Yaw具有更均衡的头部姿态分布,有效促进了对这种影响的全面评估;DCSSL方法结合空间一致性和语义一致性约束充分利用无标签非正脸表情数据,显著提高了模型在头部姿态变化下的鲁棒性,较MA-NET(multi-scale and local attention network)和EfficientFace全监督方法,平均表情识别精度分别提升了5.40%和17.01%。结论本文提出的双一致性半监督方法能充分利用正脸和非正脸数据,显著提升了模型在头部姿态变化下的表情识别精度;新数据集有效支撑了对头部姿态对表情识别影响的全面评估。
基金support from the National Natural Science Foundation of China(Grant Number 62477023).
文摘Facial expression datasets commonly exhibit imbalances between various categories or between difficult and simple samples.This imbalance introduces bias into feature extraction within facial expression recognition(FER)models,which hinders the algorithm’s comprehension of emotional states and reduces the overall recognition accuracy.A novel FER model is introduced to address these issues.It integrates rebalancing mechanisms to regulate attention consistency and focus,offering enhanced efficacy.Our approach proposes the following improvements:(i)rebalancing weights are used to enhance the consistency between the heatmaps of an original face sample and its horizontally flipped counterpart;(ii)coefficient factors are incorporated into the standard cross entropy loss function,and rebalancing weights are incorporated to fine-tune the loss adjustment.Experimental results indicate that the FER model outperforms the current leading algorithm,MEK,achieving 0.69%and 2.01%increases in overall and average recognition accuracies,respectively,on the RAF-DB dataset.The model exhibits accuracy improvements of 0.49%and 1.01%in the AffectNet dataset and 0.83%and 1.23%in the FERPlus dataset,respectively.These outcomes validate the superiority and stability of the proposed FER model.