Background The use of micro-expression recognition to recognize human emotions is one of the most critical challenges in human-computer interaction applications. In recent years, cross-database micro-expression recogn...Background The use of micro-expression recognition to recognize human emotions is one of the most critical challenges in human-computer interaction applications. In recent years, cross-database micro-expression recognition(CDMER) has emerged as a significant challenge in micro-expression recognition and analysis. Because the training and testing data in CDMER come from different micro-expression databases, CDMER is more challenging than conventional micro-expression recognition. Methods In this paper, an adaptive spatio-temporal attention neural network(ASTANN) using an attention mechanism is presented to address this challenge. To this end, the micro-expression databases SMIC and CASME II are first preprocessed using an optical flow approach,which extracts motion information among video frames that represent discriminative features of micro-expression.After preprocessing, a novel adaptive framework with a spatiotemporal attention module was designed to assign spatial and temporal weights to enhance the most discriminative features. The deep neural network then extracts the cross-domain feature, in which the second-order statistics of the sample features in the source domain are aligned with those in the target domain by minimizing the correlation alignment(CORAL) loss such that the source and target databases share similar distributions. Results To evaluate the performance of ASTANN, experiments were conducted based on the SMIC and CASME II databases under the standard experimental evaluation protocol of CDMER. The experimental results demonstrate that ASTANN outperformed other methods in relevant crossdatabase tasks. Conclusions Extensive experiments were conducted on benchmark tasks, and the results show that ASTANN has superior performance compared with other approaches. This demonstrates the superiority of our method in solving the CDMER problem.展开更多
The accurate segmentation of retinal vessels is a challenging taskdue to the presence of various pathologies as well as the low-contrast ofthin vessels and non-uniform illumination. In recent years, encoder-decodernet...The accurate segmentation of retinal vessels is a challenging taskdue to the presence of various pathologies as well as the low-contrast ofthin vessels and non-uniform illumination. In recent years, encoder-decodernetworks have achieved outstanding performance in retinal vessel segmentation at the cost of high computational complexity. To address the aforementioned challenges and to reduce the computational complexity, we proposea lightweight convolutional neural network (CNN)-based encoder-decoderdeep learning model for accurate retinal vessels segmentation. The proposeddeep learning model consists of encoder-decoder architecture along withbottleneck layers that consist of depth-wise squeezing, followed by fullconvolution, and finally depth-wise stretching. The inspiration for the proposed model is taken from the recently developed Anam-Net model, whichwas tested on CT images for COVID-19 identification. For our lightweightmodel, we used a stack of two 3 × 3 convolution layers (without spatialpooling in between) instead of a single 3 × 3 convolution layer as proposedin Anam-Net to increase the receptive field and to reduce the trainableparameters. The proposed method includes fewer filters in all convolutionallayers than the original Anam-Net and does not have an increasing numberof filters for decreasing resolution. These modifications do not compromiseon the segmentation accuracy, but they do make the architecture significantlylighter in terms of the number of trainable parameters and computation time.The proposed architecture has comparatively fewer parameters (1.01M) thanAnam-Net (4.47M), U-Net (31.05M), SegNet (29.50M), and most of the otherrecent works. The proposed model does not require any problem-specificpre- or post-processing, nor does it rely on handcrafted features. In addition,the attribute of being efficient in terms of segmentation accuracy as well aslightweight makes the proposed method a suitable candidate to be used in thescreening platforms at the point of care. We evaluated our proposed modelon open-access datasets namely, DRIVE, STARE, and CHASE_DB. Theexperimental results show that the proposed model outperforms several stateof-the-art methods, such as U-Net and its variants, fully convolutional network (FCN), SegNet, CCNet, ResWNet, residual connection-based encoderdecoder network (RCED-Net), and scale-space approx. network (SSANet) in terms of {dice coefficient, sensitivity (SN), accuracy (ACC), and the areaunder the ROC curve (AUC)} with the scores of {0.8184, 0.8561, 0.9669, and0.9868} on the DRIVE dataset, the scores of {0.8233, 0.8581, 0.9726, and0.9901} on the STARE dataset, and the scores of {0.8138, 0.8604, 0.9752,and 0.9906} on the CHASE_DB dataset. Additionally, we perform crosstraining experiments on the DRIVE and STARE datasets. The result of thisexperiment indicates the generalization ability and robustness of the proposedmodel.展开更多
文摘Background The use of micro-expression recognition to recognize human emotions is one of the most critical challenges in human-computer interaction applications. In recent years, cross-database micro-expression recognition(CDMER) has emerged as a significant challenge in micro-expression recognition and analysis. Because the training and testing data in CDMER come from different micro-expression databases, CDMER is more challenging than conventional micro-expression recognition. Methods In this paper, an adaptive spatio-temporal attention neural network(ASTANN) using an attention mechanism is presented to address this challenge. To this end, the micro-expression databases SMIC and CASME II are first preprocessed using an optical flow approach,which extracts motion information among video frames that represent discriminative features of micro-expression.After preprocessing, a novel adaptive framework with a spatiotemporal attention module was designed to assign spatial and temporal weights to enhance the most discriminative features. The deep neural network then extracts the cross-domain feature, in which the second-order statistics of the sample features in the source domain are aligned with those in the target domain by minimizing the correlation alignment(CORAL) loss such that the source and target databases share similar distributions. Results To evaluate the performance of ASTANN, experiments were conducted based on the SMIC and CASME II databases under the standard experimental evaluation protocol of CDMER. The experimental results demonstrate that ASTANN outperformed other methods in relevant crossdatabase tasks. Conclusions Extensive experiments were conducted on benchmark tasks, and the results show that ASTANN has superior performance compared with other approaches. This demonstrates the superiority of our method in solving the CDMER problem.
基金The authors extend their appreciation to the Deputyship for Research and Innovation,Ministry of Education in Saudi Arabia for funding this research work through the project number(DRI−KSU−415).
文摘The accurate segmentation of retinal vessels is a challenging taskdue to the presence of various pathologies as well as the low-contrast ofthin vessels and non-uniform illumination. In recent years, encoder-decodernetworks have achieved outstanding performance in retinal vessel segmentation at the cost of high computational complexity. To address the aforementioned challenges and to reduce the computational complexity, we proposea lightweight convolutional neural network (CNN)-based encoder-decoderdeep learning model for accurate retinal vessels segmentation. The proposeddeep learning model consists of encoder-decoder architecture along withbottleneck layers that consist of depth-wise squeezing, followed by fullconvolution, and finally depth-wise stretching. The inspiration for the proposed model is taken from the recently developed Anam-Net model, whichwas tested on CT images for COVID-19 identification. For our lightweightmodel, we used a stack of two 3 × 3 convolution layers (without spatialpooling in between) instead of a single 3 × 3 convolution layer as proposedin Anam-Net to increase the receptive field and to reduce the trainableparameters. The proposed method includes fewer filters in all convolutionallayers than the original Anam-Net and does not have an increasing numberof filters for decreasing resolution. These modifications do not compromiseon the segmentation accuracy, but they do make the architecture significantlylighter in terms of the number of trainable parameters and computation time.The proposed architecture has comparatively fewer parameters (1.01M) thanAnam-Net (4.47M), U-Net (31.05M), SegNet (29.50M), and most of the otherrecent works. The proposed model does not require any problem-specificpre- or post-processing, nor does it rely on handcrafted features. In addition,the attribute of being efficient in terms of segmentation accuracy as well aslightweight makes the proposed method a suitable candidate to be used in thescreening platforms at the point of care. We evaluated our proposed modelon open-access datasets namely, DRIVE, STARE, and CHASE_DB. Theexperimental results show that the proposed model outperforms several stateof-the-art methods, such as U-Net and its variants, fully convolutional network (FCN), SegNet, CCNet, ResWNet, residual connection-based encoderdecoder network (RCED-Net), and scale-space approx. network (SSANet) in terms of {dice coefficient, sensitivity (SN), accuracy (ACC), and the areaunder the ROC curve (AUC)} with the scores of {0.8184, 0.8561, 0.9669, and0.9868} on the DRIVE dataset, the scores of {0.8233, 0.8581, 0.9726, and0.9901} on the STARE dataset, and the scores of {0.8138, 0.8604, 0.9752,and 0.9906} on the CHASE_DB dataset. Additionally, we perform crosstraining experiments on the DRIVE and STARE datasets. The result of thisexperiment indicates the generalization ability and robustness of the proposedmodel.