Face recognition has emerged as one of the most prominent applications of image analysis and under-standing,gaining considerable attention in recent years.This growing interest is driven by two key factors:its extensi...Face recognition has emerged as one of the most prominent applications of image analysis and under-standing,gaining considerable attention in recent years.This growing interest is driven by two key factors:its extensive applications in law enforcement and the commercial domain,and the rapid advancement of practical technologies.Despite the significant advancements,modern recognition algorithms still struggle in real-world conditions such as varying lighting conditions,occlusion,and diverse facial postures.In such scenarios,human perception is still well above the capabilities of present technology.Using the systematic mapping study,this paper presents an in-depth review of face detection algorithms and face recognition algorithms,presenting a detailed survey of advancements made between 2015 and 2024.We analyze key methodologies,highlighting their strengths and restrictions in the application context.Additionally,we examine various datasets used for face detection/recognition datasets focusing on the task-specific applications,size,diversity,and complexity.By analyzing these algorithms and datasets,this survey works as a valuable resource for researchers,identifying the research gap in the field of face detection and recognition and outlining potential directions for future research.展开更多
In recent work,adversarial stickers are widely used to attack face recognition(FR)systems in the physical world.However,it is difficult to evaluate the performance of physical attacks because of the lack of volunteers...In recent work,adversarial stickers are widely used to attack face recognition(FR)systems in the physical world.However,it is difficult to evaluate the performance of physical attacks because of the lack of volunteers in the experiment.In this paper,a simple attack method called incomplete physical adversarial attack(IPAA)is proposed to simulate physical attacks.Different from the process of physical attacks,when an IPAA is conducted,a photo of the adversarial sticker is embedded into a facial image as the input to attack FR systems,which can obtain results similar to those of physical attacks without inviting any volunteers.The results show that IPAA has a higher similarity with physical attacks than digital attacks,indicating that IPAA is able to evaluate the performance of physical attacks.IPAA is effective in quantitatively measuring the impact of the sticker location on the results of attacks.展开更多
Considering that the algorithm accuracy of the traditional sparse representation models is not high under the influence of multiple complex environmental factors,this study focuses on the improvement of feature extrac...Considering that the algorithm accuracy of the traditional sparse representation models is not high under the influence of multiple complex environmental factors,this study focuses on the improvement of feature extraction and model construction.Firstly,the convolutional neural network(CNN)features of the face are extracted by the trained deep learning network.Next,the steady-state and dynamic classifiers for face recognition are constructed based on the CNN features and Haar features respectively,with two-stage sparse representation introduced in the process of constructing the steady-state classifier and the feature templates with high reliability are dynamically selected as alternative templates from the sparse representation template dictionary constructed using the CNN features.Finally,the results of face recognition are given based on the classification results of the steady-state classifier and the dynamic classifier together.Based on this,the feature weights of the steady-state classifier template are adjusted in real time and the dictionary set is dynamically updated to reduce the probability of irrelevant features entering the dictionary set.The average recognition accuracy of this method is 94.45%on the CMU PIE face database and 96.58%on the AR face database,which is significantly improved compared with that of the traditional face recognition methods.展开更多
The lack of facial features caused by wearing masks degrades the performance of facial recognition systems.Traditional occluded face recognition methods cannot integrate the computational resources of the edge layer a...The lack of facial features caused by wearing masks degrades the performance of facial recognition systems.Traditional occluded face recognition methods cannot integrate the computational resources of the edge layer and the device layer.Besides,previous research fails to consider the facial characteristics including occluded and unoccluded parts.To solve the above problems,we put forward a device-edge collaborative occluded face recognition method based on cross-domain feature fusion.Specifically,the device-edge collaborative face recognition architecture gets the utmost out of maximizes device and edge resources for real-time occluded face recognition.Then,a cross-domain facial feature fusion method is presented which combines both the explicit domain and the implicit domain facial.Furthermore,a delay-optimized edge recognition task scheduling method is developed that comprehensively considers the task load,computational power,bandwidth,and delay tolerance constraints of the edge.This method can dynamically schedule face recognition tasks and minimize recognition delay while ensuring recognition accuracy.The experimental results show that the proposed method achieves an average gain of about 21%in recognition latency,while the accuracy of the face recognition task is basically the same compared to the baseline method.展开更多
In computer vision and artificial intelligence,automatic facial expression-based emotion identification of humans has become a popular research and industry problem.Recent demonstrations and applications in several fi...In computer vision and artificial intelligence,automatic facial expression-based emotion identification of humans has become a popular research and industry problem.Recent demonstrations and applications in several fields,including computer games,smart homes,expression analysis,gesture recognition,surveillance films,depression therapy,patientmonitoring,anxiety,and others,have brought attention to its significant academic and commercial importance.This study emphasizes research that has only employed facial images for face expression recognition(FER),because facial expressions are a basic way that people communicate meaning to each other.The immense achievement of deep learning has resulted in a growing use of its much architecture to enhance efficiency.This review is on machine learning,deep learning,and hybrid methods’use of preprocessing,augmentation techniques,and feature extraction for temporal properties of successive frames of data.The following section gives a brief summary of assessment criteria that are accessible to the public and then compares them with benchmark results the most trustworthy way to assess FER-related research topics statistically.In this review,a brief synopsis of the subject matter may be beneficial for novices in the field of FER as well as seasoned scholars seeking fruitful avenues for further investigation.The information conveys fundamental knowledge and provides a comprehensive understanding of the most recent state-of-the-art research.展开更多
Sparse representation is an effective data classification algorithm that depends on the known training samples to categorise the test sample.It has been widely used in various image classification tasks.Sparseness in ...Sparse representation is an effective data classification algorithm that depends on the known training samples to categorise the test sample.It has been widely used in various image classification tasks.Sparseness in sparse representation means that only a few of instances selected from all training samples can effectively convey the essential class-specific information of the test sample,which is very important for classification.For deformable images such as human faces,pixels at the same location of different images of the same subject usually have different intensities.Therefore,extracting features and correctly classifying such deformable objects is very hard.Moreover,the lighting,attitude and occlusion cause more difficulty.Considering the problems and challenges listed above,a novel image representation and classification algorithm is proposed.First,the authors’algorithm generates virtual samples by a non-linear variation method.This method can effectively extract the low-frequency information of space-domain features of the original image,which is very useful for representing deformable objects.The combination of the original and virtual samples is more beneficial to improve the clas-sification performance and robustness of the algorithm.Thereby,the authors’algorithm calculates the expression coefficients of the original and virtual samples separately using the sparse representation principle and obtains the final score by a designed efficient score fusion scheme.The weighting coefficients in the score fusion scheme are set entirely automatically.Finally,the algorithm classifies the samples based on the final scores.The experimental results show that our method performs better classification than conventional sparse representation algorithms.展开更多
Deep neural networks,especially face recognition models,have been shown to be vulnerable to adversarial examples.However,existing attack methods for face recognition systems either cannot attack black-box models,are n...Deep neural networks,especially face recognition models,have been shown to be vulnerable to adversarial examples.However,existing attack methods for face recognition systems either cannot attack black-box models,are not universal,have cumbersome deployment processes,or lack camouflage and are easily detected by the human eye.In this paper,we propose an adversarial pattern generation method for face recognition and achieve universal black-box attacks by pasting the pattern on the frame of goggles.To achieve visual camouflage,we use a generative adversarial network(GAN).The scale of the generative network of GAN is increased to balance the performance conflict between concealment and adversarial behavior,the perceptual loss function based on VGG19 is used to constrain the color style and enhance GAN’s learning ability,and the fine-grained meta-learning adversarial attack strategy is used to carry out black-box attacks.Sufficient visualization results demonstrate that compared with existing methods,the proposed method can generate samples with camouflage and adversarial characteristics.Meanwhile,extensive quantitative experiments show that the generated samples have a high attack success rate against black-box models.展开更多
Identifying faces in non-frontal poses presents a significant challenge for face recognition(FR)systems.In this study,we delved into the impact of yaw pose variations on these systems and devised a robust method for d...Identifying faces in non-frontal poses presents a significant challenge for face recognition(FR)systems.In this study,we delved into the impact of yaw pose variations on these systems and devised a robust method for detecting faces across a wide range of angles from 0°to±90°.We initially selected the most suitable feature vector size by integrating the Dlib,FaceNet(Inception-v2),and“Support Vector Machines(SVM)”+“K-nearest neighbors(KNN)”algorithms.To train and evaluate this feature vector,we used two datasets:the“Labeled Faces in the Wild(LFW)”benchmark data and the“Robust Shape-Based FR System(RSBFRS)”real-time data,which contained face images with varying yaw poses.After selecting the best feature vector,we developed a real-time FR system to handle yaw poses.The proposed FaceNet architecture achieved recognition accuracies of 99.7%and 99.8%for the LFW and RSBFRS datasets,respectively,with 128 feature vector dimensions and minimum Euclidean distance thresholds of 0.06 and 0.12.The FaceNet+SVM and FaceNet+KNN classifiers achieved classification accuracies of 99.26%and 99.44%,respectively.The 128-dimensional embedding vector showed the highest recognition rate among all dimensions.These results demonstrate the effectiveness of our proposed approach in enhancing FR accuracy,particularly in real-world scenarios with varying yaw poses.展开更多
A framework of real time face tracking and recognition is presented, which integrates skin color based tracking and PCA/BPNN (principle component analysis/back propagation neural network) hybrid recognition techni...A framework of real time face tracking and recognition is presented, which integrates skin color based tracking and PCA/BPNN (principle component analysis/back propagation neural network) hybrid recognition techniques. The algorithm is able to track the human face against a complex background and also works well when temporary occlusion occurs. We also obtain a very high recognition rate by averaging a number of samples over a long image sequence. The proposed approach has been successfully tested by many experiments, and can operate at 20 frames/s on an 800 MHz PC.展开更多
In principal component analysis (PCA) algorithms for face recognition, to reduce the influence of the eigenvectors which relate to the changes of the illumination on abstract features, a modified PCA (MPCA) algori...In principal component analysis (PCA) algorithms for face recognition, to reduce the influence of the eigenvectors which relate to the changes of the illumination on abstract features, a modified PCA (MPCA) algorithm is proposed. The method is based on the idea of reducing the influence of the eigenvectors associated with the large eigenvalues by normalizing the feature vector element by its corresponding standard deviation. The Yale face database and Yale face database B are used to verify the method. The simulation results show that, for front face and even under the condition of limited variation in the facial poses, the proposed method results in better performance than the conventional PCA and linear discriminant analysis (LDA) approaches, and the computational cost remains the same as that of the PCA, and much less than that of the LDA.展开更多
With the aim of extracting the features of face images in face recognition, a new method of face recognition by fusing global features and local features is presented. The global features are extracted using principal...With the aim of extracting the features of face images in face recognition, a new method of face recognition by fusing global features and local features is presented. The global features are extracted using principal component analysis (PCA). Active appearance model (AAM) locates 58 facial fiducial points, from which 17 points are characterized as local features using the Gabor wavelet transform (GWT). Normalized global match degree (local match degree) can be obtained by global features (local features) of the probe image and each gallery image. After the fusion of normalized global match degree and normalized local match degree, the recognition result is the class that included the gallery image corresponding to the largest fused match degree. The method is evaluated by the recognition rates over two face image databases (AR and SJTU-IPPR). The experimental results show that the method outperforms PCA and elastic bunch graph matching (EBGM). Moreover, it is effective and robust to expression, illumination and pose variation in some degree.展开更多
Matrix principal component analysis (MatPCA), as an effective feature extraction method, can deal with the matrix pattern and the vector pattern. However, like PCA, MatPCA does not use the class information of sampl...Matrix principal component analysis (MatPCA), as an effective feature extraction method, can deal with the matrix pattern and the vector pattern. However, like PCA, MatPCA does not use the class information of samples. As a result, the extracted features cannot provide enough useful information for distinguishing pat- tern from one another, and further resulting in degradation of classification performance. To fullly use class in- formation of samples, a novel method, called the fuzzy within-class MatPCA (F-WMatPCA)is proposed. F-WMatPCA utilizes the fuzzy K-nearest neighbor method(FKNN) to fuzzify the class membership degrees of a training sample and then performs fuzzy MatPCA within these patterns having the same class label. Due to more class information is used in feature extraction, F-WMatPCA can intuitively improve the classification perfor- mance. Experimental results in face databases and some benchmark datasets show that F-WMatPCA is effective and competitive than MatPCA. The experimental analysis on face image databases indicates that F-WMatPCA im- proves the recognition accuracy and is more stable and robust in performing classification than the existing method of fuzzy-based F-Fisherfaces.展开更多
Bagging is not quite suitable for stable classifiers such as nearest neighbor classifiers due to the lack of diversity and it is difficult to be directly applied to face recognition as well due to the small sample si...Bagging is not quite suitable for stable classifiers such as nearest neighbor classifiers due to the lack of diversity and it is difficult to be directly applied to face recognition as well due to the small sample size (SSS) property of face recognition. To solve the two problems,local Bagging (L-Bagging) is proposed to simultaneously make Bagging apply to both nearest neighbor classifiers and face recognition. The major difference between L-Bagging and Bagging is that L-Bagging performs the bootstrap sampling on each local region partitioned from the original face image rather than the whole face image. Since the dimensionality of local region is usually far less than the number of samples and the component classifiers are constructed just in different local regions,L-Bagging deals with SSS problem and generates more diverse component classifiers. Experimental results on four standard face image databases (AR,Yale,ORL and Yale B) indicate that the proposed L-Bagging method is effective and robust to illumination,occlusion and slight pose variation.展开更多
To improve the classification performance of the kernel minimum squared error( KMSE), an enhanced KMSE algorithm( EKMSE) is proposed. It redefines the regular objective function by introducing a novel class label ...To improve the classification performance of the kernel minimum squared error( KMSE), an enhanced KMSE algorithm( EKMSE) is proposed. It redefines the regular objective function by introducing a novel class label definition, and the relative class label matrix can be adaptively adjusted to the kernel matrix.Compared with the common methods, the newobjective function can enlarge the distance between different classes, which therefore yields better recognition rates. In addition, an iteration parameter searching technique is adopted to improve the computational efficiency. The extensive experiments on FERET and GT face databases illustrate the feasibility and efficiency of the proposed EKMSE. It outperforms the original MSE, KMSE,some KMSE improvement methods, and even the sparse representation-based techniques in face recognition, such as collaborate representation classification( CRC).展开更多
Facial emotion recognition(FER)has become a focal point of research due to its widespread applications,ranging from human-computer interaction to affective computing.While traditional FER techniques have relied on han...Facial emotion recognition(FER)has become a focal point of research due to its widespread applications,ranging from human-computer interaction to affective computing.While traditional FER techniques have relied on handcrafted features and classification models trained on image or video datasets,recent strides in artificial intelligence and deep learning(DL)have ushered in more sophisticated approaches.The research aims to develop a FER system using a Faster Region Convolutional Neural Network(FRCNN)and design a specialized FRCNN architecture tailored for facial emotion recognition,leveraging its ability to capture spatial hierarchies within localized regions of facial features.The proposed work enhances the accuracy and efficiency of facial emotion recognition.The proposed work comprises twomajor key components:Inception V3-based feature extraction and FRCNN-based emotion categorization.Extensive experimentation on Kaggle datasets validates the effectiveness of the proposed strategy,showcasing the FRCNN approach’s resilience and accuracy in identifying and categorizing facial expressions.The model’s overall performance metrics are compelling,with an accuracy of 98.4%,precision of 97.2%,and recall of 96.31%.This work introduces a perceptive deep learning-based FER method,contributing to the evolving landscape of emotion recognition technologies.The high accuracy and resilience demonstrated by the FRCNN approach underscore its potential for real-world applications.This research advances the field of FER and presents a compelling case for the practicality and efficacy of deep learning models in automating the understanding of facial emotions.展开更多
Face recognition(FR) is a practical application of pattern recognition(PR) and remains a compelling topic in the study of computer vision. However, in real-world FR systems, interferences in images, including illumina...Face recognition(FR) is a practical application of pattern recognition(PR) and remains a compelling topic in the study of computer vision. However, in real-world FR systems, interferences in images, including illumination condition, occlusion, facial expression and pose variation, make the recognition task challenging. This study explored the impact of those interferences on FR performance and attempted to alleviate it by taking face symmetry into account. A novel and robust FR method was proposed by combining multi-mirror symmetry with local binary pattern(LBP), namely multi-mirror local binary pattern(MMLBP). To enhance FR performance with various interferences, the MMLBP can 1) adaptively compensate lighting under heterogeneous lighting conditions, and 2) generate extracted image features that are much closer to those under well-controlled conditions(i.e., frontal facial images without expression). Therefore, in contrast with the later variations of LBP, the symmetrical singular value decomposition representation(SSVDR) algorithm utilizing the facial symmetry and a state-of-art non-LBP method, the MMLBP method is shown to successfully handle various image interferences that are common in FR applications without preprocessing operation and a large number of training images. The proposed method was validated with four public data sets. According to our analysis, the MMLBP method was demonstrated to achieve robust performance regardless of image interferences.展开更多
With the continuous progress of The Times and the development of technology,the rise of network social media has also brought the“explosive”growth of image data.As one of the main ways of People’s Daily communicati...With the continuous progress of The Times and the development of technology,the rise of network social media has also brought the“explosive”growth of image data.As one of the main ways of People’s Daily communication,image is widely used as a carrier of communication because of its rich content,intuitive and other advantages.Image recognition based on convolution neural network is the first application in the field of image recognition.A series of algorithm operations such as image eigenvalue extraction,recognition and convolution are used to identify and analyze different images.The rapid development of artificial intelligence makes machine learning more and more important in its research field.Use algorithms to learn each piece of data and predict the outcome.This has become an important key to open the door of artificial intelligence.In machine vision,image recognition is the foundation,but how to associate the low-level information in the image with the high-level image semantics becomes the key problem of image recognition.Predecessors have provided many model algorithms,which have laid a solid foundation for the development of artificial intelligence and image recognition.The multi-level information fusion model based on the VGG16 model is an improvement on the fully connected neural network.Different from full connection network,convolutional neural network does not use full connection method in each layer of neurons of neural network,but USES some nodes for connection.Although this method reduces the computation time,due to the fact that the convolutional neural network model will lose some useful feature information in the process of propagation and calculation,this paper improves the model to be a multi-level information fusion of the convolution calculation method,and further recovers the discarded feature information,so as to improve the recognition rate of the image.VGG divides the network into five groups(mimicking the five layers of AlexNet),yet it USES 3*3 filters and combines them as a convolution sequence.Network deeper DCNN,channel number is bigger.The recognition rate of the model was verified by 0RL Face Database,BioID Face Database and CASIA Face Image Database.展开更多
Face recognition provides a natural visual interface for human computer interaction (HCI) applications. The process of face recognition, however, is inhibited by variations in the appearance of face images caused by...Face recognition provides a natural visual interface for human computer interaction (HCI) applications. The process of face recognition, however, is inhibited by variations in the appearance of face images caused by changes in lighting, expression, viewpoint, aging and introduction of occlusion. Although various algorithms have been presented for face recognition, face recognition is still a very challenging topic. A novel approach of real time face recognition for HCI is proposed in the paper. In view of the limits of the popular approaches to foreground segmentation, wavelet multi-scale transform based background subtraction is developed to extract foreground objects. The optimal selection of the threshold is automatically determined, which does not require any complex supervised training or manual experimental calibration. A robust real time face recognition algorithm is presented, which combines the projection matrixes without iteration and kernel Fisher discriminant analysis (KFDA) to overcome some difficulties existing in the real face recognition. Superior performance of the proposed algorithm is demonstrated by comparing with other algorithms through experiments. The proposed algorithm can also be applied to the video image sequences of natural HCI.展开更多
Support vector machine (SVM), as a novel approach in pattern recognition, has demonstrated a success in face detection and face recognition. In this paper, a face recognition approach based on the SVM classifier with ...Support vector machine (SVM), as a novel approach in pattern recognition, has demonstrated a success in face detection and face recognition. In this paper, a face recognition approach based on the SVM classifier with the nearest neighbor classifier (NNC) is proposed. The principal component analysis (PCA) is used to reduce the dimension and extract features. Then one-against-all stratedy is used to train the SVM classifiers. At the testing stage, we propose an al-展开更多
文摘Face recognition has emerged as one of the most prominent applications of image analysis and under-standing,gaining considerable attention in recent years.This growing interest is driven by two key factors:its extensive applications in law enforcement and the commercial domain,and the rapid advancement of practical technologies.Despite the significant advancements,modern recognition algorithms still struggle in real-world conditions such as varying lighting conditions,occlusion,and diverse facial postures.In such scenarios,human perception is still well above the capabilities of present technology.Using the systematic mapping study,this paper presents an in-depth review of face detection algorithms and face recognition algorithms,presenting a detailed survey of advancements made between 2015 and 2024.We analyze key methodologies,highlighting their strengths and restrictions in the application context.Additionally,we examine various datasets used for face detection/recognition datasets focusing on the task-specific applications,size,diversity,and complexity.By analyzing these algorithms and datasets,this survey works as a valuable resource for researchers,identifying the research gap in the field of face detection and recognition and outlining potential directions for future research.
文摘In recent work,adversarial stickers are widely used to attack face recognition(FR)systems in the physical world.However,it is difficult to evaluate the performance of physical attacks because of the lack of volunteers in the experiment.In this paper,a simple attack method called incomplete physical adversarial attack(IPAA)is proposed to simulate physical attacks.Different from the process of physical attacks,when an IPAA is conducted,a photo of the adversarial sticker is embedded into a facial image as the input to attack FR systems,which can obtain results similar to those of physical attacks without inviting any volunteers.The results show that IPAA has a higher similarity with physical attacks than digital attacks,indicating that IPAA is able to evaluate the performance of physical attacks.IPAA is effective in quantitatively measuring the impact of the sticker location on the results of attacks.
基金the financial support from Natural Science Foundation of Gansu Province(Nos.22JR5RA217,22JR5RA216)Lanzhou Science and Technology Program(No.2022-2-111)+1 种基金Lanzhou University of Arts and Sciences School Innovation Fund Project(No.XJ2022000103)Lanzhou College of Arts and Sciences 2023 Talent Cultivation Quality Improvement Project(No.2023-ZL-jxzz-03)。
文摘Considering that the algorithm accuracy of the traditional sparse representation models is not high under the influence of multiple complex environmental factors,this study focuses on the improvement of feature extraction and model construction.Firstly,the convolutional neural network(CNN)features of the face are extracted by the trained deep learning network.Next,the steady-state and dynamic classifiers for face recognition are constructed based on the CNN features and Haar features respectively,with two-stage sparse representation introduced in the process of constructing the steady-state classifier and the feature templates with high reliability are dynamically selected as alternative templates from the sparse representation template dictionary constructed using the CNN features.Finally,the results of face recognition are given based on the classification results of the steady-state classifier and the dynamic classifier together.Based on this,the feature weights of the steady-state classifier template are adjusted in real time and the dictionary set is dynamically updated to reduce the probability of irrelevant features entering the dictionary set.The average recognition accuracy of this method is 94.45%on the CMU PIE face database and 96.58%on the AR face database,which is significantly improved compared with that of the traditional face recognition methods.
基金supported by National Natural Science Foundation of China(61901071,61871062,61771082,U20A20157)Science and Natural Science Foundation of Chongqing,China(cstc2020jcyjzdxmX0024)+6 种基金University Innovation Research Group of Chongqing(CXQT20017)Program for Innovation Team Building at Institutions of Higher Education in Chongqing(CXTDX201601020)Natural Science Foundation of Chongqing,China(CSTB2022NSCQ-MSX0600)Youth Innovation Group Support Program of ICE Discipline of CQUPT(SCIE-QN-2022-04)Chongqing Municipal Technology Innovation and Application Development Special Key Project(cstc2020jscxdxwtBX0053)China Postdoctoral Science Foundation Project,China(2022MD723723)Chongqing Postdoctoral Research Project Special Funding,China(2023CQBSHTB3092)。
文摘The lack of facial features caused by wearing masks degrades the performance of facial recognition systems.Traditional occluded face recognition methods cannot integrate the computational resources of the edge layer and the device layer.Besides,previous research fails to consider the facial characteristics including occluded and unoccluded parts.To solve the above problems,we put forward a device-edge collaborative occluded face recognition method based on cross-domain feature fusion.Specifically,the device-edge collaborative face recognition architecture gets the utmost out of maximizes device and edge resources for real-time occluded face recognition.Then,a cross-domain facial feature fusion method is presented which combines both the explicit domain and the implicit domain facial.Furthermore,a delay-optimized edge recognition task scheduling method is developed that comprehensively considers the task load,computational power,bandwidth,and delay tolerance constraints of the edge.This method can dynamically schedule face recognition tasks and minimize recognition delay while ensuring recognition accuracy.The experimental results show that the proposed method achieves an average gain of about 21%in recognition latency,while the accuracy of the face recognition task is basically the same compared to the baseline method.
文摘In computer vision and artificial intelligence,automatic facial expression-based emotion identification of humans has become a popular research and industry problem.Recent demonstrations and applications in several fields,including computer games,smart homes,expression analysis,gesture recognition,surveillance films,depression therapy,patientmonitoring,anxiety,and others,have brought attention to its significant academic and commercial importance.This study emphasizes research that has only employed facial images for face expression recognition(FER),because facial expressions are a basic way that people communicate meaning to each other.The immense achievement of deep learning has resulted in a growing use of its much architecture to enhance efficiency.This review is on machine learning,deep learning,and hybrid methods’use of preprocessing,augmentation techniques,and feature extraction for temporal properties of successive frames of data.The following section gives a brief summary of assessment criteria that are accessible to the public and then compares them with benchmark results the most trustworthy way to assess FER-related research topics statistically.In this review,a brief synopsis of the subject matter may be beneficial for novices in the field of FER as well as seasoned scholars seeking fruitful avenues for further investigation.The information conveys fundamental knowledge and provides a comprehensive understanding of the most recent state-of-the-art research.
文摘Sparse representation is an effective data classification algorithm that depends on the known training samples to categorise the test sample.It has been widely used in various image classification tasks.Sparseness in sparse representation means that only a few of instances selected from all training samples can effectively convey the essential class-specific information of the test sample,which is very important for classification.For deformable images such as human faces,pixels at the same location of different images of the same subject usually have different intensities.Therefore,extracting features and correctly classifying such deformable objects is very hard.Moreover,the lighting,attitude and occlusion cause more difficulty.Considering the problems and challenges listed above,a novel image representation and classification algorithm is proposed.First,the authors’algorithm generates virtual samples by a non-linear variation method.This method can effectively extract the low-frequency information of space-domain features of the original image,which is very useful for representing deformable objects.The combination of the original and virtual samples is more beneficial to improve the clas-sification performance and robustness of the algorithm.Thereby,the authors’algorithm calculates the expression coefficients of the original and virtual samples separately using the sparse representation principle and obtains the final score by a designed efficient score fusion scheme.The weighting coefficients in the score fusion scheme are set entirely automatically.Finally,the algorithm classifies the samples based on the final scores.The experimental results show that our method performs better classification than conventional sparse representation algorithms.
基金the National Key Research and Development Program of China(No.2022ZD0210500)the National Natural Science Foundation of China(Nos.61972067,U21A20491,and 62103437)the Dalian Outstanding Youth Science Foundation(No.2022RJ01)。
文摘Deep neural networks,especially face recognition models,have been shown to be vulnerable to adversarial examples.However,existing attack methods for face recognition systems either cannot attack black-box models,are not universal,have cumbersome deployment processes,or lack camouflage and are easily detected by the human eye.In this paper,we propose an adversarial pattern generation method for face recognition and achieve universal black-box attacks by pasting the pattern on the frame of goggles.To achieve visual camouflage,we use a generative adversarial network(GAN).The scale of the generative network of GAN is increased to balance the performance conflict between concealment and adversarial behavior,the perceptual loss function based on VGG19 is used to constrain the color style and enhance GAN’s learning ability,and the fine-grained meta-learning adversarial attack strategy is used to carry out black-box attacks.Sufficient visualization results demonstrate that compared with existing methods,the proposed method can generate samples with camouflage and adversarial characteristics.Meanwhile,extensive quantitative experiments show that the generated samples have a high attack success rate against black-box models.
基金funding for the project,excluding research publication,from the Board of Research in Nuclear Sciences(BRNS)under Grant Number 59/14/05/2019/BRNS.
文摘Identifying faces in non-frontal poses presents a significant challenge for face recognition(FR)systems.In this study,we delved into the impact of yaw pose variations on these systems and devised a robust method for detecting faces across a wide range of angles from 0°to±90°.We initially selected the most suitable feature vector size by integrating the Dlib,FaceNet(Inception-v2),and“Support Vector Machines(SVM)”+“K-nearest neighbors(KNN)”algorithms.To train and evaluate this feature vector,we used two datasets:the“Labeled Faces in the Wild(LFW)”benchmark data and the“Robust Shape-Based FR System(RSBFRS)”real-time data,which contained face images with varying yaw poses.After selecting the best feature vector,we developed a real-time FR system to handle yaw poses.The proposed FaceNet architecture achieved recognition accuracies of 99.7%and 99.8%for the LFW and RSBFRS datasets,respectively,with 128 feature vector dimensions and minimum Euclidean distance thresholds of 0.06 and 0.12.The FaceNet+SVM and FaceNet+KNN classifiers achieved classification accuracies of 99.26%and 99.44%,respectively.The 128-dimensional embedding vector showed the highest recognition rate among all dimensions.These results demonstrate the effectiveness of our proposed approach in enhancing FR accuracy,particularly in real-world scenarios with varying yaw poses.
文摘A framework of real time face tracking and recognition is presented, which integrates skin color based tracking and PCA/BPNN (principle component analysis/back propagation neural network) hybrid recognition techniques. The algorithm is able to track the human face against a complex background and also works well when temporary occlusion occurs. We also obtain a very high recognition rate by averaging a number of samples over a long image sequence. The proposed approach has been successfully tested by many experiments, and can operate at 20 frames/s on an 800 MHz PC.
文摘In principal component analysis (PCA) algorithms for face recognition, to reduce the influence of the eigenvectors which relate to the changes of the illumination on abstract features, a modified PCA (MPCA) algorithm is proposed. The method is based on the idea of reducing the influence of the eigenvectors associated with the large eigenvalues by normalizing the feature vector element by its corresponding standard deviation. The Yale face database and Yale face database B are used to verify the method. The simulation results show that, for front face and even under the condition of limited variation in the facial poses, the proposed method results in better performance than the conventional PCA and linear discriminant analysis (LDA) approaches, and the computational cost remains the same as that of the PCA, and much less than that of the LDA.
文摘With the aim of extracting the features of face images in face recognition, a new method of face recognition by fusing global features and local features is presented. The global features are extracted using principal component analysis (PCA). Active appearance model (AAM) locates 58 facial fiducial points, from which 17 points are characterized as local features using the Gabor wavelet transform (GWT). Normalized global match degree (local match degree) can be obtained by global features (local features) of the probe image and each gallery image. After the fusion of normalized global match degree and normalized local match degree, the recognition result is the class that included the gallery image corresponding to the largest fused match degree. The method is evaluated by the recognition rates over two face image databases (AR and SJTU-IPPR). The experimental results show that the method outperforms PCA and elastic bunch graph matching (EBGM). Moreover, it is effective and robust to expression, illumination and pose variation in some degree.
文摘Matrix principal component analysis (MatPCA), as an effective feature extraction method, can deal with the matrix pattern and the vector pattern. However, like PCA, MatPCA does not use the class information of samples. As a result, the extracted features cannot provide enough useful information for distinguishing pat- tern from one another, and further resulting in degradation of classification performance. To fullly use class in- formation of samples, a novel method, called the fuzzy within-class MatPCA (F-WMatPCA)is proposed. F-WMatPCA utilizes the fuzzy K-nearest neighbor method(FKNN) to fuzzify the class membership degrees of a training sample and then performs fuzzy MatPCA within these patterns having the same class label. Due to more class information is used in feature extraction, F-WMatPCA can intuitively improve the classification perfor- mance. Experimental results in face databases and some benchmark datasets show that F-WMatPCA is effective and competitive than MatPCA. The experimental analysis on face image databases indicates that F-WMatPCA im- proves the recognition accuracy and is more stable and robust in performing classification than the existing method of fuzzy-based F-Fisherfaces.
文摘Bagging is not quite suitable for stable classifiers such as nearest neighbor classifiers due to the lack of diversity and it is difficult to be directly applied to face recognition as well due to the small sample size (SSS) property of face recognition. To solve the two problems,local Bagging (L-Bagging) is proposed to simultaneously make Bagging apply to both nearest neighbor classifiers and face recognition. The major difference between L-Bagging and Bagging is that L-Bagging performs the bootstrap sampling on each local region partitioned from the original face image rather than the whole face image. Since the dimensionality of local region is usually far less than the number of samples and the component classifiers are constructed just in different local regions,L-Bagging deals with SSS problem and generates more diverse component classifiers. Experimental results on four standard face image databases (AR,Yale,ORL and Yale B) indicate that the proposed L-Bagging method is effective and robust to illumination,occlusion and slight pose variation.
基金The Priority Academic Program Development of Jiangsu Higher Education Institutions(PAPD)the National Natural Science Foundation of China(No.61572258,61103141,51405241)+1 种基金the Natural Science Foundation of Jiangsu Province(No.BK20151530)Overseas Training Programs for Outstanding Young Scholars of Universities in Jiangsu Province
文摘To improve the classification performance of the kernel minimum squared error( KMSE), an enhanced KMSE algorithm( EKMSE) is proposed. It redefines the regular objective function by introducing a novel class label definition, and the relative class label matrix can be adaptively adjusted to the kernel matrix.Compared with the common methods, the newobjective function can enlarge the distance between different classes, which therefore yields better recognition rates. In addition, an iteration parameter searching technique is adopted to improve the computational efficiency. The extensive experiments on FERET and GT face databases illustrate the feasibility and efficiency of the proposed EKMSE. It outperforms the original MSE, KMSE,some KMSE improvement methods, and even the sparse representation-based techniques in face recognition, such as collaborate representation classification( CRC).
文摘Facial emotion recognition(FER)has become a focal point of research due to its widespread applications,ranging from human-computer interaction to affective computing.While traditional FER techniques have relied on handcrafted features and classification models trained on image or video datasets,recent strides in artificial intelligence and deep learning(DL)have ushered in more sophisticated approaches.The research aims to develop a FER system using a Faster Region Convolutional Neural Network(FRCNN)and design a specialized FRCNN architecture tailored for facial emotion recognition,leveraging its ability to capture spatial hierarchies within localized regions of facial features.The proposed work enhances the accuracy and efficiency of facial emotion recognition.The proposed work comprises twomajor key components:Inception V3-based feature extraction and FRCNN-based emotion categorization.Extensive experimentation on Kaggle datasets validates the effectiveness of the proposed strategy,showcasing the FRCNN approach’s resilience and accuracy in identifying and categorizing facial expressions.The model’s overall performance metrics are compelling,with an accuracy of 98.4%,precision of 97.2%,and recall of 96.31%.This work introduces a perceptive deep learning-based FER method,contributing to the evolving landscape of emotion recognition technologies.The high accuracy and resilience demonstrated by the FRCNN approach underscore its potential for real-world applications.This research advances the field of FER and presents a compelling case for the practicality and efficacy of deep learning models in automating the understanding of facial emotions.
基金supported by National Natural Science Foundation of China (No. 51305392)Youth Funds of the State Key Laboratory of Fluid Power Transmission and Control (No. SKLoFP_QN_1501)+1 种基金Zhejiang Provincial Natural Science Foundation of China (Nos. LY17E050009 and LZ15E050001)the Fundamental Rsesearch Funds for the Central Universities (No. 2018QNA4008)
文摘Face recognition(FR) is a practical application of pattern recognition(PR) and remains a compelling topic in the study of computer vision. However, in real-world FR systems, interferences in images, including illumination condition, occlusion, facial expression and pose variation, make the recognition task challenging. This study explored the impact of those interferences on FR performance and attempted to alleviate it by taking face symmetry into account. A novel and robust FR method was proposed by combining multi-mirror symmetry with local binary pattern(LBP), namely multi-mirror local binary pattern(MMLBP). To enhance FR performance with various interferences, the MMLBP can 1) adaptively compensate lighting under heterogeneous lighting conditions, and 2) generate extracted image features that are much closer to those under well-controlled conditions(i.e., frontal facial images without expression). Therefore, in contrast with the later variations of LBP, the symmetrical singular value decomposition representation(SSVDR) algorithm utilizing the facial symmetry and a state-of-art non-LBP method, the MMLBP method is shown to successfully handle various image interferences that are common in FR applications without preprocessing operation and a large number of training images. The proposed method was validated with four public data sets. According to our analysis, the MMLBP method was demonstrated to achieve robust performance regardless of image interferences.
文摘With the continuous progress of The Times and the development of technology,the rise of network social media has also brought the“explosive”growth of image data.As one of the main ways of People’s Daily communication,image is widely used as a carrier of communication because of its rich content,intuitive and other advantages.Image recognition based on convolution neural network is the first application in the field of image recognition.A series of algorithm operations such as image eigenvalue extraction,recognition and convolution are used to identify and analyze different images.The rapid development of artificial intelligence makes machine learning more and more important in its research field.Use algorithms to learn each piece of data and predict the outcome.This has become an important key to open the door of artificial intelligence.In machine vision,image recognition is the foundation,but how to associate the low-level information in the image with the high-level image semantics becomes the key problem of image recognition.Predecessors have provided many model algorithms,which have laid a solid foundation for the development of artificial intelligence and image recognition.The multi-level information fusion model based on the VGG16 model is an improvement on the fully connected neural network.Different from full connection network,convolutional neural network does not use full connection method in each layer of neurons of neural network,but USES some nodes for connection.Although this method reduces the computation time,due to the fact that the convolutional neural network model will lose some useful feature information in the process of propagation and calculation,this paper improves the model to be a multi-level information fusion of the convolution calculation method,and further recovers the discarded feature information,so as to improve the recognition rate of the image.VGG divides the network into five groups(mimicking the five layers of AlexNet),yet it USES 3*3 filters and combines them as a convolution sequence.Network deeper DCNN,channel number is bigger.The recognition rate of the model was verified by 0RL Face Database,BioID Face Database and CASIA Face Image Database.
基金supported by the National Natural Science Foundation of China (Grant No.60872117)the Leading Academic Discipline Project of Shanghai Municipal Education Commission (Grant No.J50104)
文摘Face recognition provides a natural visual interface for human computer interaction (HCI) applications. The process of face recognition, however, is inhibited by variations in the appearance of face images caused by changes in lighting, expression, viewpoint, aging and introduction of occlusion. Although various algorithms have been presented for face recognition, face recognition is still a very challenging topic. A novel approach of real time face recognition for HCI is proposed in the paper. In view of the limits of the popular approaches to foreground segmentation, wavelet multi-scale transform based background subtraction is developed to extract foreground objects. The optimal selection of the threshold is automatically determined, which does not require any complex supervised training or manual experimental calibration. A robust real time face recognition algorithm is presented, which combines the projection matrixes without iteration and kernel Fisher discriminant analysis (KFDA) to overcome some difficulties existing in the real face recognition. Superior performance of the proposed algorithm is demonstrated by comparing with other algorithms through experiments. The proposed algorithm can also be applied to the video image sequences of natural HCI.
基金This project was supported by Shanghai Shu Guang Project.
文摘Support vector machine (SVM), as a novel approach in pattern recognition, has demonstrated a success in face detection and face recognition. In this paper, a face recognition approach based on the SVM classifier with the nearest neighbor classifier (NNC) is proposed. The principal component analysis (PCA) is used to reduce the dimension and extract features. Then one-against-all stratedy is used to train the SVM classifiers. At the testing stage, we propose an al-