Although important progresses have been already made in face detection,many false faces can be found in detection results and false detection rate is influenced by some factors,such as rotation and tilt of human face,...Although important progresses have been already made in face detection,many false faces can be found in detection results and false detection rate is influenced by some factors,such as rotation and tilt of human face,complicated background,illumination,scale,cloak and hairstyle.This paper proposes a new method called DP-Adaboost algorithm to detect multi-angle human face and improve the correct detection rate.An improved Adaboost algorithm with the fusion of frontal face classifier and a profile face classifier is used to detect the multi-angle face.An improved horizontal differential projection algorithm is put forward to remove those non-face images among the preliminary detection results from the improved Adaboost algorithm.Experiment results show that compared with the classical Adaboost algorithm with a frontal face classifier,the textual DP-Adaboost algorithm can reduce false rate significantly and improve hit rate in multi-angle face detection.展开更多
Face liveness detection is essential for securing biometric authentication systems against spoofing attacks,including printed photos,replay videos,and 3D masks.This study systematically evaluates pre-trained CNN model...Face liveness detection is essential for securing biometric authentication systems against spoofing attacks,including printed photos,replay videos,and 3D masks.This study systematically evaluates pre-trained CNN models—DenseNet201,VGG16,InceptionV3,ResNet50,VGG19,MobileNetV2,Xception,and InceptionResNetV2—leveraging transfer learning and fine-tuning to enhance liveness detection performance.The models were trained and tested on NUAA and Replay-Attack datasets,with cross-dataset generalization validated on SiW-MV2 to assess real-world adaptability.Performance was evaluated using accuracy,precision,recall,FAR,FRR,HTER,and specialized spoof detection metrics(APCER,NPCER,ACER).Fine-tuning significantly improved detection accuracy,with DenseNet201 achieving the highest performance(98.5%on NUAA,97.71%on Replay-Attack),while MobileNetV2 proved the most efficient model for real-time applications(latency:15 ms,memory usage:45 MB,energy consumption:30 mJ).A statistical significance analysis(paired t-tests,confidence intervals)validated these improvements.Cross-dataset experiments identified DenseNet201 and MobileNetV2 as the most generalizable architectures,with DenseNet201 achieving 86.4%accuracy on Replay-Attack when trained on NUAA,demonstrating robust feature extraction and adaptability.In contrast,ResNet50 showed lower generalization capabilities,struggling with dataset variability and complex spoofing attacks.These findings suggest that MobileNetV2 is well-suited for low-power applications,while DenseNet201 is ideal for high-security environments requiring superior accuracy.This research provides a framework for improving real-time face liveness detection,enhancing biometric security,and guiding future advancements in AI-driven anti-spoofing techniques.展开更多
This paper presents a high-speed and robust dual-band infrared thermal camera based on an ARM CPU.The system consists of a low-resolution long-wavelength infrared detector,a digital temperature and humid⁃ity sensor,an...This paper presents a high-speed and robust dual-band infrared thermal camera based on an ARM CPU.The system consists of a low-resolution long-wavelength infrared detector,a digital temperature and humid⁃ity sensor,and a CMOS sensor.In view of the significant contrast between face and background in thermal infra⁃red images,this paper explores a suitable accuracy-latency tradeoff for thermal face detection and proposes a tiny,lightweight detector named YOLO-Fastest-IR.Four YOLO-Fastest-IR models(IR0 to IR3)with different scales are designed based on YOLO-Fastest.To train and evaluate these lightweight models,a multi-user low-resolution thermal face database(RGBT-MLTF)was collected,and the four networks were trained.Experiments demon⁃strate that the lightweight convolutional neural network performs well in thermal infrared face detection tasks.The proposed algorithm outperforms existing face detection methods in both positioning accuracy and speed,making it more suitable for deployment on mobile platforms or embedded devices.After obtaining the region of interest(ROI)in the infrared(IR)image,the RGB camera is guided by the thermal infrared face detection results to achieve fine positioning of the RGB face.Experimental results show that YOLO-Fastest-IR achieves a frame rate of 92.9 FPS on a Raspberry Pi 4B and successfully detects 97.4%of faces in the RGBT-MLTF test set.Ultimate⁃ly,an infrared temperature measurement system with low cost,strong robustness,and high real-time perfor⁃mance was integrated,achieving a temperature measurement accuracy of 0.3℃.展开更多
Face recognition has emerged as one of the most prominent applications of image analysis and under-standing,gaining considerable attention in recent years.This growing interest is driven by two key factors:its extensi...Face recognition has emerged as one of the most prominent applications of image analysis and under-standing,gaining considerable attention in recent years.This growing interest is driven by two key factors:its extensive applications in law enforcement and the commercial domain,and the rapid advancement of practical technologies.Despite the significant advancements,modern recognition algorithms still struggle in real-world conditions such as varying lighting conditions,occlusion,and diverse facial postures.In such scenarios,human perception is still well above the capabilities of present technology.Using the systematic mapping study,this paper presents an in-depth review of face detection algorithms and face recognition algorithms,presenting a detailed survey of advancements made between 2015 and 2024.We analyze key methodologies,highlighting their strengths and restrictions in the application context.Additionally,we examine various datasets used for face detection/recognition datasets focusing on the task-specific applications,size,diversity,and complexity.By analyzing these algorithms and datasets,this survey works as a valuable resource for researchers,identifying the research gap in the field of face detection and recognition and outlining potential directions for future research.展开更多
Face detection is a critical component inmodern security,surveillance,and human-computer interaction systems,with widespread applications in smartphones,biometric access control,and public monitoring.However,detecting...Face detection is a critical component inmodern security,surveillance,and human-computer interaction systems,with widespread applications in smartphones,biometric access control,and public monitoring.However,detecting faces with high levels of occlusion,such as those covered by masks,veils,or scarves,remains a significant challenge,as traditional models often fail to generalize under such conditions.This paper presents a hybrid approach that combines traditional handcrafted feature extraction technique called Histogram of Oriented Gradients(HOG)and Canny edge detection with modern deep learning models.The goal is to improve face detection accuracy under occlusions.The proposed method leverages the structural strengths of HOG and edge-based object proposals while exploiting the feature extraction capabilities of Convolutional Neural Networks(CNNs).The effectiveness of the proposed model is assessed using a custom dataset containing 10,000 heavily occluded face images and a subset of the Common Objects in Context(COCO)dataset for non-face samples.The COCO dataset was selected for its variety and realism in background contexts.Experimental evaluations demonstrate significant performance improvements compared to baseline CNN models.Results indicate that DenseNet121 combined with HOG outperforms other counterparts in classification metrics with an F1-score of 87.96%and precision of 88.02%.Enhanced performance is achieved through reduced false positives and improved localization accuracy with the integration of object proposals based on Canny and contour detection.While the proposed method increases inference time from 33.52 to 97.80 ms,it achieves a notable improvement in precision from 80.85% to 88.02% when comparing the baseline DenseNet121 model to its hybrid counterpart.Limitations of the method include higher computational cost and the need for careful tuning of parameters across the edge detection,handcrafted features,and CNN components.These findings highlight the potential of combining handcrafted and learned features for occluded face detection tasks.展开更多
Face Presentation Attack Detection(fPAD)plays a vital role in securing face recognition systems against various presentation attacks.While supervised learning-based methods demonstrate effectiveness,they are prone to ...Face Presentation Attack Detection(fPAD)plays a vital role in securing face recognition systems against various presentation attacks.While supervised learning-based methods demonstrate effectiveness,they are prone to overfitting to known attack types and struggle to generalize to novel attack scenarios.Recent studies have explored formulating fPAD as an anomaly detection problem or one-class classification task,enabling the training of generalized models for unknown attack detection.However,conventional anomaly detection approaches encounter difficulties in precisely delineating the boundary between bonafide samples and unknown attacks.To address this challenge,we propose a novel framework focusing on unknown attack detection using exclusively bonafide facial data during training.The core innovation lies in our pseudo-negative sample synthesis(PNSS)strategy,which facilitates learning of compact decision boundaries between bonafide faces and potential attack variations.Specifically,PNSS generates synthetic negative samples within low-likelihood regions of the bonafide feature space to represent diverse unknown attack patterns.To overcome the inherent imbalance between positive and synthetic negative samples during iterative training,we implement a dual-loss mechanism combining focal loss for classification optimization with pairwise confusion loss as a regularizer.This architecture effectively mitigates model bias towards bonafide samples while maintaining discriminative power.Comprehensive evaluations across three benchmark datasets validate the framework’s superior performance.Notably,our PNSS achieves 8%–18% average classification error rate(ACER)reduction compared with state-of-the-art one-class fPAD methods in cross-dataset evaluations on Idiap Replay-Attack and MSU-MFSD datasets.展开更多
Detecting faces under occlusion remains a significant challenge in computer vision due to variations caused by masks,sunglasses,and other obstructions.Addressing this issue is crucial for applications such as surveill...Detecting faces under occlusion remains a significant challenge in computer vision due to variations caused by masks,sunglasses,and other obstructions.Addressing this issue is crucial for applications such as surveillance,biometric authentication,and human-computer interaction.This paper provides a comprehensive review of face detection techniques developed to handle occluded faces.Studies are categorized into four main approaches:feature-based,machine learning-based,deep learning-based,and hybrid methods.We analyzed state-of-the-art studies within each category,examining their methodologies,strengths,and limitations based on widely used benchmark datasets,highlighting their adaptability to partial and severe occlusions.The review also identifies key challenges,including dataset diversity,model generalization,and computational efficiency.Our findings reveal that deep learning methods dominate recent studies,benefiting from their ability to extract hierarchical features and handle complex occlusion patterns.More recently,researchers have increasingly explored Transformer-based architectures,such as Vision Transformer(ViT)and Swin Transformer,to further improve detection robustness under challenging occlusion scenarios.In addition,hybrid approaches,which aim to combine traditional andmodern techniques,are emerging as a promising direction for improving robustness.This review provides valuable insights for researchers aiming to develop more robust face detection systems and for practitioners seeking to deploy reliable solutions in real-world,occlusionprone environments.Further improvements and the proposal of broader datasets are required to developmore scalable,robust,and efficient models that can handle complex occlusions in real-world scenarios.展开更多
As the use of deepfake facial videos proliferate,the associated threats to social security and integrity cannot be overstated.Effective methods for detecting forged facial videos are thus urgently needed.While many de...As the use of deepfake facial videos proliferate,the associated threats to social security and integrity cannot be overstated.Effective methods for detecting forged facial videos are thus urgently needed.While many deep learning-based facial forgery detection approaches show promise,they often fail to delve deeply into the complex relationships between image features and forgery indicators,limiting their effectiveness to specific forgery techniques.To address this challenge,we propose a dual-branch collaborative deepfake detection network.The network processes video frame images as input,where a specialized noise extraction module initially extracts the noise feature maps.Subsequently,the original facial images and corresponding noise maps are directed into two parallel feature extraction branches to concurrently learn texture and noise forgery clues.An attention mechanism is employed between the two branches to facilitate mutual guidance and enhancement of texture and noise features across four different scales.This dual-modal feature integration enhances sensitivity to forgery artifacts and boosts generalization ability across various forgery techniques.Features from both branches are then effectively combined and processed through a multi-layer perception layer to distinguish between real and forged video.Experimental results on benchmark deepfake detection datasets demonstrate that our approach outperforms existing state-of-the-art methods in terms of detection performance,accuracy,and generalization ability.展开更多
Coherent change detection(CCD) is an effective method to detect subtle scene changes that occur between temporal synthetic aperture radar(SAR) observations. Most coherence estimators are obtained from a Hermitian prod...Coherent change detection(CCD) is an effective method to detect subtle scene changes that occur between temporal synthetic aperture radar(SAR) observations. Most coherence estimators are obtained from a Hermitian product based on local statistics. Increasing the number of samples in the local window can improve the estimation bias, but cause the loss of the estimated images spatial resolution. The limitations of these estimators lead to unclear contour of the disturbed region, and even the omission of fine change targets. In this paper, a CCD approach is proposed to detect fine scene changes from multi-temporal and multi-angle SAR image pairs. Multi-angle CCD estimator can improve the contrast between the change target and the background clutter by jointly accumulating singleangle alternative estimator results without further loss of image resolution. The sensitivity of detection performance to image quantity and angle interval is analyzed. Theoretical analysis and experimental results verify the performance of the proposed algorithm.展开更多
This paper proposes a universal framework,termed as Multi-Task Hybrid Convolutional Neural Network(MHCNN),for joint face detection,facial landmark detection,facial quality,and facial attribute analysis.MHCNN consists ...This paper proposes a universal framework,termed as Multi-Task Hybrid Convolutional Neural Network(MHCNN),for joint face detection,facial landmark detection,facial quality,and facial attribute analysis.MHCNN consists of a high-accuracy single stage detector(SSD)and an efficient tiny convolutional neural network(T-CNN)for joint face detection refinement,alignment and attribute analysis.Though the SSD face detectors achieve promising results,we find that applying a tiny CNN on detections further boosts the detected face scores and bounding boxes.By multi-task training,our T-CNN aims to provide five facial landmarks,facial quality scores,and facial attributes like wearing sunglasses and wearing masks.Since there is no public facial quality data and facial attribute data as we need,we contribute two datasets,namely FaceQ and FaceA,which are collected from the Internet.Experiments show that our MHCNN achieves face detection performance comparable to the state of the art in face detection data set and benchmark(FDDB),and gets reasonable results on AFLW,FaceQ and FaceA.展开更多
For face detection under complex background and illumination, a detection method that combines the skin color segmentation and cost-sensitive Adaboost algorithm is proposed in this paper. First, by using the character...For face detection under complex background and illumination, a detection method that combines the skin color segmentation and cost-sensitive Adaboost algorithm is proposed in this paper. First, by using the characteristic of human skin color clustering in the color space, the skin color area in YC b C r color space is extracted and a large number of irrelevant backgrounds are excluded; then for remedying the deficiencies of Adaboost algorithm, the cost-sensitive function is introduced into the Adaboost algorithm; finally the skin color segmentation and cost-sensitive Adaboost algorithm are combined for the face detection. Experimental results show that the proposed detection method has a higher detection rate and detection speed, which can more adapt to the actual field environment.展开更多
Since the coal mine in-pit personnel positioning system neither can effectively achieve the function to detect the uniqueness of in-pit coal-mine personnel nor can identify and eliminate violations in attendance manag...Since the coal mine in-pit personnel positioning system neither can effectively achieve the function to detect the uniqueness of in-pit coal-mine personnel nor can identify and eliminate violations in attendance management such as multiple cards for one person, and swiping one's cards by others in China at present. Therefore, the research introduces a uniqueness detection system and method for in-pit coal-mine personnel integrated into the in-pit coal mine personnel positioning system, establishing a system mode based on face recognition + recognition of personnel positioning card + release by automatic detection. Aiming at the facts that the in-pit personnel are wearing helmets and faces are prone to be stained during the face recognition, the study proposes the ideas that pre-process face images using the 2D-wavelet-transformation-based Mallat algorithm and extracts three face features: miner light, eyes and mouths, using the generalized symmetry transformation-based algorithm. This research carried out test with 40 clean face images with no helmets and 40 lightly-stained face images, and then compared with results with the one using the face feature extraction method based on grey-scale transformation and edge detection. The results show that the method described in the paper can detect accurately face features in the above-mentioned two cases, and the accuracy to detect face features is 97.5% in the case of wearing helmets and lightly-stained faces.展开更多
The skewed symmetry detection plays an improtant role in three-dimensional(3-D) reconstruction. The skewed symmetry depicts a real symmetry viewed from some unknown viewing directions. And the skewed symmetry detect...The skewed symmetry detection plays an improtant role in three-dimensional(3-D) reconstruction. The skewed symmetry depicts a real symmetry viewed from some unknown viewing directions. And the skewed symmetry detection can decrease the geometric constrains and the complexity of 3-D reconstruction. The detection technique for the quadric curve ellipse proposed by Sugimoto is improved to further cover quadric curves including hyperbola and parabola. With the parametric detection, the 3-D quadric curve projection matching is automatical- ly accomplished. Finally, the skewed symmetry surface of the quadric surface solid is obtained. Several examples are used to verify the feasibility of the algorithm and satisfying results can be obtained.展开更多
To automatically detecting whether a person is wearing mask properly,we propose a face mask detection algorithm based on hue-saturation-value(HSV)+histogram of oriented gradient(HOG)features and support vector machine...To automatically detecting whether a person is wearing mask properly,we propose a face mask detection algorithm based on hue-saturation-value(HSV)+histogram of oriented gradient(HOG)features and support vector machines(SVM).Firstly,human face and five feature points are detected with RetinaFace face detection algorithm.The feature points are used to locate to mouth and nose region,and HSV+HOG features of this region are extracted and input to SVM for training to realize detection of wearing masks or not.Secondly,RetinaFace is used to locate to nasal tip area of face,and YCrCb elliptical skin tone model is used to detect the exposure of skin in the nasal tip area,and the optimal classification threshold can be found to determine whether the wear is properly according to experimental results.Experiments show that the accuracy of detecting whether mask is worn can reach 97.9%,and the accuracy of detecting whether mask is worn correctly can reach 87.55%,which verifies the feasibility of the algorithm.展开更多
Face recognition technology automatically identifies an individual from image or video sources.The detection process can be done by attaining facial characteristics from the image of a subject face.Recent developments...Face recognition technology automatically identifies an individual from image or video sources.The detection process can be done by attaining facial characteristics from the image of a subject face.Recent developments in deep learning(DL)and computer vision(CV)techniques enable the design of automated face recognition and tracking methods.This study presents a novel Harris Hawks Optimization with deep learning-empowered automated face detection and tracking(HHODL-AFDT)method.The proposed HHODL-AFDT model involves a Faster region based convolution neural network(RCNN)-based face detection model and HHO-based hyperparameter opti-mization process.The presented optimal Faster RCNN model precisely rec-ognizes the face and is passed into the face-tracking model using a regression network(REGN).The face tracking using the REGN model uses the fea-tures from neighboring frames and foresees the location of the target face in succeeding frames.The application of the HHO algorithm for optimal hyperparameter selection shows the novelty of the work.The experimental validation of the presented HHODL-AFDT algorithm is conducted using two datasets and the experiment outcomes highlighted the superior performance of the HHODL-AFDT model over current methodologies with maximum accuracy of 90.60%and 88.08%under PICS and VTB datasets,respectively.展开更多
A new kind of region pair grey difference classifier was proposed. The regions in pairs associated to form a feature were not necessarily directly-connected, but were selected dedicatedly to the grey transition betwee...A new kind of region pair grey difference classifier was proposed. The regions in pairs associated to form a feature were not necessarily directly-connected, but were selected dedicatedly to the grey transition between regions coinciding with the face pattern structure. Fifteen brighter and darker region pairs were chosen to form the region pair grey difference features with high discriminant capabilities. Instead of using both false acceptance rate and false rejection rate, the mutual information was used as a unified metric for evaluating the classifying performance. The parameters of specified positions, areas and grey difference bias for each single region pair feature were selected by an optimization processing aiming at maximizing the mutual information between the region pair feature and classifying distribution, respectively. An additional region-based feature depicting the correlation between global region grey intensity patterns was also proposed. Compared with the result of Viola-like approach using over 2 000 features, the proposed approach can achieve similar error rates with only 16 features and 1/6 implementation time on controlled illumination images.展开更多
Due to the power of editing tools,new types of fake faces are being created and synthesized,which has attracted great attention on social media.It is reasonable to acknowledge that one human cannot distinguish whether...Due to the power of editing tools,new types of fake faces are being created and synthesized,which has attracted great attention on social media.It is reasonable to acknowledge that one human cannot distinguish whether the face is manipulated from the real faces.Therefore,the detection of face manipulation becomes a critical issue in digital media forensics.This paper provides an overview of recent deep learning detection models for face manipulation.Some public dataset used for face manipulation detection is introduced.On this basis,the challenges for the research and the potential future directions are analyzed and discussed.展开更多
In recent years,with the rapid growth of generative adversarial networks(GANs),a photo-realistic face can be easily generated from a random vector.Moreover,the faces generated by advanced GANs are very realistic.It is...In recent years,with the rapid growth of generative adversarial networks(GANs),a photo-realistic face can be easily generated from a random vector.Moreover,the faces generated by advanced GANs are very realistic.It is reasonable to acknowledge that even a well-trained viewer has difficulties to distinguish artificial from real faces.Therefore,detecting the face generated by GANs is a necessary work.This paper mainly introduces some methods to detect GAN-generated fake faces,and analyzes the advantages and disadvantages of these models based on the network structure and evaluation indexes,and the results obtained in the respective data sets.On this basis,the challenges faced in this field and future research directions are discussed.展开更多
文摘Although important progresses have been already made in face detection,many false faces can be found in detection results and false detection rate is influenced by some factors,such as rotation and tilt of human face,complicated background,illumination,scale,cloak and hairstyle.This paper proposes a new method called DP-Adaboost algorithm to detect multi-angle human face and improve the correct detection rate.An improved Adaboost algorithm with the fusion of frontal face classifier and a profile face classifier is used to detect the multi-angle face.An improved horizontal differential projection algorithm is put forward to remove those non-face images among the preliminary detection results from the improved Adaboost algorithm.Experiment results show that compared with the classical Adaboost algorithm with a frontal face classifier,the textual DP-Adaboost algorithm can reduce false rate significantly and improve hit rate in multi-angle face detection.
基金funded by Centre for Advanced Modelling and Geospatial Information Systems(CAMGIS),Faculty of Engineering and IT,University of Technology Sydney.Moreover,Ongoing Research Funding Program(ORF-2025-14)King Saud University,Riyadh,Saudi Arabia,under Project ORF-2025-。
文摘Face liveness detection is essential for securing biometric authentication systems against spoofing attacks,including printed photos,replay videos,and 3D masks.This study systematically evaluates pre-trained CNN models—DenseNet201,VGG16,InceptionV3,ResNet50,VGG19,MobileNetV2,Xception,and InceptionResNetV2—leveraging transfer learning and fine-tuning to enhance liveness detection performance.The models were trained and tested on NUAA and Replay-Attack datasets,with cross-dataset generalization validated on SiW-MV2 to assess real-world adaptability.Performance was evaluated using accuracy,precision,recall,FAR,FRR,HTER,and specialized spoof detection metrics(APCER,NPCER,ACER).Fine-tuning significantly improved detection accuracy,with DenseNet201 achieving the highest performance(98.5%on NUAA,97.71%on Replay-Attack),while MobileNetV2 proved the most efficient model for real-time applications(latency:15 ms,memory usage:45 MB,energy consumption:30 mJ).A statistical significance analysis(paired t-tests,confidence intervals)validated these improvements.Cross-dataset experiments identified DenseNet201 and MobileNetV2 as the most generalizable architectures,with DenseNet201 achieving 86.4%accuracy on Replay-Attack when trained on NUAA,demonstrating robust feature extraction and adaptability.In contrast,ResNet50 showed lower generalization capabilities,struggling with dataset variability and complex spoofing attacks.These findings suggest that MobileNetV2 is well-suited for low-power applications,while DenseNet201 is ideal for high-security environments requiring superior accuracy.This research provides a framework for improving real-time face liveness detection,enhancing biometric security,and guiding future advancements in AI-driven anti-spoofing techniques.
基金Supported by the Fundamental Research Funds for the Central Universities(2024300443)the Natural Science Foundation of Jiangsu Province(BK20241224).
文摘This paper presents a high-speed and robust dual-band infrared thermal camera based on an ARM CPU.The system consists of a low-resolution long-wavelength infrared detector,a digital temperature and humid⁃ity sensor,and a CMOS sensor.In view of the significant contrast between face and background in thermal infra⁃red images,this paper explores a suitable accuracy-latency tradeoff for thermal face detection and proposes a tiny,lightweight detector named YOLO-Fastest-IR.Four YOLO-Fastest-IR models(IR0 to IR3)with different scales are designed based on YOLO-Fastest.To train and evaluate these lightweight models,a multi-user low-resolution thermal face database(RGBT-MLTF)was collected,and the four networks were trained.Experiments demon⁃strate that the lightweight convolutional neural network performs well in thermal infrared face detection tasks.The proposed algorithm outperforms existing face detection methods in both positioning accuracy and speed,making it more suitable for deployment on mobile platforms or embedded devices.After obtaining the region of interest(ROI)in the infrared(IR)image,the RGB camera is guided by the thermal infrared face detection results to achieve fine positioning of the RGB face.Experimental results show that YOLO-Fastest-IR achieves a frame rate of 92.9 FPS on a Raspberry Pi 4B and successfully detects 97.4%of faces in the RGBT-MLTF test set.Ultimate⁃ly,an infrared temperature measurement system with low cost,strong robustness,and high real-time perfor⁃mance was integrated,achieving a temperature measurement accuracy of 0.3℃.
文摘Face recognition has emerged as one of the most prominent applications of image analysis and under-standing,gaining considerable attention in recent years.This growing interest is driven by two key factors:its extensive applications in law enforcement and the commercial domain,and the rapid advancement of practical technologies.Despite the significant advancements,modern recognition algorithms still struggle in real-world conditions such as varying lighting conditions,occlusion,and diverse facial postures.In such scenarios,human perception is still well above the capabilities of present technology.Using the systematic mapping study,this paper presents an in-depth review of face detection algorithms and face recognition algorithms,presenting a detailed survey of advancements made between 2015 and 2024.We analyze key methodologies,highlighting their strengths and restrictions in the application context.Additionally,we examine various datasets used for face detection/recognition datasets focusing on the task-specific applications,size,diversity,and complexity.By analyzing these algorithms and datasets,this survey works as a valuable resource for researchers,identifying the research gap in the field of face detection and recognition and outlining potential directions for future research.
基金funded by A’Sharqiyah University,Sultanate of Oman,under Research Project Grant Number(BFP/RGP/ICT/22/490).
文摘Face detection is a critical component inmodern security,surveillance,and human-computer interaction systems,with widespread applications in smartphones,biometric access control,and public monitoring.However,detecting faces with high levels of occlusion,such as those covered by masks,veils,or scarves,remains a significant challenge,as traditional models often fail to generalize under such conditions.This paper presents a hybrid approach that combines traditional handcrafted feature extraction technique called Histogram of Oriented Gradients(HOG)and Canny edge detection with modern deep learning models.The goal is to improve face detection accuracy under occlusions.The proposed method leverages the structural strengths of HOG and edge-based object proposals while exploiting the feature extraction capabilities of Convolutional Neural Networks(CNNs).The effectiveness of the proposed model is assessed using a custom dataset containing 10,000 heavily occluded face images and a subset of the Common Objects in Context(COCO)dataset for non-face samples.The COCO dataset was selected for its variety and realism in background contexts.Experimental evaluations demonstrate significant performance improvements compared to baseline CNN models.Results indicate that DenseNet121 combined with HOG outperforms other counterparts in classification metrics with an F1-score of 87.96%and precision of 88.02%.Enhanced performance is achieved through reduced false positives and improved localization accuracy with the integration of object proposals based on Canny and contour detection.While the proposed method increases inference time from 33.52 to 97.80 ms,it achieves a notable improvement in precision from 80.85% to 88.02% when comparing the baseline DenseNet121 model to its hybrid counterpart.Limitations of the method include higher computational cost and the need for careful tuning of parameters across the edge detection,handcrafted features,and CNN components.These findings highlight the potential of combining handcrafted and learned features for occluded face detection tasks.
基金supported in part by the National Natural Science Foundation of China under Grants 61972267,and 61772070in part by the Natural Science Foundation of Hebei Province under Grant F2024210005.
文摘Face Presentation Attack Detection(fPAD)plays a vital role in securing face recognition systems against various presentation attacks.While supervised learning-based methods demonstrate effectiveness,they are prone to overfitting to known attack types and struggle to generalize to novel attack scenarios.Recent studies have explored formulating fPAD as an anomaly detection problem or one-class classification task,enabling the training of generalized models for unknown attack detection.However,conventional anomaly detection approaches encounter difficulties in precisely delineating the boundary between bonafide samples and unknown attacks.To address this challenge,we propose a novel framework focusing on unknown attack detection using exclusively bonafide facial data during training.The core innovation lies in our pseudo-negative sample synthesis(PNSS)strategy,which facilitates learning of compact decision boundaries between bonafide faces and potential attack variations.Specifically,PNSS generates synthetic negative samples within low-likelihood regions of the bonafide feature space to represent diverse unknown attack patterns.To overcome the inherent imbalance between positive and synthetic negative samples during iterative training,we implement a dual-loss mechanism combining focal loss for classification optimization with pairwise confusion loss as a regularizer.This architecture effectively mitigates model bias towards bonafide samples while maintaining discriminative power.Comprehensive evaluations across three benchmark datasets validate the framework’s superior performance.Notably,our PNSS achieves 8%–18% average classification error rate(ACER)reduction compared with state-of-the-art one-class fPAD methods in cross-dataset evaluations on Idiap Replay-Attack and MSU-MFSD datasets.
基金funded by A’Sharqiyah University,Sultanate of Oman,under Research Project grant number(BFP/RGP/ICT/22/490).
文摘Detecting faces under occlusion remains a significant challenge in computer vision due to variations caused by masks,sunglasses,and other obstructions.Addressing this issue is crucial for applications such as surveillance,biometric authentication,and human-computer interaction.This paper provides a comprehensive review of face detection techniques developed to handle occluded faces.Studies are categorized into four main approaches:feature-based,machine learning-based,deep learning-based,and hybrid methods.We analyzed state-of-the-art studies within each category,examining their methodologies,strengths,and limitations based on widely used benchmark datasets,highlighting their adaptability to partial and severe occlusions.The review also identifies key challenges,including dataset diversity,model generalization,and computational efficiency.Our findings reveal that deep learning methods dominate recent studies,benefiting from their ability to extract hierarchical features and handle complex occlusion patterns.More recently,researchers have increasingly explored Transformer-based architectures,such as Vision Transformer(ViT)and Swin Transformer,to further improve detection robustness under challenging occlusion scenarios.In addition,hybrid approaches,which aim to combine traditional andmodern techniques,are emerging as a promising direction for improving robustness.This review provides valuable insights for researchers aiming to develop more robust face detection systems and for practitioners seeking to deploy reliable solutions in real-world,occlusionprone environments.Further improvements and the proposal of broader datasets are required to developmore scalable,robust,and efficient models that can handle complex occlusions in real-world scenarios.
基金funded by the Ministry of Public Security Science and Technology Program Project(No.2023LL35)the Key Laboratory of Smart Policing and National Security Risk Governance,Sichuan Province(No.ZHZZZD2302).
文摘As the use of deepfake facial videos proliferate,the associated threats to social security and integrity cannot be overstated.Effective methods for detecting forged facial videos are thus urgently needed.While many deep learning-based facial forgery detection approaches show promise,they often fail to delve deeply into the complex relationships between image features and forgery indicators,limiting their effectiveness to specific forgery techniques.To address this challenge,we propose a dual-branch collaborative deepfake detection network.The network processes video frame images as input,where a specialized noise extraction module initially extracts the noise feature maps.Subsequently,the original facial images and corresponding noise maps are directed into two parallel feature extraction branches to concurrently learn texture and noise forgery clues.An attention mechanism is employed between the two branches to facilitate mutual guidance and enhancement of texture and noise features across four different scales.This dual-modal feature integration enhances sensitivity to forgery artifacts and boosts generalization ability across various forgery techniques.Features from both branches are then effectively combined and processed through a multi-layer perception layer to distinguish between real and forged video.Experimental results on benchmark deepfake detection datasets demonstrate that our approach outperforms existing state-of-the-art methods in terms of detection performance,accuracy,and generalization ability.
文摘Coherent change detection(CCD) is an effective method to detect subtle scene changes that occur between temporal synthetic aperture radar(SAR) observations. Most coherence estimators are obtained from a Hermitian product based on local statistics. Increasing the number of samples in the local window can improve the estimation bias, but cause the loss of the estimated images spatial resolution. The limitations of these estimators lead to unclear contour of the disturbed region, and even the omission of fine change targets. In this paper, a CCD approach is proposed to detect fine scene changes from multi-temporal and multi-angle SAR image pairs. Multi-angle CCD estimator can improve the contrast between the change target and the background clutter by jointly accumulating singleangle alternative estimator results without further loss of image resolution. The sensitivity of detection performance to image quantity and angle interval is analyzed. Theoretical analysis and experimental results verify the performance of the proposed algorithm.
基金supported by ZTE Corporation and State Key Laboratory of Mobile Network and Mobile Multimedia Technology
文摘This paper proposes a universal framework,termed as Multi-Task Hybrid Convolutional Neural Network(MHCNN),for joint face detection,facial landmark detection,facial quality,and facial attribute analysis.MHCNN consists of a high-accuracy single stage detector(SSD)and an efficient tiny convolutional neural network(T-CNN)for joint face detection refinement,alignment and attribute analysis.Though the SSD face detectors achieve promising results,we find that applying a tiny CNN on detections further boosts the detected face scores and bounding boxes.By multi-task training,our T-CNN aims to provide five facial landmarks,facial quality scores,and facial attributes like wearing sunglasses and wearing masks.Since there is no public facial quality data and facial attribute data as we need,we contribute two datasets,namely FaceQ and FaceA,which are collected from the Internet.Experiments show that our MHCNN achieves face detection performance comparable to the state of the art in face detection data set and benchmark(FDDB),and gets reasonable results on AFLW,FaceQ and FaceA.
基金supported by the National Basic Research Program of China(973 Program)under Grant No.2012CB215202the National Natural Science Foundation of China under Grant No.51205046
文摘For face detection under complex background and illumination, a detection method that combines the skin color segmentation and cost-sensitive Adaboost algorithm is proposed in this paper. First, by using the characteristic of human skin color clustering in the color space, the skin color area in YC b C r color space is extracted and a large number of irrelevant backgrounds are excluded; then for remedying the deficiencies of Adaboost algorithm, the cost-sensitive function is introduced into the Adaboost algorithm; finally the skin color segmentation and cost-sensitive Adaboost algorithm are combined for the face detection. Experimental results show that the proposed detection method has a higher detection rate and detection speed, which can more adapt to the actual field environment.
基金financial supports from the National Natural Science Foundation of China (No. 51134024)the National High Technology Research and Development Program of China (No. 2012AA062203)are gratefully acknowledged
文摘Since the coal mine in-pit personnel positioning system neither can effectively achieve the function to detect the uniqueness of in-pit coal-mine personnel nor can identify and eliminate violations in attendance management such as multiple cards for one person, and swiping one's cards by others in China at present. Therefore, the research introduces a uniqueness detection system and method for in-pit coal-mine personnel integrated into the in-pit coal mine personnel positioning system, establishing a system mode based on face recognition + recognition of personnel positioning card + release by automatic detection. Aiming at the facts that the in-pit personnel are wearing helmets and faces are prone to be stained during the face recognition, the study proposes the ideas that pre-process face images using the 2D-wavelet-transformation-based Mallat algorithm and extracts three face features: miner light, eyes and mouths, using the generalized symmetry transformation-based algorithm. This research carried out test with 40 clean face images with no helmets and 40 lightly-stained face images, and then compared with results with the one using the face feature extraction method based on grey-scale transformation and edge detection. The results show that the method described in the paper can detect accurately face features in the above-mentioned two cases, and the accuracy to detect face features is 97.5% in the case of wearing helmets and lightly-stained faces.
基金Supported by the National Natural Science Foundation of China(10377007)~~
文摘The skewed symmetry detection plays an improtant role in three-dimensional(3-D) reconstruction. The skewed symmetry depicts a real symmetry viewed from some unknown viewing directions. And the skewed symmetry detection can decrease the geometric constrains and the complexity of 3-D reconstruction. The detection technique for the quadric curve ellipse proposed by Sugimoto is improved to further cover quadric curves including hyperbola and parabola. With the parametric detection, the 3-D quadric curve projection matching is automatical- ly accomplished. Finally, the skewed symmetry surface of the quadric surface solid is obtained. Several examples are used to verify the feasibility of the algorithm and satisfying results can be obtained.
基金National Natural Science Foundation of China(No.519705449)。
文摘To automatically detecting whether a person is wearing mask properly,we propose a face mask detection algorithm based on hue-saturation-value(HSV)+histogram of oriented gradient(HOG)features and support vector machines(SVM).Firstly,human face and five feature points are detected with RetinaFace face detection algorithm.The feature points are used to locate to mouth and nose region,and HSV+HOG features of this region are extracted and input to SVM for training to realize detection of wearing masks or not.Secondly,RetinaFace is used to locate to nasal tip area of face,and YCrCb elliptical skin tone model is used to detect the exposure of skin in the nasal tip area,and the optimal classification threshold can be found to determine whether the wear is properly according to experimental results.Experiments show that the accuracy of detecting whether mask is worn can reach 97.9%,and the accuracy of detecting whether mask is worn correctly can reach 87.55%,which verifies the feasibility of the algorithm.
基金Princess Nourah bint Abdulrahman University Researchers Supporting Project Number(PNURSP2023R349)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.This study is supported via funding from Prince Sattam bin Abdulaziz University Project Number(PSAU/2023/R/1444).
文摘Face recognition technology automatically identifies an individual from image or video sources.The detection process can be done by attaining facial characteristics from the image of a subject face.Recent developments in deep learning(DL)and computer vision(CV)techniques enable the design of automated face recognition and tracking methods.This study presents a novel Harris Hawks Optimization with deep learning-empowered automated face detection and tracking(HHODL-AFDT)method.The proposed HHODL-AFDT model involves a Faster region based convolution neural network(RCNN)-based face detection model and HHO-based hyperparameter opti-mization process.The presented optimal Faster RCNN model precisely rec-ognizes the face and is passed into the face-tracking model using a regression network(REGN).The face tracking using the REGN model uses the fea-tures from neighboring frames and foresees the location of the target face in succeeding frames.The application of the HHO algorithm for optimal hyperparameter selection shows the novelty of the work.The experimental validation of the presented HHODL-AFDT algorithm is conducted using two datasets and the experiment outcomes highlighted the superior performance of the HHODL-AFDT model over current methodologies with maximum accuracy of 90.60%and 88.08%under PICS and VTB datasets,respectively.
基金Supported by the Joint Research Funds of Dalian University of Technology and Shenyang Automation Institute,Chinese Academy of Sciences
文摘A new kind of region pair grey difference classifier was proposed. The regions in pairs associated to form a feature were not necessarily directly-connected, but were selected dedicatedly to the grey transition between regions coinciding with the face pattern structure. Fifteen brighter and darker region pairs were chosen to form the region pair grey difference features with high discriminant capabilities. Instead of using both false acceptance rate and false rejection rate, the mutual information was used as a unified metric for evaluating the classifying performance. The parameters of specified positions, areas and grey difference bias for each single region pair feature were selected by an optimization processing aiming at maximizing the mutual information between the region pair feature and classifying distribution, respectively. An additional region-based feature depicting the correlation between global region grey intensity patterns was also proposed. Compared with the result of Viola-like approach using over 2 000 features, the proposed approach can achieve similar error rates with only 16 features and 1/6 implementation time on controlled illumination images.
基金This work is supported by National Natural Science Foundation of China(62072251).
文摘Due to the power of editing tools,new types of fake faces are being created and synthesized,which has attracted great attention on social media.It is reasonable to acknowledge that one human cannot distinguish whether the face is manipulated from the real faces.Therefore,the detection of face manipulation becomes a critical issue in digital media forensics.This paper provides an overview of recent deep learning detection models for face manipulation.Some public dataset used for face manipulation detection is introduced.On this basis,the challenges for the research and the potential future directions are analyzed and discussed.
基金supported by National Natural Science Foundation of China(62072251).
文摘In recent years,with the rapid growth of generative adversarial networks(GANs),a photo-realistic face can be easily generated from a random vector.Moreover,the faces generated by advanced GANs are very realistic.It is reasonable to acknowledge that even a well-trained viewer has difficulties to distinguish artificial from real faces.Therefore,detecting the face generated by GANs is a necessary work.This paper mainly introduces some methods to detect GAN-generated fake faces,and analyzes the advantages and disadvantages of these models based on the network structure and evaluation indexes,and the results obtained in the respective data sets.On this basis,the challenges faced in this field and future research directions are discussed.