This paper presents a high-speed and robust dual-band infrared thermal camera based on an ARM CPU.The system consists of a low-resolution long-wavelength infrared detector,a digital temperature and humid⁃ity sensor,an...This paper presents a high-speed and robust dual-band infrared thermal camera based on an ARM CPU.The system consists of a low-resolution long-wavelength infrared detector,a digital temperature and humid⁃ity sensor,and a CMOS sensor.In view of the significant contrast between face and background in thermal infra⁃red images,this paper explores a suitable accuracy-latency tradeoff for thermal face detection and proposes a tiny,lightweight detector named YOLO-Fastest-IR.Four YOLO-Fastest-IR models(IR0 to IR3)with different scales are designed based on YOLO-Fastest.To train and evaluate these lightweight models,a multi-user low-resolution thermal face database(RGBT-MLTF)was collected,and the four networks were trained.Experiments demon⁃strate that the lightweight convolutional neural network performs well in thermal infrared face detection tasks.The proposed algorithm outperforms existing face detection methods in both positioning accuracy and speed,making it more suitable for deployment on mobile platforms or embedded devices.After obtaining the region of interest(ROI)in the infrared(IR)image,the RGB camera is guided by the thermal infrared face detection results to achieve fine positioning of the RGB face.Experimental results show that YOLO-Fastest-IR achieves a frame rate of 92.9 FPS on a Raspberry Pi 4B and successfully detects 97.4%of faces in the RGBT-MLTF test set.Ultimate⁃ly,an infrared temperature measurement system with low cost,strong robustness,and high real-time perfor⁃mance was integrated,achieving a temperature measurement accuracy of 0.3℃.展开更多
Detecting faces under occlusion remains a significant challenge in computer vision due to variations caused by masks,sunglasses,and other obstructions.Addressing this issue is crucial for applications such as surveill...Detecting faces under occlusion remains a significant challenge in computer vision due to variations caused by masks,sunglasses,and other obstructions.Addressing this issue is crucial for applications such as surveillance,biometric authentication,and human-computer interaction.This paper provides a comprehensive review of face detection techniques developed to handle occluded faces.Studies are categorized into four main approaches:feature-based,machine learning-based,deep learning-based,and hybrid methods.We analyzed state-of-the-art studies within each category,examining their methodologies,strengths,and limitations based on widely used benchmark datasets,highlighting their adaptability to partial and severe occlusions.The review also identifies key challenges,including dataset diversity,model generalization,and computational efficiency.Our findings reveal that deep learning methods dominate recent studies,benefiting from their ability to extract hierarchical features and handle complex occlusion patterns.More recently,researchers have increasingly explored Transformer-based architectures,such as Vision Transformer(ViT)and Swin Transformer,to further improve detection robustness under challenging occlusion scenarios.In addition,hybrid approaches,which aim to combine traditional andmodern techniques,are emerging as a promising direction for improving robustness.This review provides valuable insights for researchers aiming to develop more robust face detection systems and for practitioners seeking to deploy reliable solutions in real-world,occlusionprone environments.Further improvements and the proposal of broader datasets are required to developmore scalable,robust,and efficient models that can handle complex occlusions in real-world scenarios.展开更多
Face detection is a critical component inmodern security,surveillance,and human-computer interaction systems,with widespread applications in smartphones,biometric access control,and public monitoring.However,detecting...Face detection is a critical component inmodern security,surveillance,and human-computer interaction systems,with widespread applications in smartphones,biometric access control,and public monitoring.However,detecting faces with high levels of occlusion,such as those covered by masks,veils,or scarves,remains a significant challenge,as traditional models often fail to generalize under such conditions.This paper presents a hybrid approach that combines traditional handcrafted feature extraction technique called Histogram of Oriented Gradients(HOG)and Canny edge detection with modern deep learning models.The goal is to improve face detection accuracy under occlusions.The proposed method leverages the structural strengths of HOG and edge-based object proposals while exploiting the feature extraction capabilities of Convolutional Neural Networks(CNNs).The effectiveness of the proposed model is assessed using a custom dataset containing 10,000 heavily occluded face images and a subset of the Common Objects in Context(COCO)dataset for non-face samples.The COCO dataset was selected for its variety and realism in background contexts.Experimental evaluations demonstrate significant performance improvements compared to baseline CNN models.Results indicate that DenseNet121 combined with HOG outperforms other counterparts in classification metrics with an F1-score of 87.96%and precision of 88.02%.Enhanced performance is achieved through reduced false positives and improved localization accuracy with the integration of object proposals based on Canny and contour detection.While the proposed method increases inference time from 33.52 to 97.80 ms,it achieves a notable improvement in precision from 80.85% to 88.02% when comparing the baseline DenseNet121 model to its hybrid counterpart.Limitations of the method include higher computational cost and the need for careful tuning of parameters across the edge detection,handcrafted features,and CNN components.These findings highlight the potential of combining handcrafted and learned features for occluded face detection tasks.展开更多
Face recognition has emerged as one of the most prominent applications of image analysis and under-standing,gaining considerable attention in recent years.This growing interest is driven by two key factors:its extensi...Face recognition has emerged as one of the most prominent applications of image analysis and under-standing,gaining considerable attention in recent years.This growing interest is driven by two key factors:its extensive applications in law enforcement and the commercial domain,and the rapid advancement of practical technologies.Despite the significant advancements,modern recognition algorithms still struggle in real-world conditions such as varying lighting conditions,occlusion,and diverse facial postures.In such scenarios,human perception is still well above the capabilities of present technology.Using the systematic mapping study,this paper presents an in-depth review of face detection algorithms and face recognition algorithms,presenting a detailed survey of advancements made between 2015 and 2024.We analyze key methodologies,highlighting their strengths and restrictions in the application context.Additionally,we examine various datasets used for face detection/recognition datasets focusing on the task-specific applications,size,diversity,and complexity.By analyzing these algorithms and datasets,this survey works as a valuable resource for researchers,identifying the research gap in the field of face detection and recognition and outlining potential directions for future research.展开更多
Although important progresses have been already made in face detection,many false faces can be found in detection results and false detection rate is influenced by some factors,such as rotation and tilt of human face,...Although important progresses have been already made in face detection,many false faces can be found in detection results and false detection rate is influenced by some factors,such as rotation and tilt of human face,complicated background,illumination,scale,cloak and hairstyle.This paper proposes a new method called DP-Adaboost algorithm to detect multi-angle human face and improve the correct detection rate.An improved Adaboost algorithm with the fusion of frontal face classifier and a profile face classifier is used to detect the multi-angle face.An improved horizontal differential projection algorithm is put forward to remove those non-face images among the preliminary detection results from the improved Adaboost algorithm.Experiment results show that compared with the classical Adaboost algorithm with a frontal face classifier,the textual DP-Adaboost algorithm can reduce false rate significantly and improve hit rate in multi-angle face detection.展开更多
For face detection under complex background and illumination, a detection method that combines the skin color segmentation and cost-sensitive Adaboost algorithm is proposed in this paper. First, by using the character...For face detection under complex background and illumination, a detection method that combines the skin color segmentation and cost-sensitive Adaboost algorithm is proposed in this paper. First, by using the characteristic of human skin color clustering in the color space, the skin color area in YC b C r color space is extracted and a large number of irrelevant backgrounds are excluded; then for remedying the deficiencies of Adaboost algorithm, the cost-sensitive function is introduced into the Adaboost algorithm; finally the skin color segmentation and cost-sensitive Adaboost algorithm are combined for the face detection. Experimental results show that the proposed detection method has a higher detection rate and detection speed, which can more adapt to the actual field environment.展开更多
Face recognition technology automatically identifies an individual from image or video sources.The detection process can be done by attaining facial characteristics from the image of a subject face.Recent developments...Face recognition technology automatically identifies an individual from image or video sources.The detection process can be done by attaining facial characteristics from the image of a subject face.Recent developments in deep learning(DL)and computer vision(CV)techniques enable the design of automated face recognition and tracking methods.This study presents a novel Harris Hawks Optimization with deep learning-empowered automated face detection and tracking(HHODL-AFDT)method.The proposed HHODL-AFDT model involves a Faster region based convolution neural network(RCNN)-based face detection model and HHO-based hyperparameter opti-mization process.The presented optimal Faster RCNN model precisely rec-ognizes the face and is passed into the face-tracking model using a regression network(REGN).The face tracking using the REGN model uses the fea-tures from neighboring frames and foresees the location of the target face in succeeding frames.The application of the HHO algorithm for optimal hyperparameter selection shows the novelty of the work.The experimental validation of the presented HHODL-AFDT algorithm is conducted using two datasets and the experiment outcomes highlighted the superior performance of the HHODL-AFDT model over current methodologies with maximum accuracy of 90.60%and 88.08%under PICS and VTB datasets,respectively.展开更多
A new kind of region pair grey difference classifier was proposed. The regions in pairs associated to form a feature were not necessarily directly-connected, but were selected dedicatedly to the grey transition betwee...A new kind of region pair grey difference classifier was proposed. The regions in pairs associated to form a feature were not necessarily directly-connected, but were selected dedicatedly to the grey transition between regions coinciding with the face pattern structure. Fifteen brighter and darker region pairs were chosen to form the region pair grey difference features with high discriminant capabilities. Instead of using both false acceptance rate and false rejection rate, the mutual information was used as a unified metric for evaluating the classifying performance. The parameters of specified positions, areas and grey difference bias for each single region pair feature were selected by an optimization processing aiming at maximizing the mutual information between the region pair feature and classifying distribution, respectively. An additional region-based feature depicting the correlation between global region grey intensity patterns was also proposed. Compared with the result of Viola-like approach using over 2 000 features, the proposed approach can achieve similar error rates with only 16 features and 1/6 implementation time on controlled illumination images.展开更多
Biometric applications widely use the face as a component for recognition and automatic detection.Face rotation is a variable component and makes face detection a complex and challenging task with varied angles and ro...Biometric applications widely use the face as a component for recognition and automatic detection.Face rotation is a variable component and makes face detection a complex and challenging task with varied angles and rotation.This problem has been investigated,and a novice algorithm,namely RIFDS(Rotation Invariant Face Detection System),has been devised.The objective of the paper is to implement a robust method for face detection taken at various angle.Further to achieve better results than known algorithms for face detection.In RIFDS Polar Harmonic Transforms(PHT)technique is combined with Multi-Block Local Binary Pattern(MBLBP)in a hybrid manner.The MBLBP is used to extract texture patterns from the digital image,and the PHT is used to manage invariant rotation characteristics.In this manner,RIFDS can detect human faces at different rotations and with different facial expressions.The RIFDS performance is validated on different face databases like LFW,ORL,CMU,MIT-CBCL,JAFFF Face Databases,and Lena images.The results show that the RIFDS algorithm can detect faces at varying angles and at different image resolutions and with an accuracy of 99.9%.The RIFDS algorithm outperforms previous methods like Viola-Jones,Multi-blockLocal Binary Pattern(MBLBP),and Polar HarmonicTransforms(PHTs).The RIFDS approach has a further scope with a genetic algorithm to detect faces(approximation)even from shadows.展开更多
Security access control systems and automatic video surveillance systems are becoming increasingly important recently,and detecting human faces is one of the indispensable processes.In this paper,an approach is presen...Security access control systems and automatic video surveillance systems are becoming increasingly important recently,and detecting human faces is one of the indispensable processes.In this paper,an approach is presented to detect faces in video surveillance.Firstly,both the skin-color and motion components are applied to extract skin-like regions.The skin-color segmentation algorithm is based on the BPNN (back-error-propagation neural network) and the motion component is obtained with frame difference algorithm.Secondly,the image is clustered into separated face candidates by using the region growing technique.Finally,the face candidates are further verified by the rule-based algorithm.Experiment results demonstrate that both the accuracy and processing speed are very promising and the approach can be applied for the practical use.展开更多
The intelligent environment needs Human-Computer Interactive technology (HCI) and a projector projects screen on wall in the intelligent environments. We propose the front-face detection from four captured images re...The intelligent environment needs Human-Computer Interactive technology (HCI) and a projector projects screen on wall in the intelligent environments. We propose the front-face detection from four captured images related to the intelligent room for the deaf. Our proposal purpose is that a deaf user faces wall displaying everywhere. system gets the images from four cameras, and detects the user region from a silhouette image using a different method, detects and cuts a motion body region from a different image, and cuts the vertexchest region from the cut body region image. The system attempts to find front-face using Haar-like feature, and selects a detected front-face image from the vertex-chest region. We estimate the front-face detection of recognition rate, which shows somewhat successfully.展开更多
The spread of social media has increased contacts of members of communities on the lntemet. Members of these communities often use account names instead of real names. When they meet in the real world, they will find ...The spread of social media has increased contacts of members of communities on the lntemet. Members of these communities often use account names instead of real names. When they meet in the real world, they will find it useful to have a tool that enables them to associate the faces in fiont of them with the account names they know. This paper proposes a method that enables a person to identify the account name of the person ("target") in front of him/her using a smartphone. The attendees to a meeting exchange their identifiers (i.e., the account name) and GPS information using smartphones. When the user points his/her smartphone towards a target, the target's identifier is displayed near the target's head on the camera screen using AR (augmented reality). The position where the identifier is displayed is calculated from the differences in longitude and latitude between the user and the target and the azimuth direction of the target from the user. The target is identified based on this information, the face detection coordinates, and the distance between the two. The proposed method has been implemented using Android terminals, and identification accuracy has been examined through experiments.展开更多
One being developed automatic sweep robot, need to estimate if anyone is on a certain range of road ahead then automatically adjust running speed, in order to ensure work efficiency and operation safety. This paper pr...One being developed automatic sweep robot, need to estimate if anyone is on a certain range of road ahead then automatically adjust running speed, in order to ensure work efficiency and operation safety. This paper proposed a method using face detection to predict the data of image sensor. The experimental results show that, the proposed algorithm is practical and reliable, and good outcome have been achieved in the application of instruction robot.展开更多
Face detection is applied to many tasks such as auto focus control, surveillance, user interface, and face recognition. Processing speed and detection accuracy of the face detection have been improved continuously. Th...Face detection is applied to many tasks such as auto focus control, surveillance, user interface, and face recognition. Processing speed and detection accuracy of the face detection have been improved continuously. This paper describes a novel method of fast face detection with multi-scale window search free from image resizing. We adopt statistics of gradient images (SGI) as image features and append an overlapping cell array to improve detection accuracy. The SGI feature is scale invariant and insensitive to small difference of pixel value. These characteristics enable the multi-scale window search without image resizing. Experimental results show that processing speed of our method is 3.66 times faster than a conventional method, adopting HOG features combined to an SVM classifier, without accuracy degradation.展开更多
Background Several face detection and recogni tion methods have been proposed in the past decades that have excellent performance.The conventional face recognition pipeline comprises the following:(1)face detection,(2...Background Several face detection and recogni tion methods have been proposed in the past decades that have excellent performance.The conventional face recognition pipeline comprises the following:(1)face detection,(2)face alignment,(3)feature extraction,and(4)similarity,which are independent of each other.The separate facial analysis stages lead to redundant model calculations,and are difficult for use in end-to-end training.Methods In this paper,we propose a novel end-to-end trainable convolutional network framework for face detection and recognition,in which a geometric transformation matrix is directly learned to align the faces rather than predicting the facial landmarks.In the training stage,our single CNN model is supervised only by face bounding boxes and personal identities,which are publicly available from WIDER FACE and CASIA-WebFace datasets.Our model is tested on Face Detection Dataset and Benchmark(FDDB)and Labeled Face in the Wild(LFW)datasets.Results The results show 89.24%recall for face detection tasks and 98.63%accura cy for face recognition tasks.展开更多
Face liveness detection is essential for securing biometric authentication systems against spoofing attacks,including printed photos,replay videos,and 3D masks.This study systematically evaluates pre-trained CNN model...Face liveness detection is essential for securing biometric authentication systems against spoofing attacks,including printed photos,replay videos,and 3D masks.This study systematically evaluates pre-trained CNN models—DenseNet201,VGG16,InceptionV3,ResNet50,VGG19,MobileNetV2,Xception,and InceptionResNetV2—leveraging transfer learning and fine-tuning to enhance liveness detection performance.The models were trained and tested on NUAA and Replay-Attack datasets,with cross-dataset generalization validated on SiW-MV2 to assess real-world adaptability.Performance was evaluated using accuracy,precision,recall,FAR,FRR,HTER,and specialized spoof detection metrics(APCER,NPCER,ACER).Fine-tuning significantly improved detection accuracy,with DenseNet201 achieving the highest performance(98.5%on NUAA,97.71%on Replay-Attack),while MobileNetV2 proved the most efficient model for real-time applications(latency:15 ms,memory usage:45 MB,energy consumption:30 mJ).A statistical significance analysis(paired t-tests,confidence intervals)validated these improvements.Cross-dataset experiments identified DenseNet201 and MobileNetV2 as the most generalizable architectures,with DenseNet201 achieving 86.4%accuracy on Replay-Attack when trained on NUAA,demonstrating robust feature extraction and adaptability.In contrast,ResNet50 showed lower generalization capabilities,struggling with dataset variability and complex spoofing attacks.These findings suggest that MobileNetV2 is well-suited for low-power applications,while DenseNet201 is ideal for high-security environments requiring superior accuracy.This research provides a framework for improving real-time face liveness detection,enhancing biometric security,and guiding future advancements in AI-driven anti-spoofing techniques.展开更多
As the use of deepfake facial videos proliferate,the associated threats to social security and integrity cannot be overstated.Effective methods for detecting forged facial videos are thus urgently needed.While many de...As the use of deepfake facial videos proliferate,the associated threats to social security and integrity cannot be overstated.Effective methods for detecting forged facial videos are thus urgently needed.While many deep learning-based facial forgery detection approaches show promise,they often fail to delve deeply into the complex relationships between image features and forgery indicators,limiting their effectiveness to specific forgery techniques.To address this challenge,we propose a dual-branch collaborative deepfake detection network.The network processes video frame images as input,where a specialized noise extraction module initially extracts the noise feature maps.Subsequently,the original facial images and corresponding noise maps are directed into two parallel feature extraction branches to concurrently learn texture and noise forgery clues.An attention mechanism is employed between the two branches to facilitate mutual guidance and enhancement of texture and noise features across four different scales.This dual-modal feature integration enhances sensitivity to forgery artifacts and boosts generalization ability across various forgery techniques.Features from both branches are then effectively combined and processed through a multi-layer perception layer to distinguish between real and forged video.Experimental results on benchmark deepfake detection datasets demonstrate that our approach outperforms existing state-of-the-art methods in terms of detection performance,accuracy,and generalization ability.展开更多
Face Presentation Attack Detection(fPAD)plays a vital role in securing face recognition systems against various presentation attacks.While supervised learning-based methods demonstrate effectiveness,they are prone to ...Face Presentation Attack Detection(fPAD)plays a vital role in securing face recognition systems against various presentation attacks.While supervised learning-based methods demonstrate effectiveness,they are prone to overfitting to known attack types and struggle to generalize to novel attack scenarios.Recent studies have explored formulating fPAD as an anomaly detection problem or one-class classification task,enabling the training of generalized models for unknown attack detection.However,conventional anomaly detection approaches encounter difficulties in precisely delineating the boundary between bonafide samples and unknown attacks.To address this challenge,we propose a novel framework focusing on unknown attack detection using exclusively bonafide facial data during training.The core innovation lies in our pseudo-negative sample synthesis(PNSS)strategy,which facilitates learning of compact decision boundaries between bonafide faces and potential attack variations.Specifically,PNSS generates synthetic negative samples within low-likelihood regions of the bonafide feature space to represent diverse unknown attack patterns.To overcome the inherent imbalance between positive and synthetic negative samples during iterative training,we implement a dual-loss mechanism combining focal loss for classification optimization with pairwise confusion loss as a regularizer.This architecture effectively mitigates model bias towards bonafide samples while maintaining discriminative power.Comprehensive evaluations across three benchmark datasets validate the framework’s superior performance.Notably,our PNSS achieves 8%–18% average classification error rate(ACER)reduction compared with state-of-the-art one-class fPAD methods in cross-dataset evaluations on Idiap Replay-Attack and MSU-MFSD datasets.展开更多
Locating multi-view faces in images with a complex background remains a challenging problem. In this paper, an integrated method for real-time multi-view face detection and pose estimation is presented. A simple-to-...Locating multi-view faces in images with a complex background remains a challenging problem. In this paper, an integrated method for real-time multi-view face detection and pose estimation is presented. A simple-to-complex and coarse-to-fine view-based detector architecture has been designed to detect multi- view faces and estimate their poses efficiently. Both the pose estimators and the view-based face/nonface detectors are trained by a cost-sensitive AdaBoost algorithm to improve the generalization ability. Experi- mental results show that the proposed multi-view face detector, which can be constructed easily, gives more robust face detection and pose estimation and has a faster real-time detection speed compared with other conventional methods.展开更多
In this paper,we present a strategy to implement multi-pose face detection in compressed domain.The strategy extracts firstly feature vectors from DCT domain,and then uses a boosting algorithm to build classificrs to ...In this paper,we present a strategy to implement multi-pose face detection in compressed domain.The strategy extracts firstly feature vectors from DCT domain,and then uses a boosting algorithm to build classificrs to distinguish faces and non-faces.Moreover,to get more accurate results of the face detection,we present a kernel function and a linear combination to build incrementally the strong classifiers based on the weak classifiers.Through comparing and analyzing results of some experiments on the synthetic data and the natural data,we can get more satisfied results by the strong classifiers than by the weak classifies.展开更多
基金Supported by the Fundamental Research Funds for the Central Universities(2024300443)the Natural Science Foundation of Jiangsu Province(BK20241224).
文摘This paper presents a high-speed and robust dual-band infrared thermal camera based on an ARM CPU.The system consists of a low-resolution long-wavelength infrared detector,a digital temperature and humid⁃ity sensor,and a CMOS sensor.In view of the significant contrast between face and background in thermal infra⁃red images,this paper explores a suitable accuracy-latency tradeoff for thermal face detection and proposes a tiny,lightweight detector named YOLO-Fastest-IR.Four YOLO-Fastest-IR models(IR0 to IR3)with different scales are designed based on YOLO-Fastest.To train and evaluate these lightweight models,a multi-user low-resolution thermal face database(RGBT-MLTF)was collected,and the four networks were trained.Experiments demon⁃strate that the lightweight convolutional neural network performs well in thermal infrared face detection tasks.The proposed algorithm outperforms existing face detection methods in both positioning accuracy and speed,making it more suitable for deployment on mobile platforms or embedded devices.After obtaining the region of interest(ROI)in the infrared(IR)image,the RGB camera is guided by the thermal infrared face detection results to achieve fine positioning of the RGB face.Experimental results show that YOLO-Fastest-IR achieves a frame rate of 92.9 FPS on a Raspberry Pi 4B and successfully detects 97.4%of faces in the RGBT-MLTF test set.Ultimate⁃ly,an infrared temperature measurement system with low cost,strong robustness,and high real-time perfor⁃mance was integrated,achieving a temperature measurement accuracy of 0.3℃.
基金funded by A’Sharqiyah University,Sultanate of Oman,under Research Project grant number(BFP/RGP/ICT/22/490).
文摘Detecting faces under occlusion remains a significant challenge in computer vision due to variations caused by masks,sunglasses,and other obstructions.Addressing this issue is crucial for applications such as surveillance,biometric authentication,and human-computer interaction.This paper provides a comprehensive review of face detection techniques developed to handle occluded faces.Studies are categorized into four main approaches:feature-based,machine learning-based,deep learning-based,and hybrid methods.We analyzed state-of-the-art studies within each category,examining their methodologies,strengths,and limitations based on widely used benchmark datasets,highlighting their adaptability to partial and severe occlusions.The review also identifies key challenges,including dataset diversity,model generalization,and computational efficiency.Our findings reveal that deep learning methods dominate recent studies,benefiting from their ability to extract hierarchical features and handle complex occlusion patterns.More recently,researchers have increasingly explored Transformer-based architectures,such as Vision Transformer(ViT)and Swin Transformer,to further improve detection robustness under challenging occlusion scenarios.In addition,hybrid approaches,which aim to combine traditional andmodern techniques,are emerging as a promising direction for improving robustness.This review provides valuable insights for researchers aiming to develop more robust face detection systems and for practitioners seeking to deploy reliable solutions in real-world,occlusionprone environments.Further improvements and the proposal of broader datasets are required to developmore scalable,robust,and efficient models that can handle complex occlusions in real-world scenarios.
基金funded by A’Sharqiyah University,Sultanate of Oman,under Research Project Grant Number(BFP/RGP/ICT/22/490).
文摘Face detection is a critical component inmodern security,surveillance,and human-computer interaction systems,with widespread applications in smartphones,biometric access control,and public monitoring.However,detecting faces with high levels of occlusion,such as those covered by masks,veils,or scarves,remains a significant challenge,as traditional models often fail to generalize under such conditions.This paper presents a hybrid approach that combines traditional handcrafted feature extraction technique called Histogram of Oriented Gradients(HOG)and Canny edge detection with modern deep learning models.The goal is to improve face detection accuracy under occlusions.The proposed method leverages the structural strengths of HOG and edge-based object proposals while exploiting the feature extraction capabilities of Convolutional Neural Networks(CNNs).The effectiveness of the proposed model is assessed using a custom dataset containing 10,000 heavily occluded face images and a subset of the Common Objects in Context(COCO)dataset for non-face samples.The COCO dataset was selected for its variety and realism in background contexts.Experimental evaluations demonstrate significant performance improvements compared to baseline CNN models.Results indicate that DenseNet121 combined with HOG outperforms other counterparts in classification metrics with an F1-score of 87.96%and precision of 88.02%.Enhanced performance is achieved through reduced false positives and improved localization accuracy with the integration of object proposals based on Canny and contour detection.While the proposed method increases inference time from 33.52 to 97.80 ms,it achieves a notable improvement in precision from 80.85% to 88.02% when comparing the baseline DenseNet121 model to its hybrid counterpart.Limitations of the method include higher computational cost and the need for careful tuning of parameters across the edge detection,handcrafted features,and CNN components.These findings highlight the potential of combining handcrafted and learned features for occluded face detection tasks.
文摘Face recognition has emerged as one of the most prominent applications of image analysis and under-standing,gaining considerable attention in recent years.This growing interest is driven by two key factors:its extensive applications in law enforcement and the commercial domain,and the rapid advancement of practical technologies.Despite the significant advancements,modern recognition algorithms still struggle in real-world conditions such as varying lighting conditions,occlusion,and diverse facial postures.In such scenarios,human perception is still well above the capabilities of present technology.Using the systematic mapping study,this paper presents an in-depth review of face detection algorithms and face recognition algorithms,presenting a detailed survey of advancements made between 2015 and 2024.We analyze key methodologies,highlighting their strengths and restrictions in the application context.Additionally,we examine various datasets used for face detection/recognition datasets focusing on the task-specific applications,size,diversity,and complexity.By analyzing these algorithms and datasets,this survey works as a valuable resource for researchers,identifying the research gap in the field of face detection and recognition and outlining potential directions for future research.
文摘Although important progresses have been already made in face detection,many false faces can be found in detection results and false detection rate is influenced by some factors,such as rotation and tilt of human face,complicated background,illumination,scale,cloak and hairstyle.This paper proposes a new method called DP-Adaboost algorithm to detect multi-angle human face and improve the correct detection rate.An improved Adaboost algorithm with the fusion of frontal face classifier and a profile face classifier is used to detect the multi-angle face.An improved horizontal differential projection algorithm is put forward to remove those non-face images among the preliminary detection results from the improved Adaboost algorithm.Experiment results show that compared with the classical Adaboost algorithm with a frontal face classifier,the textual DP-Adaboost algorithm can reduce false rate significantly and improve hit rate in multi-angle face detection.
基金supported by the National Basic Research Program of China(973 Program)under Grant No.2012CB215202the National Natural Science Foundation of China under Grant No.51205046
文摘For face detection under complex background and illumination, a detection method that combines the skin color segmentation and cost-sensitive Adaboost algorithm is proposed in this paper. First, by using the characteristic of human skin color clustering in the color space, the skin color area in YC b C r color space is extracted and a large number of irrelevant backgrounds are excluded; then for remedying the deficiencies of Adaboost algorithm, the cost-sensitive function is introduced into the Adaboost algorithm; finally the skin color segmentation and cost-sensitive Adaboost algorithm are combined for the face detection. Experimental results show that the proposed detection method has a higher detection rate and detection speed, which can more adapt to the actual field environment.
基金Princess Nourah bint Abdulrahman University Researchers Supporting Project Number(PNURSP2023R349)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.This study is supported via funding from Prince Sattam bin Abdulaziz University Project Number(PSAU/2023/R/1444).
文摘Face recognition technology automatically identifies an individual from image or video sources.The detection process can be done by attaining facial characteristics from the image of a subject face.Recent developments in deep learning(DL)and computer vision(CV)techniques enable the design of automated face recognition and tracking methods.This study presents a novel Harris Hawks Optimization with deep learning-empowered automated face detection and tracking(HHODL-AFDT)method.The proposed HHODL-AFDT model involves a Faster region based convolution neural network(RCNN)-based face detection model and HHO-based hyperparameter opti-mization process.The presented optimal Faster RCNN model precisely rec-ognizes the face and is passed into the face-tracking model using a regression network(REGN).The face tracking using the REGN model uses the fea-tures from neighboring frames and foresees the location of the target face in succeeding frames.The application of the HHO algorithm for optimal hyperparameter selection shows the novelty of the work.The experimental validation of the presented HHODL-AFDT algorithm is conducted using two datasets and the experiment outcomes highlighted the superior performance of the HHODL-AFDT model over current methodologies with maximum accuracy of 90.60%and 88.08%under PICS and VTB datasets,respectively.
基金Supported by the Joint Research Funds of Dalian University of Technology and Shenyang Automation Institute,Chinese Academy of Sciences
文摘A new kind of region pair grey difference classifier was proposed. The regions in pairs associated to form a feature were not necessarily directly-connected, but were selected dedicatedly to the grey transition between regions coinciding with the face pattern structure. Fifteen brighter and darker region pairs were chosen to form the region pair grey difference features with high discriminant capabilities. Instead of using both false acceptance rate and false rejection rate, the mutual information was used as a unified metric for evaluating the classifying performance. The parameters of specified positions, areas and grey difference bias for each single region pair feature were selected by an optimization processing aiming at maximizing the mutual information between the region pair feature and classifying distribution, respectively. An additional region-based feature depicting the correlation between global region grey intensity patterns was also proposed. Compared with the result of Viola-like approach using over 2 000 features, the proposed approach can achieve similar error rates with only 16 features and 1/6 implementation time on controlled illumination images.
基金The authors would like to thank the Deanship of Scientific Research at Majmaah University for supporting this work under Project Number No-R-2021-154.
文摘Biometric applications widely use the face as a component for recognition and automatic detection.Face rotation is a variable component and makes face detection a complex and challenging task with varied angles and rotation.This problem has been investigated,and a novice algorithm,namely RIFDS(Rotation Invariant Face Detection System),has been devised.The objective of the paper is to implement a robust method for face detection taken at various angle.Further to achieve better results than known algorithms for face detection.In RIFDS Polar Harmonic Transforms(PHT)technique is combined with Multi-Block Local Binary Pattern(MBLBP)in a hybrid manner.The MBLBP is used to extract texture patterns from the digital image,and the PHT is used to manage invariant rotation characteristics.In this manner,RIFDS can detect human faces at different rotations and with different facial expressions.The RIFDS performance is validated on different face databases like LFW,ORL,CMU,MIT-CBCL,JAFFF Face Databases,and Lena images.The results show that the RIFDS algorithm can detect faces at varying angles and at different image resolutions and with an accuracy of 99.9%.The RIFDS algorithm outperforms previous methods like Viola-Jones,Multi-blockLocal Binary Pattern(MBLBP),and Polar HarmonicTransforms(PHTs).The RIFDS approach has a further scope with a genetic algorithm to detect faces(approximation)even from shadows.
基金This work is supported by the National Natural Science
文摘Security access control systems and automatic video surveillance systems are becoming increasingly important recently,and detecting human faces is one of the indispensable processes.In this paper,an approach is presented to detect faces in video surveillance.Firstly,both the skin-color and motion components are applied to extract skin-like regions.The skin-color segmentation algorithm is based on the BPNN (back-error-propagation neural network) and the motion component is obtained with frame difference algorithm.Secondly,the image is clustered into separated face candidates by using the region growing technique.Finally,the face candidates are further verified by the rule-based algorithm.Experiment results demonstrate that both the accuracy and processing speed are very promising and the approach can be applied for the practical use.
基金supported by the Ministry of Knowledge Economy,Korea,the ITRC(Information Technology Research Center)support program(NIA-2009-(C1090-0902-0007))the Contents Technology Research Center support program
文摘The intelligent environment needs Human-Computer Interactive technology (HCI) and a projector projects screen on wall in the intelligent environments. We propose the front-face detection from four captured images related to the intelligent room for the deaf. Our proposal purpose is that a deaf user faces wall displaying everywhere. system gets the images from four cameras, and detects the user region from a silhouette image using a different method, detects and cuts a motion body region from a different image, and cuts the vertexchest region from the cut body region image. The system attempts to find front-face using Haar-like feature, and selects a detected front-face image from the vertex-chest region. We estimate the front-face detection of recognition rate, which shows somewhat successfully.
文摘The spread of social media has increased contacts of members of communities on the lntemet. Members of these communities often use account names instead of real names. When they meet in the real world, they will find it useful to have a tool that enables them to associate the faces in fiont of them with the account names they know. This paper proposes a method that enables a person to identify the account name of the person ("target") in front of him/her using a smartphone. The attendees to a meeting exchange their identifiers (i.e., the account name) and GPS information using smartphones. When the user points his/her smartphone towards a target, the target's identifier is displayed near the target's head on the camera screen using AR (augmented reality). The position where the identifier is displayed is calculated from the differences in longitude and latitude between the user and the target and the azimuth direction of the target from the user. The target is identified based on this information, the face detection coordinates, and the distance between the two. The proposed method has been implemented using Android terminals, and identification accuracy has been examined through experiments.
文摘One being developed automatic sweep robot, need to estimate if anyone is on a certain range of road ahead then automatically adjust running speed, in order to ensure work efficiency and operation safety. This paper proposed a method using face detection to predict the data of image sensor. The experimental results show that, the proposed algorithm is practical and reliable, and good outcome have been achieved in the application of instruction robot.
文摘Face detection is applied to many tasks such as auto focus control, surveillance, user interface, and face recognition. Processing speed and detection accuracy of the face detection have been improved continuously. This paper describes a novel method of fast face detection with multi-scale window search free from image resizing. We adopt statistics of gradient images (SGI) as image features and append an overlapping cell array to improve detection accuracy. The SGI feature is scale invariant and insensitive to small difference of pixel value. These characteristics enable the multi-scale window search without image resizing. Experimental results show that processing speed of our method is 3.66 times faster than a conventional method, adopting HOG features combined to an SVM classifier, without accuracy degradation.
文摘Background Several face detection and recogni tion methods have been proposed in the past decades that have excellent performance.The conventional face recognition pipeline comprises the following:(1)face detection,(2)face alignment,(3)feature extraction,and(4)similarity,which are independent of each other.The separate facial analysis stages lead to redundant model calculations,and are difficult for use in end-to-end training.Methods In this paper,we propose a novel end-to-end trainable convolutional network framework for face detection and recognition,in which a geometric transformation matrix is directly learned to align the faces rather than predicting the facial landmarks.In the training stage,our single CNN model is supervised only by face bounding boxes and personal identities,which are publicly available from WIDER FACE and CASIA-WebFace datasets.Our model is tested on Face Detection Dataset and Benchmark(FDDB)and Labeled Face in the Wild(LFW)datasets.Results The results show 89.24%recall for face detection tasks and 98.63%accura cy for face recognition tasks.
基金funded by Centre for Advanced Modelling and Geospatial Information Systems(CAMGIS),Faculty of Engineering and IT,University of Technology Sydney.Moreover,Ongoing Research Funding Program(ORF-2025-14)King Saud University,Riyadh,Saudi Arabia,under Project ORF-2025-。
文摘Face liveness detection is essential for securing biometric authentication systems against spoofing attacks,including printed photos,replay videos,and 3D masks.This study systematically evaluates pre-trained CNN models—DenseNet201,VGG16,InceptionV3,ResNet50,VGG19,MobileNetV2,Xception,and InceptionResNetV2—leveraging transfer learning and fine-tuning to enhance liveness detection performance.The models were trained and tested on NUAA and Replay-Attack datasets,with cross-dataset generalization validated on SiW-MV2 to assess real-world adaptability.Performance was evaluated using accuracy,precision,recall,FAR,FRR,HTER,and specialized spoof detection metrics(APCER,NPCER,ACER).Fine-tuning significantly improved detection accuracy,with DenseNet201 achieving the highest performance(98.5%on NUAA,97.71%on Replay-Attack),while MobileNetV2 proved the most efficient model for real-time applications(latency:15 ms,memory usage:45 MB,energy consumption:30 mJ).A statistical significance analysis(paired t-tests,confidence intervals)validated these improvements.Cross-dataset experiments identified DenseNet201 and MobileNetV2 as the most generalizable architectures,with DenseNet201 achieving 86.4%accuracy on Replay-Attack when trained on NUAA,demonstrating robust feature extraction and adaptability.In contrast,ResNet50 showed lower generalization capabilities,struggling with dataset variability and complex spoofing attacks.These findings suggest that MobileNetV2 is well-suited for low-power applications,while DenseNet201 is ideal for high-security environments requiring superior accuracy.This research provides a framework for improving real-time face liveness detection,enhancing biometric security,and guiding future advancements in AI-driven anti-spoofing techniques.
基金funded by the Ministry of Public Security Science and Technology Program Project(No.2023LL35)the Key Laboratory of Smart Policing and National Security Risk Governance,Sichuan Province(No.ZHZZZD2302).
文摘As the use of deepfake facial videos proliferate,the associated threats to social security and integrity cannot be overstated.Effective methods for detecting forged facial videos are thus urgently needed.While many deep learning-based facial forgery detection approaches show promise,they often fail to delve deeply into the complex relationships between image features and forgery indicators,limiting their effectiveness to specific forgery techniques.To address this challenge,we propose a dual-branch collaborative deepfake detection network.The network processes video frame images as input,where a specialized noise extraction module initially extracts the noise feature maps.Subsequently,the original facial images and corresponding noise maps are directed into two parallel feature extraction branches to concurrently learn texture and noise forgery clues.An attention mechanism is employed between the two branches to facilitate mutual guidance and enhancement of texture and noise features across four different scales.This dual-modal feature integration enhances sensitivity to forgery artifacts and boosts generalization ability across various forgery techniques.Features from both branches are then effectively combined and processed through a multi-layer perception layer to distinguish between real and forged video.Experimental results on benchmark deepfake detection datasets demonstrate that our approach outperforms existing state-of-the-art methods in terms of detection performance,accuracy,and generalization ability.
基金supported in part by the National Natural Science Foundation of China under Grants 61972267,and 61772070in part by the Natural Science Foundation of Hebei Province under Grant F2024210005.
文摘Face Presentation Attack Detection(fPAD)plays a vital role in securing face recognition systems against various presentation attacks.While supervised learning-based methods demonstrate effectiveness,they are prone to overfitting to known attack types and struggle to generalize to novel attack scenarios.Recent studies have explored formulating fPAD as an anomaly detection problem or one-class classification task,enabling the training of generalized models for unknown attack detection.However,conventional anomaly detection approaches encounter difficulties in precisely delineating the boundary between bonafide samples and unknown attacks.To address this challenge,we propose a novel framework focusing on unknown attack detection using exclusively bonafide facial data during training.The core innovation lies in our pseudo-negative sample synthesis(PNSS)strategy,which facilitates learning of compact decision boundaries between bonafide faces and potential attack variations.Specifically,PNSS generates synthetic negative samples within low-likelihood regions of the bonafide feature space to represent diverse unknown attack patterns.To overcome the inherent imbalance between positive and synthetic negative samples during iterative training,we implement a dual-loss mechanism combining focal loss for classification optimization with pairwise confusion loss as a regularizer.This architecture effectively mitigates model bias towards bonafide samples while maintaining discriminative power.Comprehensive evaluations across three benchmark datasets validate the framework’s superior performance.Notably,our PNSS achieves 8%–18% average classification error rate(ACER)reduction compared with state-of-the-art one-class fPAD methods in cross-dataset evaluations on Idiap Replay-Attack and MSU-MFSD datasets.
文摘Locating multi-view faces in images with a complex background remains a challenging problem. In this paper, an integrated method for real-time multi-view face detection and pose estimation is presented. A simple-to-complex and coarse-to-fine view-based detector architecture has been designed to detect multi- view faces and estimate their poses efficiently. Both the pose estimators and the view-based face/nonface detectors are trained by a cost-sensitive AdaBoost algorithm to improve the generalization ability. Experi- mental results show that the proposed multi-view face detector, which can be constructed easily, gives more robust face detection and pose estimation and has a faster real-time detection speed compared with other conventional methods.
基金Supported by the National863Prugram(2002AA11101)Open Fand of State Teehnology Center uf Mult-media Software Engineering(621-273128)
文摘In this paper,we present a strategy to implement multi-pose face detection in compressed domain.The strategy extracts firstly feature vectors from DCT domain,and then uses a boosting algorithm to build classificrs to distinguish faces and non-faces.Moreover,to get more accurate results of the face detection,we present a kernel function and a linear combination to build incrementally the strong classifiers based on the weak classifiers.Through comparing and analyzing results of some experiments on the synthetic data and the natural data,we can get more satisfied results by the strong classifiers than by the weak classifies.