Automatically detecting learners’engagement levels helps to develop more effective online teaching and assessment programs,allowing teachers to provide timely feedback and make personalized adjustments based on stude...Automatically detecting learners’engagement levels helps to develop more effective online teaching and assessment programs,allowing teachers to provide timely feedback and make personalized adjustments based on students’needs to enhance teaching effectiveness.Traditional approaches mainly rely on single-frame multimodal facial spatial information,neglecting temporal emotional and behavioural features,with accuracy affected by significant pose variations.Additionally,convolutional padding can erode feature maps,affecting feature extraction’s representational capacity.To address these issues,we propose a hybrid neural network architecture,the redistributing facial features and temporal convolutional network(RefEIP).This network consists of three key components:first,utilizing the spatial attention mechanism large kernel attention(LKA)to automatically capture local patches and mitigate the effects of pose variations;second,employing the feature organization and weight distribution(FOWD)module to redistribute feature weights and eliminate the impact of white features and enhancing representation in facial feature maps.Finally,we analyse the temporal changes in video frames through the modern temporal convolutional network(ModernTCN)module to detect engagement levels.We constructed a near-infrared engagement video dataset(NEVD)to better validate the efficiency of the RefEIP network.Through extensive experiments and in-depth studies,we evaluated these methods on the NEVD and the Database for Affect in Situations of Elicitation(DAiSEE),achieving an accuracy of 90.8%on NEVD and 61.2%on DAiSEE in the fourclass classification task,indicating significant advantages in addressing engagement video analysis problems.展开更多
Air gun arrays are often used in marine energy exploration and marine geological surveys.The study of the single bubble dynamics and multibubbles produced by air guns interacting with each other is helpful in understa...Air gun arrays are often used in marine energy exploration and marine geological surveys.The study of the single bubble dynamics and multibubbles produced by air guns interacting with each other is helpful in understanding pressure signals.We used the van der Waals air gun model to simulate the wavelets of a sleeve gun of various offsets and arrival angles.Several factors were taken into account,such as heat transfer,the thermodynamically open quasi-static system,the vertical rise of the bubble,and air gun post throttling.Marine vertical cables are located on the seafloor,but hydrophones are located in seawater and are far away from the air gun array vertically.This situation conforms to the acquisition conditions of the air gun far-field wavelet and thus avoids the problems of ship noise,ocean surges,and coupling.High-quality 3D wavelet data of air gun arrays were collected during a vertical cable test in the South China Sea in 2017.We proposed an evaluation method of multidimensional facial features,including zeropeak amplitude,peak-peak amplitude,bubble period,primary-to-bubble ratio,frequency spectrum,instantaneous amplitude,instantaneous phase,and instantaneous frequency,to characterize the 3D air gun wave field.The match between the facial features in the field and simulated data provides confidence for the use of the van der Waals air gun model to predict air gun wavelet and facial features to evaluate air gun array.展开更多
Deepfake technology can be used to replace people’s faces in videos or pictures to show them saying or doing things they never said or did. Deepfake media are often used to extort, defame, and manipulate public opini...Deepfake technology can be used to replace people’s faces in videos or pictures to show them saying or doing things they never said or did. Deepfake media are often used to extort, defame, and manipulate public opinion. However, despite deepfake technology’s risks, current deepfake detection methods lack generalization and are inconsistent when applied to unknown videos, i.e., videos on which they have not been trained. The purpose of this study is to develop a generalizable deepfake detection model by training convoluted neural networks (CNNs) to classify human facial features in videos. The study formulated the research questions: “How effectively does the developed model provide reliable generalizations?” A CNN model was trained to distinguish between real and fake videos using the facial features of human subjects in videos. The model was trained, validated, and tested using the FaceForensiq++ dataset, which contains more than 500,000 frames and subsets of the DFDC dataset, totaling more than 22,000 videos. The study demonstrated high generalizability, as the accuracy of the unknown dataset was only marginally (about 1%) lower than that of the known dataset. The findings of this study indicate that detection systems can be more generalizable, lighter, and faster by focusing on just a small region (the human face) of an entire video.展开更多
Facial expression recognition consists of determining what kind of emotional content is presented in a human face. The problem presents a complex area for exploration, since it encompasses face acquisition, facial fea...Facial expression recognition consists of determining what kind of emotional content is presented in a human face. The problem presents a complex area for exploration, since it encompasses face acquisition, facial feature tracking, facial ex- pression classification. Facial feature tracking is of the most interest. Active Appearance Model (AAM) enables accurate tracking of facial features in real-time, but lacks occlusions and self-occlusions. In this paper we propose a solution to improve the accuracy of fitting technique. The idea is to include occluded images into AAM training data. We demonstrate the results by running ex- periments using gradient descent algorithm for fitting the AAM. Our experiments show that using fitting algorithm with occluded training data improves the fitting quality of the algorithm.展开更多
An efficient algorithm for facial features extractions is proposed. The facial features we segment are the two eyes, nose and mouth. The algorithm is based on an improved Gabor wavelets edge detector, morphological ap...An efficient algorithm for facial features extractions is proposed. The facial features we segment are the two eyes, nose and mouth. The algorithm is based on an improved Gabor wavelets edge detector, morphological approach to detect the face region and facial features regions, and an improved T-shape face mask to locate the extract location of facial features. The experimental results show that the proposed method is robust against facial expression, illumination, and can be also effective if the person wearing glasses, and so on.展开更多
This paper presents a set of algorithms capable of locating main facial features automatically and effectively. Based on integral projection of local binary image pixels and pixel clustering techniques, a set of a p...This paper presents a set of algorithms capable of locating main facial features automatically and effectively. Based on integral projection of local binary image pixels and pixel clustering techniques, a set of a priori knowledge based algorithms have succeeded in locating eyes, nose and mouth, and uprighting the tilt face. The proposed approach is superior to other methods as it takes account of photos with glasses and sha dows, therefore suitable for processing real ID type photos.展开更多
Objective To explore the feasibility of constructing a lung cancer early-warning risk model based on facial image features,providing novel insights into the early screening of lung cancer.Methods This study included p...Objective To explore the feasibility of constructing a lung cancer early-warning risk model based on facial image features,providing novel insights into the early screening of lung cancer.Methods This study included patients with pulmonary nodules diagnosed at the Physical Examination Center of Shuguang Hospital Affiliated to Shanghai University of Traditional Chinese Medicine from November 1,2019 to December 31,2024,as well as patients with lung cancer diagnosed in the Oncology Departments of Yueyang Hospital of Integrated Traditional Chinese and Western Medicine and Longhua Hospital during the same period.The facial image information of patients with pulmonary nodules and lung cancer was collected using the TFDA-1 tongue and facial diagnosis instrument,and the facial diagnosis features were extracted from it by deep learning technology.Statistical analysis was conducted on the objective facial diagnosis characteristics of the two groups of participants to explore the differences in their facial image characteristics,and the least absolute shrinkage and selection operator(LASSO)regression was used to screen the characteristic variables.Based on the screened feature variables,four machine learning methods:random forest,logistic regression,support vector machine(SVM),and gradient boosting decision tree(GBDT)were used to establish lung cancer classification models independently.Meanwhile,the model performance was evaluated by indicators such as sensitivity,specificity,F1 score,precision,accuracy,the area under the receiver operating characteristic(ROC)curve(AUC),and the area under the precision-recall curve(AP).Results A total of 1275 patients with pulmonary nodules and 1623 patients with lung cancer were included in this study.After propensity score matching(PSM)to adjust for gender and age,535 patients were finally included in the pulmonary nodule group and the lung cancer group,respectively.There were significant differences in multiple color space metrics(such as R,G,B,V,L,a,b,Cr,H,Y,and Cb)and texture metrics[such as gray-levcl co-occurrence matrix(GLCM)-contrast(CON)and GLCM-inverse different moment(IDM)]between the two groups of individuals with pulmonary nodules and lung cancer(P<0.05).To construct a classification model,LASSO regression was used to select 63 key features from the initial 136 facial features.Based on this feature set,the SVM model demonstrated the best performance after 10-fold stratified cross-validation.The model achieved an average AUC of 0.8729 and average accuracy of 0.7990 on the internal test set.Further validation on an independent test set confirmed the model’s robust performance(AUC=0.8233,accuracy=0.7290),indicating its good generalization ability.Feature importance analysis demonstrated that color space indicators and the whole/lip Cr components(including color-B-0,wholecolor-Cr,and lipcolor-Cr)were the core factors in the model’s classification decisions,while texture indicators[GLCM-angular second moment(ASM)_2,GLCM-IDM_1,GLCM-CON_1,GLCM-entropy(ENT)_2]played an important auxiliary role.Conclusion The facial image features of patients with lung cancer and pulmonary nodules show significant differences in color and texture characteristics in multiple areas.The various models constructed based on facial image features all demonstrate good performance,indicating that facial image features can serve as potential biomarkers for lung cancer risk prediction,providing a non-invasive and feasible new approach for early lung cancer screening.展开更多
Objective To determine the correlation between traditional Chinese medicine(TCM)inspec-tion of spirit classification and the severity grade of depression based on facial features,offer-ing insights for intelligent int...Objective To determine the correlation between traditional Chinese medicine(TCM)inspec-tion of spirit classification and the severity grade of depression based on facial features,offer-ing insights for intelligent intergrated TCM and western medicine diagnosis of depression.Methods Using the Audio-Visual Emotion Challenge and Workshop(AVEC 2014)public dataset on depression,which conclude 150 interview videos,the samples were classified ac-cording to the TCM inspection of spirit classification:Deshen(得神,presence of spirit),Shaoshen(少神,insufficiency of spirit),and Shenluan(神乱,confusion of spirit).Meanwhile,based on Beck Depression Inventory-II(BDI-II)score for the severity grade of depression,the samples were divided into minimal(0-13,Q1),mild(14-19,Q2),moderate(20-28,Q3),and severe(29-63,Q4).Sixty-eight landmarks were extracted with a ResNet-50 network,and the feature extracion mode was stadardized.Random forest and support vectior machine(SVM)classifiers were used to predict TCM inspection of spirit classification and the severity grade of depression,respectively.A Chi-square test and Apriori association rule mining were then applied to quantify and explore the relationships.Results The analysis revealed a statistically significant and moderately strong association be-tween TCM spirit classification and the severity grade of depression,as confirmed by a Chi-square test(χ^(2)=14.04,P=0.029)with a Cramer’s V effect size of 0.243.Further exploration us-ing association rule mining identified the most compelling rule:“moderate depression(Q3)→Shenluan”.This rule demonstrated a support level of 5%,indicating this specific co-occur-rence was present in 5%of the cohort.Crucially,it achieved a high Confidence of 86%,mean-ing that among patients diagnosed with Q3,86%exhibited the Shenluan pattern according to TCM assessment.The substantial Lift of 2.37 signifies that the observed likelihood of Shenlu-an manifesting in Q3 patients is 2.37 times higher than would be expected by chance if these states were independent-compelling evidence of a highly non-random association.Conse-quently,Shenluan emerges as a distinct and core TCM diagnostic manifestation strongly linked to Q3,forming a clinically significant phenotype within this patient subgroup.展开更多
Objective To construct a precise model for identifying traditional Chinese medicine(TCM)constitutions;thereby offering optimized guidance for clinical diagnosis and treatment plan-ning;and ultimately enhancing medical...Objective To construct a precise model for identifying traditional Chinese medicine(TCM)constitutions;thereby offering optimized guidance for clinical diagnosis and treatment plan-ning;and ultimately enhancing medical efficiency and treatment outcomes.Methods First;TCM full-body inspection data acquisition equipment was employed to col-lect full-body standing images of healthy people;from which the constitutions were labelled and defined in accordance with the Constitution in Chinese Medicine Questionnaire(CCMQ);and a dataset encompassing labelled constitutions was constructed.Second;heat-suppres-sion valve(HSV)color space and improved local binary patterns(LBP)algorithm were lever-aged for the extraction of features such as facial complexion and body shape.In addition;a dual-branch deep network was employed to collect deep features from the full-body standing images.Last;the random forest(RF)algorithm was utilized to learn the extracted multifea-tures;which were subsequently employed to establish a TCM constitution identification mod-el.Accuracy;precision;and F1 score were the three measures selected to assess the perfor-mance of the model.Results It was found that the accuracy;precision;and F1 score of the proposed model based on multifeatures for identifying TCM constitutions were 0.842;0.868;and 0.790;respectively.In comparison with the identification models that encompass a single feature;either a single facial complexion feature;a body shape feature;or deep features;the accuracy of the model that incorporating all the aforementioned features was elevated by 0.105;0.105;and 0.079;the precision increased by 0.164;0.164;and 0.211;and the F1 score rose by 0.071;0.071;and 0.084;respectively.Conclusion The research findings affirmed the viability of the proposed model;which incor-porated multifeatures;including the facial complexion feature;the body shape feature;and the deep feature.In addition;by employing the proposed model;the objectification and intel-ligence of identifying constitutions in TCM practices could be optimized.展开更多
Plasma spark sources are widely used in high-resolution seismic exploration.However,research on the excitation mechanism and propagation characteristics of plasma spark sources is very limited.In this study,we elabora...Plasma spark sources are widely used in high-resolution seismic exploration.However,research on the excitation mechanism and propagation characteristics of plasma spark sources is very limited.In this study,we elaborated on the excitation process of corona discharge plasma spark source based on indoor experimental data.The electrode spacing has a direct impact on the movement of bubbles.As the spacing between bubbles decreases,they collapsed and fused,thereby suppressing the secondary pulse process.Based on the premise of linear arrangement and equal energy synchronous excitation,the motion equation of multiple bubbles under these conditions was derived,and a calculation method for the near-field wavelet model of plasma spark source was established.We simulated the source signals received in different directions and constructed a spatial wavelet face spectrum.Compared with traditional far-field wavelets,the spatial wavelet facial feature representation method provides a more comprehensive display of the variation characteristics and propagation properties of source wavelets in three-dimensional space.The spatial wavelet variation process of the plasma spark source was analyzed,and the source depth and the virtual reflection path are the main factors affecting the wavelet.The high-frequency properties of plasma electric spark source wavelets lead to their sensitivity to factors such as wave fluctuations,position changes,and environmental noise.Minor changes in collection parameters may result in significant changes in the recorded waveform and final data resolution.So,the facial feature method provides more effective technical support for wavelet evaluation.展开更多
In recent years,the country has spent significant workforce and material resources to prevent traffic accidents,particularly those caused by fatigued driving.The current studies mainly concentrate on driver physiologi...In recent years,the country has spent significant workforce and material resources to prevent traffic accidents,particularly those caused by fatigued driving.The current studies mainly concentrate on driver physiological signals,driving behavior,and vehicle information.However,most of the approaches are computationally intensive and inconvenient for real-time detection.Therefore,this paper designs a network that combines precision,speed and lightweight and proposes an algorithm for facial fatigue detection based on multi-feature fusion.Specifically,the face detection model takes YOLOv8(You Only Look Once version 8)as the basic framework,and replaces its backbone network with MobileNetv3.To focus on the significant regions in the image,CPCA(Channel Prior Convolution Attention)is adopted to enhance the network’s capacity for feature extraction.Meanwhile,the network training phase employs the Focal-EIOU(Focal and Efficient Intersection Over Union)loss function,which makes the network lightweight and increases the accuracy of target detection.Ultimately,the Dlib toolkit was employed to annotate 68 facial feature points.This study established an evaluation metric for facial fatigue and developed a novel fatigue detection algorithm to assess the driver’s condition.A series of comparative experiments were carried out on the self-built dataset.The suggested method’s mAP(mean Average Precision)values for object detection and fatigue detection are 96.71%and 95.75%,respectively,as well as the detection speed is 47 FPS(Frames Per Second).This method can balance the contradiction between computational complexity and model accuracy.Furthermore,it can be transplanted to NVIDIA Jetson Orin NX and quickly detect the driver’s state while maintaining a high degree of accuracy.It contributes to the development of automobile safety systems and reduces the occurrence of traffic accidents.展开更多
Abnormal driving behavior includes driving distraction,fatigue,road anger,phone use,and an exceptionally happy mood.Detecting abnormal driving behavior in advance can avoid traffic accidents and reduce the risk of tra...Abnormal driving behavior includes driving distraction,fatigue,road anger,phone use,and an exceptionally happy mood.Detecting abnormal driving behavior in advance can avoid traffic accidents and reduce the risk of traffic conflicts.Traditional methods of detecting abnormal driving behavior include using wearable devices to monitor blood pressure,pulse,heart rate,blood oxygen,and other vital signs,and using eye trackers to monitor eye activity(such as eye closure,blinking frequency,etc.)to estimate whether the driver is excited,anxious,or distracted.Traditional monitoring methods can detect abnormal driving behavior to a certain extent,but they will affect the driver’s normal driving state,thereby introducing additional driving risks.This research uses the combined method of support vector machine and dlib algorithm to extract 68 facial feature points from the human face,and uses an SVM model as a strong classifier to classify different abnormal driving statuses.The combined method reaches high accuracy in detecting road anger and fatigue status and can be used in an intelligent vehicle cabin to improve the driving safety level.展开更多
Local binary pattern(LBP)is an important method for texture feature extraction of facial expression.However,it also has the shortcomings of high dimension,slow feature extraction and noeffective local or global featur...Local binary pattern(LBP)is an important method for texture feature extraction of facial expression.However,it also has the shortcomings of high dimension,slow feature extraction and noeffective local or global features extracted.To solve these problems,a facial expression feature extraction method is proposed based on improved LBP.Firstly,LBP is converted into double local binary pattern(DLBP).Then by combining Taylor expansion(TE)with DLBP,DLBP-TE algorithm is obtained.Finally,the DLBP-TE algorithm combined with extreme learning machine(ELM)is applied in seven kinds of ficial expression images and the corresponding experiments are carried out in Japanese adult female facial expression(JAFFE)database.The results show that the proposed method can significantly improve facial expression recognition rate.展开更多
In this paper, a facial feature extracting method is proposed to transform three-dimension (3D) head images of infants with deformational plagiocephaly for assessment of asymmetry. The features of 3D point clouds of...In this paper, a facial feature extracting method is proposed to transform three-dimension (3D) head images of infants with deformational plagiocephaly for assessment of asymmetry. The features of 3D point clouds of an infant's cranium can be identified by local feature analysis and a two-phase k-means classification algorithm. The 3D images of infants with asymmetric cranium can then be aligned to the same pose. The mirrored head model obtained from the symmetry plane is compared with the original model for the measurement of asymmetry. Numerical data of the cranial volume can be reviewed by a pediatrician to adjust the treatment plan. The system can also be used to demonstrate the treatment progress.展开更多
The “facial composite” is one of the major fields in the forensic science that helps the criminal investigators to carry out their investigation process. The survey conducted by United States Law Enforcement Agencie...The “facial composite” is one of the major fields in the forensic science that helps the criminal investigators to carry out their investigation process. The survey conducted by United States Law Enforcement Agencies confirms that 80% of the law enforcement agencies use computer automated composite systems whereas Sri Lanka is still far behind in the process of facial composite with lot of inefficiencies in the current manual process. Hence this research introduces a novel approach for the manual facial composite process, while eliminating the inefficiencies of the manual procedure in Sri Lanka. In order to overcome this situation, this study introduces an automated image processing based software solution with 2D facial feature templates targeting the Sri Lankan population. Thus, this was the first ever approach that creates the 2D facial feature templates by incorporating both medically defined indexes and relevant aesthetic aspects. Hence, this research study is comprised of two separate analyses on anthropometric indices and facial feature shapes which were carried out targeting the local population. Subsequently, several evaluation techniques were utilized to evaluate this methodology where we obtained an overall success rate as 70.19%. The ultimate goal of this research study is to provide a system to the law enforcement agencies in order to carry out an efficient and effective facial composite process which can lead to increase the success rate of suspect identification.展开更多
Active Shape Model (ASM) is a powerful statistical tool to extract the facial features of a face image under frontal view. It mainly relies on Principle Component Analysis (PCA) to statistically model the variabil...Active Shape Model (ASM) is a powerful statistical tool to extract the facial features of a face image under frontal view. It mainly relies on Principle Component Analysis (PCA) to statistically model the variability in the training set of example shapes. Independent Component Analysis (ICA) has been proven to be more efficient to extract face features than PCA. In this paper, we combine the PCA and ICA by the consecutive strategy to form a novel ASM. Firstly, an initial model, which shows the global shape variability in the training set, is generated by the PCA-based ASM. And then, the final shape model, which contains more local characters, is established by the ICA-based ASM. Experimental results verify that the accuracy of facial feature extraction is statistically significantly improved by applying the ICA modes after the PCA modes.展开更多
Despite the fact that progress in face recognition algorithms over the last decades has been made, changing lighting conditions and different face orientation still remain as a challenging problem. A standard face rec...Despite the fact that progress in face recognition algorithms over the last decades has been made, changing lighting conditions and different face orientation still remain as a challenging problem. A standard face recognition system identifies the person by comparing the input picture against pictures of all faces in a database and finding the best match. Usually face matching is carried out in two steps: during the first step detection of a face is done by finding exact position of it in a complex background (various lightning condition), and in the second step face identification is performed using gathered databases. In reality detected faces can appear in different position and they can be rotated, so these disturbances reduce quality of the recognition algorithms dramatically. In this paper to increase the identification accuracy we propose original geometric normalization of the face, based on extracted facial feature position such as eyes. For the eyes localization lbllowing methods has been used: color based method, mean eye template and SVM (Support Vector Machine) technique. Experimental investigation has shown that the best results for eye center detection can be achieved using SVM technique. The recognition rate increases statistically by 28% using face orientation normalization based on the eyes position.展开更多
This paper presents a user friendly approach to localize the pupil center with a single web camera.Several methods have been proposed to determine the coordinates of the pupil center in an image,but with practical lim...This paper presents a user friendly approach to localize the pupil center with a single web camera.Several methods have been proposed to determine the coordinates of the pupil center in an image,but with practical limitations.The proposed method can track the user’s eye movements in real time under normal image resolution and lighting conditions using a regular webcam,without special equipment such as infrared illuminators.After the pre-processing steps used to deal with illumination variations,the pupil center is detected using iterative thresholding by applying geometric constraints.Experimental results show that robustness and speed in determining the pupil’s location in real time for users of various ethnicities,under various lighting conditions,at different distances from the webcam and with standard resolution images.展开更多
Race classification is a long-standing challenge in the field of face image analysis.The investigation of salient facial features is an important task to avoid processing all face parts.Face segmentation strongly bene...Race classification is a long-standing challenge in the field of face image analysis.The investigation of salient facial features is an important task to avoid processing all face parts.Face segmentation strongly benefits several face analysis tasks,including ethnicity and race classification.We propose a race-classification algorithm using a prior face segmentation framework.A deep convolutional neural network(DCNN)was used to construct a face segmentation model.For training the DCNN,we label face images according to seven different classes,that is,nose,skin,hair,eyes,brows,back,and mouth.The DCNN model developed in the first phase was used to create segmentation results.The probabilistic classification method is used,and probability maps(PMs)are created for each semantic class.We investigated five salient facial features from among seven that help in race classification.Features are extracted from the PMs of five classes,and a new model is trained based on the DCNN.We assessed the performance of the proposed race classification method on four standard face datasets,reporting superior results compared with previous studies.展开更多
In this paper,a novel face recognition method,named as wavelet-curvelet-fractal technique,is proposed. Based on the similarities embedded in the images,we propose to utilize the wave-let-curvelet-fractal technique to ...In this paper,a novel face recognition method,named as wavelet-curvelet-fractal technique,is proposed. Based on the similarities embedded in the images,we propose to utilize the wave-let-curvelet-fractal technique to extract facial features. Thus we have the wavelet’s details in diagonal,vertical,and horizontal directions,and the eight curvelet details at different angles. Then we adopt the Euclidean minimum distance classifier to recognize different faces. Extensive comparison tests on dif-ferent data sets are carried out,and higher recognition rate is obtained by the proposed technique.展开更多
基金supported by the National Natural Science Foundation of China(No.62367006)the Graduate Innovative Fund of Wuhan Institute of Technology(Grant No.CX2023551).
文摘Automatically detecting learners’engagement levels helps to develop more effective online teaching and assessment programs,allowing teachers to provide timely feedback and make personalized adjustments based on students’needs to enhance teaching effectiveness.Traditional approaches mainly rely on single-frame multimodal facial spatial information,neglecting temporal emotional and behavioural features,with accuracy affected by significant pose variations.Additionally,convolutional padding can erode feature maps,affecting feature extraction’s representational capacity.To address these issues,we propose a hybrid neural network architecture,the redistributing facial features and temporal convolutional network(RefEIP).This network consists of three key components:first,utilizing the spatial attention mechanism large kernel attention(LKA)to automatically capture local patches and mitigate the effects of pose variations;second,employing the feature organization and weight distribution(FOWD)module to redistribute feature weights and eliminate the impact of white features and enhancing representation in facial feature maps.Finally,we analyse the temporal changes in video frames through the modern temporal convolutional network(ModernTCN)module to detect engagement levels.We constructed a near-infrared engagement video dataset(NEVD)to better validate the efficiency of the RefEIP network.Through extensive experiments and in-depth studies,we evaluated these methods on the NEVD and the Database for Affect in Situations of Elicitation(DAiSEE),achieving an accuracy of 90.8%on NEVD and 61.2%on DAiSEE in the fourclass classification task,indicating significant advantages in addressing engagement video analysis problems.
基金the National Natural Science Foundation of China(Nos.91958206,91858215)the National Key Research and Development Program Pilot Project(Nos.2018YFC1405901,2017YFC0307401)+1 种基金the Fundamental Research Funds for the Central Univer-sities(No.201964016)the Marine Geological Survey Program of China Geological Survey(No.DD20190819)。
文摘Air gun arrays are often used in marine energy exploration and marine geological surveys.The study of the single bubble dynamics and multibubbles produced by air guns interacting with each other is helpful in understanding pressure signals.We used the van der Waals air gun model to simulate the wavelets of a sleeve gun of various offsets and arrival angles.Several factors were taken into account,such as heat transfer,the thermodynamically open quasi-static system,the vertical rise of the bubble,and air gun post throttling.Marine vertical cables are located on the seafloor,but hydrophones are located in seawater and are far away from the air gun array vertically.This situation conforms to the acquisition conditions of the air gun far-field wavelet and thus avoids the problems of ship noise,ocean surges,and coupling.High-quality 3D wavelet data of air gun arrays were collected during a vertical cable test in the South China Sea in 2017.We proposed an evaluation method of multidimensional facial features,including zeropeak amplitude,peak-peak amplitude,bubble period,primary-to-bubble ratio,frequency spectrum,instantaneous amplitude,instantaneous phase,and instantaneous frequency,to characterize the 3D air gun wave field.The match between the facial features in the field and simulated data provides confidence for the use of the van der Waals air gun model to predict air gun wavelet and facial features to evaluate air gun array.
文摘Deepfake technology can be used to replace people’s faces in videos or pictures to show them saying or doing things they never said or did. Deepfake media are often used to extort, defame, and manipulate public opinion. However, despite deepfake technology’s risks, current deepfake detection methods lack generalization and are inconsistent when applied to unknown videos, i.e., videos on which they have not been trained. The purpose of this study is to develop a generalizable deepfake detection model by training convoluted neural networks (CNNs) to classify human facial features in videos. The study formulated the research questions: “How effectively does the developed model provide reliable generalizations?” A CNN model was trained to distinguish between real and fake videos using the facial features of human subjects in videos. The model was trained, validated, and tested using the FaceForensiq++ dataset, which contains more than 500,000 frames and subsets of the DFDC dataset, totaling more than 22,000 videos. The study demonstrated high generalizability, as the accuracy of the unknown dataset was only marginally (about 1%) lower than that of the known dataset. The findings of this study indicate that detection systems can be more generalizable, lighter, and faster by focusing on just a small region (the human face) of an entire video.
文摘Facial expression recognition consists of determining what kind of emotional content is presented in a human face. The problem presents a complex area for exploration, since it encompasses face acquisition, facial feature tracking, facial ex- pression classification. Facial feature tracking is of the most interest. Active Appearance Model (AAM) enables accurate tracking of facial features in real-time, but lacks occlusions and self-occlusions. In this paper we propose a solution to improve the accuracy of fitting technique. The idea is to include occluded images into AAM training data. We demonstrate the results by running ex- periments using gradient descent algorithm for fitting the AAM. Our experiments show that using fitting algorithm with occluded training data improves the fitting quality of the algorithm.
基金Sponsored by the National Natural Science Foundation of China (60772066)
文摘An efficient algorithm for facial features extractions is proposed. The facial features we segment are the two eyes, nose and mouth. The algorithm is based on an improved Gabor wavelets edge detector, morphological approach to detect the face region and facial features regions, and an improved T-shape face mask to locate the extract location of facial features. The experimental results show that the proposed method is robust against facial expression, illumination, and can be also effective if the person wearing glasses, and so on.
文摘This paper presents a set of algorithms capable of locating main facial features automatically and effectively. Based on integral projection of local binary image pixels and pixel clustering techniques, a set of a priori knowledge based algorithms have succeeded in locating eyes, nose and mouth, and uprighting the tilt face. The proposed approach is superior to other methods as it takes account of photos with glasses and sha dows, therefore suitable for processing real ID type photos.
基金National Natural Science Foundation of China(82305090)Shanghai Municipal Health Commission(20234Y0168)National Key Research and Development Program of China (2017YFC1703301)。
文摘Objective To explore the feasibility of constructing a lung cancer early-warning risk model based on facial image features,providing novel insights into the early screening of lung cancer.Methods This study included patients with pulmonary nodules diagnosed at the Physical Examination Center of Shuguang Hospital Affiliated to Shanghai University of Traditional Chinese Medicine from November 1,2019 to December 31,2024,as well as patients with lung cancer diagnosed in the Oncology Departments of Yueyang Hospital of Integrated Traditional Chinese and Western Medicine and Longhua Hospital during the same period.The facial image information of patients with pulmonary nodules and lung cancer was collected using the TFDA-1 tongue and facial diagnosis instrument,and the facial diagnosis features were extracted from it by deep learning technology.Statistical analysis was conducted on the objective facial diagnosis characteristics of the two groups of participants to explore the differences in their facial image characteristics,and the least absolute shrinkage and selection operator(LASSO)regression was used to screen the characteristic variables.Based on the screened feature variables,four machine learning methods:random forest,logistic regression,support vector machine(SVM),and gradient boosting decision tree(GBDT)were used to establish lung cancer classification models independently.Meanwhile,the model performance was evaluated by indicators such as sensitivity,specificity,F1 score,precision,accuracy,the area under the receiver operating characteristic(ROC)curve(AUC),and the area under the precision-recall curve(AP).Results A total of 1275 patients with pulmonary nodules and 1623 patients with lung cancer were included in this study.After propensity score matching(PSM)to adjust for gender and age,535 patients were finally included in the pulmonary nodule group and the lung cancer group,respectively.There were significant differences in multiple color space metrics(such as R,G,B,V,L,a,b,Cr,H,Y,and Cb)and texture metrics[such as gray-levcl co-occurrence matrix(GLCM)-contrast(CON)and GLCM-inverse different moment(IDM)]between the two groups of individuals with pulmonary nodules and lung cancer(P<0.05).To construct a classification model,LASSO regression was used to select 63 key features from the initial 136 facial features.Based on this feature set,the SVM model demonstrated the best performance after 10-fold stratified cross-validation.The model achieved an average AUC of 0.8729 and average accuracy of 0.7990 on the internal test set.Further validation on an independent test set confirmed the model’s robust performance(AUC=0.8233,accuracy=0.7290),indicating its good generalization ability.Feature importance analysis demonstrated that color space indicators and the whole/lip Cr components(including color-B-0,wholecolor-Cr,and lipcolor-Cr)were the core factors in the model’s classification decisions,while texture indicators[GLCM-angular second moment(ASM)_2,GLCM-IDM_1,GLCM-CON_1,GLCM-entropy(ENT)_2]played an important auxiliary role.Conclusion The facial image features of patients with lung cancer and pulmonary nodules show significant differences in color and texture characteristics in multiple areas.The various models constructed based on facial image features all demonstrate good performance,indicating that facial image features can serve as potential biomarkers for lung cancer risk prediction,providing a non-invasive and feasible new approach for early lung cancer screening.
基金Research and Development Plan of Key Areas of Hunan Science and Technology Department (2022SK2044)Clinical Research Center for Depressive Disorder in Hunan Province (2021SK4022)。
文摘Objective To determine the correlation between traditional Chinese medicine(TCM)inspec-tion of spirit classification and the severity grade of depression based on facial features,offer-ing insights for intelligent intergrated TCM and western medicine diagnosis of depression.Methods Using the Audio-Visual Emotion Challenge and Workshop(AVEC 2014)public dataset on depression,which conclude 150 interview videos,the samples were classified ac-cording to the TCM inspection of spirit classification:Deshen(得神,presence of spirit),Shaoshen(少神,insufficiency of spirit),and Shenluan(神乱,confusion of spirit).Meanwhile,based on Beck Depression Inventory-II(BDI-II)score for the severity grade of depression,the samples were divided into minimal(0-13,Q1),mild(14-19,Q2),moderate(20-28,Q3),and severe(29-63,Q4).Sixty-eight landmarks were extracted with a ResNet-50 network,and the feature extracion mode was stadardized.Random forest and support vectior machine(SVM)classifiers were used to predict TCM inspection of spirit classification and the severity grade of depression,respectively.A Chi-square test and Apriori association rule mining were then applied to quantify and explore the relationships.Results The analysis revealed a statistically significant and moderately strong association be-tween TCM spirit classification and the severity grade of depression,as confirmed by a Chi-square test(χ^(2)=14.04,P=0.029)with a Cramer’s V effect size of 0.243.Further exploration us-ing association rule mining identified the most compelling rule:“moderate depression(Q3)→Shenluan”.This rule demonstrated a support level of 5%,indicating this specific co-occur-rence was present in 5%of the cohort.Crucially,it achieved a high Confidence of 86%,mean-ing that among patients diagnosed with Q3,86%exhibited the Shenluan pattern according to TCM assessment.The substantial Lift of 2.37 signifies that the observed likelihood of Shenlu-an manifesting in Q3 patients is 2.37 times higher than would be expected by chance if these states were independent-compelling evidence of a highly non-random association.Conse-quently,Shenluan emerges as a distinct and core TCM diagnostic manifestation strongly linked to Q3,forming a clinically significant phenotype within this patient subgroup.
基金National Key Research and Development Program of China(2022YFC3502302)National Natural Science Foundation of China(82074580)Graduate Research Innovation Program of Jiangsu Province(KYCX23_2078).
文摘Objective To construct a precise model for identifying traditional Chinese medicine(TCM)constitutions;thereby offering optimized guidance for clinical diagnosis and treatment plan-ning;and ultimately enhancing medical efficiency and treatment outcomes.Methods First;TCM full-body inspection data acquisition equipment was employed to col-lect full-body standing images of healthy people;from which the constitutions were labelled and defined in accordance with the Constitution in Chinese Medicine Questionnaire(CCMQ);and a dataset encompassing labelled constitutions was constructed.Second;heat-suppres-sion valve(HSV)color space and improved local binary patterns(LBP)algorithm were lever-aged for the extraction of features such as facial complexion and body shape.In addition;a dual-branch deep network was employed to collect deep features from the full-body standing images.Last;the random forest(RF)algorithm was utilized to learn the extracted multifea-tures;which were subsequently employed to establish a TCM constitution identification mod-el.Accuracy;precision;and F1 score were the three measures selected to assess the perfor-mance of the model.Results It was found that the accuracy;precision;and F1 score of the proposed model based on multifeatures for identifying TCM constitutions were 0.842;0.868;and 0.790;respectively.In comparison with the identification models that encompass a single feature;either a single facial complexion feature;a body shape feature;or deep features;the accuracy of the model that incorporating all the aforementioned features was elevated by 0.105;0.105;and 0.079;the precision increased by 0.164;0.164;and 0.211;and the F1 score rose by 0.071;0.071;and 0.084;respectively.Conclusion The research findings affirmed the viability of the proposed model;which incor-porated multifeatures;including the facial complexion feature;the body shape feature;and the deep feature.In addition;by employing the proposed model;the objectification and intel-ligence of identifying constitutions in TCM practices could be optimized.
基金supported by the Key Laboratory of Marine Mineral Resources,Ministry of Natural and Resources,Guangzhou(No.KLMMR-20220K02)the Marine Geological Survey Program of China Geological Survey(No.DD20191003)。
文摘Plasma spark sources are widely used in high-resolution seismic exploration.However,research on the excitation mechanism and propagation characteristics of plasma spark sources is very limited.In this study,we elaborated on the excitation process of corona discharge plasma spark source based on indoor experimental data.The electrode spacing has a direct impact on the movement of bubbles.As the spacing between bubbles decreases,they collapsed and fused,thereby suppressing the secondary pulse process.Based on the premise of linear arrangement and equal energy synchronous excitation,the motion equation of multiple bubbles under these conditions was derived,and a calculation method for the near-field wavelet model of plasma spark source was established.We simulated the source signals received in different directions and constructed a spatial wavelet face spectrum.Compared with traditional far-field wavelets,the spatial wavelet facial feature representation method provides a more comprehensive display of the variation characteristics and propagation properties of source wavelets in three-dimensional space.The spatial wavelet variation process of the plasma spark source was analyzed,and the source depth and the virtual reflection path are the main factors affecting the wavelet.The high-frequency properties of plasma electric spark source wavelets lead to their sensitivity to factors such as wave fluctuations,position changes,and environmental noise.Minor changes in collection parameters may result in significant changes in the recorded waveform and final data resolution.So,the facial feature method provides more effective technical support for wavelet evaluation.
基金supported by the Science and Technology Bureau of Xi’an project(24KGDW0049)the Key Research and Development Programof Shaanxi(2023-YBGY-264)the Key Research and Development Program of Guangxi(GK-AB20159032).
文摘In recent years,the country has spent significant workforce and material resources to prevent traffic accidents,particularly those caused by fatigued driving.The current studies mainly concentrate on driver physiological signals,driving behavior,and vehicle information.However,most of the approaches are computationally intensive and inconvenient for real-time detection.Therefore,this paper designs a network that combines precision,speed and lightweight and proposes an algorithm for facial fatigue detection based on multi-feature fusion.Specifically,the face detection model takes YOLOv8(You Only Look Once version 8)as the basic framework,and replaces its backbone network with MobileNetv3.To focus on the significant regions in the image,CPCA(Channel Prior Convolution Attention)is adopted to enhance the network’s capacity for feature extraction.Meanwhile,the network training phase employs the Focal-EIOU(Focal and Efficient Intersection Over Union)loss function,which makes the network lightweight and increases the accuracy of target detection.Ultimately,the Dlib toolkit was employed to annotate 68 facial feature points.This study established an evaluation metric for facial fatigue and developed a novel fatigue detection algorithm to assess the driver’s condition.A series of comparative experiments were carried out on the self-built dataset.The suggested method’s mAP(mean Average Precision)values for object detection and fatigue detection are 96.71%and 95.75%,respectively,as well as the detection speed is 47 FPS(Frames Per Second).This method can balance the contradiction between computational complexity and model accuracy.Furthermore,it can be transplanted to NVIDIA Jetson Orin NX and quickly detect the driver’s state while maintaining a high degree of accuracy.It contributes to the development of automobile safety systems and reduces the occurrence of traffic accidents.
文摘Abnormal driving behavior includes driving distraction,fatigue,road anger,phone use,and an exceptionally happy mood.Detecting abnormal driving behavior in advance can avoid traffic accidents and reduce the risk of traffic conflicts.Traditional methods of detecting abnormal driving behavior include using wearable devices to monitor blood pressure,pulse,heart rate,blood oxygen,and other vital signs,and using eye trackers to monitor eye activity(such as eye closure,blinking frequency,etc.)to estimate whether the driver is excited,anxious,or distracted.Traditional monitoring methods can detect abnormal driving behavior to a certain extent,but they will affect the driver’s normal driving state,thereby introducing additional driving risks.This research uses the combined method of support vector machine and dlib algorithm to extract 68 facial feature points from the human face,and uses an SVM model as a strong classifier to classify different abnormal driving statuses.The combined method reaches high accuracy in detecting road anger and fatigue status and can be used in an intelligent vehicle cabin to improve the driving safety level.
文摘Local binary pattern(LBP)is an important method for texture feature extraction of facial expression.However,it also has the shortcomings of high dimension,slow feature extraction and noeffective local or global features extracted.To solve these problems,a facial expression feature extraction method is proposed based on improved LBP.Firstly,LBP is converted into double local binary pattern(DLBP).Then by combining Taylor expansion(TE)with DLBP,DLBP-TE algorithm is obtained.Finally,the DLBP-TE algorithm combined with extreme learning machine(ELM)is applied in seven kinds of ficial expression images and the corresponding experiments are carried out in Japanese adult female facial expression(JAFFE)database.The results show that the proposed method can significantly improve facial expression recognition rate.
文摘In this paper, a facial feature extracting method is proposed to transform three-dimension (3D) head images of infants with deformational plagiocephaly for assessment of asymmetry. The features of 3D point clouds of an infant's cranium can be identified by local feature analysis and a two-phase k-means classification algorithm. The 3D images of infants with asymmetric cranium can then be aligned to the same pose. The mirrored head model obtained from the symmetry plane is compared with the original model for the measurement of asymmetry. Numerical data of the cranial volume can be reviewed by a pediatrician to adjust the treatment plan. The system can also be used to demonstrate the treatment progress.
文摘The “facial composite” is one of the major fields in the forensic science that helps the criminal investigators to carry out their investigation process. The survey conducted by United States Law Enforcement Agencies confirms that 80% of the law enforcement agencies use computer automated composite systems whereas Sri Lanka is still far behind in the process of facial composite with lot of inefficiencies in the current manual process. Hence this research introduces a novel approach for the manual facial composite process, while eliminating the inefficiencies of the manual procedure in Sri Lanka. In order to overcome this situation, this study introduces an automated image processing based software solution with 2D facial feature templates targeting the Sri Lankan population. Thus, this was the first ever approach that creates the 2D facial feature templates by incorporating both medically defined indexes and relevant aesthetic aspects. Hence, this research study is comprised of two separate analyses on anthropometric indices and facial feature shapes which were carried out targeting the local population. Subsequently, several evaluation techniques were utilized to evaluate this methodology where we obtained an overall success rate as 70.19%. The ultimate goal of this research study is to provide a system to the law enforcement agencies in order to carry out an efficient and effective facial composite process which can lead to increase the success rate of suspect identification.
文摘Active Shape Model (ASM) is a powerful statistical tool to extract the facial features of a face image under frontal view. It mainly relies on Principle Component Analysis (PCA) to statistically model the variability in the training set of example shapes. Independent Component Analysis (ICA) has been proven to be more efficient to extract face features than PCA. In this paper, we combine the PCA and ICA by the consecutive strategy to form a novel ASM. Firstly, an initial model, which shows the global shape variability in the training set, is generated by the PCA-based ASM. And then, the final shape model, which contains more local characters, is established by the ICA-based ASM. Experimental results verify that the accuracy of facial feature extraction is statistically significantly improved by applying the ICA modes after the PCA modes.
文摘Despite the fact that progress in face recognition algorithms over the last decades has been made, changing lighting conditions and different face orientation still remain as a challenging problem. A standard face recognition system identifies the person by comparing the input picture against pictures of all faces in a database and finding the best match. Usually face matching is carried out in two steps: during the first step detection of a face is done by finding exact position of it in a complex background (various lightning condition), and in the second step face identification is performed using gathered databases. In reality detected faces can appear in different position and they can be rotated, so these disturbances reduce quality of the recognition algorithms dramatically. In this paper to increase the identification accuracy we propose original geometric normalization of the face, based on extracted facial feature position such as eyes. For the eyes localization lbllowing methods has been used: color based method, mean eye template and SVM (Support Vector Machine) technique. Experimental investigation has shown that the best results for eye center detection can be achieved using SVM technique. The recognition rate increases statistically by 28% using face orientation normalization based on the eyes position.
文摘This paper presents a user friendly approach to localize the pupil center with a single web camera.Several methods have been proposed to determine the coordinates of the pupil center in an image,but with practical limitations.The proposed method can track the user’s eye movements in real time under normal image resolution and lighting conditions using a regular webcam,without special equipment such as infrared illuminators.After the pre-processing steps used to deal with illumination variations,the pupil center is detected using iterative thresholding by applying geometric constraints.Experimental results show that robustness and speed in determining the pupil’s location in real time for users of various ethnicities,under various lighting conditions,at different distances from the webcam and with standard resolution images.
基金This work was partially supported by a National Research Foundation of Korea(NRF)grant(No.2019R1F1A1062237)under the ITRC(Information Technology Research Center)support program(IITP-2021-2018-0-01431)supervised by the IITP(Institute for Information and Communications Technology Planning and Evaluation)funded by the Ministry of Science and ICT(MSIT),Korea.
文摘Race classification is a long-standing challenge in the field of face image analysis.The investigation of salient facial features is an important task to avoid processing all face parts.Face segmentation strongly benefits several face analysis tasks,including ethnicity and race classification.We propose a race-classification algorithm using a prior face segmentation framework.A deep convolutional neural network(DCNN)was used to construct a face segmentation model.For training the DCNN,we label face images according to seven different classes,that is,nose,skin,hair,eyes,brows,back,and mouth.The DCNN model developed in the first phase was used to create segmentation results.The probabilistic classification method is used,and probability maps(PMs)are created for each semantic class.We investigated five salient facial features from among seven that help in race classification.Features are extracted from the PMs of five classes,and a new model is trained based on the DCNN.We assessed the performance of the proposed race classification method on four standard face datasets,reporting superior results compared with previous studies.
基金Supported by the College of Heilongjiang Province, Electronic Engineering Key Lab Project dzzd200602Heilongjiang Province Educational Bureau Scientific Technology Important Project 11531z18
文摘In this paper,a novel face recognition method,named as wavelet-curvelet-fractal technique,is proposed. Based on the similarities embedded in the images,we propose to utilize the wave-let-curvelet-fractal technique to extract facial features. Thus we have the wavelet’s details in diagonal,vertical,and horizontal directions,and the eight curvelet details at different angles. Then we adopt the Euclidean minimum distance classifier to recognize different faces. Extensive comparison tests on dif-ferent data sets are carried out,and higher recognition rate is obtained by the proposed technique.