This paper provides a comprehensive bibliometric exposition on deepfake research,exploring the intersection of artificial intelligence and deepfakes as well as international collaborations,prominent researchers,organi...This paper provides a comprehensive bibliometric exposition on deepfake research,exploring the intersection of artificial intelligence and deepfakes as well as international collaborations,prominent researchers,organizations,institutions,publications,and key themes.We performed a search on theWeb of Science(WoS)database,focusing on Artificial Intelligence and Deepfakes,and filtered the results across 21 research areas,yielding 1412 articles.Using VOSviewer visualization tool,we analyzed thisWoS data through keyword co-occurrence graphs,emphasizing on four prominent research themes.Compared with existing bibliometric papers on deepfakes,this paper proceeds to identify and discuss some of the highly cited papers within these themes:deepfake detection,feature extraction,face recognition,and forensics.The discussion highlights key challenges and advancements in deepfake research.Furthermore,this paper also discusses pressing issues surrounding deepfakes such as security,regulation,and datasets.We also provide an analysis of another exhaustive search on Scopus database focusing solely on Deepfakes(while not excluding AI)revealing deep learning as the predominant keyword,underscoring AI’s central role in deepfake research.This comprehensive analysis,encompassing over 500 keywords from 8790 articles,uncovered a wide range of methods,implications,applications,concerns,requirements,challenges,models,tools,datasets,and modalities related to deepfakes.Finally,a discussion on recommendations for policymakers,researchers,and other stakeholders is also provided.展开更多
With the rapid advancement of visual generative models such as Generative Adversarial Networks(GANs)and stable Diffusion,the creation of highly realistic Deepfake through automated forgery has significantly progressed...With the rapid advancement of visual generative models such as Generative Adversarial Networks(GANs)and stable Diffusion,the creation of highly realistic Deepfake through automated forgery has significantly progressed.This paper examines the advancements inDeepfake detection and defense technologies,emphasizing the shift from passive detection methods to proactive digital watermarking techniques.Passive detection methods,which involve extracting features from images or videos to identify forgeries,encounter challenges such as poor performance against unknown manipulation techniques and susceptibility to counter-forensic tactics.In contrast,proactive digital watermarking techniques embed specificmarkers into images or videos,facilitating real-time detection and traceability,thereby providing a preemptive defense againstDeepfake content.We offer a comprehensive analysis of digitalwatermarking-based forensic techniques,discussing their advantages over passivemethods and highlighting four key benefits:real-time detection,embedded defense,resistance to tampering,and provision of legal evidence.Additionally,the paper identifies gaps in the literature concerning proactive forensic techniques and suggests future research directions,including cross-domain watermarking and adaptive watermarking strategies.By systematically classifying and comparing existing techniques,this review aims to contribute valuable insights for the development of more effective proactive defense strategies in Deepfake forensics.展开更多
Recent advances in artificial intelligence and the availability of large-scale benchmarks have made deepfake video generation and manipulation easier.Therefore,developing reliable and robust deepfake video detection m...Recent advances in artificial intelligence and the availability of large-scale benchmarks have made deepfake video generation and manipulation easier.Therefore,developing reliable and robust deepfake video detection mechanisms is paramount.This research introduces a novel real-time deepfake video detection framework by analyzing gaze and blink patterns,addressing the spatial-temporal challenges unique to gaze and blink anomalies using the TimeSformer and hybrid Transformer-CNN models.The TimeSformer architecture leverages spatial-temporal attention mechanisms to capture fine-grained blinking intervals and gaze direction anomalies.Compared to state-of-the-art traditional convolutional models like MesoNet and EfficientNet,which primarily focus on global facial features,our approach emphasizes localized eye-region analysis,significantly enhancing detection accuracy.We evaluate our framework on four standard datasets:FaceForensics,CelebDF-V2,DFDC,and FakeAVCeleb.The proposed framework results reveal higher accuracy,with the TimeSformer model achieving accuracies of 97.5%,96.3%,95.8%,and 97.1%,and with the hybrid Transformer-CNN model demonstrating accuracies of 92.8%,91.5%,90.9%,and 93.2%,on FaceForensics,CelebDF-V2,DFDC,and FakeAVCeleb datasets,respectively,showing robustness in distinguishing manipulated from authentic videos.Our research provides a robust state-of-the-art framework for real-time deepfake video detection.This novel study significantly contributes to video forensics,presenting scalable and accurate real-world application solutions.展开更多
With expeditious advancements in AI-driven facial manipulation techniques,particularly deepfake technology,there is growing concern over its potential misuse.Deepfakes pose a significant threat to society,partic-ularl...With expeditious advancements in AI-driven facial manipulation techniques,particularly deepfake technology,there is growing concern over its potential misuse.Deepfakes pose a significant threat to society,partic-ularly by infringing on individuals’privacy.Amid significant endeavors to fabricate systems for identifying deepfake fabrications,existing methodologies often face hurdles in adjusting to innovative forgery techniques and demonstrate increased vulnerability to image and video clarity variations,thereby hindering their broad applicability to images and videos produced by unfamiliar technologies.In this manuscript,we endorse resilient training tactics to amplify generalization capabilities.In adversarial training,models are trained using deliberately crafted samples to deceive classification systems,thereby significantly enhancing their generalization ability.In response to this challenge,we propose an innovative hybrid adversarial training framework integrating Virtual Adversarial Training(VAT)with Two-Generated Blurred Adversarial Training.This combined framework bolsters the model’s resilience in detecting deepfakes made using unfamiliar deep learning technologies.Through such adversarial training,models are prompted to acquire more versatile attributes.Through experimental studies,we demonstrate that our model achieves higher accuracy than existing models.展开更多
The majority of current deepfake detection methods are constrained to identifying one or two specific types of counterfeit images,which limits their ability to keep pace with the rapid advancements in deepfake technol...The majority of current deepfake detection methods are constrained to identifying one or two specific types of counterfeit images,which limits their ability to keep pace with the rapid advancements in deepfake technology.Therefore,in this study,we propose a novel algorithm,StereoMixture Density Network(SMNDNet),which can detect multiple types of deepfake face manipulations using a single network framework.SMNDNet is an end-to-end CNNbased network specially designed for detecting various manipulation types of deepfake face images.First,we design a Subtle Distinguishable Feature Enhancement Module to emphasize the differentiation between authentic and forged features.Second,we introduce aMulti-Scale Forged Region AdaptiveModule that dynamically adapts to extract forged features from images of varying synthesis scales.Third,we integrate a Nonlinear Expression Capability Enhancement Module to augment the model’s capacity for capturing intricate nonlinear patterns across various types of deepfakes.Collectively,these modules empower our model to efficiently extract forgery features fromdiverse manipulation types,ensuring a more satisfactory performance in multiple-types deepfake detection.Experiments show that the proposed method outperforms alternative approaches in detection accuracy and AUC across all four types of deepfake images.It also demonstrates strong generalization on cross-dataset and cross-type detection,along with robust performance against post-processing manipulations.展开更多
Pre-reading What is a Deepfake,and how is it created?What are Deepfakes?A Deepfake is a video,image or audio clip that has been created using artificial intelligence.The idea is to make it as realistic as possible.
As Deepfake technology continues to evolve,the distinction between real and fake content becomes increasingly blurred.Most existing Deepfake video detectionmethods rely on single-frame facial image features,which limi...As Deepfake technology continues to evolve,the distinction between real and fake content becomes increasingly blurred.Most existing Deepfake video detectionmethods rely on single-frame facial image features,which limits their ability to capture temporal differences between frames.Current methods also exhibit limited generalization capabilities,struggling to detect content generated by unknown forgery algorithms.Moreover,the diversity and complexity of forgery techniques introduced by Artificial Intelligence Generated Content(AIGC)present significant challenges for traditional detection frameworks,whichmust balance high detection accuracy with robust performance.To address these challenges,we propose a novel Deepfake detection framework that combines a two-stream convolutional network with a Vision Transformer(ViT)module to enhance spatio-temporal feature representation.The ViT model extracts spatial features from the forged video,while the 3D convolutional network captures temporal features.The 3D convolution enables cross-frame feature extraction,allowing the model to detect subtle facial changes between frames.The confidence scores from both the ViT and 3D convolution submodels are fused at the decision layer,enabling themodel to effectively handle unknown forgery techniques.Focusing on Deepfake videos and GAN-generated images,the proposed approach is evaluated on two widely used public face forgery datasets.Compared to existing state-of-theartmethods,it achieves higher detection accuracy and better generalization performance,offering a robust solution for deepfake detection in real-world scenarios.展开更多
Deepfake-generated fake faces,commonly utilized in identity-related activities such as political propaganda,celebrity impersonations,evidence forgery,and familiar fraud,pose new societal threats.Although current deepf...Deepfake-generated fake faces,commonly utilized in identity-related activities such as political propaganda,celebrity impersonations,evidence forgery,and familiar fraud,pose new societal threats.Although current deepfake generators strive for high realism in visual effects,they do not replicate biometric signals indicative of cardiac activity.Addressing this gap,many researchers have developed detection methods focusing on biometric characteristics.These methods utilize classification networks to analyze both temporal and spectral domain features of the remote photoplethysmography(rPPG)signal,resulting in high detection accuracy.However,in the spectral analysis,existing approaches often only consider the power spectral density and neglect the amplitude spectrum—both crucial for assessing cardiac activity.We introduce a novel method that extracts rPPG signals from multiple regions of interest through remote photoplethysmography and processes them using Fast Fourier Transform(FFT).The resultant time-frequency domain signal samples are organized into matrices to create Matrix Visualization Heatmaps(MVHM),which are then utilized to train an image classification network.Additionally,we explored various combinations of time-frequency domain representations of rPPG signals and the impact of attention mechanisms.Our experimental results show that our algorithm achieves a remarkable detection accuracy of 99.22%in identifying fake videos,significantly outperforming mainstream algorithms and demonstrating the effectiveness of Fourier Transform and attention mechanisms in detecting fake faces.展开更多
Deepfake has emerged as an obstinate challenge in a world dominated by light.Here,the authors introduce a new deepfake detection method based on Xception architecture.The model is tested exhaustively with millions of ...Deepfake has emerged as an obstinate challenge in a world dominated by light.Here,the authors introduce a new deepfake detection method based on Xception architecture.The model is tested exhaustively with millions of frames and diverse video clips;accuracy levels as high as 99.65%are reported.These are the main reasons for such high efficacy:superior feature extraction capabilities and stable training mechanisms,such as early stopping,characterizing the Xception model.The methodology applied is also more advanced when it comes to data preprocessing steps,making use of state-of-the-art techniques applied to ensure constant performance.With an ever-rising threat from fake media,this piece of research puts great emphasis on stringent memory testing to keep at bay the spread of manipulated content.It also justifies better explanation methods to justify the reasoning done by the model for those decisions that build more trust and reliability.The ensemble models being more accurate have been studied and examined for establishing a possibility of combining various detection frameworks that could together produce superior results.Further,the study underlines the need for real-time detection tools that can be effective on different social media sites and digital environments.Ethics,protecting privacy,and public awareness in the fight against the proliferation of deepfakes are important considerations.By significantly contributing to the advancements made in the technology that has actually advanced detection,it strengthens the safety and integrity of the cyber world with a robust defense against ever-evolving deepfake threats in technology.Overall,the findings generally go a long way to prove themselves as the crucial step forward to ensuring information authenticity and the trustworthiness of society in this digital world.展开更多
With the rapid development of deepfake technology,the authenticity of various types of fake synthetic content is increasing rapidly,which brings potential security threats to people’s daily life and social stability....With the rapid development of deepfake technology,the authenticity of various types of fake synthetic content is increasing rapidly,which brings potential security threats to people’s daily life and social stability.Currently,most algorithms define deepfake detection as a binary classification problem,i.e.,global features are first extracted using a backbone network and then fed into a binary classifier to discriminate true or false.However,the differences between real and fake samples are often subtle and local,and such global feature-based detection algorithms are not optimal in efficiency and accuracy.To this end,to enhance the extraction of forgery details in deep forgery samples,we propose a multi-branch deepfake detection algorithm based on fine-grained features from the perspective of fine-grained classification.First,to address the critical problem in locating discriminative feature regions in fine-grained classification tasks,we investigate a method for locating multiple different discriminative regions and design a lightweight feature localization module to obtain crucial feature representations by augmenting the most significant parts of the feature map.Second,using information complementation,we introduce a correlation-guided fusion module to enhance the discriminative feature information of different branches.Finally,we use the global attention module in the multi-branch model to improve the cross-dimensional interaction of spatial domain and channel domain information and increase the weights of crucial feature regions and feature channels.We conduct sufficient ablation experiments and comparative experiments.The experimental results show that the algorithm outperforms the detection accuracy and effectiveness on the FaceForensics++and Celeb-DF-v2 datasets compared with the representative detection algorithms in recent years,which can achieve better detection results.展开更多
In recent years,with the rapid development of deep learning technologies,some neural network models have been applied to generate fake media.DeepFakes,a deep learning based forgery technology,can tamper with the face ...In recent years,with the rapid development of deep learning technologies,some neural network models have been applied to generate fake media.DeepFakes,a deep learning based forgery technology,can tamper with the face easily and generate fake videos that are difficult to be distinguished by human eyes.The spread of face manipulation videos is very easy to bring fake information.Therefore,it is important to develop effective detection methods to verify the authenticity of the videos.Due to that it is still challenging for current forgery technologies to generate all facial details and the blending operations are used in the forgery process,the texture details of the fake face are insufficient.Therefore,in this paper,a new method is proposed to detect DeepFake videos.Firstly,the texture features are constructed,which are based on the gradient domain,standard deviation,gray level co-occurrence matrix and wavelet transform of the face region.Then,the features are processed by the feature selection method to form a discriminant feature vector,which is finally employed to SVM for classification at the frame level.The experimental results on the mainstream DeepFake datasets demonstrate that the proposed method can achieve ideal performance,proving the effectiveness of the proposed method for DeepFake videos detection.展开更多
文摘This paper provides a comprehensive bibliometric exposition on deepfake research,exploring the intersection of artificial intelligence and deepfakes as well as international collaborations,prominent researchers,organizations,institutions,publications,and key themes.We performed a search on theWeb of Science(WoS)database,focusing on Artificial Intelligence and Deepfakes,and filtered the results across 21 research areas,yielding 1412 articles.Using VOSviewer visualization tool,we analyzed thisWoS data through keyword co-occurrence graphs,emphasizing on four prominent research themes.Compared with existing bibliometric papers on deepfakes,this paper proceeds to identify and discuss some of the highly cited papers within these themes:deepfake detection,feature extraction,face recognition,and forensics.The discussion highlights key challenges and advancements in deepfake research.Furthermore,this paper also discusses pressing issues surrounding deepfakes such as security,regulation,and datasets.We also provide an analysis of another exhaustive search on Scopus database focusing solely on Deepfakes(while not excluding AI)revealing deep learning as the predominant keyword,underscoring AI’s central role in deepfake research.This comprehensive analysis,encompassing over 500 keywords from 8790 articles,uncovered a wide range of methods,implications,applications,concerns,requirements,challenges,models,tools,datasets,and modalities related to deepfakes.Finally,a discussion on recommendations for policymakers,researchers,and other stakeholders is also provided.
基金supported by the National Fund Cultivation Project from China People’s Police University(Grant Number:JJPY202402)National Natural Science Foundation of China(Grant Number:62172165).
文摘With the rapid advancement of visual generative models such as Generative Adversarial Networks(GANs)and stable Diffusion,the creation of highly realistic Deepfake through automated forgery has significantly progressed.This paper examines the advancements inDeepfake detection and defense technologies,emphasizing the shift from passive detection methods to proactive digital watermarking techniques.Passive detection methods,which involve extracting features from images or videos to identify forgeries,encounter challenges such as poor performance against unknown manipulation techniques and susceptibility to counter-forensic tactics.In contrast,proactive digital watermarking techniques embed specificmarkers into images or videos,facilitating real-time detection and traceability,thereby providing a preemptive defense againstDeepfake content.We offer a comprehensive analysis of digitalwatermarking-based forensic techniques,discussing their advantages over passivemethods and highlighting four key benefits:real-time detection,embedded defense,resistance to tampering,and provision of legal evidence.Additionally,the paper identifies gaps in the literature concerning proactive forensic techniques and suggests future research directions,including cross-domain watermarking and adaptive watermarking strategies.By systematically classifying and comparing existing techniques,this review aims to contribute valuable insights for the development of more effective proactive defense strategies in Deepfake forensics.
文摘Recent advances in artificial intelligence and the availability of large-scale benchmarks have made deepfake video generation and manipulation easier.Therefore,developing reliable and robust deepfake video detection mechanisms is paramount.This research introduces a novel real-time deepfake video detection framework by analyzing gaze and blink patterns,addressing the spatial-temporal challenges unique to gaze and blink anomalies using the TimeSformer and hybrid Transformer-CNN models.The TimeSformer architecture leverages spatial-temporal attention mechanisms to capture fine-grained blinking intervals and gaze direction anomalies.Compared to state-of-the-art traditional convolutional models like MesoNet and EfficientNet,which primarily focus on global facial features,our approach emphasizes localized eye-region analysis,significantly enhancing detection accuracy.We evaluate our framework on four standard datasets:FaceForensics,CelebDF-V2,DFDC,and FakeAVCeleb.The proposed framework results reveal higher accuracy,with the TimeSformer model achieving accuracies of 97.5%,96.3%,95.8%,and 97.1%,and with the hybrid Transformer-CNN model demonstrating accuracies of 92.8%,91.5%,90.9%,and 93.2%,on FaceForensics,CelebDF-V2,DFDC,and FakeAVCeleb datasets,respectively,showing robustness in distinguishing manipulated from authentic videos.Our research provides a robust state-of-the-art framework for real-time deepfake video detection.This novel study significantly contributes to video forensics,presenting scalable and accurate real-world application solutions.
基金supported by King Saud University,Riyadh,Saudi Arabia,through the Researchers Supporting Project under Grant RSP2025R493。
文摘With expeditious advancements in AI-driven facial manipulation techniques,particularly deepfake technology,there is growing concern over its potential misuse.Deepfakes pose a significant threat to society,partic-ularly by infringing on individuals’privacy.Amid significant endeavors to fabricate systems for identifying deepfake fabrications,existing methodologies often face hurdles in adjusting to innovative forgery techniques and demonstrate increased vulnerability to image and video clarity variations,thereby hindering their broad applicability to images and videos produced by unfamiliar technologies.In this manuscript,we endorse resilient training tactics to amplify generalization capabilities.In adversarial training,models are trained using deliberately crafted samples to deceive classification systems,thereby significantly enhancing their generalization ability.In response to this challenge,we propose an innovative hybrid adversarial training framework integrating Virtual Adversarial Training(VAT)with Two-Generated Blurred Adversarial Training.This combined framework bolsters the model’s resilience in detecting deepfakes made using unfamiliar deep learning technologies.Through such adversarial training,models are prompted to acquire more versatile attributes.Through experimental studies,we demonstrate that our model achieves higher accuracy than existing models.
基金funded by the National Natural Science Foundation of China(Grant No.62376212)the Shaanxi Science Foundation of China(Grant No.2022GY-087)supported by the Open Fund of Intelligent Control Laboratory.
文摘The majority of current deepfake detection methods are constrained to identifying one or two specific types of counterfeit images,which limits their ability to keep pace with the rapid advancements in deepfake technology.Therefore,in this study,we propose a novel algorithm,StereoMixture Density Network(SMNDNet),which can detect multiple types of deepfake face manipulations using a single network framework.SMNDNet is an end-to-end CNNbased network specially designed for detecting various manipulation types of deepfake face images.First,we design a Subtle Distinguishable Feature Enhancement Module to emphasize the differentiation between authentic and forged features.Second,we introduce aMulti-Scale Forged Region AdaptiveModule that dynamically adapts to extract forged features from images of varying synthesis scales.Third,we integrate a Nonlinear Expression Capability Enhancement Module to augment the model’s capacity for capturing intricate nonlinear patterns across various types of deepfakes.Collectively,these modules empower our model to efficiently extract forgery features fromdiverse manipulation types,ensuring a more satisfactory performance in multiple-types deepfake detection.Experiments show that the proposed method outperforms alternative approaches in detection accuracy and AUC across all four types of deepfake images.It also demonstrates strong generalization on cross-dataset and cross-type detection,along with robust performance against post-processing manipulations.
文摘Pre-reading What is a Deepfake,and how is it created?What are Deepfakes?A Deepfake is a video,image or audio clip that has been created using artificial intelligence.The idea is to make it as realistic as possible.
基金supported by National Natural Science Foundation of China(Nos.62477026,62177029,61807020)Humanities and Social Sciences Research Program of the Ministry of Education of China(No.23YJAZH047)the Startup Foundation for Introducing Talent of Nanjing University of Posts and Communications under Grant NY222034.
文摘As Deepfake technology continues to evolve,the distinction between real and fake content becomes increasingly blurred.Most existing Deepfake video detectionmethods rely on single-frame facial image features,which limits their ability to capture temporal differences between frames.Current methods also exhibit limited generalization capabilities,struggling to detect content generated by unknown forgery algorithms.Moreover,the diversity and complexity of forgery techniques introduced by Artificial Intelligence Generated Content(AIGC)present significant challenges for traditional detection frameworks,whichmust balance high detection accuracy with robust performance.To address these challenges,we propose a novel Deepfake detection framework that combines a two-stream convolutional network with a Vision Transformer(ViT)module to enhance spatio-temporal feature representation.The ViT model extracts spatial features from the forged video,while the 3D convolutional network captures temporal features.The 3D convolution enables cross-frame feature extraction,allowing the model to detect subtle facial changes between frames.The confidence scores from both the ViT and 3D convolution submodels are fused at the decision layer,enabling themodel to effectively handle unknown forgery techniques.Focusing on Deepfake videos and GAN-generated images,the proposed approach is evaluated on two widely used public face forgery datasets.Compared to existing state-of-theartmethods,it achieves higher detection accuracy and better generalization performance,offering a robust solution for deepfake detection in real-world scenarios.
基金supported by the National Nature Science Foundation of China(Grant Number:61962010).
文摘Deepfake-generated fake faces,commonly utilized in identity-related activities such as political propaganda,celebrity impersonations,evidence forgery,and familiar fraud,pose new societal threats.Although current deepfake generators strive for high realism in visual effects,they do not replicate biometric signals indicative of cardiac activity.Addressing this gap,many researchers have developed detection methods focusing on biometric characteristics.These methods utilize classification networks to analyze both temporal and spectral domain features of the remote photoplethysmography(rPPG)signal,resulting in high detection accuracy.However,in the spectral analysis,existing approaches often only consider the power spectral density and neglect the amplitude spectrum—both crucial for assessing cardiac activity.We introduce a novel method that extracts rPPG signals from multiple regions of interest through remote photoplethysmography and processes them using Fast Fourier Transform(FFT).The resultant time-frequency domain signal samples are organized into matrices to create Matrix Visualization Heatmaps(MVHM),which are then utilized to train an image classification network.Additionally,we explored various combinations of time-frequency domain representations of rPPG signals and the impact of attention mechanisms.Our experimental results show that our algorithm achieves a remarkable detection accuracy of 99.22%in identifying fake videos,significantly outperforming mainstream algorithms and demonstrating the effectiveness of Fourier Transform and attention mechanisms in detecting fake faces.
文摘Deepfake has emerged as an obstinate challenge in a world dominated by light.Here,the authors introduce a new deepfake detection method based on Xception architecture.The model is tested exhaustively with millions of frames and diverse video clips;accuracy levels as high as 99.65%are reported.These are the main reasons for such high efficacy:superior feature extraction capabilities and stable training mechanisms,such as early stopping,characterizing the Xception model.The methodology applied is also more advanced when it comes to data preprocessing steps,making use of state-of-the-art techniques applied to ensure constant performance.With an ever-rising threat from fake media,this piece of research puts great emphasis on stringent memory testing to keep at bay the spread of manipulated content.It also justifies better explanation methods to justify the reasoning done by the model for those decisions that build more trust and reliability.The ensemble models being more accurate have been studied and examined for establishing a possibility of combining various detection frameworks that could together produce superior results.Further,the study underlines the need for real-time detection tools that can be effective on different social media sites and digital environments.Ethics,protecting privacy,and public awareness in the fight against the proliferation of deepfakes are important considerations.By significantly contributing to the advancements made in the technology that has actually advanced detection,it strengthens the safety and integrity of the cyber world with a robust defense against ever-evolving deepfake threats in technology.Overall,the findings generally go a long way to prove themselves as the crucial step forward to ensuring information authenticity and the trustworthiness of society in this digital world.
基金supported by the 2023 Open Project of Key Laboratory of Ministry of Public Security for Artificial Intelligence Security(RGZNAQ-2304)the Fundamental Research Funds for the Central Universities of PPSUC(2023JKF01ZK08).
文摘With the rapid development of deepfake technology,the authenticity of various types of fake synthetic content is increasing rapidly,which brings potential security threats to people’s daily life and social stability.Currently,most algorithms define deepfake detection as a binary classification problem,i.e.,global features are first extracted using a backbone network and then fed into a binary classifier to discriminate true or false.However,the differences between real and fake samples are often subtle and local,and such global feature-based detection algorithms are not optimal in efficiency and accuracy.To this end,to enhance the extraction of forgery details in deep forgery samples,we propose a multi-branch deepfake detection algorithm based on fine-grained features from the perspective of fine-grained classification.First,to address the critical problem in locating discriminative feature regions in fine-grained classification tasks,we investigate a method for locating multiple different discriminative regions and design a lightweight feature localization module to obtain crucial feature representations by augmenting the most significant parts of the feature map.Second,using information complementation,we introduce a correlation-guided fusion module to enhance the discriminative feature information of different branches.Finally,we use the global attention module in the multi-branch model to improve the cross-dimensional interaction of spatial domain and channel domain information and increase the weights of crucial feature regions and feature channels.We conduct sufficient ablation experiments and comparative experiments.The experimental results show that the algorithm outperforms the detection accuracy and effectiveness on the FaceForensics++and Celeb-DF-v2 datasets compared with the representative detection algorithms in recent years,which can achieve better detection results.
基金supported by the National Natural Science Foundation of China(Nos.U2001202,62072480,U1736118)the National Key R&D Program of China(Nos.2019QY2202,2019QY(Y)0207)+1 种基金the Key Areas R&D Program of Guangdong(No.2019B010136002)the Key Scientific Research Program of Guangzhou(No.201804020068).
文摘In recent years,with the rapid development of deep learning technologies,some neural network models have been applied to generate fake media.DeepFakes,a deep learning based forgery technology,can tamper with the face easily and generate fake videos that are difficult to be distinguished by human eyes.The spread of face manipulation videos is very easy to bring fake information.Therefore,it is important to develop effective detection methods to verify the authenticity of the videos.Due to that it is still challenging for current forgery technologies to generate all facial details and the blending operations are used in the forgery process,the texture details of the fake face are insufficient.Therefore,in this paper,a new method is proposed to detect DeepFake videos.Firstly,the texture features are constructed,which are based on the gradient domain,standard deviation,gray level co-occurrence matrix and wavelet transform of the face region.Then,the features are processed by the feature selection method to form a discriminant feature vector,which is finally employed to SVM for classification at the frame level.The experimental results on the mainstream DeepFake datasets demonstrate that the proposed method can achieve ideal performance,proving the effectiveness of the proposed method for DeepFake videos detection.