Images and videos play an increasingly vital role in daily life and are widely utilized as key evidentiary sources in judicial investigations and forensic analysis.Simultaneously,advancements in image and video proces...Images and videos play an increasingly vital role in daily life and are widely utilized as key evidentiary sources in judicial investigations and forensic analysis.Simultaneously,advancements in image and video processing technologies have facilitated the widespread availability of powerful editing tools,such as Deepfakes,enabling anyone to easily create manipulated or fake visual content,which poses an enormous threat to social security and public trust.To verify the authenticity and integrity of images and videos,numerous approaches have been proposed,which are primarily based on content analysis and their effectiveness is susceptible to interference from various image or video post-processing operations.Recent research has highlighted the potential of file containers analysis as a promising forensic approach that offers efficient and interpretable results.However,there is still a lack of review articles on this kind of approach.In order to fill this gap,we present a comprehensive review of file containers-based image and video forensics in this paper.Specifically,we categorize the existing methods into two distinct stages,qualitative analysis and quantitative analysis.In addition,an overall framework is proposed to organize the exiting approaches.Then,the advantages and disadvantages of the schemes used across different forensic tasks are provided.Finally,we outline the trends in this research area,aiming to provide valuable insights and technical guidance for future research.展开更多
Recent advances in artificial intelligence and the availability of large-scale benchmarks have made deepfake video generation and manipulation easier.Therefore,developing reliable and robust deepfake video detection m...Recent advances in artificial intelligence and the availability of large-scale benchmarks have made deepfake video generation and manipulation easier.Therefore,developing reliable and robust deepfake video detection mechanisms is paramount.This research introduces a novel real-time deepfake video detection framework by analyzing gaze and blink patterns,addressing the spatial-temporal challenges unique to gaze and blink anomalies using the TimeSformer and hybrid Transformer-CNN models.The TimeSformer architecture leverages spatial-temporal attention mechanisms to capture fine-grained blinking intervals and gaze direction anomalies.Compared to state-of-the-art traditional convolutional models like MesoNet and EfficientNet,which primarily focus on global facial features,our approach emphasizes localized eye-region analysis,significantly enhancing detection accuracy.We evaluate our framework on four standard datasets:FaceForensics,CelebDF-V2,DFDC,and FakeAVCeleb.The proposed framework results reveal higher accuracy,with the TimeSformer model achieving accuracies of 97.5%,96.3%,95.8%,and 97.1%,and with the hybrid Transformer-CNN model demonstrating accuracies of 92.8%,91.5%,90.9%,and 93.2%,on FaceForensics,CelebDF-V2,DFDC,and FakeAVCeleb datasets,respectively,showing robustness in distinguishing manipulated from authentic videos.Our research provides a robust state-of-the-art framework for real-time deepfake video detection.This novel study significantly contributes to video forensics,presenting scalable and accurate real-world application solutions.展开更多
基金supported in part by Natural Science Foundation of Hubei Province of China under Grant 2023AFB016the 2022 Opening Fund for Hubei Key Laboratory of Intelligent Vision Based Monitoring for Hydroelectric Engineering under Grant 2022SDSJ02the Construction Fund for Hubei Key Laboratory of Intelligent Vision Based Monitoring for Hydroelectric Engineering under Grant 2019ZYYD007.
文摘Images and videos play an increasingly vital role in daily life and are widely utilized as key evidentiary sources in judicial investigations and forensic analysis.Simultaneously,advancements in image and video processing technologies have facilitated the widespread availability of powerful editing tools,such as Deepfakes,enabling anyone to easily create manipulated or fake visual content,which poses an enormous threat to social security and public trust.To verify the authenticity and integrity of images and videos,numerous approaches have been proposed,which are primarily based on content analysis and their effectiveness is susceptible to interference from various image or video post-processing operations.Recent research has highlighted the potential of file containers analysis as a promising forensic approach that offers efficient and interpretable results.However,there is still a lack of review articles on this kind of approach.In order to fill this gap,we present a comprehensive review of file containers-based image and video forensics in this paper.Specifically,we categorize the existing methods into two distinct stages,qualitative analysis and quantitative analysis.In addition,an overall framework is proposed to organize the exiting approaches.Then,the advantages and disadvantages of the schemes used across different forensic tasks are provided.Finally,we outline the trends in this research area,aiming to provide valuable insights and technical guidance for future research.
文摘Recent advances in artificial intelligence and the availability of large-scale benchmarks have made deepfake video generation and manipulation easier.Therefore,developing reliable and robust deepfake video detection mechanisms is paramount.This research introduces a novel real-time deepfake video detection framework by analyzing gaze and blink patterns,addressing the spatial-temporal challenges unique to gaze and blink anomalies using the TimeSformer and hybrid Transformer-CNN models.The TimeSformer architecture leverages spatial-temporal attention mechanisms to capture fine-grained blinking intervals and gaze direction anomalies.Compared to state-of-the-art traditional convolutional models like MesoNet and EfficientNet,which primarily focus on global facial features,our approach emphasizes localized eye-region analysis,significantly enhancing detection accuracy.We evaluate our framework on four standard datasets:FaceForensics,CelebDF-V2,DFDC,and FakeAVCeleb.The proposed framework results reveal higher accuracy,with the TimeSformer model achieving accuracies of 97.5%,96.3%,95.8%,and 97.1%,and with the hybrid Transformer-CNN model demonstrating accuracies of 92.8%,91.5%,90.9%,and 93.2%,on FaceForensics,CelebDF-V2,DFDC,and FakeAVCeleb datasets,respectively,showing robustness in distinguishing manipulated from authentic videos.Our research provides a robust state-of-the-art framework for real-time deepfake video detection.This novel study significantly contributes to video forensics,presenting scalable and accurate real-world application solutions.