In response to the problem of traditional methods ignoring audio modality tampering, this study aims to explore an effective deep forgery video detection technique that improves detection precision and reliability by ...In response to the problem of traditional methods ignoring audio modality tampering, this study aims to explore an effective deep forgery video detection technique that improves detection precision and reliability by fusing lip images and audio signals. The main method used is lip-audio matching detection technology based on the Siamese neural network, combined with MFCC (Mel Frequency Cepstrum Coefficient) feature extraction of band-pass filters, an improved dual-branch Siamese network structure, and a two-stream network structure design. Firstly, the video stream is preprocessed to extract lip images, and the audio stream is preprocessed to extract MFCC features. Then, these features are processed separately through the two branches of the Siamese network. Finally, the model is trained and optimized through fully connected layers and loss functions. The experimental results show that the testing accuracy of the model in this study on the LRW (Lip Reading in the Wild) dataset reaches 92.3%;the recall rate is 94.3%;the F1 score is 93.3%, significantly better than the results of CNN (Convolutional Neural Networks) and LSTM (Long Short-Term Memory) models. In the validation of multi-resolution image streams, the highest accuracy of dual-resolution image streams reaches 94%. Band-pass filters can effectively improve the signal-to-noise ratio of deep forgery video detection when processing different types of audio signals. The real-time processing performance of the model is also excellent, and it achieves an average score of up to 5 in user research. These data demonstrate that the method proposed in this study can effectively fuse visual and audio information in deep forgery video detection, accurately identify inconsistencies between video and audio, and thus verify the effectiveness of lip-audio modality fusion technology in improving detection performance.展开更多
Face forgery detection is drawing ever-increasing attention in the academic community owing to security concerns.Despite the considerable progress in existing methods,we note that:Previous works overlooked finegrain f...Face forgery detection is drawing ever-increasing attention in the academic community owing to security concerns.Despite the considerable progress in existing methods,we note that:Previous works overlooked finegrain forgery cues with high transferability.Such cues positively impact the model’s accuracy and generalizability.Moreover,single-modality often causes overfitting of the model,and Red-Green-Blue(RGB)modal-only is not conducive to extracting the more detailed forgery traces.We propose a novel framework for fine-grain forgery cues mining with fusion modality to cope with these issues.First,we propose two functional modules to reveal and locate the deeper forged features.Our method locates deeper forgery cues through a dual-modality progressive fusion module and a noise adaptive enhancement module,which can excavate the association between dualmodal space and channels and enhance the learning of subtle noise features.A sensitive patch branch is introduced on this foundation to enhance the mining of subtle forgery traces under fusion modality.The experimental results demonstrate that our proposed framework can desirably explore the differences between authentic and forged images with supervised learning.Comprehensive evaluations of several mainstream datasets show that our method outperforms the state-of-the-art detection methods with remarkable detection ability and generalizability.展开更多
Navigation without Global Navigation Satellite Systems(GNSS)poses a significant challenge in aerospace engineering,particularly in the environments where satellite signals are obstructed or unavailable.This paper offe...Navigation without Global Navigation Satellite Systems(GNSS)poses a significant challenge in aerospace engineering,particularly in the environments where satellite signals are obstructed or unavailable.This paper offers an in-depth review of various methods,sensors,and algorithms for Unmanned Aerial Vehicle(UAV)localization in outdoor environments where GNSS signals are unavailable or denied.A key contribution of this study is the establishment of a critical classification system that divides GNSS-denied navigation techniques into two primary categories:absolute and relative localization.This classification enhances the understanding of the strengths and weaknesses of different strategies in various operational contexts.Vision-based localization is identified as the most effective approach in GNSS-denied environments.Nonetheless,it’s clear that no single-sensor-based localization algorithm can fulfill all the needs of a comprehensive navigation system in outdoor environments.Therefore,it’s vital to implement a hybrid strategy that merges various algorithms and sensors for effective outcomes.This detailed analysis emphasizes the challenges and possible solutions for achieving reliable and effective outdoor UAV localization in environments where GNSS is unreliable or unavailable.This multi-faceted analysis,highlights the complexities and potential pathways for achieving efficient and dependable outdoor UAV localization in GNSS-denied environments.展开更多
文摘In response to the problem of traditional methods ignoring audio modality tampering, this study aims to explore an effective deep forgery video detection technique that improves detection precision and reliability by fusing lip images and audio signals. The main method used is lip-audio matching detection technology based on the Siamese neural network, combined with MFCC (Mel Frequency Cepstrum Coefficient) feature extraction of band-pass filters, an improved dual-branch Siamese network structure, and a two-stream network structure design. Firstly, the video stream is preprocessed to extract lip images, and the audio stream is preprocessed to extract MFCC features. Then, these features are processed separately through the two branches of the Siamese network. Finally, the model is trained and optimized through fully connected layers and loss functions. The experimental results show that the testing accuracy of the model in this study on the LRW (Lip Reading in the Wild) dataset reaches 92.3%;the recall rate is 94.3%;the F1 score is 93.3%, significantly better than the results of CNN (Convolutional Neural Networks) and LSTM (Long Short-Term Memory) models. In the validation of multi-resolution image streams, the highest accuracy of dual-resolution image streams reaches 94%. Band-pass filters can effectively improve the signal-to-noise ratio of deep forgery video detection when processing different types of audio signals. The real-time processing performance of the model is also excellent, and it achieves an average score of up to 5 in user research. These data demonstrate that the method proposed in this study can effectively fuse visual and audio information in deep forgery video detection, accurately identify inconsistencies between video and audio, and thus verify the effectiveness of lip-audio modality fusion technology in improving detection performance.
基金This study is supported by the Fundamental Research Funds for the Central Universities of PPSUC under Grant 2022JKF02009.
文摘Face forgery detection is drawing ever-increasing attention in the academic community owing to security concerns.Despite the considerable progress in existing methods,we note that:Previous works overlooked finegrain forgery cues with high transferability.Such cues positively impact the model’s accuracy and generalizability.Moreover,single-modality often causes overfitting of the model,and Red-Green-Blue(RGB)modal-only is not conducive to extracting the more detailed forgery traces.We propose a novel framework for fine-grain forgery cues mining with fusion modality to cope with these issues.First,we propose two functional modules to reveal and locate the deeper forged features.Our method locates deeper forgery cues through a dual-modality progressive fusion module and a noise adaptive enhancement module,which can excavate the association between dualmodal space and channels and enhance the learning of subtle noise features.A sensitive patch branch is introduced on this foundation to enhance the mining of subtle forgery traces under fusion modality.The experimental results demonstrate that our proposed framework can desirably explore the differences between authentic and forged images with supervised learning.Comprehensive evaluations of several mainstream datasets show that our method outperforms the state-of-the-art detection methods with remarkable detection ability and generalizability.
基金funded by PSDSARC seed project number(PSDSARC Project ID:PID-000085_01_02)the APC was funded by PSU.
文摘Navigation without Global Navigation Satellite Systems(GNSS)poses a significant challenge in aerospace engineering,particularly in the environments where satellite signals are obstructed or unavailable.This paper offers an in-depth review of various methods,sensors,and algorithms for Unmanned Aerial Vehicle(UAV)localization in outdoor environments where GNSS signals are unavailable or denied.A key contribution of this study is the establishment of a critical classification system that divides GNSS-denied navigation techniques into two primary categories:absolute and relative localization.This classification enhances the understanding of the strengths and weaknesses of different strategies in various operational contexts.Vision-based localization is identified as the most effective approach in GNSS-denied environments.Nonetheless,it’s clear that no single-sensor-based localization algorithm can fulfill all the needs of a comprehensive navigation system in outdoor environments.Therefore,it’s vital to implement a hybrid strategy that merges various algorithms and sensors for effective outcomes.This detailed analysis emphasizes the challenges and possible solutions for achieving reliable and effective outdoor UAV localization in environments where GNSS is unreliable or unavailable.This multi-faceted analysis,highlights the complexities and potential pathways for achieving efficient and dependable outdoor UAV localization in GNSS-denied environments.