Neuropsychological tests,such as the Rey-Osterrieth complex figure(ROCF)test,help detect mild cognitive impairment(MCI)in adults by assessing cognitive abilities such as planning,organization,and memory.Furthermore,th...Neuropsychological tests,such as the Rey-Osterrieth complex figure(ROCF)test,help detect mild cognitive impairment(MCI)in adults by assessing cognitive abilities such as planning,organization,and memory.Furthermore,they are inexpensive and minimally invasive,making them excellent tools for early screening.In this paper,we propose the use of image analysis models to characterize the relationship between an individual’s ROCF drawing and their cognitive state.This task is usually framed as a classification problem and is solved using deep learning models,due to their success in the last decade.In order to achieve good performance,these models need to be trained with a large number of examples.Given that our data availability is limited,we alternatively treat our task as a similarity learning problem,performing pairwise ROCF drawing comparisons to define groups that represent different cognitive states.This way of working could lead to better data utilization and improved model performance.To solve the similarity learning problem,we propose a siamese neural network(SNN)that exploits the distances of arbitrary ROCF drawings to the ideal representation of the ROCF.Our proposal is compared against various deep learning models designed for classification using a public dataset of 528 ROCF copy drawings,which are associated with either healthy individuals or those with MCI.Quantitative results are derived from a scheme involving multiple rounds of evaluation,employing both a dedicated test set and 14-fold cross-validation.Our SNN proposal demonstrates superiority in validation performance,and test results comparable to those of the classification-based deep learning models.展开更多
In response to the problem of traditional methods ignoring audio modality tampering, this study aims to explore an effective deep forgery video detection technique that improves detection precision and reliability by ...In response to the problem of traditional methods ignoring audio modality tampering, this study aims to explore an effective deep forgery video detection technique that improves detection precision and reliability by fusing lip images and audio signals. The main method used is lip-audio matching detection technology based on the Siamese neural network, combined with MFCC (Mel Frequency Cepstrum Coefficient) feature extraction of band-pass filters, an improved dual-branch Siamese network structure, and a two-stream network structure design. Firstly, the video stream is preprocessed to extract lip images, and the audio stream is preprocessed to extract MFCC features. Then, these features are processed separately through the two branches of the Siamese network. Finally, the model is trained and optimized through fully connected layers and loss functions. The experimental results show that the testing accuracy of the model in this study on the LRW (Lip Reading in the Wild) dataset reaches 92.3%;the recall rate is 94.3%;the F1 score is 93.3%, significantly better than the results of CNN (Convolutional Neural Networks) and LSTM (Long Short-Term Memory) models. In the validation of multi-resolution image streams, the highest accuracy of dual-resolution image streams reaches 94%. Band-pass filters can effectively improve the signal-to-noise ratio of deep forgery video detection when processing different types of audio signals. The real-time processing performance of the model is also excellent, and it achieves an average score of up to 5 in user research. These data demonstrate that the method proposed in this study can effectively fuse visual and audio information in deep forgery video detection, accurately identify inconsistencies between video and audio, and thus verify the effectiveness of lip-audio modality fusion technology in improving detection performance.展开更多
A person’s eye gaze can effectively express that person’s intentions.Thus,gaze estimation is an important approach in intelligent manufacturing to analyze a person’s intentions.Many gaze estimation methods regress ...A person’s eye gaze can effectively express that person’s intentions.Thus,gaze estimation is an important approach in intelligent manufacturing to analyze a person’s intentions.Many gaze estimation methods regress the direction of the gaze by analyzing images of the eyes,also known as eye patches.However,it is very difficult to construct a person-independent model that can estimate an accurate gaze direction for every person due to individual differences.In this paper,we hypothesize that the difference in the appearance of each of a person’s eyes is related to the difference in the corresponding gaze directions.Based on this hypothesis,a differential eyes’appearances network(DEANet)is trained on public datasets to predict the gaze differences of pairwise eye patches belonging to the same individual.Our proposed DEANet is based on a Siamese neural network(SNNet)framework which has two identical branches.A multi-stream architecture is fed into each branch of the SNNet.Both branches of the DEANet that share the same weights extract the features of the patches;then the features are concatenated to obtain the difference of the gaze directions.Once the differential gaze model is trained,a new person’s gaze direction can be estimated when a few calibrated eye patches for that person are provided.Because personspecific calibrated eye patches are involved in the testing stage,the estimation accuracy is improved.Furthermore,the problem of requiring a large amount of data when training a person-specific model is effectively avoided.A reference grid strategy is also proposed in order to select a few references as some of the DEANet’s inputs directly based on the estimation values,further thereby improving the estimation accuracy.Experiments on public datasets show that our proposed approach outperforms the state-of-theart methods.展开更多
Intrusion detection systems(IDS)can play a significant role in detecting security threats or malicious attacks that aim to steal information and/or corrupt network protocols.To deal with the dynamic and complex nature...Intrusion detection systems(IDS)can play a significant role in detecting security threats or malicious attacks that aim to steal information and/or corrupt network protocols.To deal with the dynamic and complex nature of cyber-attacks,advanced intelligent tools have been applied resulting into powerful and automated IDS that rely on the latest advances of machine learning(ML)and deep learning(DL).Most of the reported effort has been devoted on building complex ML/DL architectures adopting a brute force approach towards the maximization of their detection capacity.However,just a limited number of studies have focused on the identification or extraction of user-friendly risk indicators that could be easily used by security experts.Many papers have explored various dimensionality reduction algorithms,however a large number of selected features is still required to detect the attacks successfully,which humans cannot intuitively or immediately understand.To enhance user’s trust and understanding on data without sacrificing on accuracy,this paper contributes to the transformation of the available data collected by IDS into a single actionable and easy-to-understand risk indicator.To achieve this,a novel feature extraction pipeline was implemented consisting of the following components:(i)a fuzzy allocation scheme that transforms raw data to fuzzy class memberships,(ii)a novel modality transformation mechanism for converting feature vectors to images(Vec2im)and(iii)a dimensionality reduction module that makes use of Siamese convolutional neural networks that finally reduces the input data dimensionality into a 1-d feature space.The performance of the proposed methodology was validated with respect to detection accuracy,dimensionality reduction performance and execution time on the NSL-KDD dataset via a thorough comparative analysis that demonstrated its effectiveness(86.64%testing accuracy using only one feature)over a number of well-known feature selection(FS)and extraction techniques.The output of the proposed feature extraction pipeline could be potentially used by security experts as an indicator of malicious activity,whereas the generated images could be further utilized and/or integrated as a visual analytics tool in existing IDS.展开更多
Intrusion detection systems(IDS)can play a significant role in detecting security threats or malicious attacks that aim to steal information and/or corrupt network protocols.To deal with the dynamic and complex nature...Intrusion detection systems(IDS)can play a significant role in detecting security threats or malicious attacks that aim to steal information and/or corrupt network protocols.To deal with the dynamic and complex nature of cyber-attacks,advanced intelligent tools have been applied resulting into powerful and automated IDS that rely on the latest advances of machine learning(ML)and deep learning(DL).Most of the reported effort has been devoted on building complex ML/DL architectures adopting a brute force approach towards the maximization of their detection capacity.However,just a limited number of studies have focused on the identification or extraction of user-friendly risk indicators that could be easily used by security experts.Many papers have explored various dimensionality reduction algorithms,however a large number of selected features is still required to detect the attacks successfully,which humans cannot intuitively or immediately understand.To enhance user’s trust and understanding on data without sacrificing on accuracy,this paper contributes to the transformation of the available data collected by IDS into a single actionable and easy-to-understand risk indicator.To achieve this,a novel feature extraction pipeline was implemented consisting of the following components:(i)a fuzzy allocation scheme that transforms raw data to fuzzy class memberships,(ii)a novel modality transformation mechanism for converting feature vectors to images(Vec2im)and(iii)a dimensionality reduction module that makes use of Siamese convolutional neural networks that finally reduces the input data dimensionality into a 1-d feature space.The performance of the proposed methodology was validated with respect to detection accuracy,dimensionality reduction performance and execution time on the NSL-KDD dataset via a thorough comparative analysis that demonstrated its effectiveness(86.64%testing accuracy using only one feature)over a number of well-known feature selection(FS)and extraction techniques.The output of the proposed feature extraction pipeline could be potentially used by security experts as an indicator of malicious activity,whereas the generated images could be further utilized and/or integrated as a visual analytics tool in existing IDS.展开更多
To improve the recognition ability of communication jamming signals,Siamese Neural Network-based Open World Recognition(SNNOWR)is proposed.The algorithm can recognize known jamming classes,detect new(unknown)jamming c...To improve the recognition ability of communication jamming signals,Siamese Neural Network-based Open World Recognition(SNNOWR)is proposed.The algorithm can recognize known jamming classes,detect new(unknown)jamming classes,and unsupervised cluseter new classes.The network of SNN-OWR is trained supervised with paired input data consisting of two samples from a known dataset.On the one hand,the network is required to have the ability to distinguish whether two samples are from the same class.On the other hand,the latent distribution of known class is forced to approach their own unique Gaussian distribution,which is prepared for the subsequent open set testing.During the test,the unknown class detection process based on Gaussian probability density function threshold is designed,and an unsupervised clustering algorithm of the unknown jamming is realized by using the prior knowledge of known classes.The simulation results show that when the jamming-to-noise ratio is more than 0d B,the accuracy of SNN-OWR algorithm for known jamming classes recognition,unknown jamming detection and unsupervised clustering of unknown jamming is about 95%.This indicates that the SNN-OWR algorithm can make the effect of the recognition of unknown jamming be almost the same as that of known jamming.展开更多
This paper addresses the problem of visual object tracking for Unmanned Aerial Vehicles(UAVs).Most Siamese trackers are used to regard object tracking as classification and regression problems.However,it is difficult ...This paper addresses the problem of visual object tracking for Unmanned Aerial Vehicles(UAVs).Most Siamese trackers are used to regard object tracking as classification and regression problems.However,it is difficult for these trackers to accurately classify in the face of similar objects,background clutters and other common challenges in UAV scenes.So,a reliable classifier is the key to improving UAV tracking performance.In this paper,a simple yet efficient tracker following the basic architecture of the Siamese neural network is proposed,which improves the classification ability from three stages.First,the frequency channel attention module is introduced to enhance the target features via frequency domain learning.Second,a template-guided attention module is designed to promote information exchange between the template branch and the search branch,which can get reliable classification response maps.Third,adaptive cross-entropy loss is proposed to make the tracker focus on hard samples that contribute more to the training process,solving the data imbalance between positive and negative samples.To evaluate the performance of the proposed tracker,comprehensive experiments are conducted on two challenging aerial datasets,including UAV123 and UAVDT.Experimental results demonstrate that the proposed tracker achieves favorable tracking performances in aerial benchmarks beyond 41 frames/s.We conducted experiments in real UAV scenes to further verify the efficiency of our tracker in the real world.展开更多
Many types of research focus on utilizing Palmprint recognition in user identification and authentication.The Palmprint is one of biometric authentication(something you are)invariable during a person’s life and needs...Many types of research focus on utilizing Palmprint recognition in user identification and authentication.The Palmprint is one of biometric authentication(something you are)invariable during a person’s life and needs careful protection during enrollment into different biometric authentication systems.Accuracy and irreversibility are critical requirements for securing the Palmprint template during enrollment and verification.This paper proposes an innovative HAMTE neural network model that contains Hetero-Associative Memory for Palmprint template translation and projection using matrix multiplication and dot product multiplication.A HAMTE-Siamese network is constructed,which accepts two Palmprint templates and predicts whether these two templates belong to the same user or different users.The HAMTE is generated for each user during the enrollment phase,which is responsible for generating a secure template for the enrolled user.The proposed network secures the person’s Palmprint template by translating it into an irreversible template(different features space).It can be stored safely in a trusted/untrusted third-party authentication system that protects the original person’s template from being stolen.Experimental results are conducted on the CASIA database,where the proposed network achieved accuracy close to the original accuracy for the unprotected Palmprint templates.The recognition accuracy deviated by around 3%,and the equal error rate(EER)by approximately 0.02 compared to the original data,with appropriate performance(approximately 13 ms)while preserving the irreversibility property of the secure template.Moreover,the brute-force attack has been analyzed under the new Palmprint protection scheme.展开更多
文摘Neuropsychological tests,such as the Rey-Osterrieth complex figure(ROCF)test,help detect mild cognitive impairment(MCI)in adults by assessing cognitive abilities such as planning,organization,and memory.Furthermore,they are inexpensive and minimally invasive,making them excellent tools for early screening.In this paper,we propose the use of image analysis models to characterize the relationship between an individual’s ROCF drawing and their cognitive state.This task is usually framed as a classification problem and is solved using deep learning models,due to their success in the last decade.In order to achieve good performance,these models need to be trained with a large number of examples.Given that our data availability is limited,we alternatively treat our task as a similarity learning problem,performing pairwise ROCF drawing comparisons to define groups that represent different cognitive states.This way of working could lead to better data utilization and improved model performance.To solve the similarity learning problem,we propose a siamese neural network(SNN)that exploits the distances of arbitrary ROCF drawings to the ideal representation of the ROCF.Our proposal is compared against various deep learning models designed for classification using a public dataset of 528 ROCF copy drawings,which are associated with either healthy individuals or those with MCI.Quantitative results are derived from a scheme involving multiple rounds of evaluation,employing both a dedicated test set and 14-fold cross-validation.Our SNN proposal demonstrates superiority in validation performance,and test results comparable to those of the classification-based deep learning models.
文摘In response to the problem of traditional methods ignoring audio modality tampering, this study aims to explore an effective deep forgery video detection technique that improves detection precision and reliability by fusing lip images and audio signals. The main method used is lip-audio matching detection technology based on the Siamese neural network, combined with MFCC (Mel Frequency Cepstrum Coefficient) feature extraction of band-pass filters, an improved dual-branch Siamese network structure, and a two-stream network structure design. Firstly, the video stream is preprocessed to extract lip images, and the audio stream is preprocessed to extract MFCC features. Then, these features are processed separately through the two branches of the Siamese network. Finally, the model is trained and optimized through fully connected layers and loss functions. The experimental results show that the testing accuracy of the model in this study on the LRW (Lip Reading in the Wild) dataset reaches 92.3%;the recall rate is 94.3%;the F1 score is 93.3%, significantly better than the results of CNN (Convolutional Neural Networks) and LSTM (Long Short-Term Memory) models. In the validation of multi-resolution image streams, the highest accuracy of dual-resolution image streams reaches 94%. Band-pass filters can effectively improve the signal-to-noise ratio of deep forgery video detection when processing different types of audio signals. The real-time processing performance of the model is also excellent, and it achieves an average score of up to 5 in user research. These data demonstrate that the method proposed in this study can effectively fuse visual and audio information in deep forgery video detection, accurately identify inconsistencies between video and audio, and thus verify the effectiveness of lip-audio modality fusion technology in improving detection performance.
基金supported by the Science and Technology Support Project of Sichuan Science and Technology Department(2018SZ0357)and China Scholarship。
文摘A person’s eye gaze can effectively express that person’s intentions.Thus,gaze estimation is an important approach in intelligent manufacturing to analyze a person’s intentions.Many gaze estimation methods regress the direction of the gaze by analyzing images of the eyes,also known as eye patches.However,it is very difficult to construct a person-independent model that can estimate an accurate gaze direction for every person due to individual differences.In this paper,we hypothesize that the difference in the appearance of each of a person’s eyes is related to the difference in the corresponding gaze directions.Based on this hypothesis,a differential eyes’appearances network(DEANet)is trained on public datasets to predict the gaze differences of pairwise eye patches belonging to the same individual.Our proposed DEANet is based on a Siamese neural network(SNNet)framework which has two identical branches.A multi-stream architecture is fed into each branch of the SNNet.Both branches of the DEANet that share the same weights extract the features of the patches;then the features are concatenated to obtain the difference of the gaze directions.Once the differential gaze model is trained,a new person’s gaze direction can be estimated when a few calibrated eye patches for that person are provided.Because personspecific calibrated eye patches are involved in the testing stage,the estimation accuracy is improved.Furthermore,the problem of requiring a large amount of data when training a person-specific model is effectively avoided.A reference grid strategy is also proposed in order to select a few references as some of the DEANet’s inputs directly based on the estimation values,further thereby improving the estimation accuracy.Experiments on public datasets show that our proposed approach outperforms the state-of-theart methods.
基金This work has received funding from the European Community’s H2020 Programme,under grant agreement Nr.826183(SPHINX).
文摘Intrusion detection systems(IDS)can play a significant role in detecting security threats or malicious attacks that aim to steal information and/or corrupt network protocols.To deal with the dynamic and complex nature of cyber-attacks,advanced intelligent tools have been applied resulting into powerful and automated IDS that rely on the latest advances of machine learning(ML)and deep learning(DL).Most of the reported effort has been devoted on building complex ML/DL architectures adopting a brute force approach towards the maximization of their detection capacity.However,just a limited number of studies have focused on the identification or extraction of user-friendly risk indicators that could be easily used by security experts.Many papers have explored various dimensionality reduction algorithms,however a large number of selected features is still required to detect the attacks successfully,which humans cannot intuitively or immediately understand.To enhance user’s trust and understanding on data without sacrificing on accuracy,this paper contributes to the transformation of the available data collected by IDS into a single actionable and easy-to-understand risk indicator.To achieve this,a novel feature extraction pipeline was implemented consisting of the following components:(i)a fuzzy allocation scheme that transforms raw data to fuzzy class memberships,(ii)a novel modality transformation mechanism for converting feature vectors to images(Vec2im)and(iii)a dimensionality reduction module that makes use of Siamese convolutional neural networks that finally reduces the input data dimensionality into a 1-d feature space.The performance of the proposed methodology was validated with respect to detection accuracy,dimensionality reduction performance and execution time on the NSL-KDD dataset via a thorough comparative analysis that demonstrated its effectiveness(86.64%testing accuracy using only one feature)over a number of well-known feature selection(FS)and extraction techniques.The output of the proposed feature extraction pipeline could be potentially used by security experts as an indicator of malicious activity,whereas the generated images could be further utilized and/or integrated as a visual analytics tool in existing IDS.
基金received funding from the European Community’s H2020 Programme,under grant agreement Nr.826183(SPHINX).
文摘Intrusion detection systems(IDS)can play a significant role in detecting security threats or malicious attacks that aim to steal information and/or corrupt network protocols.To deal with the dynamic and complex nature of cyber-attacks,advanced intelligent tools have been applied resulting into powerful and automated IDS that rely on the latest advances of machine learning(ML)and deep learning(DL).Most of the reported effort has been devoted on building complex ML/DL architectures adopting a brute force approach towards the maximization of their detection capacity.However,just a limited number of studies have focused on the identification or extraction of user-friendly risk indicators that could be easily used by security experts.Many papers have explored various dimensionality reduction algorithms,however a large number of selected features is still required to detect the attacks successfully,which humans cannot intuitively or immediately understand.To enhance user’s trust and understanding on data without sacrificing on accuracy,this paper contributes to the transformation of the available data collected by IDS into a single actionable and easy-to-understand risk indicator.To achieve this,a novel feature extraction pipeline was implemented consisting of the following components:(i)a fuzzy allocation scheme that transforms raw data to fuzzy class memberships,(ii)a novel modality transformation mechanism for converting feature vectors to images(Vec2im)and(iii)a dimensionality reduction module that makes use of Siamese convolutional neural networks that finally reduces the input data dimensionality into a 1-d feature space.The performance of the proposed methodology was validated with respect to detection accuracy,dimensionality reduction performance and execution time on the NSL-KDD dataset via a thorough comparative analysis that demonstrated its effectiveness(86.64%testing accuracy using only one feature)over a number of well-known feature selection(FS)and extraction techniques.The output of the proposed feature extraction pipeline could be potentially used by security experts as an indicator of malicious activity,whereas the generated images could be further utilized and/or integrated as a visual analytics tool in existing IDS.
基金supported by the National Natural Science Foundation of China(U19B2016)Zhejiang Provincial Key Lab of Data Storage and Transmission Technology,Hangzhou Dianzi University。
文摘To improve the recognition ability of communication jamming signals,Siamese Neural Network-based Open World Recognition(SNNOWR)is proposed.The algorithm can recognize known jamming classes,detect new(unknown)jamming classes,and unsupervised cluseter new classes.The network of SNN-OWR is trained supervised with paired input data consisting of two samples from a known dataset.On the one hand,the network is required to have the ability to distinguish whether two samples are from the same class.On the other hand,the latent distribution of known class is forced to approach their own unique Gaussian distribution,which is prepared for the subsequent open set testing.During the test,the unknown class detection process based on Gaussian probability density function threshold is designed,and an unsupervised clustering algorithm of the unknown jamming is realized by using the prior knowledge of known classes.The simulation results show that when the jamming-to-noise ratio is more than 0d B,the accuracy of SNN-OWR algorithm for known jamming classes recognition,unknown jamming detection and unsupervised clustering of unknown jamming is about 95%.This indicates that the SNN-OWR algorithm can make the effect of the recognition of unknown jamming be almost the same as that of known jamming.
基金This study was co-supported by the National Natural Science Foundation of China(Nos.61673017 and 61403398).
文摘This paper addresses the problem of visual object tracking for Unmanned Aerial Vehicles(UAVs).Most Siamese trackers are used to regard object tracking as classification and regression problems.However,it is difficult for these trackers to accurately classify in the face of similar objects,background clutters and other common challenges in UAV scenes.So,a reliable classifier is the key to improving UAV tracking performance.In this paper,a simple yet efficient tracker following the basic architecture of the Siamese neural network is proposed,which improves the classification ability from three stages.First,the frequency channel attention module is introduced to enhance the target features via frequency domain learning.Second,a template-guided attention module is designed to promote information exchange between the template branch and the search branch,which can get reliable classification response maps.Third,adaptive cross-entropy loss is proposed to make the tracker focus on hard samples that contribute more to the training process,solving the data imbalance between positive and negative samples.To evaluate the performance of the proposed tracker,comprehensive experiments are conducted on two challenging aerial datasets,including UAV123 and UAVDT.Experimental results demonstrate that the proposed tracker achieves favorable tracking performances in aerial benchmarks beyond 41 frames/s.We conducted experiments in real UAV scenes to further verify the efficiency of our tracker in the real world.
基金This work was funded by the Deanship of Scientific Research at Jouf University under Grant No.(DSR-2022-RG-0104).
文摘Many types of research focus on utilizing Palmprint recognition in user identification and authentication.The Palmprint is one of biometric authentication(something you are)invariable during a person’s life and needs careful protection during enrollment into different biometric authentication systems.Accuracy and irreversibility are critical requirements for securing the Palmprint template during enrollment and verification.This paper proposes an innovative HAMTE neural network model that contains Hetero-Associative Memory for Palmprint template translation and projection using matrix multiplication and dot product multiplication.A HAMTE-Siamese network is constructed,which accepts two Palmprint templates and predicts whether these two templates belong to the same user or different users.The HAMTE is generated for each user during the enrollment phase,which is responsible for generating a secure template for the enrolled user.The proposed network secures the person’s Palmprint template by translating it into an irreversible template(different features space).It can be stored safely in a trusted/untrusted third-party authentication system that protects the original person’s template from being stolen.Experimental results are conducted on the CASIA database,where the proposed network achieved accuracy close to the original accuracy for the unprotected Palmprint templates.The recognition accuracy deviated by around 3%,and the equal error rate(EER)by approximately 0.02 compared to the original data,with appropriate performance(approximately 13 ms)while preserving the irreversibility property of the secure template.Moreover,the brute-force attack has been analyzed under the new Palmprint protection scheme.