The rapid development and widespread adoption of Internet technology have significantly increased Internet traffic,highlighting the growing importance of network security.Intrusion Detection Systems(IDS)are essential ...The rapid development and widespread adoption of Internet technology have significantly increased Internet traffic,highlighting the growing importance of network security.Intrusion Detection Systems(IDS)are essential for safeguarding network integrity.To address the low accuracy of existing intrusion detection models in identifying network attacks,this paper proposes an intrusion detection method based on the fusion of Spatial Attention mechanism and Residual Neural Network(SA-ResNet).Utilizing residual connections can effectively capture local features in the data;by introducing a spatial attention mechanism,the global dependency relationships of intrusion features can be extracted,enhancing the intrusion recognition model’s focus on the global features of intrusions,and effectively improving the accuracy of intrusion recognition.The proposed model in this paper was experimentally verified on theNSL-KDD dataset.The experimental results showthat the intrusion recognition accuracy of the intrusion detection method based on SA-ResNet has reached 99.86%,and its overall accuracy is 0.41% higher than that of traditional Convolutional Neural Network(CNN)models.展开更多
Multispectral pedestrian detection technology leverages infrared images to provide reliable information for visible light images, demonstrating significant advantages in low-light conditions and background occlusion s...Multispectral pedestrian detection technology leverages infrared images to provide reliable information for visible light images, demonstrating significant advantages in low-light conditions and background occlusion scenarios. However, while continuously improving cross-modal feature extraction and fusion, ensuring the model’s detection speed is also a challenging issue. We have devised a deep learning network model for cross-modal pedestrian detection based on Resnet50, aiming to focus on more reliable features and enhance the model’s detection efficiency. This model employs a spatial attention mechanism to reweight the input visible light and infrared image data, enhancing the model’s focus on different spatial positions and sharing the weighted feature data across different modalities, thereby reducing the interference of multi-modal features. Subsequently, lightweight modules with depthwise separable convolution are incorporated to reduce the model’s parameter count and computational load through channel-wise and point-wise convolutions. The network model algorithm proposed in this paper was experimentally validated on the publicly available KAIST dataset and compared with other existing methods. The experimental results demonstrate that our approach achieves favorable performance in various complex environments, affirming the effectiveness of the multispectral pedestrian detection technology proposed in this paper.展开更多
Visual question answering(VQA)has attracted more and more attention in computer vision and natural language processing.Scholars are committed to studying how to better integrate image features and text features to ach...Visual question answering(VQA)has attracted more and more attention in computer vision and natural language processing.Scholars are committed to studying how to better integrate image features and text features to achieve better results in VQA tasks.Analysis of all features may cause information redundancy and heavy computational burden.Attention mechanism is a wise way to solve this problem.However,using single attention mechanism may cause incomplete concern of features.This paper improves the attention mechanism method and proposes a hybrid attention mechanism that combines the spatial attention mechanism method and the channel attention mechanism method.In the case that the attention mechanism will cause the loss of the original features,a small portion of image features were added as compensation.For the attention mechanism of text features,a selfattention mechanism was introduced,and the internal structural features of sentences were strengthened to improve the overall model.The results show that attention mechanism and feature compensation add 6.1%accuracy to multimodal low-rank bilinear pooling network.展开更多
With the improvement of the national economic level,the number of vehicles is still increasing year by year.According to the statistics of National Bureau of Statics,the number is approximately up to 327 million in Ch...With the improvement of the national economic level,the number of vehicles is still increasing year by year.According to the statistics of National Bureau of Statics,the number is approximately up to 327 million in China by the end of 2018,which makes urban traffic pressure continues to rise so that the negative impact of urban traffic order is growing.Illegal parking-the common problem in the field of transportation security is urgent to be solved and traditional methods to address it are mainly based on ground loop and manual supervision,which may miss detection and cost much manpower.Due to the rapidly developing deep learning sweeping the world in recent years,object detection methods relying on background segmentation cannot meet the requirements of complex and various scenes on speed and precision.Thus,an improved Single Shot MultiBox Detector(SSD)based on deep learning is proposed in our study,we introduce attention mechanism by spatial transformer module which gives neural networks the ability to actively spatially transform feature maps and add contextual information transmission in specified layer.Finally,we found out the best connection layer in the detection model by repeated experiments especially for small objects and increased the precision by 1.5%than the baseline SSD without extra training cost.Meanwhile,we designed an illegal parking vehicle detection method by the improved SSD,reaching a high precision up to 97.3%and achieving a speed of 40FPS,superior to most of vehicle detection methods,will make contributions to relieving the negative impact of illegal parking.展开更多
Deepfake-generated fake faces,commonly utilized in identity-related activities such as political propaganda,celebrity impersonations,evidence forgery,and familiar fraud,pose new societal threats.Although current deepf...Deepfake-generated fake faces,commonly utilized in identity-related activities such as political propaganda,celebrity impersonations,evidence forgery,and familiar fraud,pose new societal threats.Although current deepfake generators strive for high realism in visual effects,they do not replicate biometric signals indicative of cardiac activity.Addressing this gap,many researchers have developed detection methods focusing on biometric characteristics.These methods utilize classification networks to analyze both temporal and spectral domain features of the remote photoplethysmography(rPPG)signal,resulting in high detection accuracy.However,in the spectral analysis,existing approaches often only consider the power spectral density and neglect the amplitude spectrum—both crucial for assessing cardiac activity.We introduce a novel method that extracts rPPG signals from multiple regions of interest through remote photoplethysmography and processes them using Fast Fourier Transform(FFT).The resultant time-frequency domain signal samples are organized into matrices to create Matrix Visualization Heatmaps(MVHM),which are then utilized to train an image classification network.Additionally,we explored various combinations of time-frequency domain representations of rPPG signals and the impact of attention mechanisms.Our experimental results show that our algorithm achieves a remarkable detection accuracy of 99.22%in identifying fake videos,significantly outperforming mainstream algorithms and demonstrating the effectiveness of Fourier Transform and attention mechanisms in detecting fake faces.展开更多
Visual object tracking is an important issue that has received long-term attention in computer vision.The ability to effectively handle occlusion,especially severe occlusion,is an important aspect of evaluating the pe...Visual object tracking is an important issue that has received long-term attention in computer vision.The ability to effectively handle occlusion,especially severe occlusion,is an important aspect of evaluating the performance of object tracking algorithms in long-term tracking,and is of great significance to improving the robustness of object tracking algorithms.However,most object tracking algorithms lack a processing mechanism specifically for occlusion.In the case of occlusion,due to the lack of target information,it is necessary to predict the target position based on the motion trajectory.Kalman filtering and particle filtering can effectively predict the target motion state based on the historical motion information.A single object tracking method,called probabilistic discriminative model prediction(PrDiMP),is based on the spatial attention mechanism in complex scenes and occlusions.In order to improve the performance of PrDiMP,Kalman filtering,particle filtering and linear filtering are introduced.First,for the occlusion situation,Kalman filtering and particle filtering are respectively introduced to predict the object position,thereby replacing the detection result of the original tracking algorithm and stopping recursion of target model.Second,for detection-jump problem of similar objects in complex scenes,a linear filtering window is added.The evaluation results on the three datasets,including GOT-10k,UAV123 and LaSOT,and the visualization results on several videos,show that our algorithms have improved tracking performance under occlusion and the detection-jump is effectively suppressed.展开更多
Autonomous driving has witnessed rapid advancement;however,ensuring safe and efficient driving in intricate scenarios remains a critical challenge.In particular,traffic roundabouts bring a set of challenges to autonom...Autonomous driving has witnessed rapid advancement;however,ensuring safe and efficient driving in intricate scenarios remains a critical challenge.In particular,traffic roundabouts bring a set of challenges to autonomous driving due to the unpredictable entry and exit of vehicles,susceptibility to traffic flow bottlenecks,and imperfect data in perceiving environmental information,rendering them a vital issue in the practical application of autonomous driving.To address the traffic challenges,this work focused on complex roundabouts with multi-lane and proposed a Perception EnhancedDeepDeterministic Policy Gradient(PE-DDPG)for AutonomousDriving in the Roundabouts.Specifically,themodel incorporates an enhanced variational autoencoder featuring an integrated spatial attention mechanism alongside the Deep Deterministic Policy Gradient framework,enhancing the vehicle’s capability to comprehend complex roundabout environments and make decisions.Furthermore,the PE-DDPG model combines a dynamic path optimization strategy for roundabout scenarios,effectively mitigating traffic bottlenecks and augmenting throughput efficiency.Extensive experiments were conducted with the collaborative simulation platform of CARLA and SUMO,and the experimental results show that the proposed PE-DDPG outperforms the baseline methods in terms of the convergence capacity of the training process,the smoothness of driving and the traffic efficiency with diverse traffic flow patterns and penetration rates of autonomous vehicles(AVs).Generally,the proposed PE-DDPGmodel could be employed for autonomous driving in complex scenarios with imperfect data.展开更多
To address the imbalance problem between supply and demand for taxis and passengers,this paper proposes a distributed ensemble empirical mode decomposition with normalization of spatial attention mechanism based bi-di...To address the imbalance problem between supply and demand for taxis and passengers,this paper proposes a distributed ensemble empirical mode decomposition with normalization of spatial attention mechanism based bi-directional gated recurrent unit(EEMDN-SABiGRU)model on Spark for accurate passenger hotspot prediction.It focuses on reducing blind cruising costs,improving carrying efficiency,and maximizing incomes.Specifically,the EEMDN method is put forward to process the passenger hotspot data in the grid to solve the problems of non-smooth sequences and the degradation of prediction accuracy caused by excessive numerical differences,while dealing with the eigenmodal EMD.Next,a spatial attention mechanism is constructed to capture the characteristics of passenger hotspots in each grid,taking passenger boarding and alighting hotspots as weights and emphasizing the spatial regularity of passengers in the grid.Furthermore,the bi-directional GRU algorithm is merged to deal with the problem that GRU can obtain only the forward information but ignores the backward information,to improve the accuracy of feature extraction.Finally,the accurate prediction of passenger hotspots is achieved based on the EEMDN-SABiGRU model using real-world taxi GPS trajectory data in the Spark parallel computing framework.The experimental results demonstrate that based on the four datasets in the 00-grid,compared with LSTM,EMDLSTM,EEMD-LSTM,GRU,EMD-GRU,EEMD-GRU,EMDN-GRU,CNN,and BP,the mean absolute percentage error,mean absolute error,root mean square error,and maximum error values of EEMDN-SABiGRU decrease by at least 43.18%,44.91%,55.04%,and 39.33%,respectively.展开更多
基金supported by National Natural Science Foundation of China(62473341)Key Research and Development Special Project of Henan Province(221111210500)Key Research and Development Special Project of Henan Province(242102211071,242102210142,232102211053).
文摘The rapid development and widespread adoption of Internet technology have significantly increased Internet traffic,highlighting the growing importance of network security.Intrusion Detection Systems(IDS)are essential for safeguarding network integrity.To address the low accuracy of existing intrusion detection models in identifying network attacks,this paper proposes an intrusion detection method based on the fusion of Spatial Attention mechanism and Residual Neural Network(SA-ResNet).Utilizing residual connections can effectively capture local features in the data;by introducing a spatial attention mechanism,the global dependency relationships of intrusion features can be extracted,enhancing the intrusion recognition model’s focus on the global features of intrusions,and effectively improving the accuracy of intrusion recognition.The proposed model in this paper was experimentally verified on theNSL-KDD dataset.The experimental results showthat the intrusion recognition accuracy of the intrusion detection method based on SA-ResNet has reached 99.86%,and its overall accuracy is 0.41% higher than that of traditional Convolutional Neural Network(CNN)models.
基金supported by the Henan Provincial Science and Technology Research Project under Grants 232102211006,232102210044,232102211017,232102210055 and 222102210214the Science and Technology Innovation Project of Zhengzhou University of Light Industry under Grant 23XNKJTD0205+1 种基金the Undergraduate Universities Smart Teaching Special Research Project of Henan Province under Grant Jiao Gao[2021]No.489-29the Doctor Natural Science Foundation of Zhengzhou University of Light Industry under Grants 2021BSJJ025 and 2022BSJJZK13.
文摘Multispectral pedestrian detection technology leverages infrared images to provide reliable information for visible light images, demonstrating significant advantages in low-light conditions and background occlusion scenarios. However, while continuously improving cross-modal feature extraction and fusion, ensuring the model’s detection speed is also a challenging issue. We have devised a deep learning network model for cross-modal pedestrian detection based on Resnet50, aiming to focus on more reliable features and enhance the model’s detection efficiency. This model employs a spatial attention mechanism to reweight the input visible light and infrared image data, enhancing the model’s focus on different spatial positions and sharing the weighted feature data across different modalities, thereby reducing the interference of multi-modal features. Subsequently, lightweight modules with depthwise separable convolution are incorporated to reduce the model’s parameter count and computational load through channel-wise and point-wise convolutions. The network model algorithm proposed in this paper was experimentally validated on the publicly available KAIST dataset and compared with other existing methods. The experimental results demonstrate that our approach achieves favorable performance in various complex environments, affirming the effectiveness of the multispectral pedestrian detection technology proposed in this paper.
基金This work was supported by the Sichuan Science and Technology Program(2021YFQ0003).
文摘Visual question answering(VQA)has attracted more and more attention in computer vision and natural language processing.Scholars are committed to studying how to better integrate image features and text features to achieve better results in VQA tasks.Analysis of all features may cause information redundancy and heavy computational burden.Attention mechanism is a wise way to solve this problem.However,using single attention mechanism may cause incomplete concern of features.This paper improves the attention mechanism method and proposes a hybrid attention mechanism that combines the spatial attention mechanism method and the channel attention mechanism method.In the case that the attention mechanism will cause the loss of the original features,a small portion of image features were added as compensation.For the attention mechanism of text features,a selfattention mechanism was introduced,and the internal structural features of sentences were strengthened to improve the overall model.The results show that attention mechanism and feature compensation add 6.1%accuracy to multimodal low-rank bilinear pooling network.
基金This research has been supported by NSFC(61672495)Scientific Research Fund of Hunan Provincial Education Department(16A208)+1 种基金Project of Hunan Provincial Science and Technology Department(2017SK2405)in part by the construct program of the key discipline in Hunan Province and the CERNET Innovation Project(NGII20170715).
文摘With the improvement of the national economic level,the number of vehicles is still increasing year by year.According to the statistics of National Bureau of Statics,the number is approximately up to 327 million in China by the end of 2018,which makes urban traffic pressure continues to rise so that the negative impact of urban traffic order is growing.Illegal parking-the common problem in the field of transportation security is urgent to be solved and traditional methods to address it are mainly based on ground loop and manual supervision,which may miss detection and cost much manpower.Due to the rapidly developing deep learning sweeping the world in recent years,object detection methods relying on background segmentation cannot meet the requirements of complex and various scenes on speed and precision.Thus,an improved Single Shot MultiBox Detector(SSD)based on deep learning is proposed in our study,we introduce attention mechanism by spatial transformer module which gives neural networks the ability to actively spatially transform feature maps and add contextual information transmission in specified layer.Finally,we found out the best connection layer in the detection model by repeated experiments especially for small objects and increased the precision by 1.5%than the baseline SSD without extra training cost.Meanwhile,we designed an illegal parking vehicle detection method by the improved SSD,reaching a high precision up to 97.3%and achieving a speed of 40FPS,superior to most of vehicle detection methods,will make contributions to relieving the negative impact of illegal parking.
基金supported by the National Nature Science Foundation of China(Grant Number:61962010).
文摘Deepfake-generated fake faces,commonly utilized in identity-related activities such as political propaganda,celebrity impersonations,evidence forgery,and familiar fraud,pose new societal threats.Although current deepfake generators strive for high realism in visual effects,they do not replicate biometric signals indicative of cardiac activity.Addressing this gap,many researchers have developed detection methods focusing on biometric characteristics.These methods utilize classification networks to analyze both temporal and spectral domain features of the remote photoplethysmography(rPPG)signal,resulting in high detection accuracy.However,in the spectral analysis,existing approaches often only consider the power spectral density and neglect the amplitude spectrum—both crucial for assessing cardiac activity.We introduce a novel method that extracts rPPG signals from multiple regions of interest through remote photoplethysmography and processes them using Fast Fourier Transform(FFT).The resultant time-frequency domain signal samples are organized into matrices to create Matrix Visualization Heatmaps(MVHM),which are then utilized to train an image classification network.Additionally,we explored various combinations of time-frequency domain representations of rPPG signals and the impact of attention mechanisms.Our experimental results show that our algorithm achieves a remarkable detection accuracy of 99.22%in identifying fake videos,significantly outperforming mainstream algorithms and demonstrating the effectiveness of Fourier Transform and attention mechanisms in detecting fake faces.
基金the National Natural Science Foundation of China (No.61673269)。
文摘Visual object tracking is an important issue that has received long-term attention in computer vision.The ability to effectively handle occlusion,especially severe occlusion,is an important aspect of evaluating the performance of object tracking algorithms in long-term tracking,and is of great significance to improving the robustness of object tracking algorithms.However,most object tracking algorithms lack a processing mechanism specifically for occlusion.In the case of occlusion,due to the lack of target information,it is necessary to predict the target position based on the motion trajectory.Kalman filtering and particle filtering can effectively predict the target motion state based on the historical motion information.A single object tracking method,called probabilistic discriminative model prediction(PrDiMP),is based on the spatial attention mechanism in complex scenes and occlusions.In order to improve the performance of PrDiMP,Kalman filtering,particle filtering and linear filtering are introduced.First,for the occlusion situation,Kalman filtering and particle filtering are respectively introduced to predict the object position,thereby replacing the detection result of the original tracking algorithm and stopping recursion of target model.Second,for detection-jump problem of similar objects in complex scenes,a linear filtering window is added.The evaluation results on the three datasets,including GOT-10k,UAV123 and LaSOT,and the visualization results on several videos,show that our algorithms have improved tracking performance under occlusion and the detection-jump is effectively suppressed.
基金supported in part by the projects of the National Natural Science Foundation of China(62376059,41971340)Fujian Provincial Department of Science and Technology(2023XQ008,2023I0024,2021Y4019),Fujian Provincial Department of Finance(GY-Z230007,GYZ23012)Fujian Key Laboratory of Automotive Electronics and Electric Drive(KF-19-22001).
文摘Autonomous driving has witnessed rapid advancement;however,ensuring safe and efficient driving in intricate scenarios remains a critical challenge.In particular,traffic roundabouts bring a set of challenges to autonomous driving due to the unpredictable entry and exit of vehicles,susceptibility to traffic flow bottlenecks,and imperfect data in perceiving environmental information,rendering them a vital issue in the practical application of autonomous driving.To address the traffic challenges,this work focused on complex roundabouts with multi-lane and proposed a Perception EnhancedDeepDeterministic Policy Gradient(PE-DDPG)for AutonomousDriving in the Roundabouts.Specifically,themodel incorporates an enhanced variational autoencoder featuring an integrated spatial attention mechanism alongside the Deep Deterministic Policy Gradient framework,enhancing the vehicle’s capability to comprehend complex roundabout environments and make decisions.Furthermore,the PE-DDPG model combines a dynamic path optimization strategy for roundabout scenarios,effectively mitigating traffic bottlenecks and augmenting throughput efficiency.Extensive experiments were conducted with the collaborative simulation platform of CARLA and SUMO,and the experimental results show that the proposed PE-DDPG outperforms the baseline methods in terms of the convergence capacity of the training process,the smoothness of driving and the traffic efficiency with diverse traffic flow patterns and penetration rates of autonomous vehicles(AVs).Generally,the proposed PE-DDPGmodel could be employed for autonomous driving in complex scenarios with imperfect data.
基金Project supported by the National Natural Science Foundation of China(Nos.62162012,62173278,and 62072061)the Science and Technology Support Program of Guizhou Province,China(No.QKHZC2021YB531)+3 种基金the Natural Science Research Project of Department of Education of Guizhou Province,China(Nos.QJJ2022015 and QJJ2022047)the Science and Technology Foundation of Guizhou Province,China(Nos.QKHJCZK2022YB195,QKHJCZK2022YB197,and QKHJCZK2023YB143)the Scientific Research Platform Project of Guizhou Minzu University,China(No.GZMUSYS202104)the 7^(th) Batch High-Level Innovative Talent Project of Guizhou Province,China。
文摘To address the imbalance problem between supply and demand for taxis and passengers,this paper proposes a distributed ensemble empirical mode decomposition with normalization of spatial attention mechanism based bi-directional gated recurrent unit(EEMDN-SABiGRU)model on Spark for accurate passenger hotspot prediction.It focuses on reducing blind cruising costs,improving carrying efficiency,and maximizing incomes.Specifically,the EEMDN method is put forward to process the passenger hotspot data in the grid to solve the problems of non-smooth sequences and the degradation of prediction accuracy caused by excessive numerical differences,while dealing with the eigenmodal EMD.Next,a spatial attention mechanism is constructed to capture the characteristics of passenger hotspots in each grid,taking passenger boarding and alighting hotspots as weights and emphasizing the spatial regularity of passengers in the grid.Furthermore,the bi-directional GRU algorithm is merged to deal with the problem that GRU can obtain only the forward information but ignores the backward information,to improve the accuracy of feature extraction.Finally,the accurate prediction of passenger hotspots is achieved based on the EEMDN-SABiGRU model using real-world taxi GPS trajectory data in the Spark parallel computing framework.The experimental results demonstrate that based on the four datasets in the 00-grid,compared with LSTM,EMDLSTM,EEMD-LSTM,GRU,EMD-GRU,EEMD-GRU,EMDN-GRU,CNN,and BP,the mean absolute percentage error,mean absolute error,root mean square error,and maximum error values of EEMDN-SABiGRU decrease by at least 43.18%,44.91%,55.04%,and 39.33%,respectively.