In recent years,gait-based emotion recognition has been widely applied in the field of computer vision.However,existing gait emotion recognition methods typically rely on complete human skeleton data,and their accurac...In recent years,gait-based emotion recognition has been widely applied in the field of computer vision.However,existing gait emotion recognition methods typically rely on complete human skeleton data,and their accuracy significantly declines when the data is occluded.To enhance the accuracy of gait emotion recognition under occlusion,this paper proposes a Multi-scale Suppression Graph ConvolutionalNetwork(MS-GCN).TheMS-GCN consists of three main components:Joint Interpolation Module(JI Moudle),Multi-scale Temporal Convolution Network(MS-TCN),and Suppression Graph Convolutional Network(SGCN).The JI Module completes the spatially occluded skeletal joints using the(K-Nearest Neighbors)KNN interpolation method.The MS-TCN employs convolutional kernels of various sizes to comprehensively capture the emotional information embedded in the gait,compensating for the temporal occlusion of gait information.The SGCN extracts more non-prominent human gait features by suppressing the extraction of key body part features,thereby reducing the negative impact of occlusion on emotion recognition results.The proposed method is evaluated on two comprehensive datasets:Emotion-Gait,containing 4227 real gaits from sources like BML,ICT-Pollick,and ELMD,and 1000 synthetic gaits generated using STEP-Gen technology,and ELMB,consisting of 3924 gaits,with 1835 labeled with emotions such as“Happy,”“Sad,”“Angry,”and“Neutral.”On the standard datasets Emotion-Gait and ELMB,the proposed method achieved accuracies of 0.900 and 0.896,respectively,attaining performance comparable to other state-ofthe-artmethods.Furthermore,on occlusion datasets,the proposedmethod significantly mitigates the performance degradation caused by occlusion compared to other methods,the accuracy is significantly higher than that of other methods.展开更多
Accurate traffic flow prediction has a profound impact on modern traffic management. Traffic flow has complex spatial-temporal correlations and periodicity, which poses difficulties for precise prediction. To address ...Accurate traffic flow prediction has a profound impact on modern traffic management. Traffic flow has complex spatial-temporal correlations and periodicity, which poses difficulties for precise prediction. To address this problem, a Multi-head Self-attention and Spatial-Temporal Graph Convolutional Network (MSSTGCN) for multiscale traffic flow prediction is proposed. Firstly, to capture the hidden traffic periodicity of traffic flow, traffic flow is divided into three kinds of periods, including hourly, daily, and weekly data. Secondly, a graph attention residual layer is constructed to learn the global spatial features across regions. Local spatial-temporal dependence is captured by using a T-GCN module. Thirdly, a transformer layer is introduced to learn the long-term dependence in time. A position embedding mechanism is introduced to label position information for all traffic sequences. Thus, this multi-head self-attention mechanism can recognize the sequence order and allocate weights for different time nodes. Experimental results on four real-world datasets show that the MSSTGCN performs better than the baseline methods and can be successfully adapted to traffic prediction tasks.展开更多
Purpose-Human behavior recognition poses a pivotal challenge in intelligent computing and cybernetics,significantly impacting engineering and management systems.With the rapid advancement of autonomous systems and int...Purpose-Human behavior recognition poses a pivotal challenge in intelligent computing and cybernetics,significantly impacting engineering and management systems.With the rapid advancement of autonomous systems and intelligent manufacturing,there is an increasing demand for precise and efficient human behavior recognition technologies.However,traditional methods often suffer from insufficient accuracy and limited generalization ability when dealing with complex and diverse human actions.Therefore,this study aims to enhance the precision of human behavior recognition by proposing an innovative framework,dynamic graph convolutional networks with multi-scale position attention(DGCN-MPA)to sup.Design/methodology/approach-The primary applications are in autonomous systems and intelligent manufacturing.The main objective of this study is to develop an efficient human behavior recognition framework that leverages advanced techniques to improve the prediction and interpretation of human actions.This framework aims to address the shortcomings of existing methods in handling the complexity and variability of human actions,providing more reliable and precise solutions for practical applications.The proposed DGCN-MPA framework integrates the strengths of convolutional neural networks and graph-based models.It innovatively incorporates wavelet packet transform to extract time-frequency characteristics and a MPA module to enhance the representation of skeletal node positions.The core innovation lies in the fusion of dynamic graph convolution with hierarchical attention mechanisms,which selectively attend to relevant features and spatial relationships,adjusting their importance across scales to address the variability in human actions.Findings-To validate the effectiveness of the DGCN-MPA framework,rigorous evaluations were conducted on benchmark datasets such as NTU-RGBþD and Kinetics-Skeleton.The results demonstrate that the framework achieves an F1 score of 62.18%and an accuracy of 75.93%on NTU-RGBþD and an F1 score of 69.34%and an accuracy of 76.86%on Kinetics-Skeleton,outperforming existing models.These findings underscore the framework’s capability to capture complex behavior patterns with high precision.Originality/value-By introducing a dynamic graph convolutional approach combined with multi-scale position attention mechanisms,this study represents a significant advancement in human behavior recognition technologies.The innovative design and superior performance of the DGCN-MPA framework contribute to its potential for real-world applications,particularly in integrating behavior recognition into engineering and autonomous systems.In the future,this framework has the potential to further propel the development of intelligent computing,cybernetics and related fields.展开更多
Graph convolutional networks(GCNs)have become a dominant approach for skeleton-based action recognition tasks.Although GCNs have made significant progress in modeling skeletons as spatial-temporal graphs,they often re...Graph convolutional networks(GCNs)have become a dominant approach for skeleton-based action recognition tasks.Although GCNs have made significant progress in modeling skeletons as spatial-temporal graphs,they often require stacking multiple graph convolution layers to effectively capture long-distance relationships among nodes.This stacking not only increases computational burdens but also raises the risk of over-smoothing,which can lead to the neglect of crucial local action features.To address this issue,we propose a novel multi-scale adaptive large kernel graph convolutional network(MSLK-GCN)to effectively aggregate local and global spatio-temporal correlations while maintaining the computational efficiency.The core components of the network include two multi-scale large kernel graph convolution(LKGC)modules,a multi-channel adaptive graph convolution(MAGC)module,and a multi-scale temporal self-attention convolution(MSTC)module.The LKGC module adaptively focuses on active motion regions by utilizing a large convolution kernel and a gating mechanism,effectively capturing long-distance dependencies within the skeleton sequence.Meanwhile,the MAGC module dynamically learns relationships between different joints by adjusting connection weights between nodes.To further enhance the ability to capture temporal dynamics,the MSTC module effectively aggregates the temporal information by integrating Efficient Channel Attention(ECA)with multi-scale convolution.In addition,we use a multi-stream fusion strategy to make full use of different modal skeleton data,including bone,joint,joint motion,and bone motion.Exhaustive experiments on three scale-varying datasets,i.e.,NTU-60,NTU-120,and NW-UCLA,demonstrate that our MSLK-GCN can achieve state-of-the-art performance with fewer parameters.展开更多
基金supported by the National Natural Science Foundation of China(62272049,62236006,62172045)the Key Projects of Beijing Union University(ZKZD202301).
文摘In recent years,gait-based emotion recognition has been widely applied in the field of computer vision.However,existing gait emotion recognition methods typically rely on complete human skeleton data,and their accuracy significantly declines when the data is occluded.To enhance the accuracy of gait emotion recognition under occlusion,this paper proposes a Multi-scale Suppression Graph ConvolutionalNetwork(MS-GCN).TheMS-GCN consists of three main components:Joint Interpolation Module(JI Moudle),Multi-scale Temporal Convolution Network(MS-TCN),and Suppression Graph Convolutional Network(SGCN).The JI Module completes the spatially occluded skeletal joints using the(K-Nearest Neighbors)KNN interpolation method.The MS-TCN employs convolutional kernels of various sizes to comprehensively capture the emotional information embedded in the gait,compensating for the temporal occlusion of gait information.The SGCN extracts more non-prominent human gait features by suppressing the extraction of key body part features,thereby reducing the negative impact of occlusion on emotion recognition results.The proposed method is evaluated on two comprehensive datasets:Emotion-Gait,containing 4227 real gaits from sources like BML,ICT-Pollick,and ELMD,and 1000 synthetic gaits generated using STEP-Gen technology,and ELMB,consisting of 3924 gaits,with 1835 labeled with emotions such as“Happy,”“Sad,”“Angry,”and“Neutral.”On the standard datasets Emotion-Gait and ELMB,the proposed method achieved accuracies of 0.900 and 0.896,respectively,attaining performance comparable to other state-ofthe-artmethods.Furthermore,on occlusion datasets,the proposedmethod significantly mitigates the performance degradation caused by occlusion compared to other methods,the accuracy is significantly higher than that of other methods.
基金supported by the National Natural Science Foundation of China(Grant Nos.62472149,62376089,62202147)Hubei Provincial Science and Technology Plan Project(2023BCB04100).
文摘Accurate traffic flow prediction has a profound impact on modern traffic management. Traffic flow has complex spatial-temporal correlations and periodicity, which poses difficulties for precise prediction. To address this problem, a Multi-head Self-attention and Spatial-Temporal Graph Convolutional Network (MSSTGCN) for multiscale traffic flow prediction is proposed. Firstly, to capture the hidden traffic periodicity of traffic flow, traffic flow is divided into three kinds of periods, including hourly, daily, and weekly data. Secondly, a graph attention residual layer is constructed to learn the global spatial features across regions. Local spatial-temporal dependence is captured by using a T-GCN module. Thirdly, a transformer layer is introduced to learn the long-term dependence in time. A position embedding mechanism is introduced to label position information for all traffic sequences. Thus, this multi-head self-attention mechanism can recognize the sequence order and allocate weights for different time nodes. Experimental results on four real-world datasets show that the MSSTGCN performs better than the baseline methods and can be successfully adapted to traffic prediction tasks.
基金supported by the Guangxi University Young and middle-aged Teachers Basic Ability Improvement Project(No.:2023KY1692)Guilin University of Information Technology 2022 Research Project(No.:XJ202207)。
文摘Purpose-Human behavior recognition poses a pivotal challenge in intelligent computing and cybernetics,significantly impacting engineering and management systems.With the rapid advancement of autonomous systems and intelligent manufacturing,there is an increasing demand for precise and efficient human behavior recognition technologies.However,traditional methods often suffer from insufficient accuracy and limited generalization ability when dealing with complex and diverse human actions.Therefore,this study aims to enhance the precision of human behavior recognition by proposing an innovative framework,dynamic graph convolutional networks with multi-scale position attention(DGCN-MPA)to sup.Design/methodology/approach-The primary applications are in autonomous systems and intelligent manufacturing.The main objective of this study is to develop an efficient human behavior recognition framework that leverages advanced techniques to improve the prediction and interpretation of human actions.This framework aims to address the shortcomings of existing methods in handling the complexity and variability of human actions,providing more reliable and precise solutions for practical applications.The proposed DGCN-MPA framework integrates the strengths of convolutional neural networks and graph-based models.It innovatively incorporates wavelet packet transform to extract time-frequency characteristics and a MPA module to enhance the representation of skeletal node positions.The core innovation lies in the fusion of dynamic graph convolution with hierarchical attention mechanisms,which selectively attend to relevant features and spatial relationships,adjusting their importance across scales to address the variability in human actions.Findings-To validate the effectiveness of the DGCN-MPA framework,rigorous evaluations were conducted on benchmark datasets such as NTU-RGBþD and Kinetics-Skeleton.The results demonstrate that the framework achieves an F1 score of 62.18%and an accuracy of 75.93%on NTU-RGBþD and an F1 score of 69.34%and an accuracy of 76.86%on Kinetics-Skeleton,outperforming existing models.These findings underscore the framework’s capability to capture complex behavior patterns with high precision.Originality/value-By introducing a dynamic graph convolutional approach combined with multi-scale position attention mechanisms,this study represents a significant advancement in human behavior recognition technologies.The innovative design and superior performance of the DGCN-MPA framework contribute to its potential for real-world applications,particularly in integrating behavior recognition into engineering and autonomous systems.In the future,this framework has the potential to further propel the development of intelligent computing,cybernetics and related fields.
基金supported in part by the National Natural Science Foundation of China under Grant No.61976127the Shandong Provincial Natural Science Foundation under Grant No.ZR2024MF030+1 种基金the Taishan Scholar Program of Shandong Province of China under Grant No.tsqn202306150the Key Research and Development Program of Shandong Province of China under Grant No.2025CXPT096.
文摘Graph convolutional networks(GCNs)have become a dominant approach for skeleton-based action recognition tasks.Although GCNs have made significant progress in modeling skeletons as spatial-temporal graphs,they often require stacking multiple graph convolution layers to effectively capture long-distance relationships among nodes.This stacking not only increases computational burdens but also raises the risk of over-smoothing,which can lead to the neglect of crucial local action features.To address this issue,we propose a novel multi-scale adaptive large kernel graph convolutional network(MSLK-GCN)to effectively aggregate local and global spatio-temporal correlations while maintaining the computational efficiency.The core components of the network include two multi-scale large kernel graph convolution(LKGC)modules,a multi-channel adaptive graph convolution(MAGC)module,and a multi-scale temporal self-attention convolution(MSTC)module.The LKGC module adaptively focuses on active motion regions by utilizing a large convolution kernel and a gating mechanism,effectively capturing long-distance dependencies within the skeleton sequence.Meanwhile,the MAGC module dynamically learns relationships between different joints by adjusting connection weights between nodes.To further enhance the ability to capture temporal dynamics,the MSTC module effectively aggregates the temporal information by integrating Efficient Channel Attention(ECA)with multi-scale convolution.In addition,we use a multi-stream fusion strategy to make full use of different modal skeleton data,including bone,joint,joint motion,and bone motion.Exhaustive experiments on three scale-varying datasets,i.e.,NTU-60,NTU-120,and NW-UCLA,demonstrate that our MSLK-GCN can achieve state-of-the-art performance with fewer parameters.