摘要
To address challenges in feature extraction and real-time processing during traffic police pose estimation,this paper proposes an improved YOLOv11-pose network for traffic police gesture recognition.By replacing the C3K2 module in the backbone network with an enhanced C3K2-Star-CAA module,we achieve efficient extraction of traffic police posture features.A multi-branch star topology enables cross-level feature fusion and multi-scale information propagation,enhancing the model’s perception of minute posture details and complex background interference.Embedding the CAA attention mechanism at the key feature layer models critical locations and their spatial contextual relationships through contextual anchors,effectively enhancing key-point feature representation while suppressing complex background interference.Experimental results demonstrate that the improved model achieves 78.6%mAP on the self-built dataset with a detection speed of 186.9 fps,outperforming comparison models in both accuracy and real-time performance.The findings indicate that this approach provides a robust and highly real-time practical solution for traffic police gesture recognition.