The requirement for precise detection and recognition of target pedestrians in unprocessed real-world imagery drives the formulation of person search as an integrated technological framework that unifies pedestrian de...The requirement for precise detection and recognition of target pedestrians in unprocessed real-world imagery drives the formulation of person search as an integrated technological framework that unifies pedestrian detection and person re-identification(Re-ID).However,the inherent discrepancy between the optimization objectives of coarse-grained localization in pedestrian detection and fine-grained discriminative learning in Re-ID,combined with the substantial performance degradation of Re-ID during joint training caused by the Faster R-CNN-based branch,collectively constitutes a critical bottleneck for person search.In this work,we propose a cascaded person searchmodel(SeqXt)based on SeqNet and ConvNeXt that adopts a sequential end-to-end network as its core architecture,artfully integrates the design logic of the two-stepmethod and one-step method framework,and concurrently incorporates the two-step method’s advantage in efficient subtask handling while preserving the one-step method’s efficiency in end-toend training.Firstly,we utilize ConvNeXt-Base as the feature extraction module,which incorporates part of the design concept of Transformer,enhances the consideration of global context information,and boosts feature discrimination through an implicit self-attention mechanism.Secondly,we introduce prototype-guided normalization for calibrating the feature distribution,which leverages the archetype features of individual identities to calibrate the feature distribution and thereby prevents features from being overly inclined towards frequently occurring IDs,notably improving the intra-class compactness and inter-class separability of person identities.Finally,we put forward an innovative loss function named the Dynamic Online Instance Matching Loss Function(DOIM),which employs the hard sample assistantmethod to adaptively update the lookup table(LUT)and the circular queue(CQ)and aims to further enhance the distinctiveness of features between classes.Experimental results on the public datasets CUHK-SYSU and PRWand the private dataset UESTC-PS show that the proposed method achieves state-of-the-art results.展开更多
基金supported by the major science and technology special projects of Xinjiang(No.2024B03041)the scientific and technological projects of Kashgar(No.KS2024024).
文摘The requirement for precise detection and recognition of target pedestrians in unprocessed real-world imagery drives the formulation of person search as an integrated technological framework that unifies pedestrian detection and person re-identification(Re-ID).However,the inherent discrepancy between the optimization objectives of coarse-grained localization in pedestrian detection and fine-grained discriminative learning in Re-ID,combined with the substantial performance degradation of Re-ID during joint training caused by the Faster R-CNN-based branch,collectively constitutes a critical bottleneck for person search.In this work,we propose a cascaded person searchmodel(SeqXt)based on SeqNet and ConvNeXt that adopts a sequential end-to-end network as its core architecture,artfully integrates the design logic of the two-stepmethod and one-step method framework,and concurrently incorporates the two-step method’s advantage in efficient subtask handling while preserving the one-step method’s efficiency in end-toend training.Firstly,we utilize ConvNeXt-Base as the feature extraction module,which incorporates part of the design concept of Transformer,enhances the consideration of global context information,and boosts feature discrimination through an implicit self-attention mechanism.Secondly,we introduce prototype-guided normalization for calibrating the feature distribution,which leverages the archetype features of individual identities to calibrate the feature distribution and thereby prevents features from being overly inclined towards frequently occurring IDs,notably improving the intra-class compactness and inter-class separability of person identities.Finally,we put forward an innovative loss function named the Dynamic Online Instance Matching Loss Function(DOIM),which employs the hard sample assistantmethod to adaptively update the lookup table(LUT)and the circular queue(CQ)and aims to further enhance the distinctiveness of features between classes.Experimental results on the public datasets CUHK-SYSU and PRWand the private dataset UESTC-PS show that the proposed method achieves state-of-the-art results.