Human interaction recognition is an essential task in video surveillance.The current works on human interaction recognition mainly focus on the scenarios only containing the close-contact interactive subjects without ...Human interaction recognition is an essential task in video surveillance.The current works on human interaction recognition mainly focus on the scenarios only containing the close-contact interactive subjects without other people.In this paper,we handle more practical but more challenging scenarios where interactive subjects are contactless and other subjects not involved in the interactions of interest are also present in the scene.To address this problem,we propose an Interactive Relation Embedding Network(IRE-Net)to simultaneously identify the subjects involved in the interaction and recognize their interaction category.As a new problem,we also build a new dataset with annotations and metrics for performance evaluation.Experimental results on this datasesthow significant improvements of the proposed method when compared with current methodsdeveloped for human interaction recognition and group activity recognition.展开更多
We study the novel problem of weakly supervised instance action recognition(WSiAR)in multi-person(crowd)scenes.We specifically aim to recognize the action of each subject in the crowd,for which we propose the use of a...We study the novel problem of weakly supervised instance action recognition(WSiAR)in multi-person(crowd)scenes.We specifically aim to recognize the action of each subject in the crowd,for which we propose the use of a weakly supervised method,considering the expense of large-scale annotations for training.This problem is of great practical value for video surveillance and sports scene analysis.To this end,we investigated and designed a series of weak annotations for the supervision of weakly supervised instance action recognition(WSiAR).We propose two categories of weak label settings,bag labels and sparse labels,to significantly reduce the number of labels.Based on the former,we propose a novel sub-block-aware multi-instance learning(MIL)loss to obtain more effective information from weak labels during training.With respect to the latter,we propose a pseudo label generation strategy for extending sparse labels.This enables our method to achieve results comparable to those of fully supervised methods but with significantly fewer annotations.The experimental results on two benchmarks verified the rationality of the problem definition and effectiveness of the proposed weakly supervised training method in solving our problem.展开更多
基金This work was supported by the National Natural Science Foundation of China(NSFC)(Grant Nos.62072334,U1803264).
文摘Human interaction recognition is an essential task in video surveillance.The current works on human interaction recognition mainly focus on the scenarios only containing the close-contact interactive subjects without other people.In this paper,we handle more practical but more challenging scenarios where interactive subjects are contactless and other subjects not involved in the interactions of interest are also present in the scene.To address this problem,we propose an Interactive Relation Embedding Network(IRE-Net)to simultaneously identify the subjects involved in the interaction and recognize their interaction category.As a new problem,we also build a new dataset with annotations and metrics for performance evaluation.Experimental results on this datasesthow significant improvements of the proposed method when compared with current methodsdeveloped for human interaction recognition and group activity recognition.
基金supported by the National Natural Science Foundation of China(NSFC)under Grant Nos.62402490 and 62072334.
文摘We study the novel problem of weakly supervised instance action recognition(WSiAR)in multi-person(crowd)scenes.We specifically aim to recognize the action of each subject in the crowd,for which we propose the use of a weakly supervised method,considering the expense of large-scale annotations for training.This problem is of great practical value for video surveillance and sports scene analysis.To this end,we investigated and designed a series of weak annotations for the supervision of weakly supervised instance action recognition(WSiAR).We propose two categories of weak label settings,bag labels and sparse labels,to significantly reduce the number of labels.Based on the former,we propose a novel sub-block-aware multi-instance learning(MIL)loss to obtain more effective information from weak labels during training.With respect to the latter,we propose a pseudo label generation strategy for extending sparse labels.This enables our method to achieve results comparable to those of fully supervised methods but with significantly fewer annotations.The experimental results on two benchmarks verified the rationality of the problem definition and effectiveness of the proposed weakly supervised training method in solving our problem.