Human action recognition(HAR)attempts to understand a subject’sbehavior and assign a label to each action performed.It is more appealingbecause it has a wide range of applications in computer vision,such asvideo surv...Human action recognition(HAR)attempts to understand a subject’sbehavior and assign a label to each action performed.It is more appealingbecause it has a wide range of applications in computer vision,such asvideo surveillance and smart cities.Many attempts have been made in theliterature to develop an effective and robust framework for HAR.Still,theprocess remains difficult and may result in reduced accuracy due to severalchallenges,such as similarity among actions,extraction of essential features,and reduction of irrelevant features.In this work,we proposed an end-toendframework using deep learning and an improved tree seed optimizationalgorithm for accurate HAR.The proposed design consists of a fewsignificantsteps.In the first step,frame preprocessing is performed.In the second step,two pre-trained deep learning models are fine-tuned and trained throughdeep transfer learning using preprocessed video frames.In the next step,deeplearning features of both fine-tuned models are fused using a new ParallelStandard Deviation Padding Max Value approach.The fused features arefurther optimized using an improved tree seed algorithm,and select the bestfeatures are finally classified by using the machine learning classifiers.Theexperiment was carried out on five publicly available datasets,including UTInteraction,Weizmann,KTH,Hollywood,and IXAMS,and achieved higheraccuracy than previous techniques.展开更多
基金supported by“Human Resources Program in Energy Technology”of the Korea Institute of Energy Technology Evaluation and Planning(KETEP),granted financial resources from the Ministry of Trade,Industry&Energy,Republic of Korea.(No.20204010600090).
文摘Human action recognition(HAR)attempts to understand a subject’sbehavior and assign a label to each action performed.It is more appealingbecause it has a wide range of applications in computer vision,such asvideo surveillance and smart cities.Many attempts have been made in theliterature to develop an effective and robust framework for HAR.Still,theprocess remains difficult and may result in reduced accuracy due to severalchallenges,such as similarity among actions,extraction of essential features,and reduction of irrelevant features.In this work,we proposed an end-toendframework using deep learning and an improved tree seed optimizationalgorithm for accurate HAR.The proposed design consists of a fewsignificantsteps.In the first step,frame preprocessing is performed.In the second step,two pre-trained deep learning models are fine-tuned and trained throughdeep transfer learning using preprocessed video frames.In the next step,deeplearning features of both fine-tuned models are fused using a new ParallelStandard Deviation Padding Max Value approach.The fused features arefurther optimized using an improved tree seed algorithm,and select the bestfeatures are finally classified by using the machine learning classifiers.Theexperiment was carried out on five publicly available datasets,including UTInteraction,Weizmann,KTH,Hollywood,and IXAMS,and achieved higheraccuracy than previous techniques.