Human hand detection in uncontrolled environments is a challenging visual recognition task due to numerous variations of hand poses and background image clutter.To achieve highly accurate results as well as provide re...Human hand detection in uncontrolled environments is a challenging visual recognition task due to numerous variations of hand poses and background image clutter.To achieve highly accurate results as well as provide real-time execution,we proposed a deep transfer learning approach over the state-of-the-art deep learning object detector.Our method,denoted as YOLOHANDS,is built on top of the You Only Look Once(YOLO)deep learning architecture,which is modified to adapt to the single class hand detection task.The model transfer is performed by modifying the higher convolutional layers including the last fully connected layer,while initializing lower non-modified layers with the generic pre-trained weights.To address robustness issues,we introduced a comprehensive augmentation procedure over the training image dataset,specifically adapted for the hand detection problem.Experimental evaluation of the proposed method,which is performed on a challenging public dataset,has demonstrated highly accurate results,comparable to the state-of-the-art methods.展开更多
基金financed by the Ministry of Education,Science and Technological Development of the Republic of Serbia.
文摘Human hand detection in uncontrolled environments is a challenging visual recognition task due to numerous variations of hand poses and background image clutter.To achieve highly accurate results as well as provide real-time execution,we proposed a deep transfer learning approach over the state-of-the-art deep learning object detector.Our method,denoted as YOLOHANDS,is built on top of the You Only Look Once(YOLO)deep learning architecture,which is modified to adapt to the single class hand detection task.The model transfer is performed by modifying the higher convolutional layers including the last fully connected layer,while initializing lower non-modified layers with the generic pre-trained weights.To address robustness issues,we introduced a comprehensive augmentation procedure over the training image dataset,specifically adapted for the hand detection problem.Experimental evaluation of the proposed method,which is performed on a challenging public dataset,has demonstrated highly accurate results,comparable to the state-of-the-art methods.