Gait recognition is an active research area that uses a walking theme to identify the subject correctly.Human Gait Recognition(HGR)is performed without any cooperation from the individual.However,in practice,it remain...Gait recognition is an active research area that uses a walking theme to identify the subject correctly.Human Gait Recognition(HGR)is performed without any cooperation from the individual.However,in practice,it remains a challenging task under diverse walking sequences due to the covariant factors such as normal walking and walking with wearing a coat.Researchers,over the years,have worked on successfully identifying subjects using different techniques,but there is still room for improvement in accuracy due to these covariant factors.This paper proposes an automated model-free framework for human gait recognition in this article.There are a few critical steps in the proposed method.Firstly,optical flow-based motion region esti-mation and dynamic coordinates-based cropping are performed.The second step involves training a fine-tuned pre-trained MobileNetV2 model on both original and optical flow cropped frames;the training has been conducted using static hyperparameters.The third step proposed a fusion technique known as normal distribution serially fusion.In the fourth step,a better optimization algorithm is applied to select the best features,which are then classified using a Bi-Layered neural network.Three publicly available datasets,CASIA A,CASIA B,and CASIA C,were used in the experimental process and obtained average accuracies of 99.6%,91.6%,and 95.02%,respectively.The proposed framework has achieved improved accuracy compared to the other methods.展开更多
Object detection in images has been identified as a critical area of research in computer vision image processing.Research has developed several novel methods for determining an object’s location and category from an...Object detection in images has been identified as a critical area of research in computer vision image processing.Research has developed several novel methods for determining an object’s location and category from an image.However,there is still room for improvement in terms of detection effi-ciency.This study aims to develop a technique for detecting objects in images.To enhance overall detection performance,we considered object detection a two-fold problem,including localization and classification.The proposed method generates class-independent,high-quality,and precise proposals using an agglomerative clustering technique.We then combine these proposals with the relevant input image to train our network on convolutional features.Next,a network refinement module decreases the quantity of generated proposals to produce fewer high-quality candidate proposals.Finally,revised candidate proposals are sent into the network’s detection process to determine the object type.The algorithm’s performance is evaluated using publicly available the PASCAL Visual Object Classes Challenge 2007(VOC2007),VOC2012,and Microsoft Common Objects in Context(MS-COCO)datasets.Using only 100 proposals per image at intersection over union((IoU)=0.5 and 0.7),the proposed method attains Detection Recall(DR)rates of(93.17%and 79.35%)and(69.4%and 58.35%),and Mean Average Best Overlap(MABO)values of(79.25%and 62.65%),for the VOC2007 and MS-COCO datasets,respectively.Besides,it achieves a Mean Average Precision(mAP)of(84.7%and 81.5%)on both VOC datasets.The experiment findings reveal that our method exceeds previous approaches in terms of overall detection performance,proving its effectiveness.展开更多
In this paper we consider the problem of“end-to-end”digital camera identification by considering sequence of images obtained from the cameras.The problem of digital camera identification is harder than the problem o...In this paper we consider the problem of“end-to-end”digital camera identification by considering sequence of images obtained from the cameras.The problem of digital camera identification is harder than the problem of identifying its analog counterpart since the process of analog to digital conversion smooths out the intrinsic noise in the analog signal.However it is known that identifying a digital camera is possible by analyzing the camera’s intrinsic sensor artifacts that are introduced into the images/videos during the process of photo/video capture.It is known that such methods are computationally intensive requiring expensive pre-processing steps.In this paper we propose an end-to-end deep feature learning framework for identifying cameras using images obtained from them.We conduct experiments using three custom datasets:the first containing two cameras in an indoor environment where each camera may observe different scenes having no overlapping features,the second containing images from four cameras in an outdoor setting but where each camera observes scenes having overlapping features and the third containing images from two cameras observing the same checkerboard pattern in an indoor setting.Our results show that it is possible to capture the intrinsic hardware signature of the cameras using deep feature representations in an end-to-end framework.These deep feature maps can in turn be used to disambiguate the cameras from each another.Our system is end-to-end,requires no complicated pre-processing steps and the trained model is computationally efficient during testing,paving a way to have near instantaneous decisions for the problem of digital camera identification in production environments.Finally we present comparisons against the current state-of-the-art in digital camera identification which clearly establishes the superiority of the end-to-end solution.展开更多
In some image classification tasks, similarities among different categories are different and the samples are usually misclassified as highly similar categories. To distinguish highly similar categories, more specific...In some image classification tasks, similarities among different categories are different and the samples are usually misclassified as highly similar categories. To distinguish highly similar categories, more specific features are required so that the classifier can improve the classification performance. In this paper, we propose a novel two-level hierarchical feature learning framework based on the deep convolutional neural network(CNN), which is simple and effective. First, the deep feature extractors of different levels are trained using the transfer learning method that fine-tunes the pre-trained deep CNN model toward the new target dataset. Second, the general feature extracted from all the categories and the specific feature extracted from highly similar categories are fused into a feature vector. Then the final feature representation is fed into a linear classifier. Finally, experiments using the Caltech-256, Oxford Flower-102, and Tasmania Coral Point Count(CPC) datasets demonstrate that the expression ability of the deep features resulting from two-level hierarchical feature learning is powerful. Our proposed method effectively increases the classification accuracy in comparison with flat multiple classification methods.展开更多
基金supported by“Human Resources Program in Energy Technology”of the Korea Institute of Energy Technology Evaluation and Planning(KETEP)granted financial resources from the Ministry of Trade,Industry&Energy,Republic of Korea.(No.20204010600090).
文摘Gait recognition is an active research area that uses a walking theme to identify the subject correctly.Human Gait Recognition(HGR)is performed without any cooperation from the individual.However,in practice,it remains a challenging task under diverse walking sequences due to the covariant factors such as normal walking and walking with wearing a coat.Researchers,over the years,have worked on successfully identifying subjects using different techniques,but there is still room for improvement in accuracy due to these covariant factors.This paper proposes an automated model-free framework for human gait recognition in this article.There are a few critical steps in the proposed method.Firstly,optical flow-based motion region esti-mation and dynamic coordinates-based cropping are performed.The second step involves training a fine-tuned pre-trained MobileNetV2 model on both original and optical flow cropped frames;the training has been conducted using static hyperparameters.The third step proposed a fusion technique known as normal distribution serially fusion.In the fourth step,a better optimization algorithm is applied to select the best features,which are then classified using a Bi-Layered neural network.Three publicly available datasets,CASIA A,CASIA B,and CASIA C,were used in the experimental process and obtained average accuracies of 99.6%,91.6%,and 95.02%,respectively.The proposed framework has achieved improved accuracy compared to the other methods.
基金funded by Huanggang Normal University,China,Self-type Project of 2021(No.30120210103)and 2022(No.2042021008).
文摘Object detection in images has been identified as a critical area of research in computer vision image processing.Research has developed several novel methods for determining an object’s location and category from an image.However,there is still room for improvement in terms of detection effi-ciency.This study aims to develop a technique for detecting objects in images.To enhance overall detection performance,we considered object detection a two-fold problem,including localization and classification.The proposed method generates class-independent,high-quality,and precise proposals using an agglomerative clustering technique.We then combine these proposals with the relevant input image to train our network on convolutional features.Next,a network refinement module decreases the quantity of generated proposals to produce fewer high-quality candidate proposals.Finally,revised candidate proposals are sent into the network’s detection process to determine the object type.The algorithm’s performance is evaluated using publicly available the PASCAL Visual Object Classes Challenge 2007(VOC2007),VOC2012,and Microsoft Common Objects in Context(MS-COCO)datasets.Using only 100 proposals per image at intersection over union((IoU)=0.5 and 0.7),the proposed method attains Detection Recall(DR)rates of(93.17%and 79.35%)and(69.4%and 58.35%),and Mean Average Best Overlap(MABO)values of(79.25%and 62.65%),for the VOC2007 and MS-COCO datasets,respectively.Besides,it achieves a Mean Average Precision(mAP)of(84.7%and 81.5%)on both VOC datasets.The experiment findings reveal that our method exceeds previous approaches in terms of overall detection performance,proving its effectiveness.
文摘In this paper we consider the problem of“end-to-end”digital camera identification by considering sequence of images obtained from the cameras.The problem of digital camera identification is harder than the problem of identifying its analog counterpart since the process of analog to digital conversion smooths out the intrinsic noise in the analog signal.However it is known that identifying a digital camera is possible by analyzing the camera’s intrinsic sensor artifacts that are introduced into the images/videos during the process of photo/video capture.It is known that such methods are computationally intensive requiring expensive pre-processing steps.In this paper we propose an end-to-end deep feature learning framework for identifying cameras using images obtained from them.We conduct experiments using three custom datasets:the first containing two cameras in an indoor environment where each camera may observe different scenes having no overlapping features,the second containing images from four cameras in an outdoor setting but where each camera observes scenes having overlapping features and the third containing images from two cameras observing the same checkerboard pattern in an indoor setting.Our results show that it is possible to capture the intrinsic hardware signature of the cameras using deep feature representations in an end-to-end framework.These deep feature maps can in turn be used to disambiguate the cameras from each another.Our system is end-to-end,requires no complicated pre-processing steps and the trained model is computationally efficient during testing,paving a way to have near instantaneous decisions for the problem of digital camera identification in production environments.Finally we present comparisons against the current state-of-the-art in digital camera identification which clearly establishes the superiority of the end-to-end solution.
基金Project supported by the National Natural Science Foundation of China(No.61379074)the Zhejiang Provincial Natural Science Foundation of China(Nos.LZ12F02003 and LY15F020035)
文摘In some image classification tasks, similarities among different categories are different and the samples are usually misclassified as highly similar categories. To distinguish highly similar categories, more specific features are required so that the classifier can improve the classification performance. In this paper, we propose a novel two-level hierarchical feature learning framework based on the deep convolutional neural network(CNN), which is simple and effective. First, the deep feature extractors of different levels are trained using the transfer learning method that fine-tunes the pre-trained deep CNN model toward the new target dataset. Second, the general feature extracted from all the categories and the specific feature extracted from highly similar categories are fused into a feature vector. Then the final feature representation is fed into a linear classifier. Finally, experiments using the Caltech-256, Oxford Flower-102, and Tasmania Coral Point Count(CPC) datasets demonstrate that the expression ability of the deep features resulting from two-level hierarchical feature learning is powerful. Our proposed method effectively increases the classification accuracy in comparison with flat multiple classification methods.