Even though much advancements have been achieved with regards to the recognition of handwritten characters,researchers still face difficulties with the handwritten character recognition problem,especially with the adv...Even though much advancements have been achieved with regards to the recognition of handwritten characters,researchers still face difficulties with the handwritten character recognition problem,especially with the advent of new datasets like the Extended Modified National Institute of Standards and Technology dataset(EMNIST).The EMNIST dataset represents a challenge for both machine-learning and deep-learning techniques due to inter-class similarity and intra-class variability.Inter-class similarity exists because of the similarity between the shapes of certain characters in the dataset.The presence of intra-class variability is mainly due to different shapes written by different writers for the same character.In this research,we have optimized a deep residual network to achieve higher accuracy vs.the published state-of-the-art results.This approach is mainly based on the prebuilt deep residual network model ResNet18,whose architecture has been enhanced by using the optimal number of residual blocks and the optimal size of the receptive field of the first convolutional filter,the replacement of the first max-pooling filter by an average pooling filter,and the addition of a drop-out layer before the fully connected layer.A distinctive modification has been introduced by replacing the final addition layer with a depth concatenation layer,which resulted in a novel deep architecture having higher accuracy vs.the pure residual architecture.Moreover,the dataset images’sizes have been adjusted to optimize their visibility in the network.Finally,by tuning the training hyperparameters and using rotation and shear augmentations,the proposed model outperformed the state-of-the-art models by achieving average accuracies of 95.91%and 90.90%for the Letters and Balanced dataset sections,respectively.Furthermore,the average accuracies were improved to 95.9%and 91.06%for the Letters and Balanced sections,respectively,by using a group of 5 instances of the trained models and averaging the output class probabilities.展开更多
Recognition of human gait is a difficult assignment,particularly for unobtrusive surveillance in a video and human identification from a large distance.Therefore,a method is proposed for the classification and recogni...Recognition of human gait is a difficult assignment,particularly for unobtrusive surveillance in a video and human identification from a large distance.Therefore,a method is proposed for the classification and recognition of different types of human gait.The proposed approach is consisting of two phases.In phase I,the new model is proposed named convolutional bidirectional long short-term memory(Conv-BiLSTM)to classify the video frames of human gait.In this model,features are derived through convolutional neural network(CNN)named ResNet-18 and supplied as an input to the LSTM model that provided more distinguishable temporal information.In phase II,the YOLOv2-squeezeNet model is designed,where deep features are extricated using the fireconcat-02 layer and fed/passed to the tinyYOLOv2 model for recognized/localized the human gaits with predicted scores.The proposed method achieved up to 90%correct prediction scores on CASIA-A,CASIA-B,and the CASIA-C benchmark datasets.The proposed method achieved better/improved prediction scores as compared to the recent existing works.展开更多
文摘Even though much advancements have been achieved with regards to the recognition of handwritten characters,researchers still face difficulties with the handwritten character recognition problem,especially with the advent of new datasets like the Extended Modified National Institute of Standards and Technology dataset(EMNIST).The EMNIST dataset represents a challenge for both machine-learning and deep-learning techniques due to inter-class similarity and intra-class variability.Inter-class similarity exists because of the similarity between the shapes of certain characters in the dataset.The presence of intra-class variability is mainly due to different shapes written by different writers for the same character.In this research,we have optimized a deep residual network to achieve higher accuracy vs.the published state-of-the-art results.This approach is mainly based on the prebuilt deep residual network model ResNet18,whose architecture has been enhanced by using the optimal number of residual blocks and the optimal size of the receptive field of the first convolutional filter,the replacement of the first max-pooling filter by an average pooling filter,and the addition of a drop-out layer before the fully connected layer.A distinctive modification has been introduced by replacing the final addition layer with a depth concatenation layer,which resulted in a novel deep architecture having higher accuracy vs.the pure residual architecture.Moreover,the dataset images’sizes have been adjusted to optimize their visibility in the network.Finally,by tuning the training hyperparameters and using rotation and shear augmentations,the proposed model outperformed the state-of-the-art models by achieving average accuracies of 95.91%and 90.90%for the Letters and Balanced dataset sections,respectively.Furthermore,the average accuracies were improved to 95.9%and 91.06%for the Letters and Balanced sections,respectively,by using a group of 5 instances of the trained models and averaging the output class probabilities.
基金supported by the Korea Institute for Advancement of Technology(KIAT)Grant funded by the Korea Government(MOTIE)(P0012724,The Competency,Development Program for Industry Specialist)the Soonchunhyang University Research Fund.
文摘Recognition of human gait is a difficult assignment,particularly for unobtrusive surveillance in a video and human identification from a large distance.Therefore,a method is proposed for the classification and recognition of different types of human gait.The proposed approach is consisting of two phases.In phase I,the new model is proposed named convolutional bidirectional long short-term memory(Conv-BiLSTM)to classify the video frames of human gait.In this model,features are derived through convolutional neural network(CNN)named ResNet-18 and supplied as an input to the LSTM model that provided more distinguishable temporal information.In phase II,the YOLOv2-squeezeNet model is designed,where deep features are extricated using the fireconcat-02 layer and fed/passed to the tinyYOLOv2 model for recognized/localized the human gaits with predicted scores.The proposed method achieved up to 90%correct prediction scores on CASIA-A,CASIA-B,and the CASIA-C benchmark datasets.The proposed method achieved better/improved prediction scores as compared to the recent existing works.