Activity recognition is a challenging topic in the field of computer vision that has various applications,including surveillance systems,industrial automation,and human-computer interaction.Today,the demand for automa...Activity recognition is a challenging topic in the field of computer vision that has various applications,including surveillance systems,industrial automation,and human-computer interaction.Today,the demand for automation has greatly increased across industries worldwide.Real-time detection requires edge devices with limited computational time.This study proposes a novel hybrid deep learning system for human activity recognition(HAR),aiming to enhance the recognition accuracy and reduce the computational time.The proposed system combines a pretrained image classification model with a sequence analysis model.First,the dataset was divided into a training set(70%),validation set(10%),and test set(20%).Second,all the videos were converted into frames and deep-based features were extracted from each frame using convolutional neural networks(CNNs)with a vision transformer.Following that,bidirectional long short-term memory(BiLSTM)-and temporal convolutional network(TCN)-based models were trained using the training set,and their performances were evaluated using the validation set and test set.Four benchmark datasets(UCF11,UCF50,UCF101,and JHMDB)were used to evaluate the performance of the proposed HAR-based system.The experimental results showed that the combination of ConvNeXt and the TCN-based model achieved a recognition accuracy of 97.73%for UCF11,98.81%for UCF50,98.46%for UCF101,and 83.38%for JHMDB,respectively.This represents improvements in the recognition accuracy of 4%,2.67%,3.67%,and 7.08%for the UCF11,UCF50,UCF101,and JHMDB datasets,respectively,over existing models.Moreover,the proposed HAR-based system obtained superior recognition accuracy,shorter computational times,and minimal memory usage compared to the existing models.展开更多
This research addresses the performance challenges of ontology-based context-aware and activity recognition techniques in complex environments and abnormal activities,and proposes an optimized ontology framework to im...This research addresses the performance challenges of ontology-based context-aware and activity recognition techniques in complex environments and abnormal activities,and proposes an optimized ontology framework to improve recognition accuracy and computational efficiency.The method in this paper adopts the event sequence segmentation technique,combines location awareness with time interval reasoning,and improves human activity recognition through ontology reasoning.Compared with the existing methods,the framework performs better when dealing with uncertain data and complex scenes,and the experimental results show that its recognition accuracy is improved by 15.6%and processing time is reduced by 22.4%.In addition,it is found that with the increase of context complexity,the traditional ontology inferencemodel has limitations in abnormal behavior recognition,especially in the case of high data redundancy,which tends to lead to a decrease in recognition accuracy.This study effectively mitigates this problem by optimizing the ontology matching algorithm and combining parallel computing and deep learning techniques to enhance the activity recognition capability in complex environments.展开更多
Human Activity Recognition(HAR)represents a rapidly advancing research domain,propelled by continuous developments in sensor technologies and the Internet of Things(IoT).Deep learning has become the dominant paradigm ...Human Activity Recognition(HAR)represents a rapidly advancing research domain,propelled by continuous developments in sensor technologies and the Internet of Things(IoT).Deep learning has become the dominant paradigm in sensor-based HAR systems,offering significant advantages over traditional machine learning methods by eliminating manual feature extraction,enhancing recognition accuracy for complex activities,and enabling the exploitation of unlabeled data through generative models.This paper provides a comprehensive review of recent advancements and emerging trends in deep learning models developed for sensor-based human activity recognition(HAR)systems.We begin with an overview of fundamental HAR concepts in sensor-driven contexts,followed by a systematic categorization and summary of existing research.Our survey encompasses a wide range of deep learning approaches,including Multi-Layer Perceptrons(MLP),Convolutional Neural Networks(CNN),Recurrent Neural Networks(RNN),Long Short-Term Memory networks(LSTM),Gated Recurrent Units(GRU),Transformers,Deep Belief Networks(DBN),and hybrid architectures.A comparative evaluation of these models is provided,highlighting their performance,architectural complexity,and contributions to the field.Beyond Centralized deep learning models,we examine the role of Federated Learning(FL)in HAR,highlighting current applications and research directions.Finally,we discuss the growing importance of Explainable Artificial Intelligence(XAI)in sensor-based HAR,reviewing recent studies that integrate interpretability methods to enhance transparency and trustworthiness in deep learning-based HAR systems.展开更多
This research investigates the application of multisource data fusion using a Multi-Layer Perceptron (MLP) for Human Activity Recognition (HAR). The study integrates four distinct open-source datasets—WISDM, DaLiAc, ...This research investigates the application of multisource data fusion using a Multi-Layer Perceptron (MLP) for Human Activity Recognition (HAR). The study integrates four distinct open-source datasets—WISDM, DaLiAc, MotionSense, and PAMAP2—to develop a generalized MLP model for classifying six human activities. Performance analysis of the fused model for each dataset reveals accuracy rates of 95.83 for WISDM, 97 for DaLiAc, 94.65 for MotionSense, and 98.54 for PAMAP2. A comparative evaluation was conducted between the fused MLP model and the individual dataset models, with the latter tested on separate validation sets. The results indicate that the MLP model, trained on the fused dataset, exhibits superior performance relative to the models trained on individual datasets. This finding suggests that multisource data fusion significantly enhances the generalization and accuracy of HAR systems. The improved performance underscores the potential of integrating diverse data sources to create more robust and comprehensive models for activity recognition.展开更多
Human group activity recognition(GAR)has attracted significant attention from computer vision researchers due to its wide practical applications in security surveillance,social role understanding and sports video anal...Human group activity recognition(GAR)has attracted significant attention from computer vision researchers due to its wide practical applications in security surveillance,social role understanding and sports video analysis.In this paper,we give a comprehensive overview of the advances in group activity recognition in videos during the past 20 years.First,we provide a summary and comparison of 11 GAR video datasets in this field.Second,we survey the group activity recognition methods,including those based on handcrafted features and those based on deep learning networks.For better understanding of the pros and cons of these methods,we compare various models from the past to the present.Finally,we outline several challenging issues and possible directions for future research.From this comprehensive literature review,readers can obtain an overview of progress in group activity recognition for future studies.展开更多
Human activity tracking plays a vital role in human–computer interaction.Traditional human activity recognition(HAR)methods adopt special devices,such as cameras and sensors,to track both macro-and micro-activities.R...Human activity tracking plays a vital role in human–computer interaction.Traditional human activity recognition(HAR)methods adopt special devices,such as cameras and sensors,to track both macro-and micro-activities.Recently,wireless signals have been exploited to track human motion and activities in indoor environments without additional equipment.This study proposes a device-free WiFi-based micro-activity recognition method that leverages the channel state information(CSI)of wireless signals.Different from existed CSI-based microactivity recognition methods,the proposed method extracts both amplitude and phase information from CSI,thereby providing more information and increasing detection accuracy.The proposed method harnesses an effective signal processing technique to reveal the unique patterns of each activity.We applied a machine learning algorithm to recognize the proposed micro-activities.The proposed method has been evaluated in both line of sight(LOS)and none line of sight(NLOS)scenarios,and the empirical results demonstrate the effectiveness of the proposed method with several users.展开更多
Human activity recognition is commonly used in several Internet of Things applications to recognize different contexts and respond to them.Deep learning has gained momentum for identifying activities through sensors,s...Human activity recognition is commonly used in several Internet of Things applications to recognize different contexts and respond to them.Deep learning has gained momentum for identifying activities through sensors,smartphones or even surveillance cameras.However,it is often difficult to train deep learning models on constrained IoT devices.The focus of this paper is to propose an alternative model by constructing a Deep Learning-based Human Activity Recognition framework for edge computing,which we call DL-HAR.The goal of this framework is to exploit the capabilities of cloud computing to train a deep learning model and deploy it on less-powerful edge devices for recognition.The idea is to conduct the training of the model in the Cloud and distribute it to the edge nodes.We demonstrate how the DL-HAR can perform human activity recognition at the edge while improving efficiency and accuracy.In order to evaluate the proposed framework,we conducted a comprehensive set of experiments to validate the applicability of DL-HAR.Experimental results on the benchmark dataset show a significant increase in performance compared with the state-of-the-art models.展开更多
This paper proposes a hybrid approach for recognizing human activities from trajectories. First, an improved hidden Markov model (HMM) parameter learning algorithm, HMM-PSO, is proposed, which achieves a better bala...This paper proposes a hybrid approach for recognizing human activities from trajectories. First, an improved hidden Markov model (HMM) parameter learning algorithm, HMM-PSO, is proposed, which achieves a better balance between the global and local exploitation by the nonlinear update strategy and repulsion operation. Then, the event probability sequence (EPS) which consists of a series of events is computed to describe the unique characteristic of human activities. The anatysis on EPS indicates that it is robust to the changes in viewing direction and contributes to improving the recognition rate. Finally, the effectiveness of the proposed approach is evaluated by data experiments on current popular datasets.展开更多
Human Activity Recognition(HAR)is an active research area due to its applications in pervasive computing,human-computer interaction,artificial intelligence,health care,and social sciences.Moreover,dynamic environments...Human Activity Recognition(HAR)is an active research area due to its applications in pervasive computing,human-computer interaction,artificial intelligence,health care,and social sciences.Moreover,dynamic environments and anthropometric differences between individuals make it harder to recognize actions.This study focused on human activity in video sequences acquired with an RGB camera because of its vast range of real-world applications.It uses two-stream ConvNet to extract spatial and temporal information and proposes a fine-tuned deep neural network.Moreover,the transfer learning paradigm is adopted to extract varied and fixed frames while reusing object identification information.Six state-of-the-art pre-trained models are exploited to find the best model for spatial feature extraction.For temporal sequence,this study uses dense optical flow following the two-stream ConvNet and Bidirectional Long Short TermMemory(BiLSTM)to capture longtermdependencies.Two state-of-the-art datasets,UCF101 and HMDB51,are used for evaluation purposes.In addition,seven state-of-the-art optimizers are used to fine-tune the proposed network parameters.Furthermore,this study utilizes an ensemble mechanism to aggregate spatial-temporal features using a four-stream Convolutional Neural Network(CNN),where two streams use RGB data.In contrast,the other uses optical flow images.Finally,the proposed ensemble approach using max hard voting outperforms state-ofthe-art methods with 96.30%and 90.07%accuracies on the UCF101 and HMDB51 datasets.展开更多
A new method for complex activity recognition in videos by key frames was presented. The progressive bisection strategy(PBS) was employed to divide the complex activity into a series of simple activities and the key f...A new method for complex activity recognition in videos by key frames was presented. The progressive bisection strategy(PBS) was employed to divide the complex activity into a series of simple activities and the key frames representing the simple activities were extracted by the self-splitting competitive learning(SSCL) algorithm. A new similarity criterion of complex activities was defined. Besides the regular visual factor, the order factor and the interference factor measuring the timing matching relationship of the simple activities and the discontinuous matching relationship of the simple activities respectively were considered. On these bases, the complex human activity recognition could be achieved by calculating their similarities. The recognition error was reduced compared with other methods when ignoring the recognition of simple activities. The proposed method was tested and evaluated on the self-built broadcast gymnastic database and the dancing database. The experimental results prove the superior efficiency.展开更多
We study the problem of humanactivity recognition from RGB-Depth(RGBD)sensors when the skeletons are not available.The skeleton tracking in Kinect SDK workswell when the human subject is facing thecamera and there are...We study the problem of humanactivity recognition from RGB-Depth(RGBD)sensors when the skeletons are not available.The skeleton tracking in Kinect SDK workswell when the human subject is facing thecamera and there are no occlusions.In surveillance or nursing home monitoring scenarios,however,the camera is usually mounted higher than human subjects,and there may beocclusions.The interest-point based approachis widely used in RGB based activity recognition,it can be used in both RGB and depthchannels.Whether we should extract interestpoints independently of each channel or extract interest points from only one of thechannels is discussed in this paper.The goal ofthis paper is to compare the performances ofdifferent methods of extracting interest points.In addition,we have developed a depth mapbased descriptor and built an RGBD dataset,called RGBD-SAR,for senior activity recognition.We show that the best performance isachieved when we extract interest points solely from RGB channels,and combine the RGBbased descriptors with the depth map-baseddescriptors.We also present a baseline performance of the RGBD-SAR dataset.展开更多
Human Activity Recognition(HAR)has been made simple in recent years,thanks to recent advancements made in Artificial Intelligence(AI)techni-ques.These techniques are applied in several areas like security,surveillance,...Human Activity Recognition(HAR)has been made simple in recent years,thanks to recent advancements made in Artificial Intelligence(AI)techni-ques.These techniques are applied in several areas like security,surveillance,healthcare,human-robot interaction,and entertainment.Since wearable sensor-based HAR system includes in-built sensors,human activities can be categorized based on sensor values.Further,it can also be employed in other applications such as gait diagnosis,observation of children/adult’s cognitive nature,stroke-patient hospital direction,Epilepsy and Parkinson’s disease examination,etc.Recently-developed Artificial Intelligence(AI)techniques,especially Deep Learning(DL)models can be deployed to accomplish effective outcomes on HAR process.With this motivation,the current research paper focuses on designing Intelligent Hyperparameter Tuned Deep Learning-based HAR(IHPTDL-HAR)technique in healthcare environment.The proposed IHPTDL-HAR technique aims at recogniz-ing the human actions in healthcare environment and helps the patients in mana-ging their healthcare service.In addition,the presented model makes use of Hierarchical Clustering(HC)-based outlier detection technique to remove the out-liers.IHPTDL-HAR technique incorporates DL-based Deep Belief Network(DBN)model to recognize the activities of users.Moreover,Harris Hawks Opti-mization(HHO)algorithm is used for hyperparameter tuning of DBN model.Finally,a comprehensive experimental analysis was conducted upon benchmark dataset and the results were examined under different aspects.The experimental results demonstrate that the proposed IHPTDL-HAR technique is a superior per-former compared to other recent techniques under different measures.展开更多
With the rapid advancement of wearable devices,Human Activities Recognition(HAR)based on these devices has emerged as a prominent research field.The objective of this study is to enhance the recognition performance of...With the rapid advancement of wearable devices,Human Activities Recognition(HAR)based on these devices has emerged as a prominent research field.The objective of this study is to enhance the recognition performance of HAR by proposing an LSTM-1DCNN recognition algorithm that utilizes a single triaxial accelerometer.This algorithm comprises two branches:one branch consists of a Long and Short-Term Memory Network(LSTM),while the other parallel branch incorporates a one-dimensional Convolutional Neural Network(1DCNN).The parallel architecture of LSTM-1DCNN initially extracts spatial and temporal features from the accelerometer data separately,which are then concatenated and fed into a fully connected neural network for information fusion.In the LSTM-1DCNN architecture,the 1DCNN branch primarily focuses on extracting spatial features during convolution operations,whereas the LSTM branch mainly captures temporal features.Nine sets of accelerometer data from five publicly available HAR datasets are employed for training and evaluation purposes.The performance of the proposed LSTM-1DCNN model is compared with five other HAR algorithms including Decision Tree,Random Forest,Support Vector Machine,1DCNN,and LSTM on these five public datasets.Experimental results demonstrate that the F1-score achieved by the proposed LSTM-1DCNN ranges from 90.36%to 99.68%,with a mean value of 96.22%and standard deviation of 0.03 across all evaluated metrics on these five public datasets-outperforming other existing HAR algorithms significantly in terms of evaluation metrics used in this study.Finally the proposed LSTM-1DCNN is validated in real-world applications by collecting acceleration data of seven human activities for training and testing purposes.Subsequently,the trained HAR algorithm is deployed on Android phones to evaluate its performance.Experimental results demonstrate that the proposed LSTM-1DCNN algorithm achieves an impressive F1-score of 97.67%on our self-built dataset.In conclusion,the fusion of temporal and spatial information in the measured data contributes to the excellent HAR performance and robustness exhibited by the proposed 1DCNN-LSTM architecture.展开更多
Artificial intelligence(AI)technology has become integral in the realm of medicine and healthcare,particularly in human activity recognition(HAR)applications such as fitness and rehabilitation tracking.This study intr...Artificial intelligence(AI)technology has become integral in the realm of medicine and healthcare,particularly in human activity recognition(HAR)applications such as fitness and rehabilitation tracking.This study introduces a robust coupling analysis framework that integrates four AI-enabled models,combining both machine learning(ML)and deep learning(DL)approaches to evaluate their effectiveness in HAR.The analytical dataset comprises 561 features sourced from the UCI-HAR database,forming the foundation for training the models.Additionally,the MHEALTH database is employed to replicate the modeling process for comparative purposes,while inclusion of the WISDM database,renowned for its challenging features,supports the framework’s resilience and adaptability.The ML-based models employ the methodologies including adaptive neuro-fuzzy inference system(ANFIS),support vector machine(SVM),and random forest(RF),for data training.In contrast,a DL-based model utilizes one-dimensional convolution neural network(1dCNN)to automate feature extraction.Furthermore,the recursive feature elimination(RFE)algorithm,which drives an ML-based estimator to eliminate low-participation features,helps identify the optimal features for enhancing model performance.The best accuracies of the ANFIS,SVM,RF,and 1dCNN models with meticulous featuring process achieve around 90%,96%,91%,and 93%,respectively.Comparative analysis using the MHEALTH dataset showcases the 1dCNN model’s remarkable perfect accuracy(100%),while the RF,SVM,and ANFIS models equipped with selected features achieve accuracies of 99.8%,99.7%,and 96.5%,respectively.Finally,when applied to the WISDM dataset,the DL-based and ML-based models attain accuracies of 91.4%and 87.3%,respectively,aligning with prior research findings.In conclusion,the proposed framework yields HAR models with commendable performance metrics,exhibiting its suitability for integration into the healthcare services system through AI-driven applications.展开更多
In this present time,Human Activity Recognition(HAR)has been of considerable aid in the case of health monitoring and recovery.The exploitation of machine learning with an intelligent agent in the area of health infor...In this present time,Human Activity Recognition(HAR)has been of considerable aid in the case of health monitoring and recovery.The exploitation of machine learning with an intelligent agent in the area of health informatics gathered using HAR augments the decision-making quality and significance.Although many research works conducted on Smart Healthcare Monitoring,there remain a certain number of pitfalls such as time,overhead,and falsification involved during analysis.Therefore,this paper proposes a Statistical Partial Regression and Support Vector Intelligent Agent Learning(SPR-SVIAL)for Smart Healthcare Monitoring.At first,the Statistical Partial Regression Feature Extraction model is used for data preprocessing along with the dimensionality-reduced features extraction process.Here,the input dataset the continuous beat-to-beat heart data,triaxial accelerometer data,and psychological characteristics were acquired from IoT wearable devices.To attain highly accurate Smart Healthcare Monitoring with less time,Partial Least Square helps extract the dimensionality-reduced features.After that,with these resulting features,SVIAL is proposed for Smart Healthcare Monitoring with the help of Machine Learning and Intelligent Agents to minimize both analysis falsification and overhead.Experimental evaluation is carried out for factors such as time,overhead,and false positive rate accuracy concerning several instances.The quantitatively analyzed results indicate the better performance of our proposed SPR-SVIAL method when compared with two state-of-the-art methods.展开更多
Human Action Recognition(HAR)and pose estimation from videos have gained significant attention among research communities due to its applica-tion in several areas namely intelligent surveillance,human robot interaction...Human Action Recognition(HAR)and pose estimation from videos have gained significant attention among research communities due to its applica-tion in several areas namely intelligent surveillance,human robot interaction,robot vision,etc.Though considerable improvements have been made in recent days,design of an effective and accurate action recognition model is yet a difficult process owing to the existence of different obstacles such as variations in camera angle,occlusion,background,movement speed,and so on.From the literature,it is observed that hard to deal with the temporal dimension in the action recognition process.Convolutional neural network(CNN)models could be used widely to solve this.With this motivation,this study designs a novel key point extraction with deep convolutional neural networks based pose estimation(KPE-DCNN)model for activity recognition.The KPE-DCNN technique initially converts the input video into a sequence of frames followed by a three stage process namely key point extraction,hyperparameter tuning,and pose estimation.In the keypoint extraction process an OpenPose model is designed to compute the accurate key-points in the human pose.Then,an optimal DCNN model is developed to classify the human activities label based on the extracted key points.For improving the training process of the DCNN technique,RMSProp optimizer is used to optimally adjust the hyperparameters such as learning rate,batch size,and epoch count.The experimental results tested using benchmark dataset like UCF sports dataset showed that KPE-DCNN technique is able to achieve good results compared with benchmark algorithms like CNN,DBN,SVM,STAL,T-CNN and so on.展开更多
Smoking is a major cause of cancer,heart disease and other afflictions that lead to early mortality.An effective smoking classification mechanism that provides insights into individual smoking habits would assist in i...Smoking is a major cause of cancer,heart disease and other afflictions that lead to early mortality.An effective smoking classification mechanism that provides insights into individual smoking habits would assist in implementing addiction treatment initiatives.Smoking activities often accompany other activities such as drinking or eating.Consequently,smoking activity recognition can be a challenging topic in human activity recognition(HAR).A deep learning framework for smoking activity recognition(SAR)employing smartwatch sensors was proposed together with a deep residual network combined with squeeze-and-excitation modules(ResNetSE)to increase the effectiveness of the SAR framework.The proposed model was tested against basic convolutional neural networks(CNNs)and recurrent neural networks(LSTM,BiLSTM,GRU and BiGRU)to recognize smoking and other similar activities such as drinking,eating and walking using the UT-Smoke dataset.Three different scenarios were investigated for their recognition performances using standard HAR metrics(accuracy,F1-score and the area under the ROC curve).Our proposed ResNetSE outperformed the other basic deep learning networks,with maximum accuracy of 98.63%.展开更多
With the improvement of people’s living standards,the demand for health monitoring and exercise detection is increasing.It is of great significance to study human activity recognition(HAR)methods that are different f...With the improvement of people’s living standards,the demand for health monitoring and exercise detection is increasing.It is of great significance to study human activity recognition(HAR)methods that are different from traditional feature extraction methods.This article uses convolutional neural network(CNN)algorithms in deep learning to automatically extract features of activities related to human life.We used a stochastic gradient descent algorithm to optimize the parameters of the CNN.The trained network model is compressed on STM32CubeMX-AI.Finally,this article introduces the use of neural networks on embedded devices to recognize six human activities of daily life,such as sitting,standing,walking,jogging,upstairs,and downstairs.The acceleration sensor related to human activity information is used to obtain the relevant characteristics of the activity,thereby solving the HAR problem.By drawing the accuracy curve,loss function curve,and confusion matrix diagram of the training model,the recognition effect of the convolutional neural network can be seen more intuitively.After comparing the average accuracy of each set of experiments and the test set of the best model obtained from it,the best model is then selected.展开更多
Recognition of human activity based on convolutional neural network(CNN)has received the interest of researchers in recent years due to its significant improvement in accuracy.A large number of algorithms based on the...Recognition of human activity based on convolutional neural network(CNN)has received the interest of researchers in recent years due to its significant improvement in accuracy.A large number of algorithms based on the deep learning approach have been proposed for activity recognition purpose.However,with the increasing advancements in technologies having limited computational resources,it needs to design an efficient deep learning-based approaches with improved utilization of computational resources.This paper presents a simple and efficient 2-dimensional CNN(2-D CNN)architecture with very small-size convolutional kernel for human activity recognition.The merit of the proposed CNN architecture over standard deep learning architectures is fewer trainable parameters and lesser memory requirement which enables it to train the proposed CNN architecture on low GPU memory-based devices and also works well with smaller as well as larger size datasets.The proposed approach consists of mainly four stages:namely(1)creation of dataset and data augmentation,(2)designing 2-D CNN architecture,(3)the proposed 2-D CNN architecture trained from scratch up to optimum stage,and(4)evaluation of the trained 2-D CNN architecture.To illustrate the effectiveness of the proposed architecture several extensive experiments are conducted on three publicly available datasets,namely IXMAS,YouTube,and UCF101 dataset.The results of the proposed method and its comparison with other state-of-the-art methods demonstrate the usefulness of the proposed method.展开更多
We present a novel model for recognizing long-term complex activities involving multiple persons. The proposed model, named ‘decomposed hidden Markov model’ (DHMM), combines spatial decomposition and hierarchical ab...We present a novel model for recognizing long-term complex activities involving multiple persons. The proposed model, named ‘decomposed hidden Markov model’ (DHMM), combines spatial decomposition and hierarchical abstraction to capture multi-modal, long-term dependent and multi-scale characteristics of activities. Decomposition in space and time offers conceptual advantages of compaction and clarity, and greatly reduces the size of state space as well as the number of parameters. DHMMs are efficient even when the number of persons is variable. We also introduce an efficient approximation algorithm for inference and parameter estimation. Experiments on multi-person activities and multi-modal individual activities demonstrate that DHMMs are more efficient and reliable than familiar models, such as coupled HMMs, hierarchical HMMs, and multi-observation HMMs.展开更多
基金funded by the Ongoing Research Funding Program(ORF-2025-890),King Saud University,Riyadh,Saudi Arabia.
文摘Activity recognition is a challenging topic in the field of computer vision that has various applications,including surveillance systems,industrial automation,and human-computer interaction.Today,the demand for automation has greatly increased across industries worldwide.Real-time detection requires edge devices with limited computational time.This study proposes a novel hybrid deep learning system for human activity recognition(HAR),aiming to enhance the recognition accuracy and reduce the computational time.The proposed system combines a pretrained image classification model with a sequence analysis model.First,the dataset was divided into a training set(70%),validation set(10%),and test set(20%).Second,all the videos were converted into frames and deep-based features were extracted from each frame using convolutional neural networks(CNNs)with a vision transformer.Following that,bidirectional long short-term memory(BiLSTM)-and temporal convolutional network(TCN)-based models were trained using the training set,and their performances were evaluated using the validation set and test set.Four benchmark datasets(UCF11,UCF50,UCF101,and JHMDB)were used to evaluate the performance of the proposed HAR-based system.The experimental results showed that the combination of ConvNeXt and the TCN-based model achieved a recognition accuracy of 97.73%for UCF11,98.81%for UCF50,98.46%for UCF101,and 83.38%for JHMDB,respectively.This represents improvements in the recognition accuracy of 4%,2.67%,3.67%,and 7.08%for the UCF11,UCF50,UCF101,and JHMDB datasets,respectively,over existing models.Moreover,the proposed HAR-based system obtained superior recognition accuracy,shorter computational times,and minimal memory usage compared to the existing models.
基金supported by the BK21 FOUR program of the National Research Foundation of Korea funded by the Ministry of Education(NRF5199991014091)Seok-Won Lee’s work was supported by Institute of Information&Communications Technology Planning&Evaluation(IITP)under the Artificial Intelligence Convergence Innovation Human Resources Development(IITP-2024-RS-2023-00255968)grant funded by the Korea government(MSIT).
文摘This research addresses the performance challenges of ontology-based context-aware and activity recognition techniques in complex environments and abnormal activities,and proposes an optimized ontology framework to improve recognition accuracy and computational efficiency.The method in this paper adopts the event sequence segmentation technique,combines location awareness with time interval reasoning,and improves human activity recognition through ontology reasoning.Compared with the existing methods,the framework performs better when dealing with uncertain data and complex scenes,and the experimental results show that its recognition accuracy is improved by 15.6%and processing time is reduced by 22.4%.In addition,it is found that with the increase of context complexity,the traditional ontology inferencemodel has limitations in abnormal behavior recognition,especially in the case of high data redundancy,which tends to lead to a decrease in recognition accuracy.This study effectively mitigates this problem by optimizing the ontology matching algorithm and combining parallel computing and deep learning techniques to enhance the activity recognition capability in complex environments.
文摘Human Activity Recognition(HAR)represents a rapidly advancing research domain,propelled by continuous developments in sensor technologies and the Internet of Things(IoT).Deep learning has become the dominant paradigm in sensor-based HAR systems,offering significant advantages over traditional machine learning methods by eliminating manual feature extraction,enhancing recognition accuracy for complex activities,and enabling the exploitation of unlabeled data through generative models.This paper provides a comprehensive review of recent advancements and emerging trends in deep learning models developed for sensor-based human activity recognition(HAR)systems.We begin with an overview of fundamental HAR concepts in sensor-driven contexts,followed by a systematic categorization and summary of existing research.Our survey encompasses a wide range of deep learning approaches,including Multi-Layer Perceptrons(MLP),Convolutional Neural Networks(CNN),Recurrent Neural Networks(RNN),Long Short-Term Memory networks(LSTM),Gated Recurrent Units(GRU),Transformers,Deep Belief Networks(DBN),and hybrid architectures.A comparative evaluation of these models is provided,highlighting their performance,architectural complexity,and contributions to the field.Beyond Centralized deep learning models,we examine the role of Federated Learning(FL)in HAR,highlighting current applications and research directions.Finally,we discuss the growing importance of Explainable Artificial Intelligence(XAI)in sensor-based HAR,reviewing recent studies that integrate interpretability methods to enhance transparency and trustworthiness in deep learning-based HAR systems.
基金supported by the Royal Golden Jubilee(RGJ)Ph.D.Programme(Grant No.PHD/0079/2561)through the National Research Council of Thailand(NRCT)and Thailand Research Fund(TRF).
文摘This research investigates the application of multisource data fusion using a Multi-Layer Perceptron (MLP) for Human Activity Recognition (HAR). The study integrates four distinct open-source datasets—WISDM, DaLiAc, MotionSense, and PAMAP2—to develop a generalized MLP model for classifying six human activities. Performance analysis of the fused model for each dataset reveals accuracy rates of 95.83 for WISDM, 97 for DaLiAc, 94.65 for MotionSense, and 98.54 for PAMAP2. A comparative evaluation was conducted between the fused MLP model and the individual dataset models, with the latter tested on separate validation sets. The results indicate that the MLP model, trained on the fused dataset, exhibits superior performance relative to the models trained on individual datasets. This finding suggests that multisource data fusion significantly enhances the generalization and accuracy of HAR systems. The improved performance underscores the potential of integrating diverse data sources to create more robust and comprehensive models for activity recognition.
基金supported by National Natural Science Foundation of China(Nos.61976010,61802011)Beijing Postdoctoral Research Foundation(No.ZZ2019-63)+1 种基金Beijing excellent young talent cultivation project(No.2017000020124G075)“Ri xin”Training Programme Foundation for the Talents by Beijing University of Technology。
文摘Human group activity recognition(GAR)has attracted significant attention from computer vision researchers due to its wide practical applications in security surveillance,social role understanding and sports video analysis.In this paper,we give a comprehensive overview of the advances in group activity recognition in videos during the past 20 years.First,we provide a summary and comparison of 11 GAR video datasets in this field.Second,we survey the group activity recognition methods,including those based on handcrafted features and those based on deep learning networks.For better understanding of the pros and cons of these methods,we compare various models from the past to the present.Finally,we outline several challenging issues and possible directions for future research.From this comprehensive literature review,readers can obtain an overview of progress in group activity recognition for future studies.
文摘Human activity tracking plays a vital role in human–computer interaction.Traditional human activity recognition(HAR)methods adopt special devices,such as cameras and sensors,to track both macro-and micro-activities.Recently,wireless signals have been exploited to track human motion and activities in indoor environments without additional equipment.This study proposes a device-free WiFi-based micro-activity recognition method that leverages the channel state information(CSI)of wireless signals.Different from existed CSI-based microactivity recognition methods,the proposed method extracts both amplitude and phase information from CSI,thereby providing more information and increasing detection accuracy.The proposed method harnesses an effective signal processing technique to reveal the unique patterns of each activity.We applied a machine learning algorithm to recognize the proposed micro-activities.The proposed method has been evaluated in both line of sight(LOS)and none line of sight(NLOS)scenarios,and the empirical results demonstrate the effectiveness of the proposed method with several users.
文摘Human activity recognition is commonly used in several Internet of Things applications to recognize different contexts and respond to them.Deep learning has gained momentum for identifying activities through sensors,smartphones or even surveillance cameras.However,it is often difficult to train deep learning models on constrained IoT devices.The focus of this paper is to propose an alternative model by constructing a Deep Learning-based Human Activity Recognition framework for edge computing,which we call DL-HAR.The goal of this framework is to exploit the capabilities of cloud computing to train a deep learning model and deploy it on less-powerful edge devices for recognition.The idea is to conduct the training of the model in the Cloud and distribute it to the edge nodes.We demonstrate how the DL-HAR can perform human activity recognition at the edge while improving efficiency and accuracy.In order to evaluate the proposed framework,we conducted a comprehensive set of experiments to validate the applicability of DL-HAR.Experimental results on the benchmark dataset show a significant increase in performance compared with the state-of-the-art models.
基金supported by the National Natural Science Foundation of China(60573159)the Guangdong High Technique Project(201100000514)
文摘This paper proposes a hybrid approach for recognizing human activities from trajectories. First, an improved hidden Markov model (HMM) parameter learning algorithm, HMM-PSO, is proposed, which achieves a better balance between the global and local exploitation by the nonlinear update strategy and repulsion operation. Then, the event probability sequence (EPS) which consists of a series of events is computed to describe the unique characteristic of human activities. The anatysis on EPS indicates that it is robust to the changes in viewing direction and contributes to improving the recognition rate. Finally, the effectiveness of the proposed approach is evaluated by data experiments on current popular datasets.
基金This work was supported by financial support from Universiti Sains Malaysia(USM)under FRGS grant number FRGS/1/2020/TK03/USM/02/1the School of Computer Sciences USM for their support.
文摘Human Activity Recognition(HAR)is an active research area due to its applications in pervasive computing,human-computer interaction,artificial intelligence,health care,and social sciences.Moreover,dynamic environments and anthropometric differences between individuals make it harder to recognize actions.This study focused on human activity in video sequences acquired with an RGB camera because of its vast range of real-world applications.It uses two-stream ConvNet to extract spatial and temporal information and proposes a fine-tuned deep neural network.Moreover,the transfer learning paradigm is adopted to extract varied and fixed frames while reusing object identification information.Six state-of-the-art pre-trained models are exploited to find the best model for spatial feature extraction.For temporal sequence,this study uses dense optical flow following the two-stream ConvNet and Bidirectional Long Short TermMemory(BiLSTM)to capture longtermdependencies.Two state-of-the-art datasets,UCF101 and HMDB51,are used for evaluation purposes.In addition,seven state-of-the-art optimizers are used to fine-tune the proposed network parameters.Furthermore,this study utilizes an ensemble mechanism to aggregate spatial-temporal features using a four-stream Convolutional Neural Network(CNN),where two streams use RGB data.In contrast,the other uses optical flow images.Finally,the proposed ensemble approach using max hard voting outperforms state-ofthe-art methods with 96.30%and 90.07%accuracies on the UCF101 and HMDB51 datasets.
基金Project(50808025) supported by the National Natural Science Foundation of ChinaProject(20090162110057) supported by the Doctoral Fund of Ministry of Education,China
文摘A new method for complex activity recognition in videos by key frames was presented. The progressive bisection strategy(PBS) was employed to divide the complex activity into a series of simple activities and the key frames representing the simple activities were extracted by the self-splitting competitive learning(SSCL) algorithm. A new similarity criterion of complex activities was defined. Besides the regular visual factor, the order factor and the interference factor measuring the timing matching relationship of the simple activities and the discontinuous matching relationship of the simple activities respectively were considered. On these bases, the complex human activity recognition could be achieved by calculating their similarities. The recognition error was reduced compared with other methods when ignoring the recognition of simple activities. The proposed method was tested and evaluated on the self-built broadcast gymnastic database and the dancing database. The experimental results prove the superior efficiency.
基金supported by the National Natural Science Foundation of China under Grants No.61075045,No.61273256the Program for New Century Excellent Talents in University under Grant No.NECT-10-0292+1 种基金the National Key Basic Research Program of China(973Program)under Grant No.2011-CB707000the Fundamental Research Funds for the Central Universities
文摘We study the problem of humanactivity recognition from RGB-Depth(RGBD)sensors when the skeletons are not available.The skeleton tracking in Kinect SDK workswell when the human subject is facing thecamera and there are no occlusions.In surveillance or nursing home monitoring scenarios,however,the camera is usually mounted higher than human subjects,and there may beocclusions.The interest-point based approachis widely used in RGB based activity recognition,it can be used in both RGB and depthchannels.Whether we should extract interestpoints independently of each channel or extract interest points from only one of thechannels is discussed in this paper.The goal ofthis paper is to compare the performances ofdifferent methods of extracting interest points.In addition,we have developed a depth mapbased descriptor and built an RGBD dataset,called RGBD-SAR,for senior activity recognition.We show that the best performance isachieved when we extract interest points solely from RGB channels,and combine the RGBbased descriptors with the depth map-baseddescriptors.We also present a baseline performance of the RGBD-SAR dataset.
基金supported by Korea Institute for Advancement of Technology(KIAT)grant fundedthe Korea Government(MOTIE)(P0012724,The Competency Development Program for Industry Specialist)the Soonchunhyang University Research Fund.
文摘Human Activity Recognition(HAR)has been made simple in recent years,thanks to recent advancements made in Artificial Intelligence(AI)techni-ques.These techniques are applied in several areas like security,surveillance,healthcare,human-robot interaction,and entertainment.Since wearable sensor-based HAR system includes in-built sensors,human activities can be categorized based on sensor values.Further,it can also be employed in other applications such as gait diagnosis,observation of children/adult’s cognitive nature,stroke-patient hospital direction,Epilepsy and Parkinson’s disease examination,etc.Recently-developed Artificial Intelligence(AI)techniques,especially Deep Learning(DL)models can be deployed to accomplish effective outcomes on HAR process.With this motivation,the current research paper focuses on designing Intelligent Hyperparameter Tuned Deep Learning-based HAR(IHPTDL-HAR)technique in healthcare environment.The proposed IHPTDL-HAR technique aims at recogniz-ing the human actions in healthcare environment and helps the patients in mana-ging their healthcare service.In addition,the presented model makes use of Hierarchical Clustering(HC)-based outlier detection technique to remove the out-liers.IHPTDL-HAR technique incorporates DL-based Deep Belief Network(DBN)model to recognize the activities of users.Moreover,Harris Hawks Opti-mization(HHO)algorithm is used for hyperparameter tuning of DBN model.Finally,a comprehensive experimental analysis was conducted upon benchmark dataset and the results were examined under different aspects.The experimental results demonstrate that the proposed IHPTDL-HAR technique is a superior per-former compared to other recent techniques under different measures.
基金supported by the Guangxi University of Science and Technology,Liuzhou,China,sponsored by the Researchers Supporting Project(No.XiaoKeBo21Z27,The Construction of Electronic Information Team supported by Artificial Intelligence Theory and Three-dimensional Visual Technology,Yuesheng Zhao)supported by the 2022 Laboratory Fund Project of the Key Laboratory of Space-Based Integrated Information System(No.SpaceInfoNet20221120,Research on the Key Technologies of Intelligent Spatiotemporal Data Engine Based on Space-Based Information Network,Yuesheng Zhao)supported by the 2023 Guangxi University Young and Middle-Aged Teachers’Basic Scientific Research Ability Improvement Project(No.2023KY0352,Research on the Recognition of Psychological Abnormalities in College Students Based on the Fusion of Pulse and EEG Techniques,Yutong Luo).
文摘With the rapid advancement of wearable devices,Human Activities Recognition(HAR)based on these devices has emerged as a prominent research field.The objective of this study is to enhance the recognition performance of HAR by proposing an LSTM-1DCNN recognition algorithm that utilizes a single triaxial accelerometer.This algorithm comprises two branches:one branch consists of a Long and Short-Term Memory Network(LSTM),while the other parallel branch incorporates a one-dimensional Convolutional Neural Network(1DCNN).The parallel architecture of LSTM-1DCNN initially extracts spatial and temporal features from the accelerometer data separately,which are then concatenated and fed into a fully connected neural network for information fusion.In the LSTM-1DCNN architecture,the 1DCNN branch primarily focuses on extracting spatial features during convolution operations,whereas the LSTM branch mainly captures temporal features.Nine sets of accelerometer data from five publicly available HAR datasets are employed for training and evaluation purposes.The performance of the proposed LSTM-1DCNN model is compared with five other HAR algorithms including Decision Tree,Random Forest,Support Vector Machine,1DCNN,and LSTM on these five public datasets.Experimental results demonstrate that the F1-score achieved by the proposed LSTM-1DCNN ranges from 90.36%to 99.68%,with a mean value of 96.22%and standard deviation of 0.03 across all evaluated metrics on these five public datasets-outperforming other existing HAR algorithms significantly in terms of evaluation metrics used in this study.Finally the proposed LSTM-1DCNN is validated in real-world applications by collecting acceleration data of seven human activities for training and testing purposes.Subsequently,the trained HAR algorithm is deployed on Android phones to evaluate its performance.Experimental results demonstrate that the proposed LSTM-1DCNN algorithm achieves an impressive F1-score of 97.67%on our self-built dataset.In conclusion,the fusion of temporal and spatial information in the measured data contributes to the excellent HAR performance and robustness exhibited by the proposed 1DCNN-LSTM architecture.
基金funded by the National Science and Technology Council,Taiwan(Grant No.NSTC 112-2121-M-039-001)by China Medical University(Grant No.CMU112-MF-79).
文摘Artificial intelligence(AI)technology has become integral in the realm of medicine and healthcare,particularly in human activity recognition(HAR)applications such as fitness and rehabilitation tracking.This study introduces a robust coupling analysis framework that integrates four AI-enabled models,combining both machine learning(ML)and deep learning(DL)approaches to evaluate their effectiveness in HAR.The analytical dataset comprises 561 features sourced from the UCI-HAR database,forming the foundation for training the models.Additionally,the MHEALTH database is employed to replicate the modeling process for comparative purposes,while inclusion of the WISDM database,renowned for its challenging features,supports the framework’s resilience and adaptability.The ML-based models employ the methodologies including adaptive neuro-fuzzy inference system(ANFIS),support vector machine(SVM),and random forest(RF),for data training.In contrast,a DL-based model utilizes one-dimensional convolution neural network(1dCNN)to automate feature extraction.Furthermore,the recursive feature elimination(RFE)algorithm,which drives an ML-based estimator to eliminate low-participation features,helps identify the optimal features for enhancing model performance.The best accuracies of the ANFIS,SVM,RF,and 1dCNN models with meticulous featuring process achieve around 90%,96%,91%,and 93%,respectively.Comparative analysis using the MHEALTH dataset showcases the 1dCNN model’s remarkable perfect accuracy(100%),while the RF,SVM,and ANFIS models equipped with selected features achieve accuracies of 99.8%,99.7%,and 96.5%,respectively.Finally,when applied to the WISDM dataset,the DL-based and ML-based models attain accuracies of 91.4%and 87.3%,respectively,aligning with prior research findings.In conclusion,the proposed framework yields HAR models with commendable performance metrics,exhibiting its suitability for integration into the healthcare services system through AI-driven applications.
基金supported by Princess Nourah bint Abdulrahman University Researchers Supporting Project Number(PNURSP2022R194)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.
文摘In this present time,Human Activity Recognition(HAR)has been of considerable aid in the case of health monitoring and recovery.The exploitation of machine learning with an intelligent agent in the area of health informatics gathered using HAR augments the decision-making quality and significance.Although many research works conducted on Smart Healthcare Monitoring,there remain a certain number of pitfalls such as time,overhead,and falsification involved during analysis.Therefore,this paper proposes a Statistical Partial Regression and Support Vector Intelligent Agent Learning(SPR-SVIAL)for Smart Healthcare Monitoring.At first,the Statistical Partial Regression Feature Extraction model is used for data preprocessing along with the dimensionality-reduced features extraction process.Here,the input dataset the continuous beat-to-beat heart data,triaxial accelerometer data,and psychological characteristics were acquired from IoT wearable devices.To attain highly accurate Smart Healthcare Monitoring with less time,Partial Least Square helps extract the dimensionality-reduced features.After that,with these resulting features,SVIAL is proposed for Smart Healthcare Monitoring with the help of Machine Learning and Intelligent Agents to minimize both analysis falsification and overhead.Experimental evaluation is carried out for factors such as time,overhead,and false positive rate accuracy concerning several instances.The quantitatively analyzed results indicate the better performance of our proposed SPR-SVIAL method when compared with two state-of-the-art methods.
文摘Human Action Recognition(HAR)and pose estimation from videos have gained significant attention among research communities due to its applica-tion in several areas namely intelligent surveillance,human robot interaction,robot vision,etc.Though considerable improvements have been made in recent days,design of an effective and accurate action recognition model is yet a difficult process owing to the existence of different obstacles such as variations in camera angle,occlusion,background,movement speed,and so on.From the literature,it is observed that hard to deal with the temporal dimension in the action recognition process.Convolutional neural network(CNN)models could be used widely to solve this.With this motivation,this study designs a novel key point extraction with deep convolutional neural networks based pose estimation(KPE-DCNN)model for activity recognition.The KPE-DCNN technique initially converts the input video into a sequence of frames followed by a three stage process namely key point extraction,hyperparameter tuning,and pose estimation.In the keypoint extraction process an OpenPose model is designed to compute the accurate key-points in the human pose.Then,an optimal DCNN model is developed to classify the human activities label based on the extracted key points.For improving the training process of the DCNN technique,RMSProp optimizer is used to optimally adjust the hyperparameters such as learning rate,batch size,and epoch count.The experimental results tested using benchmark dataset like UCF sports dataset showed that KPE-DCNN technique is able to achieve good results compared with benchmark algorithms like CNN,DBN,SVM,STAL,T-CNN and so on.
基金support provided by Thammasat University Research fund under the TSRI,Contract No.TUFF19/2564 and TUFF24/2565,for the project of“AI Ready City Networking in RUN”,based on the RUN Digital Cluster collaboration schemeThis research project was also supported by the Thailand Science Research and Innonation fund,the University of Phayao(Grant No.FF65-RIM041)supported by King Mongkut’s University of Technology North Bangkok,Contract No.KMUTNB-65-KNOW-02.
文摘Smoking is a major cause of cancer,heart disease and other afflictions that lead to early mortality.An effective smoking classification mechanism that provides insights into individual smoking habits would assist in implementing addiction treatment initiatives.Smoking activities often accompany other activities such as drinking or eating.Consequently,smoking activity recognition can be a challenging topic in human activity recognition(HAR).A deep learning framework for smoking activity recognition(SAR)employing smartwatch sensors was proposed together with a deep residual network combined with squeeze-and-excitation modules(ResNetSE)to increase the effectiveness of the SAR framework.The proposed model was tested against basic convolutional neural networks(CNNs)and recurrent neural networks(LSTM,BiLSTM,GRU and BiGRU)to recognize smoking and other similar activities such as drinking,eating and walking using the UT-Smoke dataset.Three different scenarios were investigated for their recognition performances using standard HAR metrics(accuracy,F1-score and the area under the ROC curve).Our proposed ResNetSE outperformed the other basic deep learning networks,with maximum accuracy of 98.63%.
文摘With the improvement of people’s living standards,the demand for health monitoring and exercise detection is increasing.It is of great significance to study human activity recognition(HAR)methods that are different from traditional feature extraction methods.This article uses convolutional neural network(CNN)algorithms in deep learning to automatically extract features of activities related to human life.We used a stochastic gradient descent algorithm to optimize the parameters of the CNN.The trained network model is compressed on STM32CubeMX-AI.Finally,this article introduces the use of neural networks on embedded devices to recognize six human activities of daily life,such as sitting,standing,walking,jogging,upstairs,and downstairs.The acceleration sensor related to human activity information is used to obtain the relevant characteristics of the activity,thereby solving the HAR problem.By drawing the accuracy curve,loss function curve,and confusion matrix diagram of the training model,the recognition effect of the convolutional neural network can be seen more intuitively.After comparing the average accuracy of each set of experiments and the test set of the best model obtained from it,the best model is then selected.
文摘Recognition of human activity based on convolutional neural network(CNN)has received the interest of researchers in recent years due to its significant improvement in accuracy.A large number of algorithms based on the deep learning approach have been proposed for activity recognition purpose.However,with the increasing advancements in technologies having limited computational resources,it needs to design an efficient deep learning-based approaches with improved utilization of computational resources.This paper presents a simple and efficient 2-dimensional CNN(2-D CNN)architecture with very small-size convolutional kernel for human activity recognition.The merit of the proposed CNN architecture over standard deep learning architectures is fewer trainable parameters and lesser memory requirement which enables it to train the proposed CNN architecture on low GPU memory-based devices and also works well with smaller as well as larger size datasets.The proposed approach consists of mainly four stages:namely(1)creation of dataset and data augmentation,(2)designing 2-D CNN architecture,(3)the proposed 2-D CNN architecture trained from scratch up to optimum stage,and(4)evaluation of the trained 2-D CNN architecture.To illustrate the effectiveness of the proposed architecture several extensive experiments are conducted on three publicly available datasets,namely IXMAS,YouTube,and UCF101 dataset.The results of the proposed method and its comparison with other state-of-the-art methods demonstrate the usefulness of the proposed method.
基金Project (No. 60772050) supported by the National Natural Science Foundation of China
文摘We present a novel model for recognizing long-term complex activities involving multiple persons. The proposed model, named ‘decomposed hidden Markov model’ (DHMM), combines spatial decomposition and hierarchical abstraction to capture multi-modal, long-term dependent and multi-scale characteristics of activities. Decomposition in space and time offers conceptual advantages of compaction and clarity, and greatly reduces the size of state space as well as the number of parameters. DHMMs are efficient even when the number of persons is variable. We also introduce an efficient approximation algorithm for inference and parameter estimation. Experiments on multi-person activities and multi-modal individual activities demonstrate that DHMMs are more efficient and reliable than familiar models, such as coupled HMMs, hierarchical HMMs, and multi-observation HMMs.