Activity recognition is a challenging topic in the field of computer vision that has various applications,including surveillance systems,industrial automation,and human-computer interaction.Today,the demand for automa...Activity recognition is a challenging topic in the field of computer vision that has various applications,including surveillance systems,industrial automation,and human-computer interaction.Today,the demand for automation has greatly increased across industries worldwide.Real-time detection requires edge devices with limited computational time.This study proposes a novel hybrid deep learning system for human activity recognition(HAR),aiming to enhance the recognition accuracy and reduce the computational time.The proposed system combines a pretrained image classification model with a sequence analysis model.First,the dataset was divided into a training set(70%),validation set(10%),and test set(20%).Second,all the videos were converted into frames and deep-based features were extracted from each frame using convolutional neural networks(CNNs)with a vision transformer.Following that,bidirectional long short-term memory(BiLSTM)-and temporal convolutional network(TCN)-based models were trained using the training set,and their performances were evaluated using the validation set and test set.Four benchmark datasets(UCF11,UCF50,UCF101,and JHMDB)were used to evaluate the performance of the proposed HAR-based system.The experimental results showed that the combination of ConvNeXt and the TCN-based model achieved a recognition accuracy of 97.73%for UCF11,98.81%for UCF50,98.46%for UCF101,and 83.38%for JHMDB,respectively.This represents improvements in the recognition accuracy of 4%,2.67%,3.67%,and 7.08%for the UCF11,UCF50,UCF101,and JHMDB datasets,respectively,over existing models.Moreover,the proposed HAR-based system obtained superior recognition accuracy,shorter computational times,and minimal memory usage compared to the existing models.展开更多
This research addresses the performance challenges of ontology-based context-aware and activity recognition techniques in complex environments and abnormal activities,and proposes an optimized ontology framework to im...This research addresses the performance challenges of ontology-based context-aware and activity recognition techniques in complex environments and abnormal activities,and proposes an optimized ontology framework to improve recognition accuracy and computational efficiency.The method in this paper adopts the event sequence segmentation technique,combines location awareness with time interval reasoning,and improves human activity recognition through ontology reasoning.Compared with the existing methods,the framework performs better when dealing with uncertain data and complex scenes,and the experimental results show that its recognition accuracy is improved by 15.6%and processing time is reduced by 22.4%.In addition,it is found that with the increase of context complexity,the traditional ontology inferencemodel has limitations in abnormal behavior recognition,especially in the case of high data redundancy,which tends to lead to a decrease in recognition accuracy.This study effectively mitigates this problem by optimizing the ontology matching algorithm and combining parallel computing and deep learning techniques to enhance the activity recognition capability in complex environments.展开更多
This research investigates the application of multisource data fusion using a Multi-Layer Perceptron (MLP) for Human Activity Recognition (HAR). The study integrates four distinct open-source datasets—WISDM, DaLiAc, ...This research investigates the application of multisource data fusion using a Multi-Layer Perceptron (MLP) for Human Activity Recognition (HAR). The study integrates four distinct open-source datasets—WISDM, DaLiAc, MotionSense, and PAMAP2—to develop a generalized MLP model for classifying six human activities. Performance analysis of the fused model for each dataset reveals accuracy rates of 95.83 for WISDM, 97 for DaLiAc, 94.65 for MotionSense, and 98.54 for PAMAP2. A comparative evaluation was conducted between the fused MLP model and the individual dataset models, with the latter tested on separate validation sets. The results indicate that the MLP model, trained on the fused dataset, exhibits superior performance relative to the models trained on individual datasets. This finding suggests that multisource data fusion significantly enhances the generalization and accuracy of HAR systems. The improved performance underscores the potential of integrating diverse data sources to create more robust and comprehensive models for activity recognition.展开更多
Molecular recognition of bioreceptors and enzymes relies on orthogonal interactions with small molecules within their cavity. To date, Chinese scientists have developed three types of strategies for introducing active...Molecular recognition of bioreceptors and enzymes relies on orthogonal interactions with small molecules within their cavity. To date, Chinese scientists have developed three types of strategies for introducing active sites inside the cavity of macrocyclic arenes to better mimic molecular recognition of bioreceptors and enzymes.The editorial aims to enlighten scientists in this field when they develop novel macrocycles for molecular recognition, supramolecular assembly, and applications.展开更多
Efficient recognition and selective capture of NH_(3)is not only beneficial for increasing the productivity of the synthetic NH_(3)industry but also for reducing air pollution.For this purpose,a group of deep eutectic...Efficient recognition and selective capture of NH_(3)is not only beneficial for increasing the productivity of the synthetic NH_(3)industry but also for reducing air pollution.For this purpose,a group of deep eutectic solvents(DESs)consisting of glycolic acid(GA)and phenol(PhOH)with low viscosities and multiple active sites was rationally designed in this work.Experimental results show that the GA^(+)PhOH DESs display extremely fast NH_(3)absorption rates(within 51 s for equilibrium)and high NH_(3)solubility.At 313.2 K,the NH_(3)absorption capacities of GA^(+)PhOH(1:1)reach 6.75 mol/kg(at 10.7 kPa)and 14.72 mol/kg(at 201.0 kPa).The NH_(3)solubility of GA^(+)PhOH DESs at low pressures were minimally changed after more than 100 days of air exposure.In addition,the NH_(3)solubility of GA^(+)PhOH DESs remain highly stable in 10 consecutive absorption-desorption cycles.More importantly,NH_(3)can be selectively captured by GA^(+)PhOH DESs from NH_(3)/CO_(2)/N_(2)and NH_(3)/N_(2)/H_(2)mixtures.1H-NMR,Fourier transform infrared and theoretical calculations were performed to reveal the intrinsic mechanism for the efficient recognition of NH_(3)by GA^(+)PhOH DESs.展开更多
Activity and motion recognition using Wi-Fi signals,mainly channel state information(CSI),has captured the interest of many researchers in recent years.Many research studies have achieved splendid results with the hel...Activity and motion recognition using Wi-Fi signals,mainly channel state information(CSI),has captured the interest of many researchers in recent years.Many research studies have achieved splendid results with the help of machine learning models from different applications such as healthcare services,sign language translation,security,context awareness,and the internet of things.Nevertheless,most of these adopted studies have some shortcomings in the machine learning algorithms as they rely on recurrence and convolutions and,thus,precluding smooth sequential computation.Therefore,in this paper,we propose a deep-learning approach based solely on attention,i.e.,the sole Self-Attention Mechanism model(Sole-SAM),for activity and motion recognition using Wi-Fi signals.The Sole-SAM was deployed to learn the features representing different activities and motions from the raw CSI data.Experiments were carried out to evaluate the performance of the proposed Sole-SAM architecture.The experimental results indicated that our proposed system took significantly less time to train than models that rely on recurrence and convolutions like Long Short-Term Memory(LSTM)and Recurrent Neural Network(RNN).Sole-SAM archived a 0.94%accuracy level,which is 0.04%better than RNN and 0.02%better than LSTM.展开更多
Artificial intelligence(AI)technology has become integral in the realm of medicine and healthcare,particularly in human activity recognition(HAR)applications such as fitness and rehabilitation tracking.This study intr...Artificial intelligence(AI)technology has become integral in the realm of medicine and healthcare,particularly in human activity recognition(HAR)applications such as fitness and rehabilitation tracking.This study introduces a robust coupling analysis framework that integrates four AI-enabled models,combining both machine learning(ML)and deep learning(DL)approaches to evaluate their effectiveness in HAR.The analytical dataset comprises 561 features sourced from the UCI-HAR database,forming the foundation for training the models.Additionally,the MHEALTH database is employed to replicate the modeling process for comparative purposes,while inclusion of the WISDM database,renowned for its challenging features,supports the framework’s resilience and adaptability.The ML-based models employ the methodologies including adaptive neuro-fuzzy inference system(ANFIS),support vector machine(SVM),and random forest(RF),for data training.In contrast,a DL-based model utilizes one-dimensional convolution neural network(1dCNN)to automate feature extraction.Furthermore,the recursive feature elimination(RFE)algorithm,which drives an ML-based estimator to eliminate low-participation features,helps identify the optimal features for enhancing model performance.The best accuracies of the ANFIS,SVM,RF,and 1dCNN models with meticulous featuring process achieve around 90%,96%,91%,and 93%,respectively.Comparative analysis using the MHEALTH dataset showcases the 1dCNN model’s remarkable perfect accuracy(100%),while the RF,SVM,and ANFIS models equipped with selected features achieve accuracies of 99.8%,99.7%,and 96.5%,respectively.Finally,when applied to the WISDM dataset,the DL-based and ML-based models attain accuracies of 91.4%and 87.3%,respectively,aligning with prior research findings.In conclusion,the proposed framework yields HAR models with commendable performance metrics,exhibiting its suitability for integration into the healthcare services system through AI-driven applications.展开更多
In this present time,Human Activity Recognition(HAR)has been of considerable aid in the case of health monitoring and recovery.The exploitation of machine learning with an intelligent agent in the area of health infor...In this present time,Human Activity Recognition(HAR)has been of considerable aid in the case of health monitoring and recovery.The exploitation of machine learning with an intelligent agent in the area of health informatics gathered using HAR augments the decision-making quality and significance.Although many research works conducted on Smart Healthcare Monitoring,there remain a certain number of pitfalls such as time,overhead,and falsification involved during analysis.Therefore,this paper proposes a Statistical Partial Regression and Support Vector Intelligent Agent Learning(SPR-SVIAL)for Smart Healthcare Monitoring.At first,the Statistical Partial Regression Feature Extraction model is used for data preprocessing along with the dimensionality-reduced features extraction process.Here,the input dataset the continuous beat-to-beat heart data,triaxial accelerometer data,and psychological characteristics were acquired from IoT wearable devices.To attain highly accurate Smart Healthcare Monitoring with less time,Partial Least Square helps extract the dimensionality-reduced features.After that,with these resulting features,SVIAL is proposed for Smart Healthcare Monitoring with the help of Machine Learning and Intelligent Agents to minimize both analysis falsification and overhead.Experimental evaluation is carried out for factors such as time,overhead,and false positive rate accuracy concerning several instances.The quantitatively analyzed results indicate the better performance of our proposed SPR-SVIAL method when compared with two state-of-the-art methods.展开更多
RFID-based human activity recognition(HAR)attracts attention due to its convenience,noninvasiveness,and privacy protection.Existing RFID-based HAR methods use modeling,CNN,or LSTM to extract features effectively.Still...RFID-based human activity recognition(HAR)attracts attention due to its convenience,noninvasiveness,and privacy protection.Existing RFID-based HAR methods use modeling,CNN,or LSTM to extract features effectively.Still,they have shortcomings:1)requiring complex hand-crafted data cleaning processes and 2)only addressing single-person activity recognition based on specific RF signals.To solve these problems,this paper proposes a novel device-free method based on Time-streaming Multiscale Transformer called TransTM.This model leverages the Transformer's powerful data fitting capabilities to take raw RFID RSSI data as input without pre-processing.Concretely,we propose a multiscale convolutional hybrid Transformer to capture behavioral features that recognizes singlehuman activities and human-to-human interactions.Compared with existing CNN-and LSTM-based methods,the Transformer-based method has more data fitting power,generalization,and scalability.Furthermore,using RF signals,our method achieves an excellent classification effect on human behaviorbased classification tasks.Experimental results on the actual RFID datasets show that this model achieves a high average recognition accuracy(99.1%).The dataset we collected for detecting RFID-based indoor human activities will be published.展开更多
The rapidly advancing Convolutional Neural Networks(CNNs)have brought about a paradigm shift in various computer vision tasks,while also garnering increasing interest and application in sensor-based Human Activity Rec...The rapidly advancing Convolutional Neural Networks(CNNs)have brought about a paradigm shift in various computer vision tasks,while also garnering increasing interest and application in sensor-based Human Activity Recognition(HAR)efforts.However,the significant computational demands and memory requirements hinder the practical deployment of deep networks in resource-constrained systems.This paper introduces a novel network pruning method based on the energy spectral density of data in the frequency domain,which reduces the model’s depth and accelerates activity inference.Unlike traditional pruning methods that focus on the spatial domain and the importance of filters,this method converts sensor data,such as HAR data,to the frequency domain for analysis.It emphasizes the low-frequency components by calculating their energy spectral density values.Subsequently,filters that meet the predefined thresholds are retained,and redundant filters are removed,leading to a significant reduction in model size without compromising performance or incurring additional computational costs.Notably,the proposed algorithm’s effectiveness is empirically validated on a standard five-layer CNNs backbone architecture.The computational feasibility and data sensitivity of the proposed scheme are thoroughly examined.Impressively,the classification accuracy on three benchmark HAR datasets UCI-HAR,WISDM,and PAMAP2 reaches 96.20%,98.40%,and 92.38%,respectively.Concurrently,our strategy achieves a reduction in Floating Point Operations(FLOPs)by 90.73%,93.70%,and 90.74%,respectively,along with a corresponding decrease in memory consumption by 90.53%,93.43%,and 90.05%.展开更多
Human Activity Recognition (HAR) is an important way for lower limb exoskeleton robots to implement human-computer collaboration with users. Most of the existing methods in this field focus on a simple scenario recogn...Human Activity Recognition (HAR) is an important way for lower limb exoskeleton robots to implement human-computer collaboration with users. Most of the existing methods in this field focus on a simple scenario recognizing activities for specific users, which does not consider the individual differences among users and cannot adapt to new users. In order to improve the generalization ability of HAR model, this paper proposes a novel method that combines the theories in transfer learning and active learning to mitigate the cross-subject issue, so that it can enable lower limb exoskeleton robots being used in more complex scenarios. First, a neural network based on convolutional neural networks (CNN) is designed, which can extract temporal and spatial features from sensor signals collected from different parts of human body. It can recognize human activities with high accuracy after trained by labeled data. Second, in order to improve the cross-subject adaptation ability of the pre-trained model, we design a cross-subject HAR algorithm based on sparse interrogation and label propagation. Through leave-one-subject-out validation on two widely-used public datasets with existing methods, our method achieves average accuracies of 91.77% on DSAD and 80.97% on PAMAP2, respectively. The experimental results demonstrate the potential of implementing cross-subject HAR for lower limb exoskeleton robots.展开更多
One exciting area within computer vision is classifying human activities, which has diverse applications like medical informatics, human-computer interaction, surveillance, and task monitoring systems. In the healthca...One exciting area within computer vision is classifying human activities, which has diverse applications like medical informatics, human-computer interaction, surveillance, and task monitoring systems. In the healthcare field, understanding and classifying patients’ activities is crucial for providing doctors with essential information for medication reactions and diagnosis. While some research methods already exist, utilizing machine learning and soft computational algorithms to recognize human activity from videos and images, there’s ongoing exploration of more advanced computer vision techniques. This paper introduces a straightforward and effective automated approach that involves five key steps: preprocessing, feature extraction technique, feature selection, feature fusion, and finally classification. To evaluate the proposed approach, two commonly used benchmark datasets KTH and Weizmann are employed for training, validation, and testing of ML classifiers. The study’s findings show that the first and second datasets had remarkable accuracy rates of 99.94% and 99.80%, respectively. When compared to existing methods, our approach stands out in terms of sensitivity, accuracy, precision, and specificity evaluation metrics. In essence, this paper demonstrates a practical method for automatically classifying human activities using an optimal feature fusion and deep learning approach, promising a great result that could benefit various fields, particularly in healthcare.展开更多
Human activity recognition is a significant area of research in artificial intelligence for surveillance,healthcare,sports,and human-computer interaction applications.The article benchmarks the performance of You Only...Human activity recognition is a significant area of research in artificial intelligence for surveillance,healthcare,sports,and human-computer interaction applications.The article benchmarks the performance of You Only Look Once version 11-based(YOLOv11-based)architecture for multi-class human activity recognition.The article benchmarks the performance of You Only Look Once version 11-based(YOLOv11-based)architecture for multi-class human activity recognition.The dataset consists of 14,186 images across 19 activity classes,from dynamic activities such as running and swimming to static activities such as sitting and sleeping.Preprocessing included resizing all images to 512512 pixels,annotating them in YOLO’s bounding box format,and applying data augmentation methods such as flipping,rotation,and cropping to enhance model generalization.The proposed model was trained for 100 epochs with adaptive learning rate methods and hyperparameter optimization for performance improvement,with a mAP@0.5 of 74.93%and a mAP@0.5-0.95 of 64.11%,outperforming previous versions of YOLO(v10,v9,and v8)and general-purpose architectures like ResNet50 and EfficientNet.It exhibited improved precision and recall for all activity classes with high precision values of 0.76 for running,0.79 for swimming,0.80 for sitting,and 0.81 for sleeping,and was tested for real-time deployment with an inference time of 8.9 ms per image,being computationally light.Proposed YOLOv11’s improvements are attributed to architectural advancements like a more complex feature extraction process,better attention modules,and an anchor-free detection mechanism.While YOLOv10 was extremely stable in static activity recognition,YOLOv9 performed well in dynamic environments but suffered from overfitting,and YOLOv8,while being a decent baseline,failed to differentiate between overlapping static activities.The experimental results determine proposed YOLOv11 to be the most appropriate model,providing an ideal balance between accuracy,computational efficiency,and robustness for real-world deployment.Nevertheless,there exist certain issues to be addressed,particularly in discriminating against visually similar activities and the use of publicly available datasets.Future research will entail the inclusion of 3D data and multimodal sensor inputs,such as depth and motion information,for enhancing recognition accuracy and generalizability to challenging real-world environments.展开更多
The effectiveness of facial expression recognition(FER)algorithms hinges on the model’s quality and the availability of a substantial amount of labeled expression data.However,labeling large datasets demands signific...The effectiveness of facial expression recognition(FER)algorithms hinges on the model’s quality and the availability of a substantial amount of labeled expression data.However,labeling large datasets demands significant human,time,and financial resources.Although active learning methods have mitigated the dependency on extensive labeled data,a cold-start problem persists in small to medium-sized expression recognition datasets.This issue arises because the initial labeled data often fails to represent the full spectrum of facial expression characteristics.This paper introduces an active learning approach that integrates uncertainty estimation,aiming to improve the precision of facial expression recognition regardless of dataset scale variations.The method is divided into two primary phases.First,the model undergoes self-supervised pre-training using contrastive learning and uncertainty estimation to bolster its feature extraction capabilities.Second,the model is fine-tuned using the prior knowledge obtained from the pre-training phase to significantly improve recognition accuracy.In the pretraining phase,the model employs contrastive learning to extract fundamental feature representations from the complete unlabeled dataset.These features are then weighted through a self-attention mechanism with rank regularization.Subsequently,data from the low-weighted set is relabeled to further refine the model’s feature extraction ability.The pre-trained model is then utilized in active learning to select and label information-rich samples more efficiently.Experimental results demonstrate that the proposed method significantly outperforms existing approaches,achieving an improvement in recognition accuracy of 5.09%and 3.82%over the best existing active learning methods,Margin,and Least Confidence methods,respectively,and a 1.61%improvement compared to the conventional segmented active learning method.展开更多
Human group activity recognition(GAR)has attracted significant attention from computer vision researchers due to its wide practical applications in security surveillance,social role understanding and sports video anal...Human group activity recognition(GAR)has attracted significant attention from computer vision researchers due to its wide practical applications in security surveillance,social role understanding and sports video analysis.In this paper,we give a comprehensive overview of the advances in group activity recognition in videos during the past 20 years.First,we provide a summary and comparison of 11 GAR video datasets in this field.Second,we survey the group activity recognition methods,including those based on handcrafted features and those based on deep learning networks.For better understanding of the pros and cons of these methods,we compare various models from the past to the present.Finally,we outline several challenging issues and possible directions for future research.From this comprehensive literature review,readers can obtain an overview of progress in group activity recognition for future studies.展开更多
Human activity recognition is commonly used in several Internet of Things applications to recognize different contexts and respond to them.Deep learning has gained momentum for identifying activities through sensors,s...Human activity recognition is commonly used in several Internet of Things applications to recognize different contexts and respond to them.Deep learning has gained momentum for identifying activities through sensors,smartphones or even surveillance cameras.However,it is often difficult to train deep learning models on constrained IoT devices.The focus of this paper is to propose an alternative model by constructing a Deep Learning-based Human Activity Recognition framework for edge computing,which we call DL-HAR.The goal of this framework is to exploit the capabilities of cloud computing to train a deep learning model and deploy it on less-powerful edge devices for recognition.The idea is to conduct the training of the model in the Cloud and distribute it to the edge nodes.We demonstrate how the DL-HAR can perform human activity recognition at the edge while improving efficiency and accuracy.In order to evaluate the proposed framework,we conducted a comprehensive set of experiments to validate the applicability of DL-HAR.Experimental results on the benchmark dataset show a significant increase in performance compared with the state-of-the-art models.展开更多
Human activity tracking plays a vital role in human–computer interaction.Traditional human activity recognition(HAR)methods adopt special devices,such as cameras and sensors,to track both macro-and micro-activities.R...Human activity tracking plays a vital role in human–computer interaction.Traditional human activity recognition(HAR)methods adopt special devices,such as cameras and sensors,to track both macro-and micro-activities.Recently,wireless signals have been exploited to track human motion and activities in indoor environments without additional equipment.This study proposes a device-free WiFi-based micro-activity recognition method that leverages the channel state information(CSI)of wireless signals.Different from existed CSI-based microactivity recognition methods,the proposed method extracts both amplitude and phase information from CSI,thereby providing more information and increasing detection accuracy.The proposed method harnesses an effective signal processing technique to reveal the unique patterns of each activity.We applied a machine learning algorithm to recognize the proposed micro-activities.The proposed method has been evaluated in both line of sight(LOS)and none line of sight(NLOS)scenarios,and the empirical results demonstrate the effectiveness of the proposed method with several users.展开更多
This paper proposes a hybrid approach for recognizing human activities from trajectories. First, an improved hidden Markov model (HMM) parameter learning algorithm, HMM-PSO, is proposed, which achieves a better bala...This paper proposes a hybrid approach for recognizing human activities from trajectories. First, an improved hidden Markov model (HMM) parameter learning algorithm, HMM-PSO, is proposed, which achieves a better balance between the global and local exploitation by the nonlinear update strategy and repulsion operation. Then, the event probability sequence (EPS) which consists of a series of events is computed to describe the unique characteristic of human activities. The anatysis on EPS indicates that it is robust to the changes in viewing direction and contributes to improving the recognition rate. Finally, the effectiveness of the proposed approach is evaluated by data experiments on current popular datasets.展开更多
A new method for complex activity recognition in videos by key frames was presented. The progressive bisection strategy(PBS) was employed to divide the complex activity into a series of simple activities and the key f...A new method for complex activity recognition in videos by key frames was presented. The progressive bisection strategy(PBS) was employed to divide the complex activity into a series of simple activities and the key frames representing the simple activities were extracted by the self-splitting competitive learning(SSCL) algorithm. A new similarity criterion of complex activities was defined. Besides the regular visual factor, the order factor and the interference factor measuring the timing matching relationship of the simple activities and the discontinuous matching relationship of the simple activities respectively were considered. On these bases, the complex human activity recognition could be achieved by calculating their similarities. The recognition error was reduced compared with other methods when ignoring the recognition of simple activities. The proposed method was tested and evaluated on the self-built broadcast gymnastic database and the dancing database. The experimental results prove the superior efficiency.展开更多
We study the problem of humanactivity recognition from RGB-Depth(RGBD)sensors when the skeletons are not available.The skeleton tracking in Kinect SDK workswell when the human subject is facing thecamera and there are...We study the problem of humanactivity recognition from RGB-Depth(RGBD)sensors when the skeletons are not available.The skeleton tracking in Kinect SDK workswell when the human subject is facing thecamera and there are no occlusions.In surveillance or nursing home monitoring scenarios,however,the camera is usually mounted higher than human subjects,and there may beocclusions.The interest-point based approachis widely used in RGB based activity recognition,it can be used in both RGB and depthchannels.Whether we should extract interestpoints independently of each channel or extract interest points from only one of thechannels is discussed in this paper.The goal ofthis paper is to compare the performances ofdifferent methods of extracting interest points.In addition,we have developed a depth mapbased descriptor and built an RGBD dataset,called RGBD-SAR,for senior activity recognition.We show that the best performance isachieved when we extract interest points solely from RGB channels,and combine the RGBbased descriptors with the depth map-baseddescriptors.We also present a baseline performance of the RGBD-SAR dataset.展开更多
基金funded by the Ongoing Research Funding Program(ORF-2025-890),King Saud University,Riyadh,Saudi Arabia.
文摘Activity recognition is a challenging topic in the field of computer vision that has various applications,including surveillance systems,industrial automation,and human-computer interaction.Today,the demand for automation has greatly increased across industries worldwide.Real-time detection requires edge devices with limited computational time.This study proposes a novel hybrid deep learning system for human activity recognition(HAR),aiming to enhance the recognition accuracy and reduce the computational time.The proposed system combines a pretrained image classification model with a sequence analysis model.First,the dataset was divided into a training set(70%),validation set(10%),and test set(20%).Second,all the videos were converted into frames and deep-based features were extracted from each frame using convolutional neural networks(CNNs)with a vision transformer.Following that,bidirectional long short-term memory(BiLSTM)-and temporal convolutional network(TCN)-based models were trained using the training set,and their performances were evaluated using the validation set and test set.Four benchmark datasets(UCF11,UCF50,UCF101,and JHMDB)were used to evaluate the performance of the proposed HAR-based system.The experimental results showed that the combination of ConvNeXt and the TCN-based model achieved a recognition accuracy of 97.73%for UCF11,98.81%for UCF50,98.46%for UCF101,and 83.38%for JHMDB,respectively.This represents improvements in the recognition accuracy of 4%,2.67%,3.67%,and 7.08%for the UCF11,UCF50,UCF101,and JHMDB datasets,respectively,over existing models.Moreover,the proposed HAR-based system obtained superior recognition accuracy,shorter computational times,and minimal memory usage compared to the existing models.
基金supported by the BK21 FOUR program of the National Research Foundation of Korea funded by the Ministry of Education(NRF5199991014091)Seok-Won Lee’s work was supported by Institute of Information&Communications Technology Planning&Evaluation(IITP)under the Artificial Intelligence Convergence Innovation Human Resources Development(IITP-2024-RS-2023-00255968)grant funded by the Korea government(MSIT).
文摘This research addresses the performance challenges of ontology-based context-aware and activity recognition techniques in complex environments and abnormal activities,and proposes an optimized ontology framework to improve recognition accuracy and computational efficiency.The method in this paper adopts the event sequence segmentation technique,combines location awareness with time interval reasoning,and improves human activity recognition through ontology reasoning.Compared with the existing methods,the framework performs better when dealing with uncertain data and complex scenes,and the experimental results show that its recognition accuracy is improved by 15.6%and processing time is reduced by 22.4%.In addition,it is found that with the increase of context complexity,the traditional ontology inferencemodel has limitations in abnormal behavior recognition,especially in the case of high data redundancy,which tends to lead to a decrease in recognition accuracy.This study effectively mitigates this problem by optimizing the ontology matching algorithm and combining parallel computing and deep learning techniques to enhance the activity recognition capability in complex environments.
基金supported by the Royal Golden Jubilee(RGJ)Ph.D.Programme(Grant No.PHD/0079/2561)through the National Research Council of Thailand(NRCT)and Thailand Research Fund(TRF).
文摘This research investigates the application of multisource data fusion using a Multi-Layer Perceptron (MLP) for Human Activity Recognition (HAR). The study integrates four distinct open-source datasets—WISDM, DaLiAc, MotionSense, and PAMAP2—to develop a generalized MLP model for classifying six human activities. Performance analysis of the fused model for each dataset reveals accuracy rates of 95.83 for WISDM, 97 for DaLiAc, 94.65 for MotionSense, and 98.54 for PAMAP2. A comparative evaluation was conducted between the fused MLP model and the individual dataset models, with the latter tested on separate validation sets. The results indicate that the MLP model, trained on the fused dataset, exhibits superior performance relative to the models trained on individual datasets. This finding suggests that multisource data fusion significantly enhances the generalization and accuracy of HAR systems. The improved performance underscores the potential of integrating diverse data sources to create more robust and comprehensive models for activity recognition.
文摘Molecular recognition of bioreceptors and enzymes relies on orthogonal interactions with small molecules within their cavity. To date, Chinese scientists have developed three types of strategies for introducing active sites inside the cavity of macrocyclic arenes to better mimic molecular recognition of bioreceptors and enzymes.The editorial aims to enlighten scientists in this field when they develop novel macrocycles for molecular recognition, supramolecular assembly, and applications.
基金supported by the National Natural Science Foundation of China(22008033)the Major Program of Qingyuan Innovation Laboratory.
文摘Efficient recognition and selective capture of NH_(3)is not only beneficial for increasing the productivity of the synthetic NH_(3)industry but also for reducing air pollution.For this purpose,a group of deep eutectic solvents(DESs)consisting of glycolic acid(GA)and phenol(PhOH)with low viscosities and multiple active sites was rationally designed in this work.Experimental results show that the GA^(+)PhOH DESs display extremely fast NH_(3)absorption rates(within 51 s for equilibrium)and high NH_(3)solubility.At 313.2 K,the NH_(3)absorption capacities of GA^(+)PhOH(1:1)reach 6.75 mol/kg(at 10.7 kPa)and 14.72 mol/kg(at 201.0 kPa).The NH_(3)solubility of GA^(+)PhOH DESs at low pressures were minimally changed after more than 100 days of air exposure.In addition,the NH_(3)solubility of GA^(+)PhOH DESs remain highly stable in 10 consecutive absorption-desorption cycles.More importantly,NH_(3)can be selectively captured by GA^(+)PhOH DESs from NH_(3)/CO_(2)/N_(2)and NH_(3)/N_(2)/H_(2)mixtures.1H-NMR,Fourier transform infrared and theoretical calculations were performed to reveal the intrinsic mechanism for the efficient recognition of NH_(3)by GA^(+)PhOH DESs.
基金This work was supported by Foshan Science and Technology Innovation Special Fund Project(No.BK22BF004 and No.BK20AF004),Guangdong Province,China.
文摘Activity and motion recognition using Wi-Fi signals,mainly channel state information(CSI),has captured the interest of many researchers in recent years.Many research studies have achieved splendid results with the help of machine learning models from different applications such as healthcare services,sign language translation,security,context awareness,and the internet of things.Nevertheless,most of these adopted studies have some shortcomings in the machine learning algorithms as they rely on recurrence and convolutions and,thus,precluding smooth sequential computation.Therefore,in this paper,we propose a deep-learning approach based solely on attention,i.e.,the sole Self-Attention Mechanism model(Sole-SAM),for activity and motion recognition using Wi-Fi signals.The Sole-SAM was deployed to learn the features representing different activities and motions from the raw CSI data.Experiments were carried out to evaluate the performance of the proposed Sole-SAM architecture.The experimental results indicated that our proposed system took significantly less time to train than models that rely on recurrence and convolutions like Long Short-Term Memory(LSTM)and Recurrent Neural Network(RNN).Sole-SAM archived a 0.94%accuracy level,which is 0.04%better than RNN and 0.02%better than LSTM.
基金funded by the National Science and Technology Council,Taiwan(Grant No.NSTC 112-2121-M-039-001)by China Medical University(Grant No.CMU112-MF-79).
文摘Artificial intelligence(AI)technology has become integral in the realm of medicine and healthcare,particularly in human activity recognition(HAR)applications such as fitness and rehabilitation tracking.This study introduces a robust coupling analysis framework that integrates four AI-enabled models,combining both machine learning(ML)and deep learning(DL)approaches to evaluate their effectiveness in HAR.The analytical dataset comprises 561 features sourced from the UCI-HAR database,forming the foundation for training the models.Additionally,the MHEALTH database is employed to replicate the modeling process for comparative purposes,while inclusion of the WISDM database,renowned for its challenging features,supports the framework’s resilience and adaptability.The ML-based models employ the methodologies including adaptive neuro-fuzzy inference system(ANFIS),support vector machine(SVM),and random forest(RF),for data training.In contrast,a DL-based model utilizes one-dimensional convolution neural network(1dCNN)to automate feature extraction.Furthermore,the recursive feature elimination(RFE)algorithm,which drives an ML-based estimator to eliminate low-participation features,helps identify the optimal features for enhancing model performance.The best accuracies of the ANFIS,SVM,RF,and 1dCNN models with meticulous featuring process achieve around 90%,96%,91%,and 93%,respectively.Comparative analysis using the MHEALTH dataset showcases the 1dCNN model’s remarkable perfect accuracy(100%),while the RF,SVM,and ANFIS models equipped with selected features achieve accuracies of 99.8%,99.7%,and 96.5%,respectively.Finally,when applied to the WISDM dataset,the DL-based and ML-based models attain accuracies of 91.4%and 87.3%,respectively,aligning with prior research findings.In conclusion,the proposed framework yields HAR models with commendable performance metrics,exhibiting its suitability for integration into the healthcare services system through AI-driven applications.
基金supported by Princess Nourah bint Abdulrahman University Researchers Supporting Project Number(PNURSP2022R194)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.
文摘In this present time,Human Activity Recognition(HAR)has been of considerable aid in the case of health monitoring and recovery.The exploitation of machine learning with an intelligent agent in the area of health informatics gathered using HAR augments the decision-making quality and significance.Although many research works conducted on Smart Healthcare Monitoring,there remain a certain number of pitfalls such as time,overhead,and falsification involved during analysis.Therefore,this paper proposes a Statistical Partial Regression and Support Vector Intelligent Agent Learning(SPR-SVIAL)for Smart Healthcare Monitoring.At first,the Statistical Partial Regression Feature Extraction model is used for data preprocessing along with the dimensionality-reduced features extraction process.Here,the input dataset the continuous beat-to-beat heart data,triaxial accelerometer data,and psychological characteristics were acquired from IoT wearable devices.To attain highly accurate Smart Healthcare Monitoring with less time,Partial Least Square helps extract the dimensionality-reduced features.After that,with these resulting features,SVIAL is proposed for Smart Healthcare Monitoring with the help of Machine Learning and Intelligent Agents to minimize both analysis falsification and overhead.Experimental evaluation is carried out for factors such as time,overhead,and false positive rate accuracy concerning several instances.The quantitatively analyzed results indicate the better performance of our proposed SPR-SVIAL method when compared with two state-of-the-art methods.
基金the Strategic Priority Research Program of Chinese Academy of Sciences(Grant No.XDC02040300)for this study.
文摘RFID-based human activity recognition(HAR)attracts attention due to its convenience,noninvasiveness,and privacy protection.Existing RFID-based HAR methods use modeling,CNN,or LSTM to extract features effectively.Still,they have shortcomings:1)requiring complex hand-crafted data cleaning processes and 2)only addressing single-person activity recognition based on specific RF signals.To solve these problems,this paper proposes a novel device-free method based on Time-streaming Multiscale Transformer called TransTM.This model leverages the Transformer's powerful data fitting capabilities to take raw RFID RSSI data as input without pre-processing.Concretely,we propose a multiscale convolutional hybrid Transformer to capture behavioral features that recognizes singlehuman activities and human-to-human interactions.Compared with existing CNN-and LSTM-based methods,the Transformer-based method has more data fitting power,generalization,and scalability.Furthermore,using RF signals,our method achieves an excellent classification effect on human behaviorbased classification tasks.Experimental results on the actual RFID datasets show that this model achieves a high average recognition accuracy(99.1%).The dataset we collected for detecting RFID-based indoor human activities will be published.
基金supported by National Natural Science Foundation of China(Nos.61902158 and 62202210).
文摘The rapidly advancing Convolutional Neural Networks(CNNs)have brought about a paradigm shift in various computer vision tasks,while also garnering increasing interest and application in sensor-based Human Activity Recognition(HAR)efforts.However,the significant computational demands and memory requirements hinder the practical deployment of deep networks in resource-constrained systems.This paper introduces a novel network pruning method based on the energy spectral density of data in the frequency domain,which reduces the model’s depth and accelerates activity inference.Unlike traditional pruning methods that focus on the spatial domain and the importance of filters,this method converts sensor data,such as HAR data,to the frequency domain for analysis.It emphasizes the low-frequency components by calculating their energy spectral density values.Subsequently,filters that meet the predefined thresholds are retained,and redundant filters are removed,leading to a significant reduction in model size without compromising performance or incurring additional computational costs.Notably,the proposed algorithm’s effectiveness is empirically validated on a standard five-layer CNNs backbone architecture.The computational feasibility and data sensitivity of the proposed scheme are thoroughly examined.Impressively,the classification accuracy on three benchmark HAR datasets UCI-HAR,WISDM,and PAMAP2 reaches 96.20%,98.40%,and 92.38%,respectively.Concurrently,our strategy achieves a reduction in Floating Point Operations(FLOPs)by 90.73%,93.70%,and 90.74%,respectively,along with a corresponding decrease in memory consumption by 90.53%,93.43%,and 90.05%.
文摘Human Activity Recognition (HAR) is an important way for lower limb exoskeleton robots to implement human-computer collaboration with users. Most of the existing methods in this field focus on a simple scenario recognizing activities for specific users, which does not consider the individual differences among users and cannot adapt to new users. In order to improve the generalization ability of HAR model, this paper proposes a novel method that combines the theories in transfer learning and active learning to mitigate the cross-subject issue, so that it can enable lower limb exoskeleton robots being used in more complex scenarios. First, a neural network based on convolutional neural networks (CNN) is designed, which can extract temporal and spatial features from sensor signals collected from different parts of human body. It can recognize human activities with high accuracy after trained by labeled data. Second, in order to improve the cross-subject adaptation ability of the pre-trained model, we design a cross-subject HAR algorithm based on sparse interrogation and label propagation. Through leave-one-subject-out validation on two widely-used public datasets with existing methods, our method achieves average accuracies of 91.77% on DSAD and 80.97% on PAMAP2, respectively. The experimental results demonstrate the potential of implementing cross-subject HAR for lower limb exoskeleton robots.
文摘One exciting area within computer vision is classifying human activities, which has diverse applications like medical informatics, human-computer interaction, surveillance, and task monitoring systems. In the healthcare field, understanding and classifying patients’ activities is crucial for providing doctors with essential information for medication reactions and diagnosis. While some research methods already exist, utilizing machine learning and soft computational algorithms to recognize human activity from videos and images, there’s ongoing exploration of more advanced computer vision techniques. This paper introduces a straightforward and effective automated approach that involves five key steps: preprocessing, feature extraction technique, feature selection, feature fusion, and finally classification. To evaluate the proposed approach, two commonly used benchmark datasets KTH and Weizmann are employed for training, validation, and testing of ML classifiers. The study’s findings show that the first and second datasets had remarkable accuracy rates of 99.94% and 99.80%, respectively. When compared to existing methods, our approach stands out in terms of sensitivity, accuracy, precision, and specificity evaluation metrics. In essence, this paper demonstrates a practical method for automatically classifying human activities using an optimal feature fusion and deep learning approach, promising a great result that could benefit various fields, particularly in healthcare.
基金supported by King Saud University,Riyadh,Saudi Arabia,under Ongoing Research Funding Program(ORF-2025-951).
文摘Human activity recognition is a significant area of research in artificial intelligence for surveillance,healthcare,sports,and human-computer interaction applications.The article benchmarks the performance of You Only Look Once version 11-based(YOLOv11-based)architecture for multi-class human activity recognition.The article benchmarks the performance of You Only Look Once version 11-based(YOLOv11-based)architecture for multi-class human activity recognition.The dataset consists of 14,186 images across 19 activity classes,from dynamic activities such as running and swimming to static activities such as sitting and sleeping.Preprocessing included resizing all images to 512512 pixels,annotating them in YOLO’s bounding box format,and applying data augmentation methods such as flipping,rotation,and cropping to enhance model generalization.The proposed model was trained for 100 epochs with adaptive learning rate methods and hyperparameter optimization for performance improvement,with a mAP@0.5 of 74.93%and a mAP@0.5-0.95 of 64.11%,outperforming previous versions of YOLO(v10,v9,and v8)and general-purpose architectures like ResNet50 and EfficientNet.It exhibited improved precision and recall for all activity classes with high precision values of 0.76 for running,0.79 for swimming,0.80 for sitting,and 0.81 for sleeping,and was tested for real-time deployment with an inference time of 8.9 ms per image,being computationally light.Proposed YOLOv11’s improvements are attributed to architectural advancements like a more complex feature extraction process,better attention modules,and an anchor-free detection mechanism.While YOLOv10 was extremely stable in static activity recognition,YOLOv9 performed well in dynamic environments but suffered from overfitting,and YOLOv8,while being a decent baseline,failed to differentiate between overlapping static activities.The experimental results determine proposed YOLOv11 to be the most appropriate model,providing an ideal balance between accuracy,computational efficiency,and robustness for real-world deployment.Nevertheless,there exist certain issues to be addressed,particularly in discriminating against visually similar activities and the use of publicly available datasets.Future research will entail the inclusion of 3D data and multimodal sensor inputs,such as depth and motion information,for enhancing recognition accuracy and generalizability to challenging real-world environments.
基金supported by National Science Foundation of China(61971078)Chongqing Municipal Education Commission Science and Technology Major Project(KJZDM202301901).
文摘The effectiveness of facial expression recognition(FER)algorithms hinges on the model’s quality and the availability of a substantial amount of labeled expression data.However,labeling large datasets demands significant human,time,and financial resources.Although active learning methods have mitigated the dependency on extensive labeled data,a cold-start problem persists in small to medium-sized expression recognition datasets.This issue arises because the initial labeled data often fails to represent the full spectrum of facial expression characteristics.This paper introduces an active learning approach that integrates uncertainty estimation,aiming to improve the precision of facial expression recognition regardless of dataset scale variations.The method is divided into two primary phases.First,the model undergoes self-supervised pre-training using contrastive learning and uncertainty estimation to bolster its feature extraction capabilities.Second,the model is fine-tuned using the prior knowledge obtained from the pre-training phase to significantly improve recognition accuracy.In the pretraining phase,the model employs contrastive learning to extract fundamental feature representations from the complete unlabeled dataset.These features are then weighted through a self-attention mechanism with rank regularization.Subsequently,data from the low-weighted set is relabeled to further refine the model’s feature extraction ability.The pre-trained model is then utilized in active learning to select and label information-rich samples more efficiently.Experimental results demonstrate that the proposed method significantly outperforms existing approaches,achieving an improvement in recognition accuracy of 5.09%and 3.82%over the best existing active learning methods,Margin,and Least Confidence methods,respectively,and a 1.61%improvement compared to the conventional segmented active learning method.
基金supported by National Natural Science Foundation of China(Nos.61976010,61802011)Beijing Postdoctoral Research Foundation(No.ZZ2019-63)+1 种基金Beijing excellent young talent cultivation project(No.2017000020124G075)“Ri xin”Training Programme Foundation for the Talents by Beijing University of Technology。
文摘Human group activity recognition(GAR)has attracted significant attention from computer vision researchers due to its wide practical applications in security surveillance,social role understanding and sports video analysis.In this paper,we give a comprehensive overview of the advances in group activity recognition in videos during the past 20 years.First,we provide a summary and comparison of 11 GAR video datasets in this field.Second,we survey the group activity recognition methods,including those based on handcrafted features and those based on deep learning networks.For better understanding of the pros and cons of these methods,we compare various models from the past to the present.Finally,we outline several challenging issues and possible directions for future research.From this comprehensive literature review,readers can obtain an overview of progress in group activity recognition for future studies.
文摘Human activity recognition is commonly used in several Internet of Things applications to recognize different contexts and respond to them.Deep learning has gained momentum for identifying activities through sensors,smartphones or even surveillance cameras.However,it is often difficult to train deep learning models on constrained IoT devices.The focus of this paper is to propose an alternative model by constructing a Deep Learning-based Human Activity Recognition framework for edge computing,which we call DL-HAR.The goal of this framework is to exploit the capabilities of cloud computing to train a deep learning model and deploy it on less-powerful edge devices for recognition.The idea is to conduct the training of the model in the Cloud and distribute it to the edge nodes.We demonstrate how the DL-HAR can perform human activity recognition at the edge while improving efficiency and accuracy.In order to evaluate the proposed framework,we conducted a comprehensive set of experiments to validate the applicability of DL-HAR.Experimental results on the benchmark dataset show a significant increase in performance compared with the state-of-the-art models.
文摘Human activity tracking plays a vital role in human–computer interaction.Traditional human activity recognition(HAR)methods adopt special devices,such as cameras and sensors,to track both macro-and micro-activities.Recently,wireless signals have been exploited to track human motion and activities in indoor environments without additional equipment.This study proposes a device-free WiFi-based micro-activity recognition method that leverages the channel state information(CSI)of wireless signals.Different from existed CSI-based microactivity recognition methods,the proposed method extracts both amplitude and phase information from CSI,thereby providing more information and increasing detection accuracy.The proposed method harnesses an effective signal processing technique to reveal the unique patterns of each activity.We applied a machine learning algorithm to recognize the proposed micro-activities.The proposed method has been evaluated in both line of sight(LOS)and none line of sight(NLOS)scenarios,and the empirical results demonstrate the effectiveness of the proposed method with several users.
基金supported by the National Natural Science Foundation of China(60573159)the Guangdong High Technique Project(201100000514)
文摘This paper proposes a hybrid approach for recognizing human activities from trajectories. First, an improved hidden Markov model (HMM) parameter learning algorithm, HMM-PSO, is proposed, which achieves a better balance between the global and local exploitation by the nonlinear update strategy and repulsion operation. Then, the event probability sequence (EPS) which consists of a series of events is computed to describe the unique characteristic of human activities. The anatysis on EPS indicates that it is robust to the changes in viewing direction and contributes to improving the recognition rate. Finally, the effectiveness of the proposed approach is evaluated by data experiments on current popular datasets.
基金Project(50808025) supported by the National Natural Science Foundation of ChinaProject(20090162110057) supported by the Doctoral Fund of Ministry of Education,China
文摘A new method for complex activity recognition in videos by key frames was presented. The progressive bisection strategy(PBS) was employed to divide the complex activity into a series of simple activities and the key frames representing the simple activities were extracted by the self-splitting competitive learning(SSCL) algorithm. A new similarity criterion of complex activities was defined. Besides the regular visual factor, the order factor and the interference factor measuring the timing matching relationship of the simple activities and the discontinuous matching relationship of the simple activities respectively were considered. On these bases, the complex human activity recognition could be achieved by calculating their similarities. The recognition error was reduced compared with other methods when ignoring the recognition of simple activities. The proposed method was tested and evaluated on the self-built broadcast gymnastic database and the dancing database. The experimental results prove the superior efficiency.
基金supported by the National Natural Science Foundation of China under Grants No.61075045,No.61273256the Program for New Century Excellent Talents in University under Grant No.NECT-10-0292+1 种基金the National Key Basic Research Program of China(973Program)under Grant No.2011-CB707000the Fundamental Research Funds for the Central Universities
文摘We study the problem of humanactivity recognition from RGB-Depth(RGBD)sensors when the skeletons are not available.The skeleton tracking in Kinect SDK workswell when the human subject is facing thecamera and there are no occlusions.In surveillance or nursing home monitoring scenarios,however,the camera is usually mounted higher than human subjects,and there may beocclusions.The interest-point based approachis widely used in RGB based activity recognition,it can be used in both RGB and depthchannels.Whether we should extract interestpoints independently of each channel or extract interest points from only one of thechannels is discussed in this paper.The goal ofthis paper is to compare the performances ofdifferent methods of extracting interest points.In addition,we have developed a depth mapbased descriptor and built an RGBD dataset,called RGBD-SAR,for senior activity recognition.We show that the best performance isachieved when we extract interest points solely from RGB channels,and combine the RGBbased descriptors with the depth map-baseddescriptors.We also present a baseline performance of the RGBD-SAR dataset.