Discontinuities in rock masses critically impact the stability and safety of underground engineering.Mainstream discontinuities identificationmethods,which rely on normal vector estimation and clustering algorithms,su...Discontinuities in rock masses critically impact the stability and safety of underground engineering.Mainstream discontinuities identificationmethods,which rely on normal vector estimation and clustering algorithms,suffer from accuracy degradation,omission of critical discontinuities when orientation density is unevenly distributed,and need manual intervention.To overcome these limitations,this paper introduces a novel discontinuities identificationmethod based on geometric feature analysis of rock mass.By analyzing spatial distribution variability of point cloud and integrating an adaptive region growing algorithm,the method accurately detects independent discontinuities under complex geological conditions.Given that rock mass orientations typically follow a Fisher distribution,an adaptive hierarchical clustering algorithm based on statistical analysis is employed to automatically determine the optimal number of structural sets,eliminating the need for preset clusters or thresholds inherent in traditional methods.The proposed approach effectively handles diverse rock mass shapes and sizes,leveraging both local and global geometric features to minimize noise interference.Experimental validation on three real-world rock mass models,alongside comparisons with three conventional directional clustering algorithms,demonstrates superior accuracy and robustness in identifying optimal discontinuity sets.The proposed method offers a reliable and efficienttool for discontinuities detection and grouping in underground engineering,significantlyenhancing design and construction outcomes.展开更多
Human object detection and recognition is essential for elderly monitoring and assisted living however,models relying solely on pose or scene context often struggle in cluttered or visually ambiguous settings.To addre...Human object detection and recognition is essential for elderly monitoring and assisted living however,models relying solely on pose or scene context often struggle in cluttered or visually ambiguous settings.To address this,we present SCENET-3D,a transformer-drivenmultimodal framework that unifies human-centric skeleton features with scene-object semantics for intelligent robotic vision through a three-stage pipeline.In the first stage,scene analysis,rich geometric and texture descriptors are extracted from RGB frames,including surface-normal histograms,angles between neighboring normals,Zernike moments,directional standard deviation,and Gabor-filter responses.In the second stage,scene-object analysis,non-human objects are segmented and represented using local feature descriptors and complementary surface-normal information.In the third stage,human-pose estimation,silhouettes are processed through an enhanced MoveNet to obtain 2D anatomical keypoints,which are fused with depth information and converted into RGB-based point clouds to construct pseudo-3D skeletons.Features from all three stages are fused and fed in a transformer encoder with multi-head attention to resolve visually similar activities.Experiments on UCLA(95.8%),ETRI-Activity3D(89.4%),andCAD-120(91.2%)demonstrate that combining pseudo-3D skeletonswith rich scene-object fusion significantly improves generalizable activity recognition,enabling safer elderly care,natural human–robot interaction,and robust context-aware robotic perception in real-world environments.展开更多
This paper presents a method for hand gesture recognition based on 3D point cloud. Digital image processing technology is used in this research. Based on the 3D point from depth camera, the system firstly extracts som...This paper presents a method for hand gesture recognition based on 3D point cloud. Digital image processing technology is used in this research. Based on the 3D point from depth camera, the system firstly extracts some raw data of the hand. After the data segmentation and preprocessing, three kinds of appearance features are extracted, including the number of stretched fingers, the angles between fingers and the gesture region’s area distribution feature. Based on these features, the system implements the identification of the gestures by using decision tree method. The results of experiment demonstrate that the proposed method is pretty efficient to recognize common gestures with a high accuracy.展开更多
Recognition method of traffic flow change point was put forward based on traffic flow theory and the statistical change point analysis of multiple linear regressions. The method was calibrated and tested with the fiel...Recognition method of traffic flow change point was put forward based on traffic flow theory and the statistical change point analysis of multiple linear regressions. The method was calibrated and tested with the field data of Liantong Road of Zibo city to verify the validity and the feasibility of the theory. The results show that change point method of multiple linear regression can make out the rule of quantitative changes in traffic flow more accurately than ordinary methods. So, the change point method can be applied to traffic information management system more effectively.展开更多
Point cloud based place recognition plays an important role in mobile robotics. In this paper, we propose a weighted aggregation method from structure information adaptively for point cloud place recognition. Firstly,...Point cloud based place recognition plays an important role in mobile robotics. In this paper, we propose a weighted aggregation method from structure information adaptively for point cloud place recognition. Firstly, to preserve the prior distributions and local geometric structures, we fuse learned hidden features with handcrafted features in the beginning. Secondly, we further extract and aggregate adaptively weighted features concerning density and relative spatial information from these fused features, named Weighted Aggregation with Density Estimation (WADE) module. Then, we conduct the WADE block iteratively to group the latent manifold structures. Finally, comparison results on two public datasets Oxford Robotcar and KITTI show that the proposed approach exceeds the comparison approaches on recall rate averagely 7% - 8%.展开更多
3D laser scanning technology is widely used in underground openings for high-precision,rapid,and nondestructive structural evaluations.Segmenting large 3D point cloud datasets,particularly in coal mine roadways with m...3D laser scanning technology is widely used in underground openings for high-precision,rapid,and nondestructive structural evaluations.Segmenting large 3D point cloud datasets,particularly in coal mine roadways with multi-scale targets,remains challenging.This paper proposes an enhanced segmentation method integrating improved PointNet++with a coverage-voted strategy.The coverage-voted strategy reduces data while preserving multi-scale target topology.The segmentation is achieved using an enhanced PointNet++algorithm with a normalization preprocessing head,resulting in a 94%accuracy for common supporting components.Ablation experiments show that the preprocessing head and coverage strategies increase segmentation accuracy by 20%and 2%,respectively,and improve Intersection over Union(IoU)for bearing plate segmentation by 58%and 20%.The accuracy of the current pretraining segmentation model may be affected by variations in surface support components,but it can be readily enhanced through re-optimization with additional labeled point cloud data.This proposed method,combined with a previously developed machine learning model that links rock bolt load and the deformation field of its bearing plate,provides a robust technique for simultaneously measuring the load of multiple rock bolts in a single laser scan.展开更多
The satellite laser ranging (SLR) data quality from the COMPASS was analyzed, and the difference between curve recognition in computer vision and pre-process of SLR data finally proposed a new algorithm for SLR was ...The satellite laser ranging (SLR) data quality from the COMPASS was analyzed, and the difference between curve recognition in computer vision and pre-process of SLR data finally proposed a new algorithm for SLR was discussed data based on curve recognition from points cloud is proposed. The results obtained by the new algorithm are 85 % (or even higher) consistent with that of the screen displaying method, furthermore, the new method can process SLR data automatically, which makes it possible to be used in the development of the COMPASS navigation system.展开更多
Most of the exist action recognition methods mainly utilize spatio-temporal descriptors of single interest point while ignoring their potential integral information, such as spatial distribution information. By combin...Most of the exist action recognition methods mainly utilize spatio-temporal descriptors of single interest point while ignoring their potential integral information, such as spatial distribution information. By combining local spatio-temporal feature and global positional distribution information(PDI) of interest points, a novel motion descriptor is proposed in this paper. The proposed method detects interest points by using an improved interest point detection method. Then, 3-dimensional scale-invariant feature transform(3D SIFT) descriptors are extracted for every interest point. In order to obtain a compact description and efficient computation, the principal component analysis(PCA) method is utilized twice on the 3D SIFT descriptors of single frame and multiple frames. Simultaneously, the PDI of the interest points are computed and combined with the above features. The combined features are quantified and selected and finally tested by using the support vector machine(SVM) recognition algorithm on the public KTH dataset. The testing results have showed that the recognition rate has been significantly improved and the proposed features can more accurately describe human motion with high adaptability to scenarios.展开更多
A new method for iris recognition using a multi-matching system based on a simplified deformable model of the human iris was proposed. The method defined iris feature points and formed the feature space based on a wa...A new method for iris recognition using a multi-matching system based on a simplified deformable model of the human iris was proposed. The method defined iris feature points and formed the feature space based on a wavelet transform. In the matching stage it worked in a crude manner. Driven by a simplified deformable iris model, the crude matching was refined. By means of such multi-matching system, the task of iris recognition was accomplished. This process can preserve the elastic deformation between an input iris image and a template and improve precision for iris recognition. The experimental results indicate the va- lidity of this method.展开更多
Airborne LIDAR can flexibly obtain point cloud data with three-dimensional structural information,which can improve its effectiveness of automatic target recognition in the complex environment.Compared with 2D informa...Airborne LIDAR can flexibly obtain point cloud data with three-dimensional structural information,which can improve its effectiveness of automatic target recognition in the complex environment.Compared with 2D information,3D information performs better in separating objects and background.However,an aircraft platform can have a negative influence on LIDAR obtained data because of various flight attitudes,flight heights and atmospheric disturbances.A structure of global feature based 3D automatic target recognition method for airborne LIDAR is proposed,which is composed of offline phase and online phase.The performance of four global feature descriptors is compared.Considering the summed volume region(SVR) discrepancy in real objects,SVR selection is added into the pre-processing operations to eliminate mismatching clusters compared with the interested target.Highly reliable simulated data are obtained under various sensor’s altitudes,detection distances and atmospheric disturbances.The final experiments results show that the added step increases the recognition rate by above 2.4% and decreases the execution time by about 33%.展开更多
Human Action Recognition(HAR)and pose estimation from videos have gained significant attention among research communities due to its applica-tion in several areas namely intelligent surveillance,human robot interaction...Human Action Recognition(HAR)and pose estimation from videos have gained significant attention among research communities due to its applica-tion in several areas namely intelligent surveillance,human robot interaction,robot vision,etc.Though considerable improvements have been made in recent days,design of an effective and accurate action recognition model is yet a difficult process owing to the existence of different obstacles such as variations in camera angle,occlusion,background,movement speed,and so on.From the literature,it is observed that hard to deal with the temporal dimension in the action recognition process.Convolutional neural network(CNN)models could be used widely to solve this.With this motivation,this study designs a novel key point extraction with deep convolutional neural networks based pose estimation(KPE-DCNN)model for activity recognition.The KPE-DCNN technique initially converts the input video into a sequence of frames followed by a three stage process namely key point extraction,hyperparameter tuning,and pose estimation.In the keypoint extraction process an OpenPose model is designed to compute the accurate key-points in the human pose.Then,an optimal DCNN model is developed to classify the human activities label based on the extracted key points.For improving the training process of the DCNN technique,RMSProp optimizer is used to optimally adjust the hyperparameters such as learning rate,batch size,and epoch count.The experimental results tested using benchmark dataset like UCF sports dataset showed that KPE-DCNN technique is able to achieve good results compared with benchmark algorithms like CNN,DBN,SVM,STAL,T-CNN and so on.展开更多
Although predecessors have made great contributions to the semantic segmentation of 3D indoor scenes,there still exist some challenges in the debris recognition of terrain data.Compared with hundreds of thousands of i...Although predecessors have made great contributions to the semantic segmentation of 3D indoor scenes,there still exist some challenges in the debris recognition of terrain data.Compared with hundreds of thousands of indoor point clouds,the amount of terrain point cloud is up to millions.Apart from that,terrain point cloud data obtained from remote sensing is measured in meters,but the indoor scene is measured in centimeters.In this case,the terrain debris obtained from remote sensing mapping only have dozens of points,which means that sufficient training information cannot be obtained only through the convolution of points.In this paper,we build multi-attribute descriptors containing geometric information and color information to better describe the information in low-precision terrain debris.Therefore,our process is aimed at the multi-attribute descriptors of each point rather than the point.On this basis,an unsupervised classification algorithm is proposed to divide the point cloud into several terrain areas,and regard each area as a graph vertex named super point to form the graph structure,thus effectively reducing the number of the terrain point cloud from millions to hundreds.Then we proposed a graph convolution network by employing PointNet for graph embedding and recurrent gated graph convolutional network for classification.Our experiments show that the terrain point cloud can reduce the amount of data from millions to hundreds through the super point graph based on multi-attribute descriptor and our accuracy reached 91.74%and the IoU reached 94.08%,both of which were significantly better than the current methods such as SEGCloud(Acc:88.63%,IoU:89.29%)and PointCNN(Acc:86.35,IoU:87.26).展开更多
This article is devoted to developing a recognition method of race and ethnicity of individual based on portrait photographs. The reference image is formed based on selected geometric points of the face and a special ...This article is devoted to developing a recognition method of race and ethnicity of individual based on portrait photographs. The reference image is formed based on selected geometric points of the face and a special algorithm for calculating the characteristic parameters of the images available in the database. Next, the original image is compared with the reference images of ethnic groups, and thus, the affiliation of the original image to a specific ethnic group is determined.展开更多
In this work, we consider a homotopic principle for solving large-scale and dense l1underdetermined problems and its applications in image processing and classification. We solve the face recognition problem where the...In this work, we consider a homotopic principle for solving large-scale and dense l1underdetermined problems and its applications in image processing and classification. We solve the face recognition problem where the input image contains corrupted and/or lost pixels. The approach involves two steps: first, the incomplete or corrupted image is subject to an inpainting process, and secondly, the restored image is used to carry out the classification or recognition task. Addressing these two steps involves solving large scale l1minimization problems. To that end, we propose to solve a sequence of linear equality constrained multiquadric problems that depends on a regularization parameter that converges to zero. The procedure generates a central path that converges to a point on the solution set of the l1underdetermined problem. In order to solve each subproblem, a conjugate gradient algorithm is formulated. When noise is present in the model, inexact directions are taken so that an approximate solution is computed faster. This prevents the ill conditioning produced when the conjugate gradient is required to iterate until a zero residual is attained.展开更多
Human Interaction Recognition(HIR)was one of the challenging issues in computer vision research due to the involvement of multiple individuals and their mutual interactions within video frames generated from their mov...Human Interaction Recognition(HIR)was one of the challenging issues in computer vision research due to the involvement of multiple individuals and their mutual interactions within video frames generated from their movements.HIR requires more sophisticated analysis than Human Action Recognition(HAR)since HAR focuses solely on individual activities like walking or running,while HIR involves the interactions between people.This research aims to develop a robust system for recognizing five common human interactions,such as hugging,kicking,pushing,pointing,and no interaction,from video sequences using multiple cameras.In this study,a hybrid Deep Learning(DL)and Machine Learning(ML)model was employed to improve classification accuracy and generalizability.The dataset was collected in an indoor environment with four-channel cameras capturing the five types of interactions among 13 participants.The data was processed using a DL model with a fine-tuned ResNet(Residual Networks)architecture based on 2D Convolutional Neural Network(CNN)layers for feature extraction.Subsequently,machine learning models were trained and utilized for interaction classification using six commonly used ML algorithms,including SVM,KNN,RF,DT,NB,and XGBoost.The results demonstrate a high accuracy of 95.45%in classifying human interactions.The hybrid approach enabled effective learning,resulting in highly accurate performance across different interaction types.Future work will explore more complex scenarios involving multiple individuals based on the application of this architecture.展开更多
Understanding the conformational characteristics of polymers is key to elucidating their physical properties.Cyclic polymers,defined by their closed-loop structures,inherently differ from linear polymers possessing di...Understanding the conformational characteristics of polymers is key to elucidating their physical properties.Cyclic polymers,defined by their closed-loop structures,inherently differ from linear polymers possessing distinct chain ends.Despite these structural differences,both types of polymers exhibit locally random-walk-like conformations,making it challenging to detect subtle spatial variations using conventional methods.In this study,we address this challenge by integrating molecular dynamics simulations with point cloud neural networks to analyze the spatial conformations of cyclic and linear polymers.By utilizing the Dynamic Graph CNN(DGCNN)model,we classify polymer conformations based on the 3D coordinates of monomers,capturing local and global topological differences without considering chain connectivity sequentiality.Our findings reveal that the optimal local structural feature unit size scales linearly with molecular weight,aligning with theoretical predictions.Additionally,interpretability techniques such as Grad-CAM and SHAP identify significant conformational differences:cyclic polymers tend to form prolate ellipsoid shapes with pronounced elongation along the major axis,while linear polymers show elongated ends with more spherical centers.These findings reveal subtle yet critical differences in local conformations between cyclic and linear polymers that were previously difficult to discern,providing deeper insights into polymer structure-property relationships and offering guidance for future polymer science advancements.展开更多
基金the National Key Research and Development Program of China(Grant No.2023YFC3009400).
文摘Discontinuities in rock masses critically impact the stability and safety of underground engineering.Mainstream discontinuities identificationmethods,which rely on normal vector estimation and clustering algorithms,suffer from accuracy degradation,omission of critical discontinuities when orientation density is unevenly distributed,and need manual intervention.To overcome these limitations,this paper introduces a novel discontinuities identificationmethod based on geometric feature analysis of rock mass.By analyzing spatial distribution variability of point cloud and integrating an adaptive region growing algorithm,the method accurately detects independent discontinuities under complex geological conditions.Given that rock mass orientations typically follow a Fisher distribution,an adaptive hierarchical clustering algorithm based on statistical analysis is employed to automatically determine the optimal number of structural sets,eliminating the need for preset clusters or thresholds inherent in traditional methods.The proposed approach effectively handles diverse rock mass shapes and sizes,leveraging both local and global geometric features to minimize noise interference.Experimental validation on three real-world rock mass models,alongside comparisons with three conventional directional clustering algorithms,demonstrates superior accuracy and robustness in identifying optimal discontinuity sets.The proposed method offers a reliable and efficienttool for discontinuities detection and grouping in underground engineering,significantlyenhancing design and construction outcomes.
基金funded by Princess Nourah bint Abdulrahman University Researchers Supporting Project number(PNURSP2025R410),Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.
文摘Human object detection and recognition is essential for elderly monitoring and assisted living however,models relying solely on pose or scene context often struggle in cluttered or visually ambiguous settings.To address this,we present SCENET-3D,a transformer-drivenmultimodal framework that unifies human-centric skeleton features with scene-object semantics for intelligent robotic vision through a three-stage pipeline.In the first stage,scene analysis,rich geometric and texture descriptors are extracted from RGB frames,including surface-normal histograms,angles between neighboring normals,Zernike moments,directional standard deviation,and Gabor-filter responses.In the second stage,scene-object analysis,non-human objects are segmented and represented using local feature descriptors and complementary surface-normal information.In the third stage,human-pose estimation,silhouettes are processed through an enhanced MoveNet to obtain 2D anatomical keypoints,which are fused with depth information and converted into RGB-based point clouds to construct pseudo-3D skeletons.Features from all three stages are fused and fed in a transformer encoder with multi-head attention to resolve visually similar activities.Experiments on UCLA(95.8%),ETRI-Activity3D(89.4%),andCAD-120(91.2%)demonstrate that combining pseudo-3D skeletonswith rich scene-object fusion significantly improves generalizable activity recognition,enabling safer elderly care,natural human–robot interaction,and robust context-aware robotic perception in real-world environments.
文摘This paper presents a method for hand gesture recognition based on 3D point cloud. Digital image processing technology is used in this research. Based on the 3D point from depth camera, the system firstly extracts some raw data of the hand. After the data segmentation and preprocessing, three kinds of appearance features are extracted, including the number of stretched fingers, the angles between fingers and the gesture region’s area distribution feature. Based on these features, the system implements the identification of the gestures by using decision tree method. The results of experiment demonstrate that the proposed method is pretty efficient to recognize common gestures with a high accuracy.
基金National Natural Science Foundations of China(No. 61074140,No. 60974094)Young Teacher Development Support Project of Shandong University of Technology,China
文摘Recognition method of traffic flow change point was put forward based on traffic flow theory and the statistical change point analysis of multiple linear regressions. The method was calibrated and tested with the field data of Liantong Road of Zibo city to verify the validity and the feasibility of the theory. The results show that change point method of multiple linear regression can make out the rule of quantitative changes in traffic flow more accurately than ordinary methods. So, the change point method can be applied to traffic information management system more effectively.
文摘Point cloud based place recognition plays an important role in mobile robotics. In this paper, we propose a weighted aggregation method from structure information adaptively for point cloud place recognition. Firstly, to preserve the prior distributions and local geometric structures, we fuse learned hidden features with handcrafted features in the beginning. Secondly, we further extract and aggregate adaptively weighted features concerning density and relative spatial information from these fused features, named Weighted Aggregation with Density Estimation (WADE) module. Then, we conduct the WADE block iteratively to group the latent manifold structures. Finally, comparison results on two public datasets Oxford Robotcar and KITTI show that the proposed approach exceeds the comparison approaches on recall rate averagely 7% - 8%.
基金supported by the National Natural Science Foundation of China(Grant Nos.52304139,52325403)the CCTEG Coal Mining Research Institute funding(Grant No.KCYJY-2024-MS-10).
文摘3D laser scanning technology is widely used in underground openings for high-precision,rapid,and nondestructive structural evaluations.Segmenting large 3D point cloud datasets,particularly in coal mine roadways with multi-scale targets,remains challenging.This paper proposes an enhanced segmentation method integrating improved PointNet++with a coverage-voted strategy.The coverage-voted strategy reduces data while preserving multi-scale target topology.The segmentation is achieved using an enhanced PointNet++algorithm with a normalization preprocessing head,resulting in a 94%accuracy for common supporting components.Ablation experiments show that the preprocessing head and coverage strategies increase segmentation accuracy by 20%and 2%,respectively,and improve Intersection over Union(IoU)for bearing plate segmentation by 58%and 20%.The accuracy of the current pretraining segmentation model may be affected by variations in surface support components,but it can be readily enhanced through re-optimization with additional labeled point cloud data.This proposed method,combined with a previously developed machine learning model that links rock bolt load and the deformation field of its bearing plate,provides a robust technique for simultaneously measuring the load of multiple rock bolts in a single laser scan.
文摘The satellite laser ranging (SLR) data quality from the COMPASS was analyzed, and the difference between curve recognition in computer vision and pre-process of SLR data finally proposed a new algorithm for SLR was discussed data based on curve recognition from points cloud is proposed. The results obtained by the new algorithm are 85 % (or even higher) consistent with that of the screen displaying method, furthermore, the new method can process SLR data automatically, which makes it possible to be used in the development of the COMPASS navigation system.
基金supported by National Natural Science Foundation of China(No.61103123)Scientific Research Foundation for the Returned Overseas Chinese Scholars,State Education Ministry
文摘Most of the exist action recognition methods mainly utilize spatio-temporal descriptors of single interest point while ignoring their potential integral information, such as spatial distribution information. By combining local spatio-temporal feature and global positional distribution information(PDI) of interest points, a novel motion descriptor is proposed in this paper. The proposed method detects interest points by using an improved interest point detection method. Then, 3-dimensional scale-invariant feature transform(3D SIFT) descriptors are extracted for every interest point. In order to obtain a compact description and efficient computation, the principal component analysis(PCA) method is utilized twice on the 3D SIFT descriptors of single frame and multiple frames. Simultaneously, the PDI of the interest points are computed and combined with the above features. The combined features are quantified and selected and finally tested by using the support vector machine(SVM) recognition algorithm on the public KTH dataset. The testing results have showed that the recognition rate has been significantly improved and the proposed features can more accurately describe human motion with high adaptability to scenarios.
文摘A new method for iris recognition using a multi-matching system based on a simplified deformable model of the human iris was proposed. The method defined iris feature points and formed the feature space based on a wavelet transform. In the matching stage it worked in a crude manner. Driven by a simplified deformable iris model, the crude matching was refined. By means of such multi-matching system, the task of iris recognition was accomplished. This process can preserve the elastic deformation between an input iris image and a template and improve precision for iris recognition. The experimental results indicate the va- lidity of this method.
基金This research was supported by National Natural Science Foundation of China(No.61271353,61871389)Major Funding Projects of National University of Defense Technology(No.ZK18-01-02)Foundation of State Key Laboratory of Pulsed Power Laser Technology(No.SKL2018ZR09).
文摘Airborne LIDAR can flexibly obtain point cloud data with three-dimensional structural information,which can improve its effectiveness of automatic target recognition in the complex environment.Compared with 2D information,3D information performs better in separating objects and background.However,an aircraft platform can have a negative influence on LIDAR obtained data because of various flight attitudes,flight heights and atmospheric disturbances.A structure of global feature based 3D automatic target recognition method for airborne LIDAR is proposed,which is composed of offline phase and online phase.The performance of four global feature descriptors is compared.Considering the summed volume region(SVR) discrepancy in real objects,SVR selection is added into the pre-processing operations to eliminate mismatching clusters compared with the interested target.Highly reliable simulated data are obtained under various sensor’s altitudes,detection distances and atmospheric disturbances.The final experiments results show that the added step increases the recognition rate by above 2.4% and decreases the execution time by about 33%.
文摘Human Action Recognition(HAR)and pose estimation from videos have gained significant attention among research communities due to its applica-tion in several areas namely intelligent surveillance,human robot interaction,robot vision,etc.Though considerable improvements have been made in recent days,design of an effective and accurate action recognition model is yet a difficult process owing to the existence of different obstacles such as variations in camera angle,occlusion,background,movement speed,and so on.From the literature,it is observed that hard to deal with the temporal dimension in the action recognition process.Convolutional neural network(CNN)models could be used widely to solve this.With this motivation,this study designs a novel key point extraction with deep convolutional neural networks based pose estimation(KPE-DCNN)model for activity recognition.The KPE-DCNN technique initially converts the input video into a sequence of frames followed by a three stage process namely key point extraction,hyperparameter tuning,and pose estimation.In the keypoint extraction process an OpenPose model is designed to compute the accurate key-points in the human pose.Then,an optimal DCNN model is developed to classify the human activities label based on the extracted key points.For improving the training process of the DCNN technique,RMSProp optimizer is used to optimally adjust the hyperparameters such as learning rate,batch size,and epoch count.The experimental results tested using benchmark dataset like UCF sports dataset showed that KPE-DCNN technique is able to achieve good results compared with benchmark algorithms like CNN,DBN,SVM,STAL,T-CNN and so on.
基金This research was funded by grant from the Key Research and Development Program of Shaanxi Province(2018NY-127,2019ZDLNY07-02-01,2020NY-205)National Undergraduate Training Program for Innovation and entrepreneurship plan(S201910712240,X201910712080).
文摘Although predecessors have made great contributions to the semantic segmentation of 3D indoor scenes,there still exist some challenges in the debris recognition of terrain data.Compared with hundreds of thousands of indoor point clouds,the amount of terrain point cloud is up to millions.Apart from that,terrain point cloud data obtained from remote sensing is measured in meters,but the indoor scene is measured in centimeters.In this case,the terrain debris obtained from remote sensing mapping only have dozens of points,which means that sufficient training information cannot be obtained only through the convolution of points.In this paper,we build multi-attribute descriptors containing geometric information and color information to better describe the information in low-precision terrain debris.Therefore,our process is aimed at the multi-attribute descriptors of each point rather than the point.On this basis,an unsupervised classification algorithm is proposed to divide the point cloud into several terrain areas,and regard each area as a graph vertex named super point to form the graph structure,thus effectively reducing the number of the terrain point cloud from millions to hundreds.Then we proposed a graph convolution network by employing PointNet for graph embedding and recurrent gated graph convolutional network for classification.Our experiments show that the terrain point cloud can reduce the amount of data from millions to hundreds through the super point graph based on multi-attribute descriptor and our accuracy reached 91.74%and the IoU reached 94.08%,both of which were significantly better than the current methods such as SEGCloud(Acc:88.63%,IoU:89.29%)and PointCNN(Acc:86.35,IoU:87.26).
文摘This article is devoted to developing a recognition method of race and ethnicity of individual based on portrait photographs. The reference image is formed based on selected geometric points of the face and a special algorithm for calculating the characteristic parameters of the images available in the database. Next, the original image is compared with the reference images of ethnic groups, and thus, the affiliation of the original image to a specific ethnic group is determined.
文摘In this work, we consider a homotopic principle for solving large-scale and dense l1underdetermined problems and its applications in image processing and classification. We solve the face recognition problem where the input image contains corrupted and/or lost pixels. The approach involves two steps: first, the incomplete or corrupted image is subject to an inpainting process, and secondly, the restored image is used to carry out the classification or recognition task. Addressing these two steps involves solving large scale l1minimization problems. To that end, we propose to solve a sequence of linear equality constrained multiquadric problems that depends on a regularization parameter that converges to zero. The procedure generates a central path that converges to a point on the solution set of the l1underdetermined problem. In order to solve each subproblem, a conjugate gradient algorithm is formulated. When noise is present in the model, inexact directions are taken so that an approximate solution is computed faster. This prevents the ill conditioning produced when the conjugate gradient is required to iterate until a zero residual is attained.
基金supported by the National Research Foundation of Korea(NRF)grant funded by the Korea government(MSIT)(No.RS-2023-00218176)and the Soonchunhyang University Research Fund.
文摘Human Interaction Recognition(HIR)was one of the challenging issues in computer vision research due to the involvement of multiple individuals and their mutual interactions within video frames generated from their movements.HIR requires more sophisticated analysis than Human Action Recognition(HAR)since HAR focuses solely on individual activities like walking or running,while HIR involves the interactions between people.This research aims to develop a robust system for recognizing five common human interactions,such as hugging,kicking,pushing,pointing,and no interaction,from video sequences using multiple cameras.In this study,a hybrid Deep Learning(DL)and Machine Learning(ML)model was employed to improve classification accuracy and generalizability.The dataset was collected in an indoor environment with four-channel cameras capturing the five types of interactions among 13 participants.The data was processed using a DL model with a fine-tuned ResNet(Residual Networks)architecture based on 2D Convolutional Neural Network(CNN)layers for feature extraction.Subsequently,machine learning models were trained and utilized for interaction classification using six commonly used ML algorithms,including SVM,KNN,RF,DT,NB,and XGBoost.The results demonstrate a high accuracy of 95.45%in classifying human interactions.The hybrid approach enabled effective learning,resulting in highly accurate performance across different interaction types.Future work will explore more complex scenarios involving multiple individuals based on the application of this architecture.
基金the National Key R&D Program of China(No.2022YFB3707303)National Natural Science Foundation of China(No.52293471)。
文摘Understanding the conformational characteristics of polymers is key to elucidating their physical properties.Cyclic polymers,defined by their closed-loop structures,inherently differ from linear polymers possessing distinct chain ends.Despite these structural differences,both types of polymers exhibit locally random-walk-like conformations,making it challenging to detect subtle spatial variations using conventional methods.In this study,we address this challenge by integrating molecular dynamics simulations with point cloud neural networks to analyze the spatial conformations of cyclic and linear polymers.By utilizing the Dynamic Graph CNN(DGCNN)model,we classify polymer conformations based on the 3D coordinates of monomers,capturing local and global topological differences without considering chain connectivity sequentiality.Our findings reveal that the optimal local structural feature unit size scales linearly with molecular weight,aligning with theoretical predictions.Additionally,interpretability techniques such as Grad-CAM and SHAP identify significant conformational differences:cyclic polymers tend to form prolate ellipsoid shapes with pronounced elongation along the major axis,while linear polymers show elongated ends with more spherical centers.These findings reveal subtle yet critical differences in local conformations between cyclic and linear polymers that were previously difficult to discern,providing deeper insights into polymer structure-property relationships and offering guidance for future polymer science advancements.