Gestures are one of the most natural and intuitive approach for human-computer interaction.Compared with traditional camera-based or wearable sensors-based solutions,gesture recognition using the millimeter wave radar...Gestures are one of the most natural and intuitive approach for human-computer interaction.Compared with traditional camera-based or wearable sensors-based solutions,gesture recognition using the millimeter wave radar has attracted growing attention for its characteristics of contact-free,privacy-preserving and less environmentdependence.Although there have been many recent studies on hand gesture recognition,the existing hand gesture recognition methods still have recognition accuracy and generalization ability shortcomings in shortrange applications.In this paper,we present a hand gesture recognition method named multiscale feature fusion(MSFF)to accurately identify micro hand gestures.In MSFF,not only the overall action recognition of the palm but also the subtle movements of the fingers are taken into account.Specifically,we adopt hand gesture multiangle Doppler-time and gesture trajectory range-angle map multi-feature fusion to comprehensively extract hand gesture features and fuse high-level deep neural networks to make it pay more attention to subtle finger movements.We evaluate the proposed method using data collected from 10 users and our proposed solution achieves an average recognition accuracy of 99.7%.Extensive experiments on a public mmWave gesture dataset demonstrate the superior effectiveness of the proposed system.展开更多
In multi-target tracking,Multiple Hypothesis Tracking (MHT) can effectively solve the data association problem. However,traditional MHT can not make full use of motion information. In this work,we combine MHT with Int...In multi-target tracking,Multiple Hypothesis Tracking (MHT) can effectively solve the data association problem. However,traditional MHT can not make full use of motion information. In this work,we combine MHT with Interactive Multiple Model (IMM) estimator and feature fusion. New algorithm greatly improves the tracking performance due to the fact that IMM estimator provides better estimation and feature information enhances the accuracy of data association. The new algorithm is tested by tracking tropical fish in fish container. Experimental result shows that this algorithm can significantly reduce tracking lost rate and restrain the noises with higher computational effectiveness when compares with traditional MHT.展开更多
Identifying drug-drug interactions(DDIs)is essential to prevent adverse effects from polypharmacy.Although deep learning has advanced DDI identification,the gap between powerful models and their lack of clinical appli...Identifying drug-drug interactions(DDIs)is essential to prevent adverse effects from polypharmacy.Although deep learning has advanced DDI identification,the gap between powerful models and their lack of clinical application and evaluation has hindered clinical benefits.Here,we developed a Multi-Dimensional Feature Fusion model named MDFF,which integrates one-dimensional simplified molec-ular input line entry system sequence features,two-dimensional molecular graph features,and three-dimensional geometric features to enhance drug representations for predicting DDIs.MDFF was trained and validated on two DDI datasets,evaluated across three distinct scenarios,and compared with advanced DDI prediction models using accuracy,precision,recall,area under the curve,and F1 score metrics.MDFF achieved state-of-the-art performance across all metrics.Ablation experiments showed that integrating multi-dimensional drug features yielded the best results.More importantly,we obtained adverse drug reaction reports uploaded by Xiangya Hospital of Central South University from 2021 to 2023 and used MDFF to identify potential adverse DDIs.Among 12 real-world adverse drug reaction reports,the predictions of 9 reports were supported by relevant evidence.Additionally,MDFF demon-strated the ability to explain adverse DDI mechanisms,providing insights into the mechanisms behind one specific report and highlighting its potential to assist practitioners in improving medical practice.展开更多
With the growing application of intelligent robots in service,manufacturing,and medical fields,efficient and natural interaction between humans and robots has become key to improving collaboration efficiency and user ...With the growing application of intelligent robots in service,manufacturing,and medical fields,efficient and natural interaction between humans and robots has become key to improving collaboration efficiency and user experience.Gesture recognition,as an intuitive and contactless interaction method,can overcome the limitations of traditional interfaces and enable real-time control and feedback of robot movements and behaviors.This study first reviews mainstream gesture recognition algorithms and their application on different sensing platforms(RGB cameras,depth cameras,and inertial measurement units).It then proposes a gesture recognition method based on multimodal feature fusion and a lightweight deep neural network that balances recognition accuracy with computational efficiency.At system level,a modular human-robot interaction architecture is constructed,comprising perception,decision,and execution layers,and gesture commands are transmitted and mapped to robot actions in real time via the ROS communication protocol.Through multiple comparative experiments on public gesture datasets and a self-collected dataset,the proposed method’s superiority is validated in terms of accuracy,response latency,and system robustness,while user-experience tests assess the interface’s usability.The results provide a reliable technical foundation for robot collaboration and service in complex scenarios,offering broad prospects for practical application and deployment.展开更多
Scene perception and trajectory forecasting are two fundamental challenges that are crucial to a safe and reliable autonomous driving(AD)system.However,most proposed methods aim at addressing one of the two challenges...Scene perception and trajectory forecasting are two fundamental challenges that are crucial to a safe and reliable autonomous driving(AD)system.However,most proposed methods aim at addressing one of the two challenges mentioned above with a single model.To tackle this dilemma,this paper proposes spatio-temporal semantics and interaction graph aggregation for multi-agent perception and trajectory forecasting(STSIGMA),an efficient end-to-end method to jointly and accurately perceive the AD environment and forecast the trajectories of the surrounding traffic agents within a unified framework.ST-SIGMA adopts a trident encoder-decoder architecture to learn scene semantics and agent interaction information on bird’s-eye view(BEV)maps simultaneously.Specifically,an iterative aggregation network is first employed as the scene semantic encoder(SSE)to learn diverse scene information.To preserve dynamic interactions of traffic agents,ST-SIGMA further exploits a spatio-temporal graph network as the graph interaction encoder.Meanwhile,a simple yet efficient feature fusion method to fuse semantic and interaction features into a unified feature space as the input to a novel hierarchical aggregation decoder for downstream prediction tasks is designed.Extensive experiments on the nuScenes data set have demonstrated that the proposed ST-SIGMA achieves significant improvements compared to the state-of-theart(SOTA)methods in terms of scene perception and trajectory forecasting,respectively.Therefore,the proposed approach outperforms SOTA in terms of model generalisation and robustness and is therefore more feasible for deployment in realworld AD scenarios.展开更多
基金supported by the National Natural Science Foundation of China under grant no.62272242.
文摘Gestures are one of the most natural and intuitive approach for human-computer interaction.Compared with traditional camera-based or wearable sensors-based solutions,gesture recognition using the millimeter wave radar has attracted growing attention for its characteristics of contact-free,privacy-preserving and less environmentdependence.Although there have been many recent studies on hand gesture recognition,the existing hand gesture recognition methods still have recognition accuracy and generalization ability shortcomings in shortrange applications.In this paper,we present a hand gesture recognition method named multiscale feature fusion(MSFF)to accurately identify micro hand gestures.In MSFF,not only the overall action recognition of the palm but also the subtle movements of the fingers are taken into account.Specifically,we adopt hand gesture multiangle Doppler-time and gesture trajectory range-angle map multi-feature fusion to comprehensively extract hand gesture features and fuse high-level deep neural networks to make it pay more attention to subtle finger movements.We evaluate the proposed method using data collected from 10 users and our proposed solution achieves an average recognition accuracy of 99.7%.Extensive experiments on a public mmWave gesture dataset demonstrate the superior effectiveness of the proposed system.
基金Supported by the National Natural Science Foundation of China (No. 60772154)the President Foundation of Graduate University of Chinese Academy of Sciences (No. 085102GN00)
文摘In multi-target tracking,Multiple Hypothesis Tracking (MHT) can effectively solve the data association problem. However,traditional MHT can not make full use of motion information. In this work,we combine MHT with Interactive Multiple Model (IMM) estimator and feature fusion. New algorithm greatly improves the tracking performance due to the fact that IMM estimator provides better estimation and feature information enhances the accuracy of data association. The new algorithm is tested by tracking tropical fish in fish container. Experimental result shows that this algorithm can significantly reduce tracking lost rate and restrain the noises with higher computational effectiveness when compares with traditional MHT.
基金supported by the National Key R&D Program of China(Grant No.:2023YFC2604400)the National Natural Science Foundation of China(Grant No.:62103436).
文摘Identifying drug-drug interactions(DDIs)is essential to prevent adverse effects from polypharmacy.Although deep learning has advanced DDI identification,the gap between powerful models and their lack of clinical application and evaluation has hindered clinical benefits.Here,we developed a Multi-Dimensional Feature Fusion model named MDFF,which integrates one-dimensional simplified molec-ular input line entry system sequence features,two-dimensional molecular graph features,and three-dimensional geometric features to enhance drug representations for predicting DDIs.MDFF was trained and validated on two DDI datasets,evaluated across three distinct scenarios,and compared with advanced DDI prediction models using accuracy,precision,recall,area under the curve,and F1 score metrics.MDFF achieved state-of-the-art performance across all metrics.Ablation experiments showed that integrating multi-dimensional drug features yielded the best results.More importantly,we obtained adverse drug reaction reports uploaded by Xiangya Hospital of Central South University from 2021 to 2023 and used MDFF to identify potential adverse DDIs.Among 12 real-world adverse drug reaction reports,the predictions of 9 reports were supported by relevant evidence.Additionally,MDFF demon-strated the ability to explain adverse DDI mechanisms,providing insights into the mechanisms behind one specific report and highlighting its potential to assist practitioners in improving medical practice.
文摘With the growing application of intelligent robots in service,manufacturing,and medical fields,efficient and natural interaction between humans and robots has become key to improving collaboration efficiency and user experience.Gesture recognition,as an intuitive and contactless interaction method,can overcome the limitations of traditional interfaces and enable real-time control and feedback of robot movements and behaviors.This study first reviews mainstream gesture recognition algorithms and their application on different sensing platforms(RGB cameras,depth cameras,and inertial measurement units).It then proposes a gesture recognition method based on multimodal feature fusion and a lightweight deep neural network that balances recognition accuracy with computational efficiency.At system level,a modular human-robot interaction architecture is constructed,comprising perception,decision,and execution layers,and gesture commands are transmitted and mapped to robot actions in real time via the ROS communication protocol.Through multiple comparative experiments on public gesture datasets and a self-collected dataset,the proposed method’s superiority is validated in terms of accuracy,response latency,and system robustness,while user-experience tests assess the interface’s usability.The results provide a reliable technical foundation for robot collaboration and service in complex scenarios,offering broad prospects for practical application and deployment.
基金Basic and Advanced Research Projects of CSTC,Grant/Award Number:cstc2019jcyj-zdxmX0008Science and Technology Research Program of Chongqing Municipal Education Commission,Grant/Award Numbers:KJQN202100634,KJZDK201900605National Natural Science Foundation of China,Grant/Award Number:62006065。
文摘Scene perception and trajectory forecasting are two fundamental challenges that are crucial to a safe and reliable autonomous driving(AD)system.However,most proposed methods aim at addressing one of the two challenges mentioned above with a single model.To tackle this dilemma,this paper proposes spatio-temporal semantics and interaction graph aggregation for multi-agent perception and trajectory forecasting(STSIGMA),an efficient end-to-end method to jointly and accurately perceive the AD environment and forecast the trajectories of the surrounding traffic agents within a unified framework.ST-SIGMA adopts a trident encoder-decoder architecture to learn scene semantics and agent interaction information on bird’s-eye view(BEV)maps simultaneously.Specifically,an iterative aggregation network is first employed as the scene semantic encoder(SSE)to learn diverse scene information.To preserve dynamic interactions of traffic agents,ST-SIGMA further exploits a spatio-temporal graph network as the graph interaction encoder.Meanwhile,a simple yet efficient feature fusion method to fuse semantic and interaction features into a unified feature space as the input to a novel hierarchical aggregation decoder for downstream prediction tasks is designed.Extensive experiments on the nuScenes data set have demonstrated that the proposed ST-SIGMA achieves significant improvements compared to the state-of-theart(SOTA)methods in terms of scene perception and trajectory forecasting,respectively.Therefore,the proposed approach outperforms SOTA in terms of model generalisation and robustness and is therefore more feasible for deployment in realworld AD scenarios.