Gestures are one of the most natural and intuitive approach for human-computer interaction.Compared with traditional camera-based or wearable sensors-based solutions,gesture recognition using the millimeter wave radar...Gestures are one of the most natural and intuitive approach for human-computer interaction.Compared with traditional camera-based or wearable sensors-based solutions,gesture recognition using the millimeter wave radar has attracted growing attention for its characteristics of contact-free,privacy-preserving and less environmentdependence.Although there have been many recent studies on hand gesture recognition,the existing hand gesture recognition methods still have recognition accuracy and generalization ability shortcomings in shortrange applications.In this paper,we present a hand gesture recognition method named multiscale feature fusion(MSFF)to accurately identify micro hand gestures.In MSFF,not only the overall action recognition of the palm but also the subtle movements of the fingers are taken into account.Specifically,we adopt hand gesture multiangle Doppler-time and gesture trajectory range-angle map multi-feature fusion to comprehensively extract hand gesture features and fuse high-level deep neural networks to make it pay more attention to subtle finger movements.We evaluate the proposed method using data collected from 10 users and our proposed solution achieves an average recognition accuracy of 99.7%.Extensive experiments on a public mmWave gesture dataset demonstrate the superior effectiveness of the proposed system.展开更多
In multi-target tracking,Multiple Hypothesis Tracking (MHT) can effectively solve the data association problem. However,traditional MHT can not make full use of motion information. In this work,we combine MHT with Int...In multi-target tracking,Multiple Hypothesis Tracking (MHT) can effectively solve the data association problem. However,traditional MHT can not make full use of motion information. In this work,we combine MHT with Interactive Multiple Model (IMM) estimator and feature fusion. New algorithm greatly improves the tracking performance due to the fact that IMM estimator provides better estimation and feature information enhances the accuracy of data association. The new algorithm is tested by tracking tropical fish in fish container. Experimental result shows that this algorithm can significantly reduce tracking lost rate and restrain the noises with higher computational effectiveness when compares with traditional MHT.展开更多
Identifying drug-drug interactions(DDIs)is essential to prevent adverse effects from polypharmacy.Although deep learning has advanced DDI identification,the gap between powerful models and their lack of clinical appli...Identifying drug-drug interactions(DDIs)is essential to prevent adverse effects from polypharmacy.Although deep learning has advanced DDI identification,the gap between powerful models and their lack of clinical application and evaluation has hindered clinical benefits.Here,we developed a Multi-Dimensional Feature Fusion model named MDFF,which integrates one-dimensional simplified molec-ular input line entry system sequence features,two-dimensional molecular graph features,and three-dimensional geometric features to enhance drug representations for predicting DDIs.MDFF was trained and validated on two DDI datasets,evaluated across three distinct scenarios,and compared with advanced DDI prediction models using accuracy,precision,recall,area under the curve,and F1 score metrics.MDFF achieved state-of-the-art performance across all metrics.Ablation experiments showed that integrating multi-dimensional drug features yielded the best results.More importantly,we obtained adverse drug reaction reports uploaded by Xiangya Hospital of Central South University from 2021 to 2023 and used MDFF to identify potential adverse DDIs.Among 12 real-world adverse drug reaction reports,the predictions of 9 reports were supported by relevant evidence.Additionally,MDFF demon-strated the ability to explain adverse DDI mechanisms,providing insights into the mechanisms behind one specific report and highlighting its potential to assist practitioners in improving medical practice.展开更多
With the growing application of intelligent robots in service,manufacturing,and medical fields,efficient and natural interaction between humans and robots has become key to improving collaboration efficiency and user ...With the growing application of intelligent robots in service,manufacturing,and medical fields,efficient and natural interaction between humans and robots has become key to improving collaboration efficiency and user experience.Gesture recognition,as an intuitive and contactless interaction method,can overcome the limitations of traditional interfaces and enable real-time control and feedback of robot movements and behaviors.This study first reviews mainstream gesture recognition algorithms and their application on different sensing platforms(RGB cameras,depth cameras,and inertial measurement units).It then proposes a gesture recognition method based on multimodal feature fusion and a lightweight deep neural network that balances recognition accuracy with computational efficiency.At system level,a modular human-robot interaction architecture is constructed,comprising perception,decision,and execution layers,and gesture commands are transmitted and mapped to robot actions in real time via the ROS communication protocol.Through multiple comparative experiments on public gesture datasets and a self-collected dataset,the proposed method’s superiority is validated in terms of accuracy,response latency,and system robustness,while user-experience tests assess the interface’s usability.The results provide a reliable technical foundation for robot collaboration and service in complex scenarios,offering broad prospects for practical application and deployment.展开更多
Scene perception and trajectory forecasting are two fundamental challenges that are crucial to a safe and reliable autonomous driving(AD)system.However,most proposed methods aim at addressing one of the two challenges...Scene perception and trajectory forecasting are two fundamental challenges that are crucial to a safe and reliable autonomous driving(AD)system.However,most proposed methods aim at addressing one of the two challenges mentioned above with a single model.To tackle this dilemma,this paper proposes spatio-temporal semantics and interaction graph aggregation for multi-agent perception and trajectory forecasting(STSIGMA),an efficient end-to-end method to jointly and accurately perceive the AD environment and forecast the trajectories of the surrounding traffic agents within a unified framework.ST-SIGMA adopts a trident encoder-decoder architecture to learn scene semantics and agent interaction information on bird’s-eye view(BEV)maps simultaneously.Specifically,an iterative aggregation network is first employed as the scene semantic encoder(SSE)to learn diverse scene information.To preserve dynamic interactions of traffic agents,ST-SIGMA further exploits a spatio-temporal graph network as the graph interaction encoder.Meanwhile,a simple yet efficient feature fusion method to fuse semantic and interaction features into a unified feature space as the input to a novel hierarchical aggregation decoder for downstream prediction tasks is designed.Extensive experiments on the nuScenes data set have demonstrated that the proposed ST-SIGMA achieves significant improvements compared to the state-of-theart(SOTA)methods in terms of scene perception and trajectory forecasting,respectively.Therefore,the proposed approach outperforms SOTA in terms of model generalisation and robustness and is therefore more feasible for deployment in realworld AD scenarios.展开更多
煤矿井下弥漫着粉尘和雾气且多数区域为狭长巷道,仅依赖矿灯照明会导致视频监控图像出现细节模糊、局部过曝及目标尺寸多变等问题。这些因素增加了井下安全帽目标检测的难度,现有目标检测算法直接应用于煤矿井下场景时,通常面临精度不...煤矿井下弥漫着粉尘和雾气且多数区域为狭长巷道,仅依赖矿灯照明会导致视频监控图像出现细节模糊、局部过曝及目标尺寸多变等问题。这些因素增加了井下安全帽目标检测的难度,现有目标检测算法直接应用于煤矿井下场景时,通常面临精度不足的挑战。针对这些问题,研究提出一种基于YOLOv8n(You Only Look Once version 8n)的煤矿井下安全帽检测算法。首先,采用空间到深度机制将YOLOv8n主干网络中的Conv模块重新构建为空间到深度卷积(Space-to-Depth Convolutional,SPDConv)模块,以便从特征图中充分提取浅层细节信息,提高模型对细节模糊图像中小目标安全帽的检测精度;其次,引入基于注意力机制的尺度内特征交互模块,减少局部过曝对安全帽特征提取的干扰,增强模型对目标区域的关注能力;最后,借鉴高层次筛选特征融合金字塔对YOLOv8n的颈部网络进行重设计,改善模型对不同尺寸安全帽的检测能力,进一步提升检测精度。试验结果显示,该算法在CUMT-Helme T数据集上的平均精度均值达91.7%,相较于YOLOv8n提升了3.2百分点,同时模型参数量减少了1.9×10^(5)。与单次多边框检测(Single Shot MultiBox Detector,SSD)、快速区域卷积神经网络(Region-based Convolutional Neural Networks,Faster RCNN)、YOLOv5s、YOLOv6n、YOLOv7及YOLOv7-tiny等当前主流目标检测算法相比,该算法的平均精度均值最高,且参数量和浮点运算量较低,在实现较高检测精度的同时还具备一定的轻量化特性。展开更多
基金supported by the National Natural Science Foundation of China under grant no.62272242.
文摘Gestures are one of the most natural and intuitive approach for human-computer interaction.Compared with traditional camera-based or wearable sensors-based solutions,gesture recognition using the millimeter wave radar has attracted growing attention for its characteristics of contact-free,privacy-preserving and less environmentdependence.Although there have been many recent studies on hand gesture recognition,the existing hand gesture recognition methods still have recognition accuracy and generalization ability shortcomings in shortrange applications.In this paper,we present a hand gesture recognition method named multiscale feature fusion(MSFF)to accurately identify micro hand gestures.In MSFF,not only the overall action recognition of the palm but also the subtle movements of the fingers are taken into account.Specifically,we adopt hand gesture multiangle Doppler-time and gesture trajectory range-angle map multi-feature fusion to comprehensively extract hand gesture features and fuse high-level deep neural networks to make it pay more attention to subtle finger movements.We evaluate the proposed method using data collected from 10 users and our proposed solution achieves an average recognition accuracy of 99.7%.Extensive experiments on a public mmWave gesture dataset demonstrate the superior effectiveness of the proposed system.
基金Supported by the National Natural Science Foundation of China (No. 60772154)the President Foundation of Graduate University of Chinese Academy of Sciences (No. 085102GN00)
文摘In multi-target tracking,Multiple Hypothesis Tracking (MHT) can effectively solve the data association problem. However,traditional MHT can not make full use of motion information. In this work,we combine MHT with Interactive Multiple Model (IMM) estimator and feature fusion. New algorithm greatly improves the tracking performance due to the fact that IMM estimator provides better estimation and feature information enhances the accuracy of data association. The new algorithm is tested by tracking tropical fish in fish container. Experimental result shows that this algorithm can significantly reduce tracking lost rate and restrain the noises with higher computational effectiveness when compares with traditional MHT.
基金supported by the National Key R&D Program of China(Grant No.:2023YFC2604400)the National Natural Science Foundation of China(Grant No.:62103436).
文摘Identifying drug-drug interactions(DDIs)is essential to prevent adverse effects from polypharmacy.Although deep learning has advanced DDI identification,the gap between powerful models and their lack of clinical application and evaluation has hindered clinical benefits.Here,we developed a Multi-Dimensional Feature Fusion model named MDFF,which integrates one-dimensional simplified molec-ular input line entry system sequence features,two-dimensional molecular graph features,and three-dimensional geometric features to enhance drug representations for predicting DDIs.MDFF was trained and validated on two DDI datasets,evaluated across three distinct scenarios,and compared with advanced DDI prediction models using accuracy,precision,recall,area under the curve,and F1 score metrics.MDFF achieved state-of-the-art performance across all metrics.Ablation experiments showed that integrating multi-dimensional drug features yielded the best results.More importantly,we obtained adverse drug reaction reports uploaded by Xiangya Hospital of Central South University from 2021 to 2023 and used MDFF to identify potential adverse DDIs.Among 12 real-world adverse drug reaction reports,the predictions of 9 reports were supported by relevant evidence.Additionally,MDFF demon-strated the ability to explain adverse DDI mechanisms,providing insights into the mechanisms behind one specific report and highlighting its potential to assist practitioners in improving medical practice.
文摘With the growing application of intelligent robots in service,manufacturing,and medical fields,efficient and natural interaction between humans and robots has become key to improving collaboration efficiency and user experience.Gesture recognition,as an intuitive and contactless interaction method,can overcome the limitations of traditional interfaces and enable real-time control and feedback of robot movements and behaviors.This study first reviews mainstream gesture recognition algorithms and their application on different sensing platforms(RGB cameras,depth cameras,and inertial measurement units).It then proposes a gesture recognition method based on multimodal feature fusion and a lightweight deep neural network that balances recognition accuracy with computational efficiency.At system level,a modular human-robot interaction architecture is constructed,comprising perception,decision,and execution layers,and gesture commands are transmitted and mapped to robot actions in real time via the ROS communication protocol.Through multiple comparative experiments on public gesture datasets and a self-collected dataset,the proposed method’s superiority is validated in terms of accuracy,response latency,and system robustness,while user-experience tests assess the interface’s usability.The results provide a reliable technical foundation for robot collaboration and service in complex scenarios,offering broad prospects for practical application and deployment.
基金Basic and Advanced Research Projects of CSTC,Grant/Award Number:cstc2019jcyj-zdxmX0008Science and Technology Research Program of Chongqing Municipal Education Commission,Grant/Award Numbers:KJQN202100634,KJZDK201900605National Natural Science Foundation of China,Grant/Award Number:62006065。
文摘Scene perception and trajectory forecasting are two fundamental challenges that are crucial to a safe and reliable autonomous driving(AD)system.However,most proposed methods aim at addressing one of the two challenges mentioned above with a single model.To tackle this dilemma,this paper proposes spatio-temporal semantics and interaction graph aggregation for multi-agent perception and trajectory forecasting(STSIGMA),an efficient end-to-end method to jointly and accurately perceive the AD environment and forecast the trajectories of the surrounding traffic agents within a unified framework.ST-SIGMA adopts a trident encoder-decoder architecture to learn scene semantics and agent interaction information on bird’s-eye view(BEV)maps simultaneously.Specifically,an iterative aggregation network is first employed as the scene semantic encoder(SSE)to learn diverse scene information.To preserve dynamic interactions of traffic agents,ST-SIGMA further exploits a spatio-temporal graph network as the graph interaction encoder.Meanwhile,a simple yet efficient feature fusion method to fuse semantic and interaction features into a unified feature space as the input to a novel hierarchical aggregation decoder for downstream prediction tasks is designed.Extensive experiments on the nuScenes data set have demonstrated that the proposed ST-SIGMA achieves significant improvements compared to the state-of-theart(SOTA)methods in terms of scene perception and trajectory forecasting,respectively.Therefore,the proposed approach outperforms SOTA in terms of model generalisation and robustness and is therefore more feasible for deployment in realworld AD scenarios.
文摘煤矿井下弥漫着粉尘和雾气且多数区域为狭长巷道,仅依赖矿灯照明会导致视频监控图像出现细节模糊、局部过曝及目标尺寸多变等问题。这些因素增加了井下安全帽目标检测的难度,现有目标检测算法直接应用于煤矿井下场景时,通常面临精度不足的挑战。针对这些问题,研究提出一种基于YOLOv8n(You Only Look Once version 8n)的煤矿井下安全帽检测算法。首先,采用空间到深度机制将YOLOv8n主干网络中的Conv模块重新构建为空间到深度卷积(Space-to-Depth Convolutional,SPDConv)模块,以便从特征图中充分提取浅层细节信息,提高模型对细节模糊图像中小目标安全帽的检测精度;其次,引入基于注意力机制的尺度内特征交互模块,减少局部过曝对安全帽特征提取的干扰,增强模型对目标区域的关注能力;最后,借鉴高层次筛选特征融合金字塔对YOLOv8n的颈部网络进行重设计,改善模型对不同尺寸安全帽的检测能力,进一步提升检测精度。试验结果显示,该算法在CUMT-Helme T数据集上的平均精度均值达91.7%,相较于YOLOv8n提升了3.2百分点,同时模型参数量减少了1.9×10^(5)。与单次多边框检测(Single Shot MultiBox Detector,SSD)、快速区域卷积神经网络(Region-based Convolutional Neural Networks,Faster RCNN)、YOLOv5s、YOLOv6n、YOLOv7及YOLOv7-tiny等当前主流目标检测算法相比,该算法的平均精度均值最高,且参数量和浮点运算量较低,在实现较高检测精度的同时还具备一定的轻量化特性。