Grasping is an important general ability for robots to work in aerospace and other fields.An accurate grasping detection result is the premise and key step for robots to complete grasping.To reduce the parameter quant...Grasping is an important general ability for robots to work in aerospace and other fields.An accurate grasping detection result is the premise and key step for robots to complete grasping.To reduce the parameter quantity and computational complexity of the grasping detection model,improve the grasping detection accuracy and real-time performance,this paper proposes a grasping detection algorithm based on key point estimation.First,this model focuses on finding the center point of the grasp rectangle,and then obtains the best grasp through its position on feature heat map.Second,for RGB-D multi-modal input,an improved residual block combined with squeeze-and-excitation block is used as the feature extraction layer to explicitly learn multi-channel weight information.Different from the anchor-based detection algorithm that exhausts the possible positions of the target and needs to score the grasp candidate after classification and regression,the proposed model obtains best grasp by directly predicting the position of the center point,angle,and open width of gripper.The number of parameters in this model is only around 482 k,which is less than one-third of regular general model.The results on the Cornell Grasp Dataset show that the model achieves an accuracy of 97.75%and runs at 24.7 frames per second.展开更多
文摘Grasping is an important general ability for robots to work in aerospace and other fields.An accurate grasping detection result is the premise and key step for robots to complete grasping.To reduce the parameter quantity and computational complexity of the grasping detection model,improve the grasping detection accuracy and real-time performance,this paper proposes a grasping detection algorithm based on key point estimation.First,this model focuses on finding the center point of the grasp rectangle,and then obtains the best grasp through its position on feature heat map.Second,for RGB-D multi-modal input,an improved residual block combined with squeeze-and-excitation block is used as the feature extraction layer to explicitly learn multi-channel weight information.Different from the anchor-based detection algorithm that exhausts the possible positions of the target and needs to score the grasp candidate after classification and regression,the proposed model obtains best grasp by directly predicting the position of the center point,angle,and open width of gripper.The number of parameters in this model is only around 482 k,which is less than one-third of regular general model.The results on the Cornell Grasp Dataset show that the model achieves an accuracy of 97.75%and runs at 24.7 frames per second.