Image captioning aims to generate a corresponding description of an image.In recent years,neural encoder-decodermodels have been the dominant approaches,in which the Convolutional Neural Network(CNN)and Long Short Ter...Image captioning aims to generate a corresponding description of an image.In recent years,neural encoder-decodermodels have been the dominant approaches,in which the Convolutional Neural Network(CNN)and Long Short TermMemory(LSTM)are used to translate an image into a natural language description.Among these approaches,the visual attention mechanisms are widely used to enable deeper image understanding through fine-grained analysis and even multiple steps of reasoning.However,most conventional visual attention mechanisms are based on high-level image features,ignoring the effects of other image features,and giving insufficient consideration to the relative positions between image features.In this work,we propose a Position-Aware Transformer model with image-feature attention and position-aware attention mechanisms for the above problems.The image-feature attention firstly extracts multi-level features by using Feature Pyramid Network(FPN),then utilizes the scaled-dot-product to fuse these features,which enables our model to detect objects of different scales in the image more effectivelywithout increasing parameters.In the position-aware attentionmechanism,the relative positions between image features are obtained at first,afterwards the relative positions are incorporated into the original image features to generate captions more accurately.Experiments are carried out on the MSCOCO dataset and our approach achieves competitive BLEU-4,METEOR,ROUGE-L,CIDEr scores compared with some state-of-the-art approaches,demonstrating the effectiveness of our approach.展开更多
Industrial robot dynamics lay the foundation for high-precision and high-speed control, and accurate identification of dynamic parameters is essential for precise dynamic calculations. The choice of friction models is...Industrial robot dynamics lay the foundation for high-precision and high-speed control, and accurate identification of dynamic parameters is essential for precise dynamic calculations. The choice of friction models is a critical component in the identification of industrial robot dynamics. Traditional static friction models struggle to capture the hysteresis effects caused by robot joint elasticity and clearances, leading to large torque prediction errors when the joint velocity crosses zero. Due to the presence of hysteresis effects, the joint velocity crosses zero in the forward direction, and the reverse direction will have different friction patterns. Although the hysteresis effects can be modeled as an ordinary differential equation(ODE), it is difficult to determine the ODE structure that achieves both generalization and accuracy to describe the hysteresis effects of the friction model. To address this issue, we propose the neural hysteresis friction(NHF), which uses neural ODE to model the hysteresis effects in a data-driven manner, thereby mitigating the current inadequacies in the study of dynamic friction characteristics. The experiments on a real 6-axis industrial robot demonstrate that our proposed method can accurately model the friction dynamics during directional switching and outperform other modeling methods. Velocity tracking control experiments show that NHF can effectively reduce tracking errors when the velocity crosses zero.展开更多
基金This work was supported in part by the National Natural Science Foundation of China under Grant No.61977018the Deanship of Scientific Research at King Saud University,Riyadh,Saudi Arabia for funding this work through research Group No.RG-1438-070in part by the Research Foundation of Education Bureau of Hunan Province of China under Grant 16B006.
文摘Image captioning aims to generate a corresponding description of an image.In recent years,neural encoder-decodermodels have been the dominant approaches,in which the Convolutional Neural Network(CNN)and Long Short TermMemory(LSTM)are used to translate an image into a natural language description.Among these approaches,the visual attention mechanisms are widely used to enable deeper image understanding through fine-grained analysis and even multiple steps of reasoning.However,most conventional visual attention mechanisms are based on high-level image features,ignoring the effects of other image features,and giving insufficient consideration to the relative positions between image features.In this work,we propose a Position-Aware Transformer model with image-feature attention and position-aware attention mechanisms for the above problems.The image-feature attention firstly extracts multi-level features by using Feature Pyramid Network(FPN),then utilizes the scaled-dot-product to fuse these features,which enables our model to detect objects of different scales in the image more effectivelywithout increasing parameters.In the position-aware attentionmechanism,the relative positions between image features are obtained at first,afterwards the relative positions are incorporated into the original image features to generate captions more accurately.Experiments are carried out on the MSCOCO dataset and our approach achieves competitive BLEU-4,METEOR,ROUGE-L,CIDEr scores compared with some state-of-the-art approaches,demonstrating the effectiveness of our approach.
基金supported by the National Natural Science Foundation of China (Grant No.52188102)。
文摘Industrial robot dynamics lay the foundation for high-precision and high-speed control, and accurate identification of dynamic parameters is essential for precise dynamic calculations. The choice of friction models is a critical component in the identification of industrial robot dynamics. Traditional static friction models struggle to capture the hysteresis effects caused by robot joint elasticity and clearances, leading to large torque prediction errors when the joint velocity crosses zero. Due to the presence of hysteresis effects, the joint velocity crosses zero in the forward direction, and the reverse direction will have different friction patterns. Although the hysteresis effects can be modeled as an ordinary differential equation(ODE), it is difficult to determine the ODE structure that achieves both generalization and accuracy to describe the hysteresis effects of the friction model. To address this issue, we propose the neural hysteresis friction(NHF), which uses neural ODE to model the hysteresis effects in a data-driven manner, thereby mitigating the current inadequacies in the study of dynamic friction characteristics. The experiments on a real 6-axis industrial robot demonstrate that our proposed method can accurately model the friction dynamics during directional switching and outperform other modeling methods. Velocity tracking control experiments show that NHF can effectively reduce tracking errors when the velocity crosses zero.