In the PSP(Pressure-Sensitive Paint),image deblurring is essential due to factors such as prolonged camera exposure times and highmodel velocities,which can lead to significant image blurring.Conventional deblurring m...In the PSP(Pressure-Sensitive Paint),image deblurring is essential due to factors such as prolonged camera exposure times and highmodel velocities,which can lead to significant image blurring.Conventional deblurring methods applied to PSP images often suffer from limited accuracy and require extensive computational resources.To address these issues,this study proposes a deep learning-based approach tailored for PSP image deblurring.Considering that PSP applications primarily involve the accurate pressure measurements of complex geometries,the images captured under such conditions exhibit distinctive non-uniform motion blur,presenting challenges for standard deep learning models utilizing convolutional or attention-based techniques.In this paper,we introduce a novel deblurring architecture featuring multiple DAAM(Deformable Ack Attention Module).These modules provide enhanced flexibility for end-to-end deblurring,leveraging irregular convolution operations for efficient feature extraction while employing attention mechanisms interpreted as multiple 1×1 convolutions,subsequently reassembled to enhance performance.Furthermore,we incorporate a RSC(Residual Shortcut Convolution)module for initial feature processing,aimed at reducing redundant computations and improving the learning capacity for representative shallow features.To preserve critical spatial information during upsampling and downsampling,we replace conventional convolutions with wt(Haar wavelet downsampling)and dysample(Upsampling by Dynamic Sampling).This modification significantly enhances high-precision image reconstruction.By integrating these advanced modules within an encoder-decoder framework,we present the DFDNet(Deformable Fusion Deblurring Network)for image blur removal,providing robust technical support for subsequent PSP data analysis.Experimental evaluations on the FY dataset demonstrate the superior performance of our model,achieving competitive results on the GOPRO and HIDE datasets.展开更多
Multimodal action recognition,especially the fusion of image and skeleton data,has emerged as the prevailing approach in the field of action recognition.However,existing models often lack the spatialtemporal discrimin...Multimodal action recognition,especially the fusion of image and skeleton data,has emerged as the prevailing approach in the field of action recognition.However,existing models often lack the spatialtemporal discriminative ability required for fine-grained recognition tasks.To address this issue,we introduce a flexible attention block called Variable-Channel Spatial-Temporal attention(VCSTA)to enhance the discriminative capacity of spatial-temporal connections.Based on VCSTA,we propose a novel multimodal variable-channel Spatial-Temporal semantic action recognition network(MMARN).MMARN utilizes a combination of spatial-temporal embedding loss and global loss to improve the model’s understanding of action changes and semantic information in videos,resulting in more precise prediction and analysis.The experimental results show that our multimodal variable-channel spatialtemporal semantic action recognition network achieves 98.3%and 89.9%accuracy in classifying actions on the large-scale human activity datasets NTU-RGB+D 60 and NTU-RGB+D 120 respectively.展开更多
基金supported by the National Natural Science Foundation of China(No.12202476).
文摘In the PSP(Pressure-Sensitive Paint),image deblurring is essential due to factors such as prolonged camera exposure times and highmodel velocities,which can lead to significant image blurring.Conventional deblurring methods applied to PSP images often suffer from limited accuracy and require extensive computational resources.To address these issues,this study proposes a deep learning-based approach tailored for PSP image deblurring.Considering that PSP applications primarily involve the accurate pressure measurements of complex geometries,the images captured under such conditions exhibit distinctive non-uniform motion blur,presenting challenges for standard deep learning models utilizing convolutional or attention-based techniques.In this paper,we introduce a novel deblurring architecture featuring multiple DAAM(Deformable Ack Attention Module).These modules provide enhanced flexibility for end-to-end deblurring,leveraging irregular convolution operations for efficient feature extraction while employing attention mechanisms interpreted as multiple 1×1 convolutions,subsequently reassembled to enhance performance.Furthermore,we incorporate a RSC(Residual Shortcut Convolution)module for initial feature processing,aimed at reducing redundant computations and improving the learning capacity for representative shallow features.To preserve critical spatial information during upsampling and downsampling,we replace conventional convolutions with wt(Haar wavelet downsampling)and dysample(Upsampling by Dynamic Sampling).This modification significantly enhances high-precision image reconstruction.By integrating these advanced modules within an encoder-decoder framework,we present the DFDNet(Deformable Fusion Deblurring Network)for image blur removal,providing robust technical support for subsequent PSP data analysis.Experimental evaluations on the FY dataset demonstrate the superior performance of our model,achieving competitive results on the GOPRO and HIDE datasets.
文摘Multimodal action recognition,especially the fusion of image and skeleton data,has emerged as the prevailing approach in the field of action recognition.However,existing models often lack the spatialtemporal discriminative ability required for fine-grained recognition tasks.To address this issue,we introduce a flexible attention block called Variable-Channel Spatial-Temporal attention(VCSTA)to enhance the discriminative capacity of spatial-temporal connections.Based on VCSTA,we propose a novel multimodal variable-channel Spatial-Temporal semantic action recognition network(MMARN).MMARN utilizes a combination of spatial-temporal embedding loss and global loss to improve the model’s understanding of action changes and semantic information in videos,resulting in more precise prediction and analysis.The experimental results show that our multimodal variable-channel spatialtemporal semantic action recognition network achieves 98.3%and 89.9%accuracy in classifying actions on the large-scale human activity datasets NTU-RGB+D 60 and NTU-RGB+D 120 respectively.