Feature extraction of signals plays an important role in classification problems because of data dimension reduction property and potential improvement of a classification accuracy rate. Principal component analysis (...Feature extraction of signals plays an important role in classification problems because of data dimension reduction property and potential improvement of a classification accuracy rate. Principal component analysis (PCA), wavelets transform or Fourier transform methods are often used for feature extraction. In this paper, we propose a multi-scale PCA, which combines discrete wavelet transform, and PCA for feature extraction of signals in both the spatial and temporal domains. Our study shows that the multi-scale PCA combined with the proposed new classification methods leads to high classification accuracy for the considered signals.展开更多
Aiming at the difficulty of fault identification caused by manual extraction of fault features of rotating machinery,a one-dimensional multi-scale convolutional auto-encoder fault diagnosis model is proposed,based on ...Aiming at the difficulty of fault identification caused by manual extraction of fault features of rotating machinery,a one-dimensional multi-scale convolutional auto-encoder fault diagnosis model is proposed,based on the standard convolutional auto-encoder.In this model,the parallel convolutional and deconvolutional kernels of different scales are used to extract the features from the input signal and reconstruct the input signal;then the feature map extracted by multi-scale convolutional kernels is used as the input of the classifier;and finally the parameters of the whole model are fine-tuned using labeled data.Experiments on one set of simulation fault data and two sets of rolling bearing fault data are conducted to validate the proposed method.The results show that the model can achieve 99.75%,99.3%and 100%diagnostic accuracy,respectively.In addition,the diagnostic accuracy and reconstruction error of the one-dimensional multi-scale convolutional auto-encoder are compared with traditional machine learning,convolutional neural networks and a traditional convolutional auto-encoder.The final results show that the proposed model has a better recognition effect for rolling bearing fault data.展开更多
Tea leaf picking is a crucial stage in tea production that directly influences the quality and value of the tea.Traditional tea-picking machines may compromise the quality of the tea leaves.High-quality teas are often...Tea leaf picking is a crucial stage in tea production that directly influences the quality and value of the tea.Traditional tea-picking machines may compromise the quality of the tea leaves.High-quality teas are often handpicked and need more delicate operations in intelligent picking machines.Compared with traditional image processing techniques,deep learning models have stronger feature extraction capabilities,and better generalization and are more suitable for practical tea shoot harvesting.However,current research mostly focuses on shoot detection and cannot directly accomplish end-to-end shoot segmentation tasks.We propose a tea shoot instance segmentation model based on multi-scale mixed attention(Mask2FusionNet)using a dataset from the tea garden in Hangzhou.We further analyzed the characteristics of the tea shoot dataset,where the proportion of small to medium-sized targets is 89.9%.Our algorithm is compared with several mainstream object segmentation algorithms,and the results demonstrate that our model achieves an accuracy of 82%in recognizing the tea shoots,showing a better performance compared to other models.Through ablation experiments,we found that ResNet50,PointRend strategy,and the Feature Pyramid Network(FPN)architecture can improve performance by 1.6%,1.4%,and 2.4%,respectively.These experiments demonstrated that our proposed multi-scale and point selection strategy optimizes the feature extraction capability for overlapping small targets.The results indicate that the proposed Mask2FusionNet model can perform the shoot segmentation in unstructured environments,realizing the individual distinction of tea shoots,and complete extraction of the shoot edge contours with a segmentation accuracy of 82.0%.The research results can provide algorithmic support for the segmentation and intelligent harvesting of premium tea shoots at different scales.展开更多
Micro-expression recognition has attracted growing research interests in the field of compute vision.However,micro-expression usually lasts a few seconds,thus it is difficult to detect.This paper presents a new framew...Micro-expression recognition has attracted growing research interests in the field of compute vision.However,micro-expression usually lasts a few seconds,thus it is difficult to detect.This paper presents a new framework to recognize micro-expression using pyramid histogram of Centralized Gabor Binary Pattern from Three Orthogonal Panels(CGBP-TOP)which is an extension of Local Gabor Binary Pattern from Three Orthogonal Panels feature.CGBP-TOP performs spatial and temporal analysis to capture the local facial characteristics of micro-expression image sequences.In order to keep more local information of the face,CGBP-TOP is extracted based on pyramid subregions of the micro-expression video frame.The combination of CGBP-TOP and spatial pyramid can represent well and truly the facial movements of the micro-expression image sequences.However,the dimension of our pyramid CGBP-TOP tends to be very high,which may lead to high data redundancy problem.In addition,it is clear that people of different genders usually have different ways of micro-expression.Therefore,in this paper,in order to select the relevant features of micro-expression,the gender-specific sparse multi-task learning method with adaptive regularization term is adopted to learn a compact subset of pyramid CGBP-TOP feature for micro-expression classification of different sexes.Finally,extensive experiments on widely used CASME II and SMIC databases demonstrate that our method can efficiently extract micro-expression motion features in the micro-expression video clip.Moreover,our proposed approach achieves comparable results with the state-of-the-art methods.展开更多
In order to extract the richer feature information of ship targets from sea clutter, and address the high dimensional data problem, a method termed as multi-scale fusion kernel sparse preserving projection(MSFKSPP) ba...In order to extract the richer feature information of ship targets from sea clutter, and address the high dimensional data problem, a method termed as multi-scale fusion kernel sparse preserving projection(MSFKSPP) based on the maximum margin criterion(MMC) is proposed for recognizing the class of ship targets utilizing the high-resolution range profile(HRRP). Multi-scale fusion is introduced to capture the local and detailed information in small-scale features, and the global and contour information in large-scale features, offering help to extract the edge information from sea clutter and further improving the target recognition accuracy. The proposed method can maximally preserve the multi-scale fusion sparse of data and maximize the class separability in the reduced dimensionality by reproducing kernel Hilbert space. Experimental results on the measured radar data show that the proposed method can effectively extract the features of ship target from sea clutter, further reduce the feature dimensionality, and improve target recognition performance.展开更多
Background Recurrent recovery is a common method for video super-resolution(VSR)that models the correlation between frames via hidden states.However,the application of this structure in real-world scenarios can lead t...Background Recurrent recovery is a common method for video super-resolution(VSR)that models the correlation between frames via hidden states.However,the application of this structure in real-world scenarios can lead to unsatisfactory artifacts.We found that in real-world VSR training,the use of unknown and complex degradation can better simulate the degradation process in the real world.Methods Based on this,we propose the RealFuVSR model,which simulates real-world degradation and mitigates artifacts caused by the VSR.Specifically,we propose a multiscale feature extraction module(MSF)module that extracts and fuses features from multiple scales,thereby facilitating the elimination of hidden state artifacts.To improve the accuracy of the hidden state alignment information,RealFuVSR uses an advanced optical flow-guided deformable convolution.Moreover,a cascaded residual upsampling module was used to eliminate noise caused by the upsampling process.Results The experiment demonstrates that RealFuVSR model can not only recover high-quality videos but also outperforms the state-of-the-art RealBasicVSR and RealESRGAN models.展开更多
In this study,an underwater image enhancement method based on multi-scale adversarial network was proposed to solve the problem of detail blur and color distortion in underwater images.Firstly,the local features of ea...In this study,an underwater image enhancement method based on multi-scale adversarial network was proposed to solve the problem of detail blur and color distortion in underwater images.Firstly,the local features of each layer were enhanced into the global features by the proposed residual dense block,which ensured that the generated images retain more details.Secondly,a multi-scale structure was adopted to extract multi-scale semantic features of the original images.Finally,the features obtained from the dual channels were fused by an adaptive fusion module to further optimize the features.The discriminant network adopted the structure of the Markov discriminator.In addition,by constructing mean square error,structural similarity,and perceived color loss function,the generated image is consistent with the reference image in structure,color,and content.The experimental results showed that the enhanced underwater image deblurring effect of the proposed algorithm was good and the problem of underwater image color bias was effectively improved.In both subjective and objective evaluation indexes,the experimental results of the proposed algorithm are better than those of the comparison algorithm.展开更多
Target detection is always an important application in hyperspectral image processing field. In this paper, a spectral-spatial target detection algorithm for hyperspectral data is proposed.The spatial feature and spec...Target detection is always an important application in hyperspectral image processing field. In this paper, a spectral-spatial target detection algorithm for hyperspectral data is proposed.The spatial feature and spectral feature were unified based on the data filed theory and extracted by weighted manifold embedding. The novelties of the proposed method lie in two aspects. One is the way in which the spatial features and spectral features were fused as a new feature based on the data field theory, and the other is that local information was introduced to describe the decision boundary and explore the discriminative features for target detection. The extracted features based on data field modeling and manifold embedding techniques were considered for a target detection task.Three standard hyperspectral datasets were considered in the analysis. The effectiveness of the proposed target detection algorithm based on data field theory was proved by the higher detection rates with lower False Alarm Rates(FARs) with respect to those achieved by conventional hyperspectral target detectors.展开更多
针对视觉算法在检测航拍图像中密集小目标时容易受到目标重叠、遮挡等情况干扰的现象,提出了一种基于高阶空间特征(目标形状、位置等信息的高级表示)提取的Transformer检测头HSF-TPH(Transformer prediction head with high-order spati...针对视觉算法在检测航拍图像中密集小目标时容易受到目标重叠、遮挡等情况干扰的现象,提出了一种基于高阶空间特征(目标形状、位置等信息的高级表示)提取的Transformer检测头HSF-TPH(Transformer prediction head with high-order spatial feature extraction)。所提检测头中将自注意力机制中的二阶交互扩展到三阶以生成高阶空间特征,提取更有区分度的空间关系,突出每一个小目标在空间上的语义信息。同时,为了缓解骨干网络过度下采样对小目标信息的压缩,设计了一种高分辨率特征图生成机制,增加头部网络的输入特征分辨率,以提升HSFTPH检测密集小目标的效果。设计了新的损失函数USIoU,降低算法位置偏差敏感性。在VisDrone2019数据集上开展实验证明,所提算法在面积最小、密度最高的人类目标的检测任务中实现了mAP50指标10个百分点以上的性能提升。展开更多
文摘Feature extraction of signals plays an important role in classification problems because of data dimension reduction property and potential improvement of a classification accuracy rate. Principal component analysis (PCA), wavelets transform or Fourier transform methods are often used for feature extraction. In this paper, we propose a multi-scale PCA, which combines discrete wavelet transform, and PCA for feature extraction of signals in both the spatial and temporal domains. Our study shows that the multi-scale PCA combined with the proposed new classification methods leads to high classification accuracy for the considered signals.
基金The National Natural Science Foundation of China(No.51675098)
文摘Aiming at the difficulty of fault identification caused by manual extraction of fault features of rotating machinery,a one-dimensional multi-scale convolutional auto-encoder fault diagnosis model is proposed,based on the standard convolutional auto-encoder.In this model,the parallel convolutional and deconvolutional kernels of different scales are used to extract the features from the input signal and reconstruct the input signal;then the feature map extracted by multi-scale convolutional kernels is used as the input of the classifier;and finally the parameters of the whole model are fine-tuned using labeled data.Experiments on one set of simulation fault data and two sets of rolling bearing fault data are conducted to validate the proposed method.The results show that the model can achieve 99.75%,99.3%and 100%diagnostic accuracy,respectively.In addition,the diagnostic accuracy and reconstruction error of the one-dimensional multi-scale convolutional auto-encoder are compared with traditional machine learning,convolutional neural networks and a traditional convolutional auto-encoder.The final results show that the proposed model has a better recognition effect for rolling bearing fault data.
基金This research was supported by the National Natural Science Foundation of China No.62276086the National Key R&D Program of China No.2022YFD2000100Zhejiang Provincial Natural Science Foundation of China under Grant No.LTGN23D010002.
文摘Tea leaf picking is a crucial stage in tea production that directly influences the quality and value of the tea.Traditional tea-picking machines may compromise the quality of the tea leaves.High-quality teas are often handpicked and need more delicate operations in intelligent picking machines.Compared with traditional image processing techniques,deep learning models have stronger feature extraction capabilities,and better generalization and are more suitable for practical tea shoot harvesting.However,current research mostly focuses on shoot detection and cannot directly accomplish end-to-end shoot segmentation tasks.We propose a tea shoot instance segmentation model based on multi-scale mixed attention(Mask2FusionNet)using a dataset from the tea garden in Hangzhou.We further analyzed the characteristics of the tea shoot dataset,where the proportion of small to medium-sized targets is 89.9%.Our algorithm is compared with several mainstream object segmentation algorithms,and the results demonstrate that our model achieves an accuracy of 82%in recognizing the tea shoots,showing a better performance compared to other models.Through ablation experiments,we found that ResNet50,PointRend strategy,and the Feature Pyramid Network(FPN)architecture can improve performance by 1.6%,1.4%,and 2.4%,respectively.These experiments demonstrated that our proposed multi-scale and point selection strategy optimizes the feature extraction capability for overlapping small targets.The results indicate that the proposed Mask2FusionNet model can perform the shoot segmentation in unstructured environments,realizing the individual distinction of tea shoots,and complete extraction of the shoot edge contours with a segmentation accuracy of 82.0%.The research results can provide algorithmic support for the segmentation and intelligent harvesting of premium tea shoots at different scales.
基金This work is funded by the natural science foundation of Jiangsu Province(No.BK20150471)the natural science foundation of the higher education institutions of Jiangsu Province(No.17KJB520007)+2 种基金the Key Research and Development Program of Zhenjiang-Social Development(No.SH2018005)the scientific researching fund of Jiangsu University of Science and Technology(No.1132921402,No.1132931803)the basic science and frontier technology research program of Chongqing Municipal Science and Technology Commission(cstc2016jcyjA0407).
文摘Micro-expression recognition has attracted growing research interests in the field of compute vision.However,micro-expression usually lasts a few seconds,thus it is difficult to detect.This paper presents a new framework to recognize micro-expression using pyramid histogram of Centralized Gabor Binary Pattern from Three Orthogonal Panels(CGBP-TOP)which is an extension of Local Gabor Binary Pattern from Three Orthogonal Panels feature.CGBP-TOP performs spatial and temporal analysis to capture the local facial characteristics of micro-expression image sequences.In order to keep more local information of the face,CGBP-TOP is extracted based on pyramid subregions of the micro-expression video frame.The combination of CGBP-TOP and spatial pyramid can represent well and truly the facial movements of the micro-expression image sequences.However,the dimension of our pyramid CGBP-TOP tends to be very high,which may lead to high data redundancy problem.In addition,it is clear that people of different genders usually have different ways of micro-expression.Therefore,in this paper,in order to select the relevant features of micro-expression,the gender-specific sparse multi-task learning method with adaptive regularization term is adopted to learn a compact subset of pyramid CGBP-TOP feature for micro-expression classification of different sexes.Finally,extensive experiments on widely used CASME II and SMIC databases demonstrate that our method can efficiently extract micro-expression motion features in the micro-expression video clip.Moreover,our proposed approach achieves comparable results with the state-of-the-art methods.
基金supported by the National Natural Science Foundation of China (62271255,61871218)the Fundamental Research Funds for the Central University (3082019NC2019002)+1 种基金the Aeronautical Science Foundation (ASFC-201920007002)the Program of Remote Sensing Intelligent Monitoring and Emergency Services for Regional Security Elements。
文摘In order to extract the richer feature information of ship targets from sea clutter, and address the high dimensional data problem, a method termed as multi-scale fusion kernel sparse preserving projection(MSFKSPP) based on the maximum margin criterion(MMC) is proposed for recognizing the class of ship targets utilizing the high-resolution range profile(HRRP). Multi-scale fusion is introduced to capture the local and detailed information in small-scale features, and the global and contour information in large-scale features, offering help to extract the edge information from sea clutter and further improving the target recognition accuracy. The proposed method can maximally preserve the multi-scale fusion sparse of data and maximize the class separability in the reduced dimensionality by reproducing kernel Hilbert space. Experimental results on the measured radar data show that the proposed method can effectively extract the features of ship target from sea clutter, further reduce the feature dimensionality, and improve target recognition performance.
基金Supported by Open Project of the Ministry of Industry and Information Technology Key Laboratory of Performance and Reliability Testing and Evaluation for Basic Software and Hardware。
文摘Background Recurrent recovery is a common method for video super-resolution(VSR)that models the correlation between frames via hidden states.However,the application of this structure in real-world scenarios can lead to unsatisfactory artifacts.We found that in real-world VSR training,the use of unknown and complex degradation can better simulate the degradation process in the real world.Methods Based on this,we propose the RealFuVSR model,which simulates real-world degradation and mitigates artifacts caused by the VSR.Specifically,we propose a multiscale feature extraction module(MSF)module that extracts and fuses features from multiple scales,thereby facilitating the elimination of hidden state artifacts.To improve the accuracy of the hidden state alignment information,RealFuVSR uses an advanced optical flow-guided deformable convolution.Moreover,a cascaded residual upsampling module was used to eliminate noise caused by the upsampling process.Results The experiment demonstrates that RealFuVSR model can not only recover high-quality videos but also outperforms the state-of-the-art RealBasicVSR and RealESRGAN models.
文摘In this study,an underwater image enhancement method based on multi-scale adversarial network was proposed to solve the problem of detail blur and color distortion in underwater images.Firstly,the local features of each layer were enhanced into the global features by the proposed residual dense block,which ensured that the generated images retain more details.Secondly,a multi-scale structure was adopted to extract multi-scale semantic features of the original images.Finally,the features obtained from the dual channels were fused by an adaptive fusion module to further optimize the features.The discriminant network adopted the structure of the Markov discriminator.In addition,by constructing mean square error,structural similarity,and perceived color loss function,the generated image is consistent with the reference image in structure,color,and content.The experimental results showed that the enhanced underwater image deblurring effect of the proposed algorithm was good and the problem of underwater image color bias was effectively improved.In both subjective and objective evaluation indexes,the experimental results of the proposed algorithm are better than those of the comparison algorithm.
文摘Target detection is always an important application in hyperspectral image processing field. In this paper, a spectral-spatial target detection algorithm for hyperspectral data is proposed.The spatial feature and spectral feature were unified based on the data filed theory and extracted by weighted manifold embedding. The novelties of the proposed method lie in two aspects. One is the way in which the spatial features and spectral features were fused as a new feature based on the data field theory, and the other is that local information was introduced to describe the decision boundary and explore the discriminative features for target detection. The extracted features based on data field modeling and manifold embedding techniques were considered for a target detection task.Three standard hyperspectral datasets were considered in the analysis. The effectiveness of the proposed target detection algorithm based on data field theory was proved by the higher detection rates with lower False Alarm Rates(FARs) with respect to those achieved by conventional hyperspectral target detectors.
文摘针对视觉算法在检测航拍图像中密集小目标时容易受到目标重叠、遮挡等情况干扰的现象,提出了一种基于高阶空间特征(目标形状、位置等信息的高级表示)提取的Transformer检测头HSF-TPH(Transformer prediction head with high-order spatial feature extraction)。所提检测头中将自注意力机制中的二阶交互扩展到三阶以生成高阶空间特征,提取更有区分度的空间关系,突出每一个小目标在空间上的语义信息。同时,为了缓解骨干网络过度下采样对小目标信息的压缩,设计了一种高分辨率特征图生成机制,增加头部网络的输入特征分辨率,以提升HSFTPH检测密集小目标的效果。设计了新的损失函数USIoU,降低算法位置偏差敏感性。在VisDrone2019数据集上开展实验证明,所提算法在面积最小、密度最高的人类目标的检测任务中实现了mAP50指标10个百分点以上的性能提升。