Traditional data-driven fault diagnosis methods depend on expert experience to manually extract effective fault features of signals,which has certain limitations.Conversely,deep learning techniques have gained promine...Traditional data-driven fault diagnosis methods depend on expert experience to manually extract effective fault features of signals,which has certain limitations.Conversely,deep learning techniques have gained prominence as a central focus of research in the field of fault diagnosis by strong fault feature extraction ability and end-to-end fault diagnosis efficiency.Recently,utilizing the respective advantages of convolution neural network(CNN)and Transformer in local and global feature extraction,research on cooperating the two have demonstrated promise in the field of fault diagnosis.However,the cross-channel convolution mechanism in CNN and the self-attention calculations in Transformer contribute to excessive complexity in the cooperative model.This complexity results in high computational costs and limited industrial applicability.To tackle the above challenges,this paper proposes a lightweight CNN-Transformer named as SEFormer for rotating machinery fault diagnosis.First,a separable multiscale depthwise convolution block is designed to extract and integrate multiscale feature information from different channel dimensions of vibration signals.Then,an efficient self-attention block is developed to capture critical fine-grained features of the signal from a global perspective.Finally,experimental results on the planetary gearbox dataset and themotor roller bearing dataset prove that the proposed framework can balance the advantages of robustness,generalization and lightweight compared to recent state-of-the-art fault diagnosis models based on CNN and Transformer.This study presents a feasible strategy for developing a lightweight rotating machinery fault diagnosis framework aimed at economical deployment.展开更多
Recently,video-based fire detection technology has become an important research topic in the field of machine vision.This paper proposes a method of combining the classification model and target detection model in dee...Recently,video-based fire detection technology has become an important research topic in the field of machine vision.This paper proposes a method of combining the classification model and target detection model in deep learning for fire detection.Firstly,the depthwise separable convolution is used to classify fire images,which saves a lot of detection time under the premise of ensuring detection accuracy.Secondly,You Only Look Once version 3(YOLOv3)target regression function is used to output the fire position information for the images whose classification result is fire,which avoids the problem that the accuracy of detection cannot be guaranteed by using YOLOv3 for target classification and position regression.At the same time,the detection time of target regression for images without fire is greatly reduced saved.The experiments were tested using a network public database.The detection accuracy reached 98%and the detection rate reached 38fps.This method not only saves the workload of manually extracting flame characteristics,reduces the calculation cost,and reduces the amount of parameters,but also improves the detection accuracy and detection rate.展开更多
Pointwise convolution is usually utilized to expand or squeeze features in modern lightweight deep models.However,it takes up most of the overall computational cost(usually more than 90%).This paper proposes a novel P...Pointwise convolution is usually utilized to expand or squeeze features in modern lightweight deep models.However,it takes up most of the overall computational cost(usually more than 90%).This paper proposes a novel Poker module to expand features by taking advantage of cheap depthwise convolution.As a result,the Poker module can greatly reduce the computational cost,and meanwhile generate a large number of effective features to guarantee the performance.The proposed module is standardized and can be employed wherever the feature expansion is needed.By varying the stride and the number of channels,different kinds of bottlenecks are designed to plug the proposed Poker module into the network.Thus,a lightweight model can be easily assembled.Experiments conducted on benchmarks reveal the effectiveness of our proposed Poker module.And our Poker Net models can reduce the computational cost by 7.1%-15.6%.Poker Net models achieve comparable or even higher recognition accuracy than previous state-of-the-art(SOTA)models on the Image Net ILSVRC2012 classification dataset.Code is available at https://github.com/diaomin/pokernet.展开更多
为解决传统电能质量扰动信号识别模型中特征融合固定和计算复杂度高的问题,文章提出了一种自适应格拉姆时间频率增强网络(Adaptive Gramian Time Frequency Enhancement Network,AGTFENet)。首先引入基于格拉姆矩阵的降噪策略处理一维...为解决传统电能质量扰动信号识别模型中特征融合固定和计算复杂度高的问题,文章提出了一种自适应格拉姆时间频率增强网络(Adaptive Gramian Time Frequency Enhancement Network,AGTFENet)。首先引入基于格拉姆矩阵的降噪策略处理一维输入信号,采用三分支并行架构,分别处理原始信号、格拉姆降噪信号和频谱;其次堆叠多个特征学习模块,通过深度可分离卷积提取各分支特征;最后引入自适应平均池化和自适应权重机制,动态调整各分支特征的贡献度,实现特征的加权融合及扰动信号的分类。仿真实验表明,AGTFENet在不同噪声等级(无噪声、40 dB、30 dB、20 dB)条件下的识别准确率分别为98.9%、98.7%、98.5%和97.8%,优于其他分类模型;且得益于其轻量化设计,在计算复杂度方面表现出色。展开更多
Convolutional neural networks (CNNs) are widely used in image classification tasks, but their increasing model size and computation make them challenging to implement on embedded systems with constrained hardware reso...Convolutional neural networks (CNNs) are widely used in image classification tasks, but their increasing model size and computation make them challenging to implement on embedded systems with constrained hardware resources. To address this issue, the MobileNetV1 network was developed, which employs depthwise convolution to reduce network complexity. MobileNetV1 employs a stride of 2 in several convolutional layers to decrease the spatial resolution of feature maps, thereby lowering computational costs. However, this stride setting can lead to a loss of spatial information, particularly affecting the detection and representation of smaller objects or finer details in images. To maintain the trade-off between complexity and model performance, a lightweight convolutional neural network with hierarchical multi-scale feature fusion based on the MobileNetV1 network is proposed. The network consists of two main subnetworks. The first subnetwork uses a depthwise dilated separable convolution (DDSC) layer to learn imaging features with fewer parameters, which results in a lightweight and computationally inexpensive network. Furthermore, depthwise dilated convolution in DDSC layer effectively expands the field of view of filters, allowing them to incorporate a larger context. The second subnetwork is a hierarchical multi-scale feature fusion (HMFF) module that uses parallel multi-resolution branches architecture to process the input feature map in order to extract the multi-scale feature information of the input image. Experimental results on the CIFAR-10, Malaria, and KvasirV1 datasets demonstrate that the proposed method is efficient, reducing the network parameters and computational cost by 65.02% and 39.78%, respectively, while maintaining the network performance compared to the MobileNetV1 baseline.展开更多
Deep kernel mapping support vector machines have achieved good results in numerous tasks by mapping features from a low-dimensional space to a high-dimensional space and then using support vector machines for classifi...Deep kernel mapping support vector machines have achieved good results in numerous tasks by mapping features from a low-dimensional space to a high-dimensional space and then using support vector machines for classification.However,the depth kernel mapping support vector machine does not take into account the connection of different dimensional spaces and increases the model parameters.To further improve the recognition capability of deep kernel mapping support vector machines while reducing the number of model parameters,this paper proposes a framework of Lightweight Deep Convolutional Cross-Connected Kernel Mapping Support Vector Machines(LC-CKMSVM).The framework consists of a feature extraction module and a classification module.The feature extraction module first maps the data from low-dimensional to high-dimensional space by fusing the representations of different dimensional spaces through cross-connections;then,it uses depthwise separable convolution to replace part of the original convolution to reduce the number of parameters in the module;The classification module uses a soft margin support vector machine for classification.The results on 6 different visual datasets show that LC-CKMSVM obtains better classification accuracies on most cases than the other five models.展开更多
地浸采铀作为铀矿的绿色开采技术,在生产运行中产生海量数据,利用这些海量数据进行大数据分析和趋势预测,能够提升技术人员制定生产计划的可靠性。目前采用的基于编码器-解码器结构的时序预测模型,由于存在注意力机制,导致计算复杂、内...地浸采铀作为铀矿的绿色开采技术,在生产运行中产生海量数据,利用这些海量数据进行大数据分析和趋势预测,能够提升技术人员制定生产计划的可靠性。目前采用的基于编码器-解码器结构的时序预测模型,由于存在注意力机制,导致计算复杂、内存消耗大。本研究提出深度可分离卷积混合模型,通过动态序列分割模块降低固定分割带来的语义破坏,通过深度可分离卷积混合模块降低模型运行时间并捕获局部和全局特征。结果表明,深度可分离卷积混合网络模型的均方误差(Mean Square Error,MSE)与平均绝对误差(Mean Absolute Error,MAE)相较于时间序列分块自注意力模型(Patch Time Series Transformer,PatchTST)分别降低了1.04%和4.13%,提出的动态序列分割模块的MSE与MAE相较于原有模型分别降低了7.32%和5.03%;在性能对比分析上,深度可分离卷积混合模型的训练速度相较于趋势季节分解线性模型(Decomposition Linear,DLinear)提高了59.91%。建立的模型能够准确预测采区生产运行中硫酸注液量的变化趋势,改善了现有预测模型针对地浸铀矿数据集存在的运行时间长、运行内存大、数据拟合差的问题,可为地浸铀矿生产决策提供理论和实践参考。展开更多
基金supported by the National Natural Science Foundation of China(No.52277055).
文摘Traditional data-driven fault diagnosis methods depend on expert experience to manually extract effective fault features of signals,which has certain limitations.Conversely,deep learning techniques have gained prominence as a central focus of research in the field of fault diagnosis by strong fault feature extraction ability and end-to-end fault diagnosis efficiency.Recently,utilizing the respective advantages of convolution neural network(CNN)and Transformer in local and global feature extraction,research on cooperating the two have demonstrated promise in the field of fault diagnosis.However,the cross-channel convolution mechanism in CNN and the self-attention calculations in Transformer contribute to excessive complexity in the cooperative model.This complexity results in high computational costs and limited industrial applicability.To tackle the above challenges,this paper proposes a lightweight CNN-Transformer named as SEFormer for rotating machinery fault diagnosis.First,a separable multiscale depthwise convolution block is designed to extract and integrate multiscale feature information from different channel dimensions of vibration signals.Then,an efficient self-attention block is developed to capture critical fine-grained features of the signal from a global perspective.Finally,experimental results on the planetary gearbox dataset and themotor roller bearing dataset prove that the proposed framework can balance the advantages of robustness,generalization and lightweight compared to recent state-of-the-art fault diagnosis models based on CNN and Transformer.This study presents a feasible strategy for developing a lightweight rotating machinery fault diagnosis framework aimed at economical deployment.
基金This work was supported by Liaoning Provincial Science Public Welfare Research Fund Project(No.2016002006)Liaoning Provincial Department of Education Scientific Research Service Local Project(No.L201708).
文摘Recently,video-based fire detection technology has become an important research topic in the field of machine vision.This paper proposes a method of combining the classification model and target detection model in deep learning for fire detection.Firstly,the depthwise separable convolution is used to classify fire images,which saves a lot of detection time under the premise of ensuring detection accuracy.Secondly,You Only Look Once version 3(YOLOv3)target regression function is used to output the fire position information for the images whose classification result is fire,which avoids the problem that the accuracy of detection cannot be guaranteed by using YOLOv3 for target classification and position regression.At the same time,the detection time of target regression for images without fire is greatly reduced saved.The experiments were tested using a network public database.The detection accuracy reached 98%and the detection rate reached 38fps.This method not only saves the workload of manually extracting flame characteristics,reduces the calculation cost,and reduces the amount of parameters,but also improves the detection accuracy and detection rate.
基金supported by National Natural Science Foundation of China(Nos.61525306,61633021,61721004,61806194,U1803261 and 61976132)Major Project for New Generation of AI(No.2018AAA0100400)+2 种基金Beijing Nova Program(No.Z201100006820079)Shandong Provincial Key Research and Development Program(No.2019JZZY010119)CAS-AIR。
文摘Pointwise convolution is usually utilized to expand or squeeze features in modern lightweight deep models.However,it takes up most of the overall computational cost(usually more than 90%).This paper proposes a novel Poker module to expand features by taking advantage of cheap depthwise convolution.As a result,the Poker module can greatly reduce the computational cost,and meanwhile generate a large number of effective features to guarantee the performance.The proposed module is standardized and can be employed wherever the feature expansion is needed.By varying the stride and the number of channels,different kinds of bottlenecks are designed to plug the proposed Poker module into the network.Thus,a lightweight model can be easily assembled.Experiments conducted on benchmarks reveal the effectiveness of our proposed Poker module.And our Poker Net models can reduce the computational cost by 7.1%-15.6%.Poker Net models achieve comparable or even higher recognition accuracy than previous state-of-the-art(SOTA)models on the Image Net ILSVRC2012 classification dataset.Code is available at https://github.com/diaomin/pokernet.
文摘为解决传统电能质量扰动信号识别模型中特征融合固定和计算复杂度高的问题,文章提出了一种自适应格拉姆时间频率增强网络(Adaptive Gramian Time Frequency Enhancement Network,AGTFENet)。首先引入基于格拉姆矩阵的降噪策略处理一维输入信号,采用三分支并行架构,分别处理原始信号、格拉姆降噪信号和频谱;其次堆叠多个特征学习模块,通过深度可分离卷积提取各分支特征;最后引入自适应平均池化和自适应权重机制,动态调整各分支特征的贡献度,实现特征的加权融合及扰动信号的分类。仿真实验表明,AGTFENet在不同噪声等级(无噪声、40 dB、30 dB、20 dB)条件下的识别准确率分别为98.9%、98.7%、98.5%和97.8%,优于其他分类模型;且得益于其轻量化设计,在计算复杂度方面表现出色。
文摘Convolutional neural networks (CNNs) are widely used in image classification tasks, but their increasing model size and computation make them challenging to implement on embedded systems with constrained hardware resources. To address this issue, the MobileNetV1 network was developed, which employs depthwise convolution to reduce network complexity. MobileNetV1 employs a stride of 2 in several convolutional layers to decrease the spatial resolution of feature maps, thereby lowering computational costs. However, this stride setting can lead to a loss of spatial information, particularly affecting the detection and representation of smaller objects or finer details in images. To maintain the trade-off between complexity and model performance, a lightweight convolutional neural network with hierarchical multi-scale feature fusion based on the MobileNetV1 network is proposed. The network consists of two main subnetworks. The first subnetwork uses a depthwise dilated separable convolution (DDSC) layer to learn imaging features with fewer parameters, which results in a lightweight and computationally inexpensive network. Furthermore, depthwise dilated convolution in DDSC layer effectively expands the field of view of filters, allowing them to incorporate a larger context. The second subnetwork is a hierarchical multi-scale feature fusion (HMFF) module that uses parallel multi-resolution branches architecture to process the input feature map in order to extract the multi-scale feature information of the input image. Experimental results on the CIFAR-10, Malaria, and KvasirV1 datasets demonstrate that the proposed method is efficient, reducing the network parameters and computational cost by 65.02% and 39.78%, respectively, while maintaining the network performance compared to the MobileNetV1 baseline.
基金This work is supported by the National Natural Science Foundation of China(61806013,61876010,61906005,62166002)General project of Science and Technology Plan of Beijing Municipal Education Commission(KM202110005028)+1 种基金Project of Interdisciplinary Research Institute of Beijing University of Technology(2021020101)International Research Cooperation Seed Fund of Beijing University of Technology(2021A01).
文摘Deep kernel mapping support vector machines have achieved good results in numerous tasks by mapping features from a low-dimensional space to a high-dimensional space and then using support vector machines for classification.However,the depth kernel mapping support vector machine does not take into account the connection of different dimensional spaces and increases the model parameters.To further improve the recognition capability of deep kernel mapping support vector machines while reducing the number of model parameters,this paper proposes a framework of Lightweight Deep Convolutional Cross-Connected Kernel Mapping Support Vector Machines(LC-CKMSVM).The framework consists of a feature extraction module and a classification module.The feature extraction module first maps the data from low-dimensional to high-dimensional space by fusing the representations of different dimensional spaces through cross-connections;then,it uses depthwise separable convolution to replace part of the original convolution to reduce the number of parameters in the module;The classification module uses a soft margin support vector machine for classification.The results on 6 different visual datasets show that LC-CKMSVM obtains better classification accuracies on most cases than the other five models.
文摘地浸采铀作为铀矿的绿色开采技术,在生产运行中产生海量数据,利用这些海量数据进行大数据分析和趋势预测,能够提升技术人员制定生产计划的可靠性。目前采用的基于编码器-解码器结构的时序预测模型,由于存在注意力机制,导致计算复杂、内存消耗大。本研究提出深度可分离卷积混合模型,通过动态序列分割模块降低固定分割带来的语义破坏,通过深度可分离卷积混合模块降低模型运行时间并捕获局部和全局特征。结果表明,深度可分离卷积混合网络模型的均方误差(Mean Square Error,MSE)与平均绝对误差(Mean Absolute Error,MAE)相较于时间序列分块自注意力模型(Patch Time Series Transformer,PatchTST)分别降低了1.04%和4.13%,提出的动态序列分割模块的MSE与MAE相较于原有模型分别降低了7.32%和5.03%;在性能对比分析上,深度可分离卷积混合模型的训练速度相较于趋势季节分解线性模型(Decomposition Linear,DLinear)提高了59.91%。建立的模型能够准确预测采区生产运行中硫酸注液量的变化趋势,改善了现有预测模型针对地浸铀矿数据集存在的运行时间长、运行内存大、数据拟合差的问题,可为地浸铀矿生产决策提供理论和实践参考。