Invoice document digitization is crucial for efficient management in industries.The scanned invoice image is often noisy due to various reasons.This affects the OCR(optical character recognition)detection accuracy.In ...Invoice document digitization is crucial for efficient management in industries.The scanned invoice image is often noisy due to various reasons.This affects the OCR(optical character recognition)detection accuracy.In this paper,letter data obtained from images of invoices are denoised using a modified autoencoder based deep learning method.A stacked denoising autoencoder(SDAE)is implemented with two hidden layers each in encoder network and decoder network.In order to capture the most salient features of training samples,a undercomplete autoencoder is designed with non-linear encoder and decoder function.This autoencoder is regularized for denoising application using a combined loss function which considers both mean square error and binary cross entropy.A dataset consisting of 59,119 letter images,which contains both English alphabets(upper and lower case)and numbers(0 to 9)is prepared from many scanned invoices images and windows true type(.ttf)files,are used for training the neural network.Performance is analyzed in terms of Signal to Noise Ratio(SNR),Peak Signal to Noise Ratio(PSNR),Structural Similarity Index(SSIM)and Universal Image Quality Index(UQI)and compared with other filtering techniques like Nonlocal Means filter,Anisotropic diffusion filter,Gaussian filters and Mean filters.Denoising performance of proposed SDAE is compared with existing SDAE with single loss function in terms of SNR and PSNR values.Results show the superior performance of proposed SDAE method.展开更多
Wireless sensor networks are increasingly used in sensitive event monitoring.However,various abnormal data generated by sensors greatly decrease the accuracy of the event detection.Although many methods have been prop...Wireless sensor networks are increasingly used in sensitive event monitoring.However,various abnormal data generated by sensors greatly decrease the accuracy of the event detection.Although many methods have been proposed to deal with the abnormal data,they generally detect and/or repair all abnormal data without further differentiate.Actually,besides the abnormal data caused by events,it is well known that sensor nodes prone to generate abnormal data due to factors such as sensor hardware drawbacks and random effects of external sources.Dealing with all abnormal data without differentiate will result in false detection or missed detection of the events.In this paper,we propose a data cleaning approach based on Stacked Denoising Autoencoders(SDAE)and multi-sensor collaborations.We detect all abnormal data by SDAE,then differentiate the abnormal data by multi-sensor collaborations.The abnormal data caused by events are unchanged,while the abnormal data caused by other factors are repaired.Real data based simulations show the efficiency of the proposed approach.展开更多
针对现有电动汽车电池状态估计方法存在运算效率低和估算准确率低的问题,提出一种模型以估算电动汽车电池荷电状态(state of charge,SOC)和健康状态(state of health,SOH)。采用堆叠降噪自编码器(stacked denosing auto encoder,SDAE)...针对现有电动汽车电池状态估计方法存在运算效率低和估算准确率低的问题,提出一种模型以估算电动汽车电池荷电状态(state of charge,SOC)和健康状态(state of health,SOH)。采用堆叠降噪自编码器(stacked denosing auto encoder,SDAE)清洗电压、电流和温度数据中的异常数据和空缺数据,减小对估算精度的影响。引入动态通道剪枝(dynamical channel pruning,DCP)技术对Informer模型进行稀疏化处理,提高剪枝后模型的性能和稳定性。将清洗过的数据输入DCPInformer模型实现SOC和SOH的精确估计。实验结果表明,所提出的SDAE-DCPInformer模型估计SOC的平均绝对误差和均方根误差分别达到0.25%和0.38%,估计SOH的平均绝对误差和均方根误差分别达到了0.51%和0.64%。与传统Transformer等模型相比,所提模型预测SOC和SOH的速度更快,估算准确度有效提升,拥有的更好稳定性和泛化性。展开更多
为了提升入侵检测的准确率,鉴于自编码器在学习特征方面的优势以及残差网络在构建深层模型方面的成熟应用,提出一种基于特征降维的改进残差网络入侵检测模型(improved residual network intrusion detection model based on feature dim...为了提升入侵检测的准确率,鉴于自编码器在学习特征方面的优势以及残差网络在构建深层模型方面的成熟应用,提出一种基于特征降维的改进残差网络入侵检测模型(improved residual network intrusion detection model based on feature dimensionality reduction,IRFD),进而缓解传统机器学习入侵检测模型的低准确率问题。IRFD采用堆叠降噪稀疏自编码器策略对数据进行降维,从而提取有效特征。利用卷积注意力机制对残差网络进行改进,构建能提取关键特征的分类网络,并利用两个典型的入侵检测数据集验证IRFD的检测性能。实验结果表明,IRFD在数据集UNSW-NB15和CICIDS 2017上的准确率均达到99%以上,且F1-score分别为99.5%和99.7%。与基线模型相比,提出的IRFD在准确率、精确率和F1-score性能上均有较大提升。展开更多
非侵入式负荷监测(NILM)通过分析电力总线数据估计单个负荷的功率波形,是电力系统能耗管理的关键技术之一。随着用户对设备能耗管理需求的增加,NILM的准确性成为研究的重点之一,但它容易受到功率类型、功率水平和负荷变化的影响。单一N...非侵入式负荷监测(NILM)通过分析电力总线数据估计单个负荷的功率波形,是电力系统能耗管理的关键技术之一。随着用户对设备能耗管理需求的增加,NILM的准确性成为研究的重点之一,但它容易受到功率类型、功率水平和负荷变化的影响。单一NILM模型面对不同类型的负荷时准确性差异较大,使用单一方法难以在各类负荷上均取得理想效果。因此,提出一种基于堆叠集成学习的非侵入式负荷高精度辨识方法 AMEL(Aggregation Method based on Ensemble Learning)。首先,选择在各种类型的负荷中表现最优的几种方法构建NILM模型库;其次,建立一个基于多层感知机(MLP)的NILM模型偏好框架,以实现对不同负荷的高精度监测。在UK-DALE数据集上的实验结果表明,与典型的NILM方法相比,所提方法的平均绝对误差(MAE)平均降低了35.6%,F1、召回率和马修斯相关系数(MCC)分别平均提升了33.5%、30.6%和32.1%。此外,通过比较现有的堆叠集成方法和各类设备的辨识波形,验证了所提方法的有效性。展开更多
The increasingly complex and interconnected train control information network is vulnerable to a variety of malicious traffic attacks,and the existing malicious traffic detection methods mainly rely on machine learnin...The increasingly complex and interconnected train control information network is vulnerable to a variety of malicious traffic attacks,and the existing malicious traffic detection methods mainly rely on machine learning,such as poor robustness,weak generalization,and a lack of ability to learn common features.Therefore,this paper proposes a malicious traffic identification method based on stacked sparse denoising autoencoders combined with a regularized extreme learning machine through particle swarm optimization.Firstly,the simulation environment of the Chinese train control system-3,was constructed for data acquisition.Then Pearson coefficient and other methods are used for pre-processing,then a stacked sparse denoising autoencoder is used to achieve nonlinear dimensionality reduction of features,and finally regularization extreme learning machine optimized by particle swarm optimization is used to achieve classification.Experimental data show that the proposed method has good training performance,with an average accuracy of 97.57%and a false negative rate of 2.43%,which is better than other alternative methods.In addition,ablation experiments were performed to evaluate the contribution of each component,and the results showed that the combination of methods was superior to individual methods.To further evaluate the generalization ability of the model in different scenarios,publicly available data sets of industrial control system networks were used.The results show that the model has robust detection capability in various types of network attacks.展开更多
为了解决联合收割机作业故障的非线性特征信号难以提取的问题,该研究提出了一种基于堆叠去噪自动编码器(Stack Denoising Auto Encoder,SDAE)和BP神经网络(Back Propagation,BP)融合的联合收割机作业故障监测及诊断的方法(SDAE-BP)。以...为了解决联合收割机作业故障的非线性特征信号难以提取的问题,该研究提出了一种基于堆叠去噪自动编码器(Stack Denoising Auto Encoder,SDAE)和BP神经网络(Back Propagation,BP)融合的联合收割机作业故障监测及诊断的方法(SDAE-BP)。以转速传感器采集联合收割机脱粒滚筒转速、籽粒搅龙转速、喂入搅龙转速、杂余搅龙转速、风机转速、输送链耙转速、割刀频率以及逐稿器振动频率,并将采集的数据集作为系统的输入。利用SDAE提取输入信号的深层次特征,并由BP神经网络辨识收割机作业状态,实现联合收割机故障监测。在SDAE-BP模型训练过程中,去噪自动编码器(Denoising Auto Encode,DAE)依次经带有不同分布中心噪声的原始数据进行训练,然后将其堆叠,并通过误差反向传播算法对模型参数进行优化,以提升模型识别故障性能和泛化能力。试验结果表明,对于2018年联合收割机田间试验数据,模型的故障诊断准确率达到99.00%,与SDAE和BP神经网络相比,分别提高了1.5和4.5个百分点。将SDAE-BP故障诊断模型用2019年的试验数据进行更新,并用2018年和2019年试验数据进行测试,结果表明,更新后的模型对2018年试验数据的故障识别准确率为99.25%,对2019年试验数据的故障识别准确率为98.74%,更新后模型在2019试验数据集上的故障识别准确率较未更新模型提高了6.52个百分点。该文所建模型能够准确识别联合收割机的故障类型,且具有较好的鲁棒性,对旋转型机械故障监测及预警具有参考价值。展开更多
文摘Invoice document digitization is crucial for efficient management in industries.The scanned invoice image is often noisy due to various reasons.This affects the OCR(optical character recognition)detection accuracy.In this paper,letter data obtained from images of invoices are denoised using a modified autoencoder based deep learning method.A stacked denoising autoencoder(SDAE)is implemented with two hidden layers each in encoder network and decoder network.In order to capture the most salient features of training samples,a undercomplete autoencoder is designed with non-linear encoder and decoder function.This autoencoder is regularized for denoising application using a combined loss function which considers both mean square error and binary cross entropy.A dataset consisting of 59,119 letter images,which contains both English alphabets(upper and lower case)and numbers(0 to 9)is prepared from many scanned invoices images and windows true type(.ttf)files,are used for training the neural network.Performance is analyzed in terms of Signal to Noise Ratio(SNR),Peak Signal to Noise Ratio(PSNR),Structural Similarity Index(SSIM)and Universal Image Quality Index(UQI)and compared with other filtering techniques like Nonlocal Means filter,Anisotropic diffusion filter,Gaussian filters and Mean filters.Denoising performance of proposed SDAE is compared with existing SDAE with single loss function in terms of SNR and PSNR values.Results show the superior performance of proposed SDAE method.
基金This work is supported by the National Natural Science Foundation of China(Grant No.61672282)the Basic Research Program of Jiangsu Province(Grant No.BK20161491).
文摘Wireless sensor networks are increasingly used in sensitive event monitoring.However,various abnormal data generated by sensors greatly decrease the accuracy of the event detection.Although many methods have been proposed to deal with the abnormal data,they generally detect and/or repair all abnormal data without further differentiate.Actually,besides the abnormal data caused by events,it is well known that sensor nodes prone to generate abnormal data due to factors such as sensor hardware drawbacks and random effects of external sources.Dealing with all abnormal data without differentiate will result in false detection or missed detection of the events.In this paper,we propose a data cleaning approach based on Stacked Denoising Autoencoders(SDAE)and multi-sensor collaborations.We detect all abnormal data by SDAE,then differentiate the abnormal data by multi-sensor collaborations.The abnormal data caused by events are unchanged,while the abnormal data caused by other factors are repaired.Real data based simulations show the efficiency of the proposed approach.
文摘针对现有电动汽车电池状态估计方法存在运算效率低和估算准确率低的问题,提出一种模型以估算电动汽车电池荷电状态(state of charge,SOC)和健康状态(state of health,SOH)。采用堆叠降噪自编码器(stacked denosing auto encoder,SDAE)清洗电压、电流和温度数据中的异常数据和空缺数据,减小对估算精度的影响。引入动态通道剪枝(dynamical channel pruning,DCP)技术对Informer模型进行稀疏化处理,提高剪枝后模型的性能和稳定性。将清洗过的数据输入DCPInformer模型实现SOC和SOH的精确估计。实验结果表明,所提出的SDAE-DCPInformer模型估计SOC的平均绝对误差和均方根误差分别达到0.25%和0.38%,估计SOH的平均绝对误差和均方根误差分别达到了0.51%和0.64%。与传统Transformer等模型相比,所提模型预测SOC和SOH的速度更快,估算准确度有效提升,拥有的更好稳定性和泛化性。
文摘为了提升入侵检测的准确率,鉴于自编码器在学习特征方面的优势以及残差网络在构建深层模型方面的成熟应用,提出一种基于特征降维的改进残差网络入侵检测模型(improved residual network intrusion detection model based on feature dimensionality reduction,IRFD),进而缓解传统机器学习入侵检测模型的低准确率问题。IRFD采用堆叠降噪稀疏自编码器策略对数据进行降维,从而提取有效特征。利用卷积注意力机制对残差网络进行改进,构建能提取关键特征的分类网络,并利用两个典型的入侵检测数据集验证IRFD的检测性能。实验结果表明,IRFD在数据集UNSW-NB15和CICIDS 2017上的准确率均达到99%以上,且F1-score分别为99.5%和99.7%。与基线模型相比,提出的IRFD在准确率、精确率和F1-score性能上均有较大提升。
文摘非侵入式负荷监测(NILM)通过分析电力总线数据估计单个负荷的功率波形,是电力系统能耗管理的关键技术之一。随着用户对设备能耗管理需求的增加,NILM的准确性成为研究的重点之一,但它容易受到功率类型、功率水平和负荷变化的影响。单一NILM模型面对不同类型的负荷时准确性差异较大,使用单一方法难以在各类负荷上均取得理想效果。因此,提出一种基于堆叠集成学习的非侵入式负荷高精度辨识方法 AMEL(Aggregation Method based on Ensemble Learning)。首先,选择在各种类型的负荷中表现最优的几种方法构建NILM模型库;其次,建立一个基于多层感知机(MLP)的NILM模型偏好框架,以实现对不同负荷的高精度监测。在UK-DALE数据集上的实验结果表明,与典型的NILM方法相比,所提方法的平均绝对误差(MAE)平均降低了35.6%,F1、召回率和马修斯相关系数(MCC)分别平均提升了33.5%、30.6%和32.1%。此外,通过比较现有的堆叠集成方法和各类设备的辨识波形,验证了所提方法的有效性。
文摘The increasingly complex and interconnected train control information network is vulnerable to a variety of malicious traffic attacks,and the existing malicious traffic detection methods mainly rely on machine learning,such as poor robustness,weak generalization,and a lack of ability to learn common features.Therefore,this paper proposes a malicious traffic identification method based on stacked sparse denoising autoencoders combined with a regularized extreme learning machine through particle swarm optimization.Firstly,the simulation environment of the Chinese train control system-3,was constructed for data acquisition.Then Pearson coefficient and other methods are used for pre-processing,then a stacked sparse denoising autoencoder is used to achieve nonlinear dimensionality reduction of features,and finally regularization extreme learning machine optimized by particle swarm optimization is used to achieve classification.Experimental data show that the proposed method has good training performance,with an average accuracy of 97.57%and a false negative rate of 2.43%,which is better than other alternative methods.In addition,ablation experiments were performed to evaluate the contribution of each component,and the results showed that the combination of methods was superior to individual methods.To further evaluate the generalization ability of the model in different scenarios,publicly available data sets of industrial control system networks were used.The results show that the model has robust detection capability in various types of network attacks.
文摘为了解决联合收割机作业故障的非线性特征信号难以提取的问题,该研究提出了一种基于堆叠去噪自动编码器(Stack Denoising Auto Encoder,SDAE)和BP神经网络(Back Propagation,BP)融合的联合收割机作业故障监测及诊断的方法(SDAE-BP)。以转速传感器采集联合收割机脱粒滚筒转速、籽粒搅龙转速、喂入搅龙转速、杂余搅龙转速、风机转速、输送链耙转速、割刀频率以及逐稿器振动频率,并将采集的数据集作为系统的输入。利用SDAE提取输入信号的深层次特征,并由BP神经网络辨识收割机作业状态,实现联合收割机故障监测。在SDAE-BP模型训练过程中,去噪自动编码器(Denoising Auto Encode,DAE)依次经带有不同分布中心噪声的原始数据进行训练,然后将其堆叠,并通过误差反向传播算法对模型参数进行优化,以提升模型识别故障性能和泛化能力。试验结果表明,对于2018年联合收割机田间试验数据,模型的故障诊断准确率达到99.00%,与SDAE和BP神经网络相比,分别提高了1.5和4.5个百分点。将SDAE-BP故障诊断模型用2019年的试验数据进行更新,并用2018年和2019年试验数据进行测试,结果表明,更新后的模型对2018年试验数据的故障识别准确率为99.25%,对2019年试验数据的故障识别准确率为98.74%,更新后模型在2019试验数据集上的故障识别准确率较未更新模型提高了6.52个百分点。该文所建模型能够准确识别联合收割机的故障类型,且具有较好的鲁棒性,对旋转型机械故障监测及预警具有参考价值。