Integrating heterogeneous data sources is a precondition to share data for enterprises. Highly-efficient data updating can both save system expenses, and offer real-time data. It is one of the hot issues to modify dat...Integrating heterogeneous data sources is a precondition to share data for enterprises. Highly-efficient data updating can both save system expenses, and offer real-time data. It is one of the hot issues to modify data rapidly in the pre-processing area of the data warehouse. An extract transform loading design is proposed based on a new data algorithm called Diff-Match,which is developed by utilizing mode matching and data-filtering technology. It can accelerate data renewal, filter the heterogeneous data, and seek out different sets of data. Its efficiency has been proved by its successful application in an enterprise of electric apparatus groups.展开更多
In order to use data information in the Internet, it is necessary to extract data from web pages. An HTT tree model representing HTML pages is presented. Based on the HTT model, a wrapper generation algorithm AGW is p...In order to use data information in the Internet, it is necessary to extract data from web pages. An HTT tree model representing HTML pages is presented. Based on the HTT model, a wrapper generation algorithm AGW is proposed. The AGW algorithm utilizes comparing and correcting technique to generate the wrapper with the native characteristic of the HTT tree structure. The AGW algorithm can not only generate the wrapper automatically, but also rebuild the data schema easily and reduce the complexity of the computing.展开更多
Reversible data hiding is an information hiding technique that requires the retrieval of the error free cover image after the extraction of the secret image.We suggested a technique in this research that uses a recurs...Reversible data hiding is an information hiding technique that requires the retrieval of the error free cover image after the extraction of the secret image.We suggested a technique in this research that uses a recursive embedding method to increase capacity substantially using the Integer wavelet transform and the Arnold transform.The notion of Integer wavelet transforms is to ensure that all coefficients of the cover images are used during embedding with an increase in payload.By scrambling the cover image,Arnold transform adds security to the information that gets embedded and also allows embedding more information in each iteration.The hybrid combination of Integer wavelet transform and Arnold transform results to build a more efficient and secure system.The proposed method employs a set of keys to ensure that information cannot be decoded by an attacker.The experimental results show that it aids in the development of a more secure storage system and withstand few tampering attacks The suggested technique is tested on many image formats,including medical images.Various performance metrics proves that the retrieved cover image and hidden image are both intact.This System is proven to withstand rotation attack as well.展开更多
Machining quality prediction based on cutting big data is the core focus of current developments in intelligent manufacturing.Presently,predictions of machining quality primarily rely on process and signal analyses.Pr...Machining quality prediction based on cutting big data is the core focus of current developments in intelligent manufacturing.Presently,predictions of machining quality primarily rely on process and signal analyses.Process-based predictions are generally constrained to the development of rudimentary regression models.Signal-based predictions often require large amounts of data,multiple processing steps(such as noise reduction,principal component analysis,modulation,etc.),and have low prediction efficiency.In addition,the accuracy of the model depends on tedious manual parameter tuning.This paper proposes a convolutional neural network quality intelligent prediction model based on automatic feature extraction and adaptive data fusion(CNN-AFEADF).Firstly,by processing signals from multiple directions,time-frequency domain images with rich features can be obtained,which significantly benefit neural network learning.Secondly,the corresponding images in three directions are fused into one image by setting different fusion weight parameters.The optimal fusion weight parameters and window length are determined by the Particle Swarm Optimization algorithm(PSO).This data fusion method reduces training time by 16.74 times.Finally,the proposed method is verified by various experiments.This method can automatically identify sensitive data features through neural networkfitting experiments and optimization,thereby eliminating the need for expert experience in determining the significance of data features.Based on this approach,the model achieves an average relative error of 2.95%,reducing the prediction error compared to traditional models.Furthermore,this method enhances the intelligent machining level.展开更多
Software defect prediction is a critical component in maintaining software quality,enabling early identification and resolution of issues that could lead to system failures and significant financial losses.With the in...Software defect prediction is a critical component in maintaining software quality,enabling early identification and resolution of issues that could lead to system failures and significant financial losses.With the increasing reliance on user-generated content,social media reviews have emerged as a valuable source of real-time feedback,offering insights into potential software defects that traditional testing methods may overlook.However,existing models face challenges like handling imbalanced data,high computational complexity,and insufficient inte-gration of contextual information from these reviews.To overcome these limitations,this paper introduces the SESDP(Sentiment Analysis-Based Early Software Defect Prediction)model.SESDP employs a Transformer-Based Multi-Task Learning approach using Robustly Optimized Bidirectional Encoder Representations from Transformers Approach(RoBERTa)to simultaneously perform sentiment analysis and defect prediction.By integrating text embedding extraction,sentiment score computation,and feature fusion,the model effectively captures both the contextual nuances and sentiment expressed in user reviews.Experimental results show that SESDP achieves superior performance with an accuracy of 96.37%,precision of 94.7%,and recall of 95.4%,particularly excelling in handling imbalanced datasets compared to baseline models.This approach offers a scalable and efficient solution for early software defect detection,enhancing proactive software quality assurance.展开更多
非侵入式负荷监测(non-intrusive load monitoring,NILM)是一种无需进入每个用电器内部系统,仅在用户总线入口处安装监测设备的技术.在开展NILM技术研究时,往往需要收集大规模的用户负荷数据来证明所提出方法的普适性,此需求不可避免地...非侵入式负荷监测(non-intrusive load monitoring,NILM)是一种无需进入每个用电器内部系统,仅在用户总线入口处安装监测设备的技术.在开展NILM技术研究时,往往需要收集大规模的用户负荷数据来证明所提出方法的普适性,此需求不可避免地带来了繁重的数据收集与整理负担.为克服该挑战,设计了一种结合周期信号频率不变变换(frequency invariant transformation for periodic signals,FIT-PS)原理与时间序列生成对抗网络(time series generative adversarial networks,TimeGAN)的混合模型,记为FIT-PSTimeGAN.针对全球家庭与工业瞬态能量数据集(worldwide household and industry transient energy dataset,WHITED)中的空调、微波炉、吸尘器、冰箱和热水壶5种电器,运用FIT-PS对负荷数据集进行切割和拼接,构建TimeGAN不同状态下的训练集和测试集.评估测试集的效果发现,生成的波形数据与真实数据表现出高度一致性.进一步采用FIT-PS对训练得到的生成数据进行截取和拼接,生成满足测试需求的完整的单负荷波形和多负荷波形.对这些生成的波形与相同状态下的真实数据进行对比,结果显示两者吻合度很高.与自回归模型和生成对抗网络(generative adversarial network,GAN)模型相比,FIT-PS-TimeGAN模型在生成数据的性能方面表现更优.研究结果表明,FIT-PS-TimeGAN混合模型能够有效生成符合标准电器运行规律的波形和场景数据.展开更多
针对制造业数字化转型中工艺数据利用率低、文档管理碎片化等问题,本文提出一种基于多源数据融合的工艺要素智能提取与结构化管理方法。通过整合OCR、PDF解析、NX Open API UG二次开发工具等技术,构建工艺要素数据管理系统原型,实现工...针对制造业数字化转型中工艺数据利用率低、文档管理碎片化等问题,本文提出一种基于多源数据融合的工艺要素智能提取与结构化管理方法。通过整合OCR、PDF解析、NX Open API UG二次开发工具等技术,构建工艺要素数据管理系统原型,实现工艺规程的离散化存储与动态调用。以活门偶件类产品为实验试点,开发具备智能提取工艺要素和结构化管理功能的系统原型。该系统原型在验证过程中,实现高达95.2%的标准文本提取准确率,并且在解析标准版PDF文档时,达到每分钟200页的解析速度。研究建立文档结构与工艺要素的关联推送机制,为企业生产准备计划编制、质量数据自动比对提供数据支撑,形成可复用的工艺知识管理体系。展开更多
基金Supported by National Natural Science Foundation of China (No. 50475117)Tianjin Natural Science Foundation (No.06YFJMJC03700).
文摘Integrating heterogeneous data sources is a precondition to share data for enterprises. Highly-efficient data updating can both save system expenses, and offer real-time data. It is one of the hot issues to modify data rapidly in the pre-processing area of the data warehouse. An extract transform loading design is proposed based on a new data algorithm called Diff-Match,which is developed by utilizing mode matching and data-filtering technology. It can accelerate data renewal, filter the heterogeneous data, and seek out different sets of data. Its efficiency has been proved by its successful application in an enterprise of electric apparatus groups.
基金the National Grand Fundamental Research 973 Program of China(G1998030414)
文摘In order to use data information in the Internet, it is necessary to extract data from web pages. An HTT tree model representing HTML pages is presented. Based on the HTT model, a wrapper generation algorithm AGW is proposed. The AGW algorithm utilizes comparing and correcting technique to generate the wrapper with the native characteristic of the HTT tree structure. The AGW algorithm can not only generate the wrapper automatically, but also rebuild the data schema easily and reduce the complexity of the computing.
文摘Reversible data hiding is an information hiding technique that requires the retrieval of the error free cover image after the extraction of the secret image.We suggested a technique in this research that uses a recursive embedding method to increase capacity substantially using the Integer wavelet transform and the Arnold transform.The notion of Integer wavelet transforms is to ensure that all coefficients of the cover images are used during embedding with an increase in payload.By scrambling the cover image,Arnold transform adds security to the information that gets embedded and also allows embedding more information in each iteration.The hybrid combination of Integer wavelet transform and Arnold transform results to build a more efficient and secure system.The proposed method employs a set of keys to ensure that information cannot be decoded by an attacker.The experimental results show that it aids in the development of a more secure storage system and withstand few tampering attacks The suggested technique is tested on many image formats,including medical images.Various performance metrics proves that the retrieved cover image and hidden image are both intact.This System is proven to withstand rotation attack as well.
基金supported by the National Key Research and Development Program Young Scientist Program under Grant No.2022YFB3402400the National Natural Science Foundation of China under Grant No.52375407+1 种基金Chongqing graduate research innovation project under Grant No.CYS23141Chongqing Talent Program under Grant No.CQYC202105002.
文摘Machining quality prediction based on cutting big data is the core focus of current developments in intelligent manufacturing.Presently,predictions of machining quality primarily rely on process and signal analyses.Process-based predictions are generally constrained to the development of rudimentary regression models.Signal-based predictions often require large amounts of data,multiple processing steps(such as noise reduction,principal component analysis,modulation,etc.),and have low prediction efficiency.In addition,the accuracy of the model depends on tedious manual parameter tuning.This paper proposes a convolutional neural network quality intelligent prediction model based on automatic feature extraction and adaptive data fusion(CNN-AFEADF).Firstly,by processing signals from multiple directions,time-frequency domain images with rich features can be obtained,which significantly benefit neural network learning.Secondly,the corresponding images in three directions are fused into one image by setting different fusion weight parameters.The optimal fusion weight parameters and window length are determined by the Particle Swarm Optimization algorithm(PSO).This data fusion method reduces training time by 16.74 times.Finally,the proposed method is verified by various experiments.This method can automatically identify sensitive data features through neural networkfitting experiments and optimization,thereby eliminating the need for expert experience in determining the significance of data features.Based on this approach,the model achieves an average relative error of 2.95%,reducing the prediction error compared to traditional models.Furthermore,this method enhances the intelligent machining level.
基金funded by a grant from the Center of Excellence in Information Assurance(CoEIA),King Saud University(KSU).
文摘Software defect prediction is a critical component in maintaining software quality,enabling early identification and resolution of issues that could lead to system failures and significant financial losses.With the increasing reliance on user-generated content,social media reviews have emerged as a valuable source of real-time feedback,offering insights into potential software defects that traditional testing methods may overlook.However,existing models face challenges like handling imbalanced data,high computational complexity,and insufficient inte-gration of contextual information from these reviews.To overcome these limitations,this paper introduces the SESDP(Sentiment Analysis-Based Early Software Defect Prediction)model.SESDP employs a Transformer-Based Multi-Task Learning approach using Robustly Optimized Bidirectional Encoder Representations from Transformers Approach(RoBERTa)to simultaneously perform sentiment analysis and defect prediction.By integrating text embedding extraction,sentiment score computation,and feature fusion,the model effectively captures both the contextual nuances and sentiment expressed in user reviews.Experimental results show that SESDP achieves superior performance with an accuracy of 96.37%,precision of 94.7%,and recall of 95.4%,particularly excelling in handling imbalanced datasets compared to baseline models.This approach offers a scalable and efficient solution for early software defect detection,enhancing proactive software quality assurance.
文摘非侵入式负荷监测(non-intrusive load monitoring,NILM)是一种无需进入每个用电器内部系统,仅在用户总线入口处安装监测设备的技术.在开展NILM技术研究时,往往需要收集大规模的用户负荷数据来证明所提出方法的普适性,此需求不可避免地带来了繁重的数据收集与整理负担.为克服该挑战,设计了一种结合周期信号频率不变变换(frequency invariant transformation for periodic signals,FIT-PS)原理与时间序列生成对抗网络(time series generative adversarial networks,TimeGAN)的混合模型,记为FIT-PSTimeGAN.针对全球家庭与工业瞬态能量数据集(worldwide household and industry transient energy dataset,WHITED)中的空调、微波炉、吸尘器、冰箱和热水壶5种电器,运用FIT-PS对负荷数据集进行切割和拼接,构建TimeGAN不同状态下的训练集和测试集.评估测试集的效果发现,生成的波形数据与真实数据表现出高度一致性.进一步采用FIT-PS对训练得到的生成数据进行截取和拼接,生成满足测试需求的完整的单负荷波形和多负荷波形.对这些生成的波形与相同状态下的真实数据进行对比,结果显示两者吻合度很高.与自回归模型和生成对抗网络(generative adversarial network,GAN)模型相比,FIT-PS-TimeGAN模型在生成数据的性能方面表现更优.研究结果表明,FIT-PS-TimeGAN混合模型能够有效生成符合标准电器运行规律的波形和场景数据.
文摘针对制造业数字化转型中工艺数据利用率低、文档管理碎片化等问题,本文提出一种基于多源数据融合的工艺要素智能提取与结构化管理方法。通过整合OCR、PDF解析、NX Open API UG二次开发工具等技术,构建工艺要素数据管理系统原型,实现工艺规程的离散化存储与动态调用。以活门偶件类产品为实验试点,开发具备智能提取工艺要素和结构化管理功能的系统原型。该系统原型在验证过程中,实现高达95.2%的标准文本提取准确率,并且在解析标准版PDF文档时,达到每分钟200页的解析速度。研究建立文档结构与工艺要素的关联推送机制,为企业生产准备计划编制、质量数据自动比对提供数据支撑,形成可复用的工艺知识管理体系。