The heterogeneous nodes in the Internet of Things(IoT)are relatively weak in the computing power and storage capacity.Therefore,traditional algorithms of network security are not suitable for the IoT.Once these nodes ...The heterogeneous nodes in the Internet of Things(IoT)are relatively weak in the computing power and storage capacity.Therefore,traditional algorithms of network security are not suitable for the IoT.Once these nodes alternate between normal behavior and anomaly behavior,it is difficult to identify and isolate them by the network system in a short time,thus the data transmission accuracy and the integrity of the network function will be affected negatively.Based on the characteristics of IoT,a lightweight local outlier factor detection method is used for node detection.In order to further determine whether the nodes are an anomaly or not,the varying behavior of those nodes in terms of time is considered in this research,and a time series method is used to make the system respond to the randomness and selectiveness of anomaly behavior nodes effectively in a short period of time.Simulation results show that the proposed method can improve the accuracy of the data transmitted by the network and achieve better performance.展开更多
Since data services are penetrating into our daily life rapidly, the mobile network becomes more complicated, and the amount of data transmission is more and more increasing. In this case, the traditional statistical ...Since data services are penetrating into our daily life rapidly, the mobile network becomes more complicated, and the amount of data transmission is more and more increasing. In this case, the traditional statistical methods for anomalous cell detection cannot adapt to the evolution of networks, and data mining becomes the mainstream. In this paper, we propose a novel kernel density-based local outlier factor(KLOF) to assign a degree of being an outlier to each object. Firstly, the notion of KLOF is introduced, which captures exactly the relative degree of isolation. Then, by analyzing its properties, including the tightness of upper and lower bounds, sensitivity of density perturbation, we find that KLOF is much greater than 1 for outliers. Lastly, KLOFis applied on a real-world dataset to detect anomalous cells with abnormal key performance indicators(KPIs) to verify its reliability. The experiment shows that KLOF can find outliers efficiently. It can be a guideline for the operators to perform faster and more efficient trouble shooting.展开更多
数据流是一类具有高生成率、动态分布特性的数据,其异常检测旨在从这一类数据中发现偏离预期行为的数据流,从而为医疗、工业生产、金融等诸多领域的决策提供支持。现有数据流异常检测方法普遍面临参数敏感性高、时空开销大、阈值选取难...数据流是一类具有高生成率、动态分布特性的数据,其异常检测旨在从这一类数据中发现偏离预期行为的数据流,从而为医疗、工业生产、金融等诸多领域的决策提供支持。现有数据流异常检测方法普遍面临参数敏感性高、时空开销大、阈值选取难等问题。为了解决上述问题,提出一种基于变密度的自适应数据流的异常检测方法。首先定义了可变局部离群因子(Va-riable Local Outlier Factor,VLOF),VLOF通过对比数据点在并行的不同k值的邻域窗口下,其局部可达密度和局部异常因子的变化情况,度量数据点的密度分布,降低单一k近邻密度度量导致的结果不准确。其次,计算VLOF与k值的相对增长率和绝对均值率,以反映数据流的动态变化趋势,并将适应这种动态变化趋势的数据点定义为核心点,通过核心点加快对后续正常点的判断。最后,将相对增长率和绝对均值率作为数据点理论分布的度量指标,计算理论分布和新数据点实际分布的差异,从而自适应地将偏离理论分布的点识别为异常。为了验证提出算法的有效性,在多个UCI数据集和真实数据集下与8个算法进行对比实验,实验结果表明:与基线模型相比,所提方法在精确率、召回率、F1性能指标上表现良好,且时间和空间效率也有相应提升。展开更多
Purpose:The main aim of this study is to build a robust novel approach that is able to detect outliers in the datasets accurately.To serve this purpose,a novel approach is introduced to determine the likelihood of an ...Purpose:The main aim of this study is to build a robust novel approach that is able to detect outliers in the datasets accurately.To serve this purpose,a novel approach is introduced to determine the likelihood of an object to be extremely different from the general behavior of the entire dataset.Design/methodology/approach:This paper proposes a novel two-level approach based on the integration of bagging and voting techniques for anomaly detection problems.The proposed approach,named Bagged and Voted Local Outlier Detection(BV-LOF),benefits from the Local Outlier Factor(LOF)as the base algorithm and improves its detection rate by using ensemble methods.Findings:Several experiments have been performed on ten benchmark outlier detection datasets to demonstrate the effectiveness of the BV-LOF method.According to the results,the BV-LOF approach significantly outperformed LOF on 9 datasets of 10 ones on average.Research limitations:In the BV-LOF approach,the base algorithm is applied to each subset data multiple times with different neighborhood sizes(k)in each case and with different ensemble sizes(T).In our study,we have chosen k and T value ranges as[1-100];however,these ranges can be changed according to the dataset handled and to the problem addressed.Practical implications:The proposed method can be applied to the datasets from different domains(i.e.health,finance,manufacturing,etc.)without requiring any prior information.Since the BV-LOF method includes two-level ensemble operations,it may lead to more computational time than single-level ensemble methods;however,this drawback can be overcome by parallelization and by using a proper data structure such as R*-tree or KD-tree.Originality/value:The proposed approach(BV-LOF)investigates multiple neighborhood sizes(k),which provides findings of instances with different local densities,and in this way,it provides more likelihood of outlier detection that LOF may neglect.It also brings many benefits such as easy implementation,improved capability,higher applicability,and interpretability.展开更多
Focusing on controlling the press-assembly quality of high-precision servo mechanism,an intelligent early warning method based on outlier data detection and linear regression is proposed.Linear regression is used to d...Focusing on controlling the press-assembly quality of high-precision servo mechanism,an intelligent early warning method based on outlier data detection and linear regression is proposed.Linear regression is used to deal with the relationship between assembly quality and press-assembly process,then the mathematical model of displacement-force in press-assembly process is established and a qualified press-assembly force range is defined for assembly quality control.To preprocess the raw dataset of displacement-force in the press-assembly process,an improved local outlier factor based on area density and P weight(LAOPW)is designed to eliminate the outliers which will result in inaccuracy of the mathematical model.A weighted distance based on information entropy is used to measure distance,and the reachable distance is replaced with P weight.Experiments show that the detection efficiency of the algorithm is improved by 5.6 ms compared with the traditional local outlier factor(LOF)algorithm,and the detection accuracy is improved by about 2%compared with the local outlier factor based on area density(LAOF)algorithm.The application of LAOPW algorithm and the linear regression model shows that it can effectively carry out intelligent early warning of press-assembly quality of high precision servo mechanism.展开更多
针对目前飞机离地姿态异常的监控依赖单一参数超限探测、缺乏多参数组合异常检测的问题,提出了一种基于近邻搜索空间提取的局部异常因子算法(Isolation-based Data Extracting Local Outlier Factor,IDELOF)的飞机离地姿态异常检测方法...针对目前飞机离地姿态异常的监控依赖单一参数超限探测、缺乏多参数组合异常检测的问题,提出了一种基于近邻搜索空间提取的局部异常因子算法(Isolation-based Data Extracting Local Outlier Factor,IDELOF)的飞机离地姿态异常检测方法。首先,选取空速、俯仰角、滚转角作为飞机离地姿态特征参数,运用基于隔离思想的近邻搜索空间提取方法进行数据降维提取,降低计算复杂度;其次,利用局部异常因子算法对提取后的数据进行异常检测,识别多参综合异常;然后,基于国内某航空公司A319机队297个航班的快速存取记录器(Quick Access Recorder,QAR)数据,验证了模型对单一参数异常和多参综合异常检测结果的有效性;最后,对模型结果的正异常分布特征及可解释性进行分析,分别阐述了八种异常情况出现的主要原因,为飞行安全风险防控提供了深入的数据支持。展开更多
针对复杂工业生产过程具有高维度、多工况、非线性的特征以及扩散映射存在的新样本投影困难的问题,本文提出了一种基于可扩容式扩散映射和局部离群因子(expandable diffusion maps and local outlier factors, EDM-LOF)的工业过程故障...针对复杂工业生产过程具有高维度、多工况、非线性的特征以及扩散映射存在的新样本投影困难的问题,本文提出了一种基于可扩容式扩散映射和局部离群因子(expandable diffusion maps and local outlier factors, EDM-LOF)的工业过程故障检测方法.使用扩散映射方法提取训练样本的低维流形结构,构建局部投影矩阵将新样本投影至流形空间,并在流形空间中使用局部离群因子方法进行故障检测.将EDM-LOF应用于青霉素发酵过程进行故障检测,并与PCA、FD-kNN、LOF方法进行比较,结果表明EDM-LOF具有更高的故障检测性能,验证了该方法的有效性.展开更多
Accurate prediction of solubility data in the Sodium Chloride-Sodium Sulfate-Water system is essential.It provides theoretical support for salt lake resource development and wastewater treatment technologies.This stud...Accurate prediction of solubility data in the Sodium Chloride-Sodium Sulfate-Water system is essential.It provides theoretical support for salt lake resource development and wastewater treatment technologies.This study proposes an innovative solubility prediction approach.It addresses the limitations of traditional thermodynamic models.This is particularly important when experimental data from various sources contain inconsistencies.Our approach combines the Weighted Local Outlier Factor technique for anomaly detection with a Deep Ensemble Neural Network architecture.This methodology effectively removes local outliers while preserving data distribution integrity,and integrates multiple neural network sub-models to comprehensively capture system features while minimizing individual model biases.Experimental validation demonstrates exceptional prediction performance across temperatures from−20℃to 150℃,achieving a coefficient of determination of 0.989 after Bayesian hyperparameter optimization.This data-driven approach provides more accurate and universally applicable solubility predictions than conventional thermodynamic models,offering theoretical guidance for industrial applications in salt lake resource utilization,separation process optimization,and environmental salt management systems.展开更多
基金This work is partially supported by the Ministry of Education of China(www.moe.gov.cn)under grant Nos.201802123091(received by F.W.)and 201802123068(received by Z.W.)Scientific Project of CAFUC(www.cafuc.edu.cn)under grant Nos.F2017KF02 and J2018-3(both received by Z.W.)Teaching Reform Project of CAFUC(www.cafuc.edu.cn)under grant No.E2020044(received by Z.W.).
文摘The heterogeneous nodes in the Internet of Things(IoT)are relatively weak in the computing power and storage capacity.Therefore,traditional algorithms of network security are not suitable for the IoT.Once these nodes alternate between normal behavior and anomaly behavior,it is difficult to identify and isolate them by the network system in a short time,thus the data transmission accuracy and the integrity of the network function will be affected negatively.Based on the characteristics of IoT,a lightweight local outlier factor detection method is used for node detection.In order to further determine whether the nodes are an anomaly or not,the varying behavior of those nodes in terms of time is considered in this research,and a time series method is used to make the system respond to the randomness and selectiveness of anomaly behavior nodes effectively in a short period of time.Simulation results show that the proposed method can improve the accuracy of the data transmitted by the network and achieve better performance.
基金supported by the National Basic Research Program of China (973 Program: 2013CB329004)
文摘Since data services are penetrating into our daily life rapidly, the mobile network becomes more complicated, and the amount of data transmission is more and more increasing. In this case, the traditional statistical methods for anomalous cell detection cannot adapt to the evolution of networks, and data mining becomes the mainstream. In this paper, we propose a novel kernel density-based local outlier factor(KLOF) to assign a degree of being an outlier to each object. Firstly, the notion of KLOF is introduced, which captures exactly the relative degree of isolation. Then, by analyzing its properties, including the tightness of upper and lower bounds, sensitivity of density perturbation, we find that KLOF is much greater than 1 for outliers. Lastly, KLOFis applied on a real-world dataset to detect anomalous cells with abnormal key performance indicators(KPIs) to verify its reliability. The experiment shows that KLOF can find outliers efficiently. It can be a guideline for the operators to perform faster and more efficient trouble shooting.
文摘数据流是一类具有高生成率、动态分布特性的数据,其异常检测旨在从这一类数据中发现偏离预期行为的数据流,从而为医疗、工业生产、金融等诸多领域的决策提供支持。现有数据流异常检测方法普遍面临参数敏感性高、时空开销大、阈值选取难等问题。为了解决上述问题,提出一种基于变密度的自适应数据流的异常检测方法。首先定义了可变局部离群因子(Va-riable Local Outlier Factor,VLOF),VLOF通过对比数据点在并行的不同k值的邻域窗口下,其局部可达密度和局部异常因子的变化情况,度量数据点的密度分布,降低单一k近邻密度度量导致的结果不准确。其次,计算VLOF与k值的相对增长率和绝对均值率,以反映数据流的动态变化趋势,并将适应这种动态变化趋势的数据点定义为核心点,通过核心点加快对后续正常点的判断。最后,将相对增长率和绝对均值率作为数据点理论分布的度量指标,计算理论分布和新数据点实际分布的差异,从而自适应地将偏离理论分布的点识别为异常。为了验证提出算法的有效性,在多个UCI数据集和真实数据集下与8个算法进行对比实验,实验结果表明:与基线模型相比,所提方法在精确率、召回率、F1性能指标上表现良好,且时间和空间效率也有相应提升。
文摘Purpose:The main aim of this study is to build a robust novel approach that is able to detect outliers in the datasets accurately.To serve this purpose,a novel approach is introduced to determine the likelihood of an object to be extremely different from the general behavior of the entire dataset.Design/methodology/approach:This paper proposes a novel two-level approach based on the integration of bagging and voting techniques for anomaly detection problems.The proposed approach,named Bagged and Voted Local Outlier Detection(BV-LOF),benefits from the Local Outlier Factor(LOF)as the base algorithm and improves its detection rate by using ensemble methods.Findings:Several experiments have been performed on ten benchmark outlier detection datasets to demonstrate the effectiveness of the BV-LOF method.According to the results,the BV-LOF approach significantly outperformed LOF on 9 datasets of 10 ones on average.Research limitations:In the BV-LOF approach,the base algorithm is applied to each subset data multiple times with different neighborhood sizes(k)in each case and with different ensemble sizes(T).In our study,we have chosen k and T value ranges as[1-100];however,these ranges can be changed according to the dataset handled and to the problem addressed.Practical implications:The proposed method can be applied to the datasets from different domains(i.e.health,finance,manufacturing,etc.)without requiring any prior information.Since the BV-LOF method includes two-level ensemble operations,it may lead to more computational time than single-level ensemble methods;however,this drawback can be overcome by parallelization and by using a proper data structure such as R*-tree or KD-tree.Originality/value:The proposed approach(BV-LOF)investigates multiple neighborhood sizes(k),which provides findings of instances with different local densities,and in this way,it provides more likelihood of outlier detection that LOF may neglect.It also brings many benefits such as easy implementation,improved capability,higher applicability,and interpretability.
文摘Focusing on controlling the press-assembly quality of high-precision servo mechanism,an intelligent early warning method based on outlier data detection and linear regression is proposed.Linear regression is used to deal with the relationship between assembly quality and press-assembly process,then the mathematical model of displacement-force in press-assembly process is established and a qualified press-assembly force range is defined for assembly quality control.To preprocess the raw dataset of displacement-force in the press-assembly process,an improved local outlier factor based on area density and P weight(LAOPW)is designed to eliminate the outliers which will result in inaccuracy of the mathematical model.A weighted distance based on information entropy is used to measure distance,and the reachable distance is replaced with P weight.Experiments show that the detection efficiency of the algorithm is improved by 5.6 ms compared with the traditional local outlier factor(LOF)algorithm,and the detection accuracy is improved by about 2%compared with the local outlier factor based on area density(LAOF)algorithm.The application of LAOPW algorithm and the linear regression model shows that it can effectively carry out intelligent early warning of press-assembly quality of high precision servo mechanism.
文摘针对目前飞机离地姿态异常的监控依赖单一参数超限探测、缺乏多参数组合异常检测的问题,提出了一种基于近邻搜索空间提取的局部异常因子算法(Isolation-based Data Extracting Local Outlier Factor,IDELOF)的飞机离地姿态异常检测方法。首先,选取空速、俯仰角、滚转角作为飞机离地姿态特征参数,运用基于隔离思想的近邻搜索空间提取方法进行数据降维提取,降低计算复杂度;其次,利用局部异常因子算法对提取后的数据进行异常检测,识别多参综合异常;然后,基于国内某航空公司A319机队297个航班的快速存取记录器(Quick Access Recorder,QAR)数据,验证了模型对单一参数异常和多参综合异常检测结果的有效性;最后,对模型结果的正异常分布特征及可解释性进行分析,分别阐述了八种异常情况出现的主要原因,为飞行安全风险防控提供了深入的数据支持。
文摘针对复杂工业生产过程具有高维度、多工况、非线性的特征以及扩散映射存在的新样本投影困难的问题,本文提出了一种基于可扩容式扩散映射和局部离群因子(expandable diffusion maps and local outlier factors, EDM-LOF)的工业过程故障检测方法.使用扩散映射方法提取训练样本的低维流形结构,构建局部投影矩阵将新样本投影至流形空间,并在流形空间中使用局部离群因子方法进行故障检测.将EDM-LOF应用于青霉素发酵过程进行故障检测,并与PCA、FD-kNN、LOF方法进行比较,结果表明EDM-LOF具有更高的故障检测性能,验证了该方法的有效性.
基金support of the Natural Science Foundation of Qinghai Province of China(2024-ZJ-940)Qinghai University Research Ability Enhancement Project(2025KTST02)are greatly appreciated.
文摘Accurate prediction of solubility data in the Sodium Chloride-Sodium Sulfate-Water system is essential.It provides theoretical support for salt lake resource development and wastewater treatment technologies.This study proposes an innovative solubility prediction approach.It addresses the limitations of traditional thermodynamic models.This is particularly important when experimental data from various sources contain inconsistencies.Our approach combines the Weighted Local Outlier Factor technique for anomaly detection with a Deep Ensemble Neural Network architecture.This methodology effectively removes local outliers while preserving data distribution integrity,and integrates multiple neural network sub-models to comprehensively capture system features while minimizing individual model biases.Experimental validation demonstrates exceptional prediction performance across temperatures from−20℃to 150℃,achieving a coefficient of determination of 0.989 after Bayesian hyperparameter optimization.This data-driven approach provides more accurate and universally applicable solubility predictions than conventional thermodynamic models,offering theoretical guidance for industrial applications in salt lake resource utilization,separation process optimization,and environmental salt management systems.