针对目前电子式电流互感器谐波计量中谐波成分复杂,使实际测量中受噪声和异常值干扰,导致谐波计量结果准确性下降的问题,通过引入DBSCAN(dens ity-based spatial clustering of applications with noise)聚类,提出一种基于DBSCAN聚类的...针对目前电子式电流互感器谐波计量中谐波成分复杂,使实际测量中受噪声和异常值干扰,导致谐波计量结果准确性下降的问题,通过引入DBSCAN(dens ity-based spatial clustering of applications with noise)聚类,提出一种基于DBSCAN聚类的电子式电流互感器谐波计量算法,以有效识别并排除数据集中的噪声点和异常值,从而提高谐波成分检测的准确性.首先,采集电子式电流互感器电流信号,根据时频能量峰值获取互感器中各电流信号峰值频率;其次,通过DBSCAN聚类算法计算各电流信号峰值频率之间的距离,根据距离确定噪声信号、非谐波信号以及不同类型的谐波信号,排除数据集中的噪声点和异常值;最后,利用最小二乘法对各类谐波信号的幅值和相位进行计量,得到电流互感器谐波计量结果.实验结果表明,当时间为2 s时,谐波相位的实际值为18°,该算法的谐波相位为18°,始终与实际结果一致,对谐波幅值和相位的计量准确性均较高,表明该算法能有效提升谐波计量精度,避免受噪声和异常值干扰.展开更多
Finding clusters based on density represents a significant class of clustering algorithms.These methods can discover clusters of various shapes and sizes.The most studied algorithm in this class is theDensity-Based Sp...Finding clusters based on density represents a significant class of clustering algorithms.These methods can discover clusters of various shapes and sizes.The most studied algorithm in this class is theDensity-Based Spatial Clustering of Applications with Noise(DBSCAN).It identifies clusters by grouping the densely connected objects into one group and discarding the noise objects.It requires two input parameters:epsilon(fixed neighborhood radius)and MinPts(the lowest number of objects in epsilon).However,it can’t handle clusters of various densities since it uses a global value for epsilon.This article proposes an adaptation of the DBSCAN method so it can discover clusters of varied densities besides reducing the required number of input parameters to only one.Only user input in the proposed method is the MinPts.Epsilon on the other hand,is computed automatically based on statistical information of the dataset.The proposed method finds the core distance for each object in the dataset,takes the average of these distances as the first value of epsilon,and finds the clusters satisfying this density level.The remaining unclustered objects will be clustered using a new value of epsilon that equals the average core distances of unclustered objects.This process continues until all objects have been clustered or the remaining unclustered objects are less than 0.006 of the dataset’s size.The proposed method requires MinPts only as an input parameter because epsilon is computed from data.Benchmark datasets were used to evaluate the effectiveness of the proposed method that produced promising results.Practical experiments demonstrate that the outstanding ability of the proposed method to detect clusters of different densities even if there is no separation between them.The accuracy of the method ranges from 92%to 100%for the experimented datasets.展开更多
In recent years,there has been a concerted effort to improve anomaly detection tech-niques,particularly in the context of high-dimensional,distributed clinical data.Analysing patient data within clinical settings reve...In recent years,there has been a concerted effort to improve anomaly detection tech-niques,particularly in the context of high-dimensional,distributed clinical data.Analysing patient data within clinical settings reveals a pronounced focus on refining diagnostic accuracy,personalising treatment plans,and optimising resource allocation to enhance clinical outcomes.Nonetheless,this domain faces unique challenges,such as irregular data collection,inconsistent data quality,and patient-specific structural variations.This paper proposed a novel hybrid approach that integrates heuristic and stochastic methods for anomaly detection in patient clinical data to address these challenges.The strategy combines HPO-based optimal Density-Based Spatial Clustering of Applications with Noise for clustering patient exercise data,facilitating efficient anomaly identification.Subsequently,a stochastic method based on the Interquartile Range filters unreliable data points,ensuring that medical tools and professionals receive only the most pertinent and accurate information.The primary objective of this study is to equip healthcare pro-fessionals and researchers with a robust tool for managing extensive,high-dimensional clinical datasets,enabling effective isolation and removal of aberrant data points.Furthermore,a sophisticated regression model has been developed using Automated Machine Learning(AutoML)to assess the impact of the ensemble abnormal pattern detection approach.Various statistical error estimation techniques validate the efficacy of the hybrid approach alongside AutoML.Experimental results show that implementing this innovative hybrid model on patient rehabilitation data leads to a notable enhance-ment in AutoML performance,with an average improvement of 0.041 in the R2 score,surpassing the effectiveness of traditional regression models.展开更多
分析城市网约车出行需求有利于掌握城市居民出行的时空分布特征,指导城市营运客运资源的高效投放。为缓解城市出行压力,优化网约车运营调度并增强公众出行满意度,文中基于南京市网约车订单数据,引入DBSCAN(Density-Based Spatial Cluste...分析城市网约车出行需求有利于掌握城市居民出行的时空分布特征,指导城市营运客运资源的高效投放。为缓解城市出行压力,优化网约车运营调度并增强公众出行满意度,文中基于南京市网约车订单数据,引入DBSCAN(Density-Based Spatial Clustering of Applications with Noise)空间聚类算法,以南京市早高峰网约车出行乘客为研究对象,对网约车上客出行区域进行聚类分析,得出簇半径Eps为0.010、最小样本数量M为400为最优参数组合,能反映城市繁华商圈、大型客运枢纽、公共交通站点为城市网约车出行热点区域的特点;针对网约车典型载客热区提出南京市网约车投放建议。展开更多
钢拱桥的线形监测是桥梁健康监测系统的重要组成部分。运用三维激光扫描技术,融合随机抽样一致(random sample consensus,RANSAC)算法对传统的具有噪声的基于密度的聚类方法(density-based spatial clustering of applications with noi...钢拱桥的线形监测是桥梁健康监测系统的重要组成部分。运用三维激光扫描技术,融合随机抽样一致(random sample consensus,RANSAC)算法对传统的具有噪声的基于密度的聚类方法(density-based spatial clustering of applications with noise,DBSCAN)算法进行改进,对钢拱桥拱肋线形进行提取。三维激光点云数据具有全面性和细节体现的优势,能够完整地呈现桥梁结构的形状和变形信息,融合RANSAC的改进DBSCAN算法根据钢拱桥结构特征对聚类结果进行约束,能够很好地实现删除离散点及桥面、横撑、横联和腹杆部分的点云这一目的。根据融合RANSAC的改进DBSCAN算法提取出的点云进行关键点拟合,与人工提取结果进行对比,拱肋关键点提取误差均在毫米级,最大误差为9.2 mm,最小误差为0.1 mm,此提取方法能够更加准确有效地完成钢拱桥线形提取,使线形提取精度达到毫米级,大大降低了人力成本和时间成本,对钢拱桥的复杂结构有更好的鲁棒性,能很好地适应实际生产需求。展开更多
文摘针对目前电子式电流互感器谐波计量中谐波成分复杂,使实际测量中受噪声和异常值干扰,导致谐波计量结果准确性下降的问题,通过引入DBSCAN(dens ity-based spatial clustering of applications with noise)聚类,提出一种基于DBSCAN聚类的电子式电流互感器谐波计量算法,以有效识别并排除数据集中的噪声点和异常值,从而提高谐波成分检测的准确性.首先,采集电子式电流互感器电流信号,根据时频能量峰值获取互感器中各电流信号峰值频率;其次,通过DBSCAN聚类算法计算各电流信号峰值频率之间的距离,根据距离确定噪声信号、非谐波信号以及不同类型的谐波信号,排除数据集中的噪声点和异常值;最后,利用最小二乘法对各类谐波信号的幅值和相位进行计量,得到电流互感器谐波计量结果.实验结果表明,当时间为2 s时,谐波相位的实际值为18°,该算法的谐波相位为18°,始终与实际结果一致,对谐波幅值和相位的计量准确性均较高,表明该算法能有效提升谐波计量精度,避免受噪声和异常值干扰.
基金The author extends his appreciation to theDeputyship forResearch&Innovation,Ministry of Education in Saudi Arabia for funding this research work through the project number(IFPSAU-2021/01/17758).
文摘Finding clusters based on density represents a significant class of clustering algorithms.These methods can discover clusters of various shapes and sizes.The most studied algorithm in this class is theDensity-Based Spatial Clustering of Applications with Noise(DBSCAN).It identifies clusters by grouping the densely connected objects into one group and discarding the noise objects.It requires two input parameters:epsilon(fixed neighborhood radius)and MinPts(the lowest number of objects in epsilon).However,it can’t handle clusters of various densities since it uses a global value for epsilon.This article proposes an adaptation of the DBSCAN method so it can discover clusters of varied densities besides reducing the required number of input parameters to only one.Only user input in the proposed method is the MinPts.Epsilon on the other hand,is computed automatically based on statistical information of the dataset.The proposed method finds the core distance for each object in the dataset,takes the average of these distances as the first value of epsilon,and finds the clusters satisfying this density level.The remaining unclustered objects will be clustered using a new value of epsilon that equals the average core distances of unclustered objects.This process continues until all objects have been clustered or the remaining unclustered objects are less than 0.006 of the dataset’s size.The proposed method requires MinPts only as an input parameter because epsilon is computed from data.Benchmark datasets were used to evaluate the effectiveness of the proposed method that produced promising results.Practical experiments demonstrate that the outstanding ability of the proposed method to detect clusters of different densities even if there is no separation between them.The accuracy of the method ranges from 92%to 100%for the experimented datasets.
文摘In recent years,there has been a concerted effort to improve anomaly detection tech-niques,particularly in the context of high-dimensional,distributed clinical data.Analysing patient data within clinical settings reveals a pronounced focus on refining diagnostic accuracy,personalising treatment plans,and optimising resource allocation to enhance clinical outcomes.Nonetheless,this domain faces unique challenges,such as irregular data collection,inconsistent data quality,and patient-specific structural variations.This paper proposed a novel hybrid approach that integrates heuristic and stochastic methods for anomaly detection in patient clinical data to address these challenges.The strategy combines HPO-based optimal Density-Based Spatial Clustering of Applications with Noise for clustering patient exercise data,facilitating efficient anomaly identification.Subsequently,a stochastic method based on the Interquartile Range filters unreliable data points,ensuring that medical tools and professionals receive only the most pertinent and accurate information.The primary objective of this study is to equip healthcare pro-fessionals and researchers with a robust tool for managing extensive,high-dimensional clinical datasets,enabling effective isolation and removal of aberrant data points.Furthermore,a sophisticated regression model has been developed using Automated Machine Learning(AutoML)to assess the impact of the ensemble abnormal pattern detection approach.Various statistical error estimation techniques validate the efficacy of the hybrid approach alongside AutoML.Experimental results show that implementing this innovative hybrid model on patient rehabilitation data leads to a notable enhance-ment in AutoML performance,with an average improvement of 0.041 in the R2 score,surpassing the effectiveness of traditional regression models.
文摘分析城市网约车出行需求有利于掌握城市居民出行的时空分布特征,指导城市营运客运资源的高效投放。为缓解城市出行压力,优化网约车运营调度并增强公众出行满意度,文中基于南京市网约车订单数据,引入DBSCAN(Density-Based Spatial Clustering of Applications with Noise)空间聚类算法,以南京市早高峰网约车出行乘客为研究对象,对网约车上客出行区域进行聚类分析,得出簇半径Eps为0.010、最小样本数量M为400为最优参数组合,能反映城市繁华商圈、大型客运枢纽、公共交通站点为城市网约车出行热点区域的特点;针对网约车典型载客热区提出南京市网约车投放建议。