Cluster-basedmodels have numerous application scenarios in vehicular ad-hoc networks(VANETs)and can greatly help improve the communication performance of VANETs.However,the frequent movement of vehicles can often lead...Cluster-basedmodels have numerous application scenarios in vehicular ad-hoc networks(VANETs)and can greatly help improve the communication performance of VANETs.However,the frequent movement of vehicles can often lead to changes in the network topology,thereby reducing cluster stability in urban scenarios.To address this issue,we propose a clustering model based on the density peak clustering(DPC)method and sparrow search algorithm(SSA),named SDPC.First,the model constructs a fitness function based on the parameters obtained from the DPC method and deploys the SSA for iterative optimization to select cluster heads(CHs).Then,the vehicles that have not been selected as CHs are assigned to appropriate clusters by comprehensively considering the distance parameter and link-reliability parameter.Finally,cluster maintenance strategies are considered to tackle the changes in the clusters’organizational structure.To verify the performance of the model,we conducted a simulation on a real-world scenario for multiple metrics related to clusters’stability.The results show that compared with the APROVE and the GAPC,SDPC showed clear performance advantages,indicating that SDPC can effectively ensure VANETs’cluster stability in urban scenarios.展开更多
Traditional clustering algorithms often struggle to produce satisfactory results when dealing with datasets withuneven density. Additionally, they incur substantial computational costs when applied to high-dimensional...Traditional clustering algorithms often struggle to produce satisfactory results when dealing with datasets withuneven density. Additionally, they incur substantial computational costs when applied to high-dimensional datadue to calculating similarity matrices. To alleviate these issues, we employ the KD-Tree to partition the dataset andcompute the K-nearest neighbors (KNN) density for each point, thereby avoiding the computation of similaritymatrices. Moreover, we apply the rules of voting elections, treating each data point as a voter and casting a votefor the point with the highest density among its KNN. By utilizing the vote counts of each point, we develop thestrategy for classifying noise points and potential cluster centers, allowing the algorithm to identify clusters withuneven density and complex shapes. Additionally, we define the concept of “adhesive points” between two clustersto merge adjacent clusters that have similar densities. This process helps us identify the optimal number of clustersautomatically. Experimental results indicate that our algorithm not only improves the efficiency of clustering butalso increases its accuracy.展开更多
Failure mode and effect analysis(FMEA)is a preven-tative risk evaluation method used to evaluate and eliminate fail-ure modes within a system.However,the traditional FMEA method exhibits many deficiencies that pose ch...Failure mode and effect analysis(FMEA)is a preven-tative risk evaluation method used to evaluate and eliminate fail-ure modes within a system.However,the traditional FMEA method exhibits many deficiencies that pose challenges in prac-tical applications.To improve the conventional FMEA,many modified FMEA models have been suggested.However,the majority of them inadequately address consensus issues and focus on achieving a complete ranking of failure modes.In this research,we propose a new FMEA approach that integrates a two-stage consensus reaching model and a density peak clus-tering algorithm for the assessment and clustering of failure modes.Firstly,we employ the interval 2-tuple linguistic vari-ables(I2TLVs)to express the uncertain risk evaluations provided by FMEA experts.Then,a two-stage consensus reaching model is adopted to enable FMEA experts to reach a consensus.Next,failure modes are categorized into several risk clusters using a density peak clustering algorithm.Finally,the proposed FMEA is illustrated by a case study of load-bearing guidance devices of subway systems.The results show that the proposed FMEA model can more easily to describe the uncertain risk information of failure modes by using the I2TLVs;the introduction of an endogenous feedback mechanism and an exogenous feedback mechanism can accelerate the process of consensus reaching;and the density peak clustering of failure modes successfully improves the practical applicability of FMEA.展开更多
The key challenge of the extended target probability hypothesis density (ET-PHD) filter is to reduce the computational complexity by using a subset to approximate the full set of partitions. In this paper, the influen...The key challenge of the extended target probability hypothesis density (ET-PHD) filter is to reduce the computational complexity by using a subset to approximate the full set of partitions. In this paper, the influence for the tracking results of different partitions is analyzed, and the form of the most informative partition is obtained. Then, a fast density peak-based clustering (FDPC) partitioning algorithm is applied to the measurement set partitioning. Since only one partition of the measurement set is used, the ET-PHD filter based on FDPC partitioning has lower computational complexity than the other ET-PHD filters. As FDPC partitioning is able to remove the spatially close clutter-generated measurements, the ET-PHD filter based on FDPC partitioning has good tracking performance in the scenario with more clutter-generated measurements. The simulation results show that the proposed algorithm can get the most informative partition and obviously reduce computational burden without losing tracking performance. As the number of clutter-generated measurements increased, the ET-PHD filter based on FDPC partitioning has better tracking performance than other ET-PHD filters. The FDPC algorithm will play an important role in the engineering realization of the multiple extended target tracking filter.展开更多
Performing cluster analysis on molecular conformation is an important way to find the representative conformation in the molecular dynamics trajectories.Usually,it is a critical step for interpreting complex conformat...Performing cluster analysis on molecular conformation is an important way to find the representative conformation in the molecular dynamics trajectories.Usually,it is a critical step for interpreting complex conformational changes or interaction mechanisms.As one of the density-based clustering algorithms,find density peaks(FDP)is an accurate and reasonable candidate for the molecular conformation clustering.However,facing the rapidly increasing simulation length due to the increase in computing power,the low computing efficiency of FDP limits its application potential.Here we propose a marginal extension to FDP named K-means find density peaks(KFDP)to solve the mass source consuming problem.In KFDP,the points are initially clustered by a high efficiency clustering algorithm,such as K-means.Cluster centers are defined as typical points with a weight which represents the cluster size.Then,the weighted typical points are clustered again by FDP,and then are refined as core,boundary,and redefined halo points.In this way,KFDP has comparable accuracy as FDP but its computational complexity is reduced from O(n^(2))to O(n).We apply and test our KFDP method to the trajectory data of multiple small proteins in terms of torsion angle,secondary structure or contact map.The comparing results with K-means and density-based spatial clustering of applications with noise show the validation of the proposed KFDP.展开更多
We present a novel unsupervised integrated score framework to generate generic extractive multi- document summaries by ranking sentences based on dynamic programming (DP) strategy. Considering that cluster-based met...We present a novel unsupervised integrated score framework to generate generic extractive multi- document summaries by ranking sentences based on dynamic programming (DP) strategy. Considering that cluster-based methods proposed by other researchers tend to ignore informativeness of words when they generate summaries, our proposed framework takes relevance, diversity, informativeness and length constraint of sentences into consideration comprehensively. We apply Density Peaks Clustering (DPC) to get relevance scores and diversity scores of sentences simultaneously. Our framework produces the best performance on DUC2004, 0.396 of ROUGE-1 score, 0.094 of ROUGE-2 score and 0.143 of ROUGE-SU4 which outperforms a series of popular baselines, such as DUC Best, FGB [7], and BSTM [10].展开更多
Modeling of energy consumption(EC) and effluent quality(EQ) are very essential problems that need to be solved for the multiobjective optimal control in the wastewater treatment process(WWTP). To address this issue, a...Modeling of energy consumption(EC) and effluent quality(EQ) are very essential problems that need to be solved for the multiobjective optimal control in the wastewater treatment process(WWTP). To address this issue, a density peaks-based adaptive fuzzy neural network(DP-AFNN) is proposed in this study. To obtain suitable fuzzy rules, a DP-based clustering method is applied to fit the cluster centers to process nonlinearity.The parameters of the extracted fuzzy rules are fine-tuned based on the improved Levenberg-Marquardt algorithm during the training process. Furthermore, the analysis of convergence is performed to guarantee the successful application of the DPAFNN. Finally, the proposed DP-AFNN is utilized to develop the models of EC and EQ in the WWTP. The experimental results show that the proposed DP-AFNN can achieve fast convergence speed and high prediction accuracy in comparison with some existing methods.展开更多
针对谱聚类在尺度参数计算时需要人为设置近邻参数及聚类结果不稳定等问题,本文将初始类中心值和尺度参数作为决策变量,重点对谱聚类算法进行自适应优化与改进。首先,将样本邻域标准差的倒数作为度量样本局部密度的参数,与密度峰值思想...针对谱聚类在尺度参数计算时需要人为设置近邻参数及聚类结果不稳定等问题,本文将初始类中心值和尺度参数作为决策变量,重点对谱聚类算法进行自适应优化与改进。首先,将样本邻域标准差的倒数作为度量样本局部密度的参数,与密度峰值思想相结合,设计了一种基于密度峰值的初始类中心决策值选择方法(initial class center decision value algorithm based on density peak,DP_KD),解决密度调整谱聚类中聚类结果不稳定的问题。其次,利用样本间的平均距离计算相应的邻域半径,并根据样本标准差自适应地求解每个样本的尺度参数,构造样本间的相似度矩阵,实现了近邻参数的自适应设置,解决尺度参数需要人为设置的问题。然后,基于优化后的初始类中心决策值和近邻参数方法,进一步调整高斯核函数,提出一种基于邻域标准差的密度调整谱聚类算法(density adjusted spectral clustering algorithm based on neighborhood standard deviation,DSSD),通过构建特征向量空间实现了密度谱聚类。最后,将提出的算法与其他聚类算法在多个数据集上进行了对比。结果表明,与其他谱聚类算法相比,本文提出的DSSD算法不仅具有更好的聚类效果,且聚类结果更加稳定,尤其是在类内密集且类间边缘明确的DIM512数据集中,DSSD算法可以正确地进行聚类分簇;在准确率、兰德系数和F-measure上较其他算法至少提升了0.0268、0.0136和0.0247,这表明DSSD算法不仅聚类效果较好且更适合大规模数据集的聚类分析。展开更多
结合自然邻居搜索算法改进了密度峰值聚类(clustering by fast search and find of density peaks,CFSFDP)算法存在的一系列问题,提出基于自然邻居搜索优化策略的密度峰值聚类(density peak clustering algorithm optimized by natural ...结合自然邻居搜索算法改进了密度峰值聚类(clustering by fast search and find of density peaks,CFSFDP)算法存在的一系列问题,提出基于自然邻居搜索优化策略的密度峰值聚类(density peak clustering algorithm optimized by natural neighbor search,NaN-CFSFDP)算法。基于自然邻居搜索算法提出了一种离群样本的检测方法,针对CFSFDP算法中截断距离d_(c)人工准确取值较难的问题,结合自然邻居搜索算法改进了d_(c)的计算方式,实现了d_(c)的自动取值。重新设计并统一了CFSFDP算法的样本密度度量规则,使其更关注每个样本的局部信息。由于数据集中因类簇间的密度差异大,密度峰值点集中于稠密簇使得簇丢失,因此提出样本共享自然邻居和类簇共享自然邻居的概念,构造新的类簇融合算法。合成数据集和真实数据集上的实验结果表明,在大多数情况下,NaN-CFSFDP算法在聚类性能上优于或至少与比较方法相当,且与CFSFDP算法及其改进算法相比参数更少。展开更多
文摘Cluster-basedmodels have numerous application scenarios in vehicular ad-hoc networks(VANETs)and can greatly help improve the communication performance of VANETs.However,the frequent movement of vehicles can often lead to changes in the network topology,thereby reducing cluster stability in urban scenarios.To address this issue,we propose a clustering model based on the density peak clustering(DPC)method and sparrow search algorithm(SSA),named SDPC.First,the model constructs a fitness function based on the parameters obtained from the DPC method and deploys the SSA for iterative optimization to select cluster heads(CHs).Then,the vehicles that have not been selected as CHs are assigned to appropriate clusters by comprehensively considering the distance parameter and link-reliability parameter.Finally,cluster maintenance strategies are considered to tackle the changes in the clusters’organizational structure.To verify the performance of the model,we conducted a simulation on a real-world scenario for multiple metrics related to clusters’stability.The results show that compared with the APROVE and the GAPC,SDPC showed clear performance advantages,indicating that SDPC can effectively ensure VANETs’cluster stability in urban scenarios.
基金National Natural Science Foundation of China Nos.61962054 and 62372353.
文摘Traditional clustering algorithms often struggle to produce satisfactory results when dealing with datasets withuneven density. Additionally, they incur substantial computational costs when applied to high-dimensional datadue to calculating similarity matrices. To alleviate these issues, we employ the KD-Tree to partition the dataset andcompute the K-nearest neighbors (KNN) density for each point, thereby avoiding the computation of similaritymatrices. Moreover, we apply the rules of voting elections, treating each data point as a voter and casting a votefor the point with the highest density among its KNN. By utilizing the vote counts of each point, we develop thestrategy for classifying noise points and potential cluster centers, allowing the algorithm to identify clusters withuneven density and complex shapes. Additionally, we define the concept of “adhesive points” between two clustersto merge adjacent clusters that have similar densities. This process helps us identify the optimal number of clustersautomatically. Experimental results indicate that our algorithm not only improves the efficiency of clustering butalso increases its accuracy.
基金supported by the Fundamental Research Funds for the Central Universities(22120240094)Humanities and Social Science Fund of Ministry of Education China(22YJA630082).
文摘Failure mode and effect analysis(FMEA)is a preven-tative risk evaluation method used to evaluate and eliminate fail-ure modes within a system.However,the traditional FMEA method exhibits many deficiencies that pose challenges in prac-tical applications.To improve the conventional FMEA,many modified FMEA models have been suggested.However,the majority of them inadequately address consensus issues and focus on achieving a complete ranking of failure modes.In this research,we propose a new FMEA approach that integrates a two-stage consensus reaching model and a density peak clus-tering algorithm for the assessment and clustering of failure modes.Firstly,we employ the interval 2-tuple linguistic vari-ables(I2TLVs)to express the uncertain risk evaluations provided by FMEA experts.Then,a two-stage consensus reaching model is adopted to enable FMEA experts to reach a consensus.Next,failure modes are categorized into several risk clusters using a density peak clustering algorithm.Finally,the proposed FMEA is illustrated by a case study of load-bearing guidance devices of subway systems.The results show that the proposed FMEA model can more easily to describe the uncertain risk information of failure modes by using the I2TLVs;the introduction of an endogenous feedback mechanism and an exogenous feedback mechanism can accelerate the process of consensus reaching;and the density peak clustering of failure modes successfully improves the practical applicability of FMEA.
基金supported by the National Natural Science Foundation of China(61401475)
文摘The key challenge of the extended target probability hypothesis density (ET-PHD) filter is to reduce the computational complexity by using a subset to approximate the full set of partitions. In this paper, the influence for the tracking results of different partitions is analyzed, and the form of the most informative partition is obtained. Then, a fast density peak-based clustering (FDPC) partitioning algorithm is applied to the measurement set partitioning. Since only one partition of the measurement set is used, the ET-PHD filter based on FDPC partitioning has lower computational complexity than the other ET-PHD filters. As FDPC partitioning is able to remove the spatially close clutter-generated measurements, the ET-PHD filter based on FDPC partitioning has good tracking performance in the scenario with more clutter-generated measurements. The simulation results show that the proposed algorithm can get the most informative partition and obviously reduce computational burden without losing tracking performance. As the number of clutter-generated measurements increased, the ET-PHD filter based on FDPC partitioning has better tracking performance than other ET-PHD filters. The FDPC algorithm will play an important role in the engineering realization of the multiple extended target tracking filter.
基金Professor Hong Yu at Intelligent Fishery Innovative Team(No.C202109)in School of Information Engineering of Dalian Ocean University for her support of this workfunded by the National Natural Science Foundation of China(No.31800615 and No.21933010)。
文摘Performing cluster analysis on molecular conformation is an important way to find the representative conformation in the molecular dynamics trajectories.Usually,it is a critical step for interpreting complex conformational changes or interaction mechanisms.As one of the density-based clustering algorithms,find density peaks(FDP)is an accurate and reasonable candidate for the molecular conformation clustering.However,facing the rapidly increasing simulation length due to the increase in computing power,the low computing efficiency of FDP limits its application potential.Here we propose a marginal extension to FDP named K-means find density peaks(KFDP)to solve the mass source consuming problem.In KFDP,the points are initially clustered by a high efficiency clustering algorithm,such as K-means.Cluster centers are defined as typical points with a weight which represents the cluster size.Then,the weighted typical points are clustered again by FDP,and then are refined as core,boundary,and redefined halo points.In this way,KFDP has comparable accuracy as FDP but its computational complexity is reduced from O(n^(2))to O(n).We apply and test our KFDP method to the trajectory data of multiple small proteins in terms of torsion angle,secondary structure or contact map.The comparing results with K-means and density-based spatial clustering of applications with noise show the validation of the proposed KFDP.
文摘We present a novel unsupervised integrated score framework to generate generic extractive multi- document summaries by ranking sentences based on dynamic programming (DP) strategy. Considering that cluster-based methods proposed by other researchers tend to ignore informativeness of words when they generate summaries, our proposed framework takes relevance, diversity, informativeness and length constraint of sentences into consideration comprehensively. We apply Density Peaks Clustering (DPC) to get relevance scores and diversity scores of sentences simultaneously. Our framework produces the best performance on DUC2004, 0.396 of ROUGE-1 score, 0.094 of ROUGE-2 score and 0.143 of ROUGE-SU4 which outperforms a series of popular baselines, such as DUC Best, FGB [7], and BSTM [10].
基金supported by the National Science Foundation for Distinguished Young Scholars of China(61225016)the State Key Program of National Natural Science of China(61533002)
文摘Modeling of energy consumption(EC) and effluent quality(EQ) are very essential problems that need to be solved for the multiobjective optimal control in the wastewater treatment process(WWTP). To address this issue, a density peaks-based adaptive fuzzy neural network(DP-AFNN) is proposed in this study. To obtain suitable fuzzy rules, a DP-based clustering method is applied to fit the cluster centers to process nonlinearity.The parameters of the extracted fuzzy rules are fine-tuned based on the improved Levenberg-Marquardt algorithm during the training process. Furthermore, the analysis of convergence is performed to guarantee the successful application of the DPAFNN. Finally, the proposed DP-AFNN is utilized to develop the models of EC and EQ in the WWTP. The experimental results show that the proposed DP-AFNN can achieve fast convergence speed and high prediction accuracy in comparison with some existing methods.
文摘针对谱聚类在尺度参数计算时需要人为设置近邻参数及聚类结果不稳定等问题,本文将初始类中心值和尺度参数作为决策变量,重点对谱聚类算法进行自适应优化与改进。首先,将样本邻域标准差的倒数作为度量样本局部密度的参数,与密度峰值思想相结合,设计了一种基于密度峰值的初始类中心决策值选择方法(initial class center decision value algorithm based on density peak,DP_KD),解决密度调整谱聚类中聚类结果不稳定的问题。其次,利用样本间的平均距离计算相应的邻域半径,并根据样本标准差自适应地求解每个样本的尺度参数,构造样本间的相似度矩阵,实现了近邻参数的自适应设置,解决尺度参数需要人为设置的问题。然后,基于优化后的初始类中心决策值和近邻参数方法,进一步调整高斯核函数,提出一种基于邻域标准差的密度调整谱聚类算法(density adjusted spectral clustering algorithm based on neighborhood standard deviation,DSSD),通过构建特征向量空间实现了密度谱聚类。最后,将提出的算法与其他聚类算法在多个数据集上进行了对比。结果表明,与其他谱聚类算法相比,本文提出的DSSD算法不仅具有更好的聚类效果,且聚类结果更加稳定,尤其是在类内密集且类间边缘明确的DIM512数据集中,DSSD算法可以正确地进行聚类分簇;在准确率、兰德系数和F-measure上较其他算法至少提升了0.0268、0.0136和0.0247,这表明DSSD算法不仅聚类效果较好且更适合大规模数据集的聚类分析。
文摘结合自然邻居搜索算法改进了密度峰值聚类(clustering by fast search and find of density peaks,CFSFDP)算法存在的一系列问题,提出基于自然邻居搜索优化策略的密度峰值聚类(density peak clustering algorithm optimized by natural neighbor search,NaN-CFSFDP)算法。基于自然邻居搜索算法提出了一种离群样本的检测方法,针对CFSFDP算法中截断距离d_(c)人工准确取值较难的问题,结合自然邻居搜索算法改进了d_(c)的计算方式,实现了d_(c)的自动取值。重新设计并统一了CFSFDP算法的样本密度度量规则,使其更关注每个样本的局部信息。由于数据集中因类簇间的密度差异大,密度峰值点集中于稠密簇使得簇丢失,因此提出样本共享自然邻居和类簇共享自然邻居的概念,构造新的类簇融合算法。合成数据集和真实数据集上的实验结果表明,在大多数情况下,NaN-CFSFDP算法在聚类性能上优于或至少与比较方法相当,且与CFSFDP算法及其改进算法相比参数更少。