This paper proposes an equivalent modeling method for photovoltaic(PV)power stations via a particle swarm optimization(PSO)K-means clustering(KMC)algorithm with passive filter parameter clustering to address the compl...This paper proposes an equivalent modeling method for photovoltaic(PV)power stations via a particle swarm optimization(PSO)K-means clustering(KMC)algorithm with passive filter parameter clustering to address the complexities,simulation time cost and convergence problems of detailed PV power station models.First,the amplitude–frequency curves of different filter parameters are analyzed.Based on the results,a grouping parameter set for characterizing the external filter characteristics is established.These parameters are further defined as clustering parameters.A single PV inverter model is then established as a prerequisite foundation.The proposed equivalent method combines the global search capability of PSO with the rapid convergence of KMC,effectively overcoming the tendency of KMC to become trapped in local optima.This approach enhances both clustering accuracy and numerical stability when determining equivalence for PV inverter units.Using the proposed clustering method,both a detailed PV power station model and an equivalent model are developed and compared.Simulation and hardwarein-loop(HIL)results based on the equivalent model verify that the equivalent method accurately represents the dynamic characteristics of PVpower stations and adapts well to different operating conditions.The proposed equivalent modeling method provides an effective analysis tool for future renewable energy integration research.展开更多
Addressing the issue that flight plans between Chinese city pairs typically rely on a single route,lacking alternative paths and posing challenges in responding to emergencies,this study employs the“quantile-inflecti...Addressing the issue that flight plans between Chinese city pairs typically rely on a single route,lacking alternative paths and posing challenges in responding to emergencies,this study employs the“quantile-inflection point method”to analyze specific deviation trajectories,determine deviation thresholds,and identify commonly used deviation paths.By combining multiple similarity metrics,including Euclidean distance,Hausdorff distance,and sector edit distance,with the density-based spatial clustering of applications with noise(DBSCAN)algorithm,the study clusters deviation trajectories to construct a multi-option trajectory set for city pairs.A case study of 23578 flight trajectories between the Guangzhou airport cluster and the Shanghai airport cluster demonstrates the effectiveness of the proposed framework.Experimental results show that sector edit distance achieves superior clustering performance compared to Euclidean and Hausdorff distances,with higher silhouette coefficients and lower Davies⁃Bouldin indices,ensuring better intra-cluster compactness and inter-cluster separation.Based on clustering results,19 representative trajectory options are identified,covering both nominal and deviation paths,which significantly enhance route diversity and reflect actual flight practices.This provides a practical basis for optimizing flight paths and scheduling,enhancing the flexibility of route selection for flights between city pairs.展开更多
This paper focuses on the unsupervised detection of the Higgs boson particle using the most informative features and variables which characterize the“Higgs machine learning challenge 2014”data set.This unsupervised ...This paper focuses on the unsupervised detection of the Higgs boson particle using the most informative features and variables which characterize the“Higgs machine learning challenge 2014”data set.This unsupervised detection goes in this paper analysis through 4 steps:(1)selection of the most informative features from the considered data;(2)definition of the number of clusters based on the elbow criterion.The experimental results showed that the optimal number of clusters that group the considered data in an unsupervised manner corresponds to 2 clusters;(3)proposition of a new approach for hybridization of both hard and fuzzy clustering tuned with Ant Lion Optimization(ALO);(4)comparison with some existing metaheuristic optimizations such as Genetic Algorithm(GA)and Particle Swarm Optimization(PSO).By employing a multi-angle analysis based on the cluster validation indices,the confusion matrix,the efficiencies and purities rates,the average cost variation,the computational time and the Sammon mapping visualization,the results highlight the effectiveness of the improved Gustafson-Kessel algorithm optimized withALO(ALOGK)to validate the proposed approach.Even if the paper gives a complete clustering analysis,its novel contribution concerns only the Steps(1)and(3)considered above.The first contribution lies in the method used for Step(1)to select the most informative features and variables.We used the t-Statistic technique to rank them.Afterwards,a feature mapping is applied using Self-Organizing Map(SOM)to identify the level of correlation between them.Then,Particle Swarm Optimization(PSO),a metaheuristic optimization technique,is used to reduce the data set dimension.The second contribution of thiswork concern the third step,where each one of the clustering algorithms as K-means(KM),Global K-means(GlobalKM),Partitioning AroundMedoids(PAM),Fuzzy C-means(FCM),Gustafson-Kessel(GK)and Gath-Geva(GG)is optimized and tuned with ALO.展开更多
At present,the proportion of new energy in the power grid is increasing,and the random fluctuations in power output increase the risk of cascading failures in the power grid.In this paper,we propose a method for ident...At present,the proportion of new energy in the power grid is increasing,and the random fluctuations in power output increase the risk of cascading failures in the power grid.In this paper,we propose a method for identifying high-risk scenarios of interlocking faults in new energy power grids based on a deep embedding clustering(DEC)algorithm and apply it in a risk assessment of cascading failures in different operating scenarios for new energy power grids.First,considering the real-time operation status and system structure of new energy power grids,the scenario cascading failure risk indicator is established.Based on this indicator,the risk of cascading failure is calculated for the scenario set,the scenarios are clustered based on the DEC algorithm,and the scenarios with the highest indicators are selected as the significant risk scenario set.The results of simulations with an example power grid show that our method can effectively identify scenarios with a high risk of cascading failures from a large number of scenarios.展开更多
Classification systems such as Slope Mass Rating(SMR) are currently being used to undertake slope stability analysis. In SMR classification system, data is allocated to certain classes based on linguistic and experien...Classification systems such as Slope Mass Rating(SMR) are currently being used to undertake slope stability analysis. In SMR classification system, data is allocated to certain classes based on linguistic and experience-based criteria. In order to eliminate linguistic criteria resulted from experience-based judgments and account for uncertainties in determining class boundaries developed by SMR system,the system classification results were corrected using two clustering algorithms, namely K-means and fuzzy c-means(FCM), for the ratings obtained via continuous and discrete functions. By applying clustering algorithms in SMR classification system, no in-advance experience-based judgment was made on the number of extracted classes in this system, and it was only after all steps of the clustering algorithms were accomplished that new classification scheme was proposed for SMR system under different failure modes based on the ratings obtained via continuous and discrete functions. The results of this study showed that, engineers can achieve more reliable and objective evaluations over slope stability by using SMR system based on the ratings calculated via continuous and discrete functions.展开更多
Most clustering algorithms need to describe the similarity of objects by a predefined distance function. Three distance functions which are widely used in two traditional clustering algorithms k-means and hierarchical...Most clustering algorithms need to describe the similarity of objects by a predefined distance function. Three distance functions which are widely used in two traditional clustering algorithms k-means and hierarchical clustering were investigated. Both theoretical analysis and detailed experimental results were given. It is shown that a distance function greatly affects clustering results and can be used to detect the outlier of a cluster by the comparison of such different results and give the shape information of clusters. In practice situation, it is suggested to use different distance function separately, compare the clustering results and pick out the 搒wing points? And such points may leak out more information for data analysts.展开更多
Mobile commerce(m-commerce)contributes to increasing the popularity of electronic commerce(e-commerce),allowing anybody to sell or buy goods using a mobile device or tablet anywhere and at any time.As demand for e-com...Mobile commerce(m-commerce)contributes to increasing the popularity of electronic commerce(e-commerce),allowing anybody to sell or buy goods using a mobile device or tablet anywhere and at any time.As demand for e-commerce increases tremendously,the pressure on delivery companies increases to organise their transportation plans to achieve profits and customer satisfaction.One important planning problem in this domain is the multi-vehicle profitable pickup and delivery problem(MVPPDP),where a selected set of pickup and delivery customers need to be served within certain allowed trip time.In this paper,we proposed hybrid clustering algorithms with the greedy randomised adaptive search procedure(GRASP)to construct an initial solution for the MVPPDP.Our approaches first cluster the search space in order to reduce its dimensionality,then use GRASP to build routes for each cluster.We compared our results with state-of-the-art construction heuristics that have been used to construct initial solutions to this problem.Experimental results show that our proposed algorithms contribute to achieving excellent performance in terms of both quality of solutions and processing time.展开更多
Molecular dynamics (MD) simulation has become a powerful tool to investigate the structure- function relationship of proteins and other biological macromolecules at atomic resolution and biologically relevant timesc...Molecular dynamics (MD) simulation has become a powerful tool to investigate the structure- function relationship of proteins and other biological macromolecules at atomic resolution and biologically relevant timescales. MD simulations often produce massive datasets con- taining millions of snapshots describing proteins in motion. Therefore, clustering algorithms have been in high demand to be developed and applied to classify these MD snapshots and gain biological insights. There mainly exist two categories of clustering algorithms that aim to group protein conformations into clusters based on the similarity of their shape (geometric clustering) and kinetics (kinetic clustering). In this paper, we review a series of frequently used clustering algorithms applied in MD simulations, including divisive algorithms, ag- glomerative algorithms (single-linkage, complete-linkage, average-linkage, centroid-linkage and ward-linkage), center-based algorithms (K-Means, K-Medoids, K-Centers, and APM), density-based algorithms (neighbor-based, DBSCAN, density-peaks, and Robust-DB), and spectral-based algorithms (PCCA and PCCA+). In particular, differences between geomet- ric and kinetic clustering metrics will be discussed along with the performances of diflhrent clustering algorithms. We note that there does not exist a one-size-fits-all algorithm in the classification of MD datasets. For a specific application, the right choice of clustering algo- rithm should be based on the purpose of clustering, and the intrinsic properties of the MD conformational ensembles. Therefore, a main focus of our review is to describe the merits and limitations of each clustering algorithm. We expect that this review would be helpful to guide researchers to choose appropriate clustering algorithms for their own MD datasets.展开更多
Active semi-supervised fuzzy clustering integrates fuzzy clustering techniques with limited labeled data,guided by active learning,to enhance classification accuracy,particularly in complex and ambiguous datasets.Alth...Active semi-supervised fuzzy clustering integrates fuzzy clustering techniques with limited labeled data,guided by active learning,to enhance classification accuracy,particularly in complex and ambiguous datasets.Although several active semi-supervised fuzzy clustering methods have been developed previously,they typically face significant limitations,including high computational complexity,sensitivity to initial cluster centroids,and difficulties in accurately managing boundary clusters where data points often overlap among multiple clusters.This study introduces a novel Active Semi-Supervised Fuzzy Clustering algorithm specifically designed to identify,analyze,and correct misclassified boundary elements.By strategically utilizing labeled data through active learning,our method improves the robustness and precision of cluster boundary assignments.Extensive experimental evaluations conducted on three types of datasets—including benchmark UCI datasets,synthetic data with controlled boundary overlap,and satellite imagery—demonstrate that our proposed approach achieves superior performance in terms of clustering accuracy and robustness compared to existing active semi-supervised fuzzy clustering methods.The results confirm the effectiveness and practicality of our method in handling real-world scenarios where precise cluster boundaries are critical.展开更多
Transient stability assessment(TSA)based on artificial intelligence typically has two distinct model management approaches:a unified management approach for all faulted lines and a separate management approach for eac...Transient stability assessment(TSA)based on artificial intelligence typically has two distinct model management approaches:a unified management approach for all faulted lines and a separate management approach for each faulted line.To address the shortcomings of the aforementioned approaches,namely accuracy,training time,and model management complexity,a multi-model management approach for power system TSA based on multi-moment feature clustering has been proposed.First,the steady-state and transient features present under fault conditions were obtained through a transient simulation of line faults.The input sample set was then constructed using the aforementioned multi-moment electrical features and the embedded faulty line numbers.Subsequently,K-means clustering was conducted on each line based on the similarity of their electrical features,employing t-SNE dimensionality reduction.The PSO-CNN model was trained separately for each cluster to generate several independent TSA models.Finally,a model effectiveness evaluation system consisting of five metrics was established,and the effect of the sample imbalance ratio on the model effectiveness was investigated.The model effectiveness was evaluated using the IEEE 39-bus system algorithm.The results showed that the multi-model management strategy based on multi-moment feature clustering can effectively combine the two advantages of superior evaluation performance and streamlined model management by fully extracting system features.Moreover,this approach allows for more flexible adjustments to line topology changes.展开更多
The characterization and clustering of rock discontinuity sets are a crucial and challenging task in rock mechanics and geotechnical engineering.Over the past few decades,the clustering of discontinuity sets has under...The characterization and clustering of rock discontinuity sets are a crucial and challenging task in rock mechanics and geotechnical engineering.Over the past few decades,the clustering of discontinuity sets has undergone rapid and remarkable development.However,there is no relevant literature summarizing these achievements,and this paper attempts to elaborate on the current status and prospects in this field.Specifically,this review aims to discuss the development process of clustering methods for discontinuity sets and the state-of-the-art relevant algorithms.First,we introduce the importance of discontinuity clustering analysis and follow the comprehensive characterization approaches of discontinuity data.A bibliometric analysis is subsequently conducted to clarify the current status and development characteristics of the clustering of discontinuity sets.The methods for the clustering analysis of rock discontinuities are reviewed in terms of single-and multi-parameter clustering methods.Single-parameter methods can be classified into empirical judgment methods,dynamic clustering methods,relative static clustering methods,and static clustering methods,reflecting the continuous optimization and improvement of clustering algorithms.Moreover,this paper compares the current mainstream of single-parameter clustering methods with multi-parameter clustering methods.It is emphasized that the current single-parameter clustering methods have reached their performance limits,with little room for improvement,and that there is a need to extend the study of multi-parameter clustering methods.Finally,several suggestions are offered for future research on the clustering of discontinuity sets.展开更多
Wireless Sensor Networks(WSNs),as a crucial component of the Internet of Things(IoT),are widely used in environmental monitoring,industrial control,and security surveillance.However,WSNs still face challenges such as ...Wireless Sensor Networks(WSNs),as a crucial component of the Internet of Things(IoT),are widely used in environmental monitoring,industrial control,and security surveillance.However,WSNs still face challenges such as inaccurate node clustering,low energy efficiency,and shortened network lifespan in practical deployments,which significantly limit their large-scale application.To address these issues,this paper proposes an Adaptive Chaotic Ant Colony Optimization algorithm(AC-ACO),aiming to optimize the energy utilization and system lifespan of WSNs.AC-ACO combines the path-planning capability of Ant Colony Optimization(ACO)with the dynamic characteristics of chaotic mapping and introduces an adaptive mechanism to enhance the algorithm’s flexibility and adaptability.By dynamically adjusting the pheromone evaporation factor and heuristic weights,efficient node clustering is achieved.Additionally,a chaotic mapping initialization strategy is employed to enhance population diversity and avoid premature convergence.To validate the algorithm’s performance,this paper compares AC-ACO with clustering methods such as Low-Energy Adaptive Clustering Hierarchy(LEACH),ACO,Particle Swarm Optimization(PSO),and Genetic Algorithm(GA).Simulation results demonstrate that AC-ACO outperforms the compared algorithms in key metrics such as energy consumption optimization,network lifetime extension,and communication delay reduction,providing an efficient solution for improving energy efficiency and ensuring long-term stable operation of wireless sensor networks.展开更多
Recognizing discontinuities within rock masses is a critical aspect of rock engineering.The development of remote sensing technologies has significantly enhanced the quality and quantity of the point clouds collected ...Recognizing discontinuities within rock masses is a critical aspect of rock engineering.The development of remote sensing technologies has significantly enhanced the quality and quantity of the point clouds collected from rock outcrops.In response,we propose a workflow that balances accuracy and efficiency to extract discontinuities from massive point clouds.The proposed method employs voxel filtering to downsample point clouds,constructs a point cloud topology using K-d trees,utilizes principal component analysis to calculate the point cloud normals,and employs the pointwise clustering(PWC)algorithm to extract discontinuities from rock outcrop point clouds.This method provides information on the location and orientation(dip direction and dip angle)of the discontinuities,and the modified whale optimization algorithm(MWOA)is utilized to identify major discontinuity sets and their average orientations.Performance evaluations based on three real cases demonstrate that the proposed method significantly reduces computational time costs without sacrificing accuracy.In particular,the method yields more reasonable extraction results for discontinuities with certain undulations.The presented approach offers a novel tool for efficiently extracting discontinuities from large-scale point clouds.展开更多
In order to mine production and security information from security supervising data and to ensure security and safety involved in production and decision-making,a clustering analysis algorithm for security supervising...In order to mine production and security information from security supervising data and to ensure security and safety involved in production and decision-making,a clustering analysis algorithm for security supervising data based on a semantic description in coal mines is studied.First,the semantic and numerical-based hybrid description method of security supervising data in coal mines is described.Secondly,the similarity measurement method of semantic and numerical data are separately given and a weight-based hybrid similarity measurement method for the security supervising data based on a semantic description in coal mines is presented.Thirdly,taking the hybrid similarity measurement method as the distance criteria and using a grid methodology for reference,an improved CURE clustering algorithm based on the grid is presented.Finally,the simulation results of a security supervising data set in coal mines validate the efficiency of the algorithm.展开更多
The safe driving and operation of trains is a necessary condition for ensuring the safe operation of trains.In particular,heavy-haul trains are characterized by the difficulty in driving and operation.Considering the ...The safe driving and operation of trains is a necessary condition for ensuring the safe operation of trains.In particular,heavy-haul trains are characterized by the difficulty in driving and operation.Considering the uncertainties in train driving and operation,this paper analyzes the relationship between the safety of heavy-haul electric locomotive hauled trains and driving and operation.It studies the auxiliary intelligent driving safety operation control methods.Through K-means to identify the characteristics of drivers'driving manipulation,the hidden Markov model adaptively adjusts the train driving and operation sequence,and conducts auxiliary driving reconstruction for heavy-haul locomotive driving and operation.Based on the train running curve and the locomotive traction/braking characteristics,it smoothly controls the exertion of the traction/braking force of heavy-haul locomotives,thereby optimizing the driving safety control of heavy-haul trains in the vehicle-environment-track system.Finally,the train operation simulation and optimized driving verification are carried out by simulating some track sections.The results show that the proposed method can correct and pre-optimize driving operations,improving the smoothness of heavy-haul trains by approximately 10%.It verifies the effectiveness of the proposed train assisted driving control reconstruction method,facilitating the smooth and safe operation of heavy-haul trains.展开更多
In k-means clustering, we are given a set of n data points in d-dimensional space R^d and an integer k and the problem is to determine a set of k points in R^d, called centers, so as to minimize the mean squared dista...In k-means clustering, we are given a set of n data points in d-dimensional space R^d and an integer k and the problem is to determine a set of k points in R^d, called centers, so as to minimize the mean squared distance from each data point to its nearest center. In this paper, we present a simple and efficient clustering algorithm based on the k-means algorithm, which we call enhanced k-means algorithm. This algorithm is easy to implement, requiring a simple data structure to keep some information in each iteration to be used in the next iteration. Our experimental results demonstrated that our scheme can improve the computational speed of the k-means algorithm by the magnitude in the total number of distance calculations and the overall time of computation.展开更多
A convective and stratiform cloud classification method for weather radar is proposed based on the density-based spatial clustering of applications with noise(DBSCAN)algorithm.To identify convective and stratiform clo...A convective and stratiform cloud classification method for weather radar is proposed based on the density-based spatial clustering of applications with noise(DBSCAN)algorithm.To identify convective and stratiform clouds in different developmental phases,two-dimensional(2D)and three-dimensional(3D)models are proposed by applying reflectivity factors at 0.5°and at 0.5°,1.5°,and 2.4°elevation angles,respectively.According to the thresholds of the algorithm,which include echo intensity,the echo top height of 35 dBZ(ET),density threshold,andεneighborhood,cloud clusters can be marked into four types:deep-convective cloud(DCC),shallow-convective cloud(SCC),hybrid convective-stratiform cloud(HCS),and stratiform cloud(SFC)types.Each cloud cluster type is further identified as a core area and boundary area,which can provide more abundant cloud structure information.The algorithm is verified using the volume scan data observed with new-generation S-band weather radars in Nanjing,Xuzhou,and Qingdao.The results show that cloud clusters can be intuitively identified as core and boundary points,which change in area continuously during the process of convective evolution,by the improved DBSCAN algorithm.Therefore,the occurrence and disappearance of convective weather can be estimated in advance by observing the changes of the classification.Because density thresholds are different and multiple elevations are utilized in the 3D model,the identified echo types and areas are dissimilar between the 2D and 3D models.The 3D model identifies larger convective and stratiform clouds than the 2D model.However,the developing convective clouds of small areas at lower heights cannot be identified with the 3D model because they are covered by thick stratiform clouds.In addition,the 3D model can avoid the influence of the melting layer and better suggest convective clouds in the developmental stage.展开更多
To improve the recognition rate of signal modulation recognition methods based on the clustering algorithm under the low SNR, a modulation recognition method is proposed. The characteristic parameter of the signal is ...To improve the recognition rate of signal modulation recognition methods based on the clustering algorithm under the low SNR, a modulation recognition method is proposed. The characteristic parameter of the signal is extracted by using a clustering algorithm, the neural network is trained by using the algorithm of variable gradient correction (Polak-Ribiere) so as to enhance the rate of convergence, improve the performance of recognition under the low SNR and realize modulation recognition of the signal based on the modulation system of the constellation diagram. Simulation results show that the recognition rate based on this algorithm is enhanced over 30% compared with the methods that adopt clustering algorithm or neural network based on the back propagation algorithm alone under the low SNR. The recognition rate can reach 90% when the SNR is 4 dB, and the method is easy to be achieved so that it has a broad application prospect in the modulating recognition.展开更多
Water quality assessment of lakes is important to determine functional zones of water use.Considering the fuzziness during the partitioning process for lake water quality in an arid area,a multiplex model of fuzzy clu...Water quality assessment of lakes is important to determine functional zones of water use.Considering the fuzziness during the partitioning process for lake water quality in an arid area,a multiplex model of fuzzy clustering with pattern recognition was developed by integrating transitive closure method,ISODATA algorithm in fuzzy clustering and fuzzy pattern recognition.The model was applied to partition the Ulansuhai Lake,a typical shallow lake in arid climate zone in the west part of Inner Mongolia,China and grade the condition of water quality divisions.The results showed that the partition well matched the real conditions of the lake,and the method has been proved accurate in the application.展开更多
基金supported by the Research Project of China Southern Power Grid(No.056200KK52222031).
文摘This paper proposes an equivalent modeling method for photovoltaic(PV)power stations via a particle swarm optimization(PSO)K-means clustering(KMC)algorithm with passive filter parameter clustering to address the complexities,simulation time cost and convergence problems of detailed PV power station models.First,the amplitude–frequency curves of different filter parameters are analyzed.Based on the results,a grouping parameter set for characterizing the external filter characteristics is established.These parameters are further defined as clustering parameters.A single PV inverter model is then established as a prerequisite foundation.The proposed equivalent method combines the global search capability of PSO with the rapid convergence of KMC,effectively overcoming the tendency of KMC to become trapped in local optima.This approach enhances both clustering accuracy and numerical stability when determining equivalence for PV inverter units.Using the proposed clustering method,both a detailed PV power station model and an equivalent model are developed and compared.Simulation and hardwarein-loop(HIL)results based on the equivalent model verify that the equivalent method accurately represents the dynamic characteristics of PVpower stations and adapts well to different operating conditions.The proposed equivalent modeling method provides an effective analysis tool for future renewable energy integration research.
基金supported in part by Boeing Company and Nanjing University of Aeronautics and Astronautics(NUAA)through the Research on Decision Support Technology of Air Traffic Operation Management in Convective Weather under Project 2022-GT-129in part by the Postgraduate Research and Practice Innovation Program of NUAA(No.xcxjh20240709)。
文摘Addressing the issue that flight plans between Chinese city pairs typically rely on a single route,lacking alternative paths and posing challenges in responding to emergencies,this study employs the“quantile-inflection point method”to analyze specific deviation trajectories,determine deviation thresholds,and identify commonly used deviation paths.By combining multiple similarity metrics,including Euclidean distance,Hausdorff distance,and sector edit distance,with the density-based spatial clustering of applications with noise(DBSCAN)algorithm,the study clusters deviation trajectories to construct a multi-option trajectory set for city pairs.A case study of 23578 flight trajectories between the Guangzhou airport cluster and the Shanghai airport cluster demonstrates the effectiveness of the proposed framework.Experimental results show that sector edit distance achieves superior clustering performance compared to Euclidean and Hausdorff distances,with higher silhouette coefficients and lower Davies⁃Bouldin indices,ensuring better intra-cluster compactness and inter-cluster separation.Based on clustering results,19 representative trajectory options are identified,covering both nominal and deviation paths,which significantly enhance route diversity and reflect actual flight practices.This provides a practical basis for optimizing flight paths and scheduling,enhancing the flexibility of route selection for flights between city pairs.
文摘This paper focuses on the unsupervised detection of the Higgs boson particle using the most informative features and variables which characterize the“Higgs machine learning challenge 2014”data set.This unsupervised detection goes in this paper analysis through 4 steps:(1)selection of the most informative features from the considered data;(2)definition of the number of clusters based on the elbow criterion.The experimental results showed that the optimal number of clusters that group the considered data in an unsupervised manner corresponds to 2 clusters;(3)proposition of a new approach for hybridization of both hard and fuzzy clustering tuned with Ant Lion Optimization(ALO);(4)comparison with some existing metaheuristic optimizations such as Genetic Algorithm(GA)and Particle Swarm Optimization(PSO).By employing a multi-angle analysis based on the cluster validation indices,the confusion matrix,the efficiencies and purities rates,the average cost variation,the computational time and the Sammon mapping visualization,the results highlight the effectiveness of the improved Gustafson-Kessel algorithm optimized withALO(ALOGK)to validate the proposed approach.Even if the paper gives a complete clustering analysis,its novel contribution concerns only the Steps(1)and(3)considered above.The first contribution lies in the method used for Step(1)to select the most informative features and variables.We used the t-Statistic technique to rank them.Afterwards,a feature mapping is applied using Self-Organizing Map(SOM)to identify the level of correlation between them.Then,Particle Swarm Optimization(PSO),a metaheuristic optimization technique,is used to reduce the data set dimension.The second contribution of thiswork concern the third step,where each one of the clustering algorithms as K-means(KM),Global K-means(GlobalKM),Partitioning AroundMedoids(PAM),Fuzzy C-means(FCM),Gustafson-Kessel(GK)and Gath-Geva(GG)is optimized and tuned with ALO.
基金funded by the State Grid Limited Science and Technology Project of China,Grant Number SGSXDK00DJJS2200144.
文摘At present,the proportion of new energy in the power grid is increasing,and the random fluctuations in power output increase the risk of cascading failures in the power grid.In this paper,we propose a method for identifying high-risk scenarios of interlocking faults in new energy power grids based on a deep embedding clustering(DEC)algorithm and apply it in a risk assessment of cascading failures in different operating scenarios for new energy power grids.First,considering the real-time operation status and system structure of new energy power grids,the scenario cascading failure risk indicator is established.Based on this indicator,the risk of cascading failure is calculated for the scenario set,the scenarios are clustered based on the DEC algorithm,and the scenarios with the highest indicators are selected as the significant risk scenario set.The results of simulations with an example power grid show that our method can effectively identify scenarios with a high risk of cascading failures from a large number of scenarios.
文摘Classification systems such as Slope Mass Rating(SMR) are currently being used to undertake slope stability analysis. In SMR classification system, data is allocated to certain classes based on linguistic and experience-based criteria. In order to eliminate linguistic criteria resulted from experience-based judgments and account for uncertainties in determining class boundaries developed by SMR system,the system classification results were corrected using two clustering algorithms, namely K-means and fuzzy c-means(FCM), for the ratings obtained via continuous and discrete functions. By applying clustering algorithms in SMR classification system, no in-advance experience-based judgment was made on the number of extracted classes in this system, and it was only after all steps of the clustering algorithms were accomplished that new classification scheme was proposed for SMR system under different failure modes based on the ratings obtained via continuous and discrete functions. The results of this study showed that, engineers can achieve more reliable and objective evaluations over slope stability by using SMR system based on the ratings calculated via continuous and discrete functions.
文摘Most clustering algorithms need to describe the similarity of objects by a predefined distance function. Three distance functions which are widely used in two traditional clustering algorithms k-means and hierarchical clustering were investigated. Both theoretical analysis and detailed experimental results were given. It is shown that a distance function greatly affects clustering results and can be used to detect the outlier of a cluster by the comparison of such different results and give the shape information of clusters. In practice situation, it is suggested to use different distance function separately, compare the clustering results and pick out the 搒wing points? And such points may leak out more information for data analysts.
基金Deanship of scientific research for funding and supporting this research through the initiative of DSR Graduate Students Research Support(GSR).
文摘Mobile commerce(m-commerce)contributes to increasing the popularity of electronic commerce(e-commerce),allowing anybody to sell or buy goods using a mobile device or tablet anywhere and at any time.As demand for e-commerce increases tremendously,the pressure on delivery companies increases to organise their transportation plans to achieve profits and customer satisfaction.One important planning problem in this domain is the multi-vehicle profitable pickup and delivery problem(MVPPDP),where a selected set of pickup and delivery customers need to be served within certain allowed trip time.In this paper,we proposed hybrid clustering algorithms with the greedy randomised adaptive search procedure(GRASP)to construct an initial solution for the MVPPDP.Our approaches first cluster the search space in order to reduce its dimensionality,then use GRASP to build routes for each cluster.We compared our results with state-of-the-art construction heuristics that have been used to construct initial solutions to this problem.Experimental results show that our proposed algorithms contribute to achieving excellent performance in terms of both quality of solutions and processing time.
基金supported by Shenzhen Science and Technology Innovation Committee(JCYJ20170413173837121)the Hong Kong Research Grant Council(HKUST C6009-15G,14203915,16302214,16304215,16318816,and AoE/P-705/16)+2 种基金King Abdullah University of Science and Technology(KAUST) Office of Sponsored Research(OSR)(OSR-2016-CRG5-3007)Guangzhou Science Technology and Innovation Commission(201704030116)Innovation and Technology Commission(ITCPD/17-9and ITC-CNERC14SC01)
文摘Molecular dynamics (MD) simulation has become a powerful tool to investigate the structure- function relationship of proteins and other biological macromolecules at atomic resolution and biologically relevant timescales. MD simulations often produce massive datasets con- taining millions of snapshots describing proteins in motion. Therefore, clustering algorithms have been in high demand to be developed and applied to classify these MD snapshots and gain biological insights. There mainly exist two categories of clustering algorithms that aim to group protein conformations into clusters based on the similarity of their shape (geometric clustering) and kinetics (kinetic clustering). In this paper, we review a series of frequently used clustering algorithms applied in MD simulations, including divisive algorithms, ag- glomerative algorithms (single-linkage, complete-linkage, average-linkage, centroid-linkage and ward-linkage), center-based algorithms (K-Means, K-Medoids, K-Centers, and APM), density-based algorithms (neighbor-based, DBSCAN, density-peaks, and Robust-DB), and spectral-based algorithms (PCCA and PCCA+). In particular, differences between geomet- ric and kinetic clustering metrics will be discussed along with the performances of diflhrent clustering algorithms. We note that there does not exist a one-size-fits-all algorithm in the classification of MD datasets. For a specific application, the right choice of clustering algo- rithm should be based on the purpose of clustering, and the intrinsic properties of the MD conformational ensembles. Therefore, a main focus of our review is to describe the merits and limitations of each clustering algorithm. We expect that this review would be helpful to guide researchers to choose appropriate clustering algorithms for their own MD datasets.
文摘Active semi-supervised fuzzy clustering integrates fuzzy clustering techniques with limited labeled data,guided by active learning,to enhance classification accuracy,particularly in complex and ambiguous datasets.Although several active semi-supervised fuzzy clustering methods have been developed previously,they typically face significant limitations,including high computational complexity,sensitivity to initial cluster centroids,and difficulties in accurately managing boundary clusters where data points often overlap among multiple clusters.This study introduces a novel Active Semi-Supervised Fuzzy Clustering algorithm specifically designed to identify,analyze,and correct misclassified boundary elements.By strategically utilizing labeled data through active learning,our method improves the robustness and precision of cluster boundary assignments.Extensive experimental evaluations conducted on three types of datasets—including benchmark UCI datasets,synthetic data with controlled boundary overlap,and satellite imagery—demonstrate that our proposed approach achieves superior performance in terms of clustering accuracy and robustness compared to existing active semi-supervised fuzzy clustering methods.The results confirm the effectiveness and practicality of our method in handling real-world scenarios where precise cluster boundaries are critical.
基金supported by the Science and Technology Project of SGCC(5100-202199558A-0-5-ZN).
文摘Transient stability assessment(TSA)based on artificial intelligence typically has two distinct model management approaches:a unified management approach for all faulted lines and a separate management approach for each faulted line.To address the shortcomings of the aforementioned approaches,namely accuracy,training time,and model management complexity,a multi-model management approach for power system TSA based on multi-moment feature clustering has been proposed.First,the steady-state and transient features present under fault conditions were obtained through a transient simulation of line faults.The input sample set was then constructed using the aforementioned multi-moment electrical features and the embedded faulty line numbers.Subsequently,K-means clustering was conducted on each line based on the similarity of their electrical features,employing t-SNE dimensionality reduction.The PSO-CNN model was trained separately for each cluster to generate several independent TSA models.Finally,a model effectiveness evaluation system consisting of five metrics was established,and the effect of the sample imbalance ratio on the model effectiveness was investigated.The model effectiveness was evaluated using the IEEE 39-bus system algorithm.The results showed that the multi-model management strategy based on multi-moment feature clustering can effectively combine the two advantages of superior evaluation performance and streamlined model management by fully extracting system features.Moreover,this approach allows for more flexible adjustments to line topology changes.
基金funding support from the National Natural Science Foundation of China(Grant No.42007269)the Young Talent Fund of Xi'an Association for Science and Technology(Grant No.959202313094)the Fundamental Research Funds for the Central Universities,CHD(Grant No.300102263401).
文摘The characterization and clustering of rock discontinuity sets are a crucial and challenging task in rock mechanics and geotechnical engineering.Over the past few decades,the clustering of discontinuity sets has undergone rapid and remarkable development.However,there is no relevant literature summarizing these achievements,and this paper attempts to elaborate on the current status and prospects in this field.Specifically,this review aims to discuss the development process of clustering methods for discontinuity sets and the state-of-the-art relevant algorithms.First,we introduce the importance of discontinuity clustering analysis and follow the comprehensive characterization approaches of discontinuity data.A bibliometric analysis is subsequently conducted to clarify the current status and development characteristics of the clustering of discontinuity sets.The methods for the clustering analysis of rock discontinuities are reviewed in terms of single-and multi-parameter clustering methods.Single-parameter methods can be classified into empirical judgment methods,dynamic clustering methods,relative static clustering methods,and static clustering methods,reflecting the continuous optimization and improvement of clustering algorithms.Moreover,this paper compares the current mainstream of single-parameter clustering methods with multi-parameter clustering methods.It is emphasized that the current single-parameter clustering methods have reached their performance limits,with little room for improvement,and that there is a need to extend the study of multi-parameter clustering methods.Finally,several suggestions are offered for future research on the clustering of discontinuity sets.
基金funded by the Natural Science Foundation of Xinjiang Uygur Autonomous Region:No.22D01B148Bidding Topics for the Center for Integration of Education and Production and Development of New Business in 2024:No.2024-KYJD05+1 种基金Basic Scientific Research Business Fee Project of Colleges and Universities in Autonomous Region:No.XJEDU2025P126Xinjiang College of Science&Technology School-level Scientific Research Fund Project:No.2024-KYTD01.
文摘Wireless Sensor Networks(WSNs),as a crucial component of the Internet of Things(IoT),are widely used in environmental monitoring,industrial control,and security surveillance.However,WSNs still face challenges such as inaccurate node clustering,low energy efficiency,and shortened network lifespan in practical deployments,which significantly limit their large-scale application.To address these issues,this paper proposes an Adaptive Chaotic Ant Colony Optimization algorithm(AC-ACO),aiming to optimize the energy utilization and system lifespan of WSNs.AC-ACO combines the path-planning capability of Ant Colony Optimization(ACO)with the dynamic characteristics of chaotic mapping and introduces an adaptive mechanism to enhance the algorithm’s flexibility and adaptability.By dynamically adjusting the pheromone evaporation factor and heuristic weights,efficient node clustering is achieved.Additionally,a chaotic mapping initialization strategy is employed to enhance population diversity and avoid premature convergence.To validate the algorithm’s performance,this paper compares AC-ACO with clustering methods such as Low-Energy Adaptive Clustering Hierarchy(LEACH),ACO,Particle Swarm Optimization(PSO),and Genetic Algorithm(GA).Simulation results demonstrate that AC-ACO outperforms the compared algorithms in key metrics such as energy consumption optimization,network lifetime extension,and communication delay reduction,providing an efficient solution for improving energy efficiency and ensuring long-term stable operation of wireless sensor networks.
基金supported by the National Natural Science Foundation of China(Grant No.42407232)the Sichuan Science and Technology Program(Grant No.2024NSFSC0826).
文摘Recognizing discontinuities within rock masses is a critical aspect of rock engineering.The development of remote sensing technologies has significantly enhanced the quality and quantity of the point clouds collected from rock outcrops.In response,we propose a workflow that balances accuracy and efficiency to extract discontinuities from massive point clouds.The proposed method employs voxel filtering to downsample point clouds,constructs a point cloud topology using K-d trees,utilizes principal component analysis to calculate the point cloud normals,and employs the pointwise clustering(PWC)algorithm to extract discontinuities from rock outcrop point clouds.This method provides information on the location and orientation(dip direction and dip angle)of the discontinuities,and the modified whale optimization algorithm(MWOA)is utilized to identify major discontinuity sets and their average orientations.Performance evaluations based on three real cases demonstrate that the proposed method significantly reduces computational time costs without sacrificing accuracy.In particular,the method yields more reasonable extraction results for discontinuities with certain undulations.The presented approach offers a novel tool for efficiently extracting discontinuities from large-scale point clouds.
基金The National Natural Science Foundation of China(No.50674086)Specialized Research Fund for the Doctoral Program of Higher Education(No.20060290508)the Postdoctoral Scientific Program of Jiangsu Province(No.0701045B)
文摘In order to mine production and security information from security supervising data and to ensure security and safety involved in production and decision-making,a clustering analysis algorithm for security supervising data based on a semantic description in coal mines is studied.First,the semantic and numerical-based hybrid description method of security supervising data in coal mines is described.Secondly,the similarity measurement method of semantic and numerical data are separately given and a weight-based hybrid similarity measurement method for the security supervising data based on a semantic description in coal mines is presented.Thirdly,taking the hybrid similarity measurement method as the distance criteria and using a grid methodology for reference,an improved CURE clustering algorithm based on the grid is presented.Finally,the simulation results of a security supervising data set in coal mines validate the efficiency of the algorithm.
基金Project(U2034211)supported by the National Natural Science Foundation of ChinaProject(20232ACE01013)supported by the Major Scientific and Technological Research and Development Special Project of Jiangxi Province,China。
文摘The safe driving and operation of trains is a necessary condition for ensuring the safe operation of trains.In particular,heavy-haul trains are characterized by the difficulty in driving and operation.Considering the uncertainties in train driving and operation,this paper analyzes the relationship between the safety of heavy-haul electric locomotive hauled trains and driving and operation.It studies the auxiliary intelligent driving safety operation control methods.Through K-means to identify the characteristics of drivers'driving manipulation,the hidden Markov model adaptively adjusts the train driving and operation sequence,and conducts auxiliary driving reconstruction for heavy-haul locomotive driving and operation.Based on the train running curve and the locomotive traction/braking characteristics,it smoothly controls the exertion of the traction/braking force of heavy-haul locomotives,thereby optimizing the driving safety control of heavy-haul trains in the vehicle-environment-track system.Finally,the train operation simulation and optimized driving verification are carried out by simulating some track sections.The results show that the proposed method can correct and pre-optimize driving operations,improving the smoothness of heavy-haul trains by approximately 10%.It verifies the effectiveness of the proposed train assisted driving control reconstruction method,facilitating the smooth and safe operation of heavy-haul trains.
文摘In k-means clustering, we are given a set of n data points in d-dimensional space R^d and an integer k and the problem is to determine a set of k points in R^d, called centers, so as to minimize the mean squared distance from each data point to its nearest center. In this paper, we present a simple and efficient clustering algorithm based on the k-means algorithm, which we call enhanced k-means algorithm. This algorithm is easy to implement, requiring a simple data structure to keep some information in each iteration to be used in the next iteration. Our experimental results demonstrated that our scheme can improve the computational speed of the k-means algorithm by the magnitude in the total number of distance calculations and the overall time of computation.
基金supported by National Natural Science Foundation of China(61304256)Zhejiang Provincial Natural Science Foundation of China(LQ13F030013)+4 种基金Project of the Education Department of Zhejiang Province(Y201327006)Young Researchers Foundation of Zhejiang Provincial Top Key Academic Discipline of Mechanical Engineering and Zhejiang Sci-Tech University Key Laboratory(ZSTUME01B15)New Century 151 Talent Project of Zhejiang Province521 Talent Project of Zhejiang Sci-Tech UniversityYoung and Middle-aged Talents Foundation of Zhejiang Provincial Top Key Academic Discipline of Mechanical Engineering
基金funded by the Key-Area Research and Development Program of Guangdong Province(Grant No.2020B1111200001)the Key project of monitoring,early warning and prevention of major natural disasters of China(Grant No.2019YFC1510304)+1 种基金the S&T Program of Hebei(Grant No.19275408D)the Scientific Research Projects of Weather Modification in Northwest China(Grant No.RYSY201905).
文摘A convective and stratiform cloud classification method for weather radar is proposed based on the density-based spatial clustering of applications with noise(DBSCAN)algorithm.To identify convective and stratiform clouds in different developmental phases,two-dimensional(2D)and three-dimensional(3D)models are proposed by applying reflectivity factors at 0.5°and at 0.5°,1.5°,and 2.4°elevation angles,respectively.According to the thresholds of the algorithm,which include echo intensity,the echo top height of 35 dBZ(ET),density threshold,andεneighborhood,cloud clusters can be marked into four types:deep-convective cloud(DCC),shallow-convective cloud(SCC),hybrid convective-stratiform cloud(HCS),and stratiform cloud(SFC)types.Each cloud cluster type is further identified as a core area and boundary area,which can provide more abundant cloud structure information.The algorithm is verified using the volume scan data observed with new-generation S-band weather radars in Nanjing,Xuzhou,and Qingdao.The results show that cloud clusters can be intuitively identified as core and boundary points,which change in area continuously during the process of convective evolution,by the improved DBSCAN algorithm.Therefore,the occurrence and disappearance of convective weather can be estimated in advance by observing the changes of the classification.Because density thresholds are different and multiple elevations are utilized in the 3D model,the identified echo types and areas are dissimilar between the 2D and 3D models.The 3D model identifies larger convective and stratiform clouds than the 2D model.However,the developing convective clouds of small areas at lower heights cannot be identified with the 3D model because they are covered by thick stratiform clouds.In addition,the 3D model can avoid the influence of the melting layer and better suggest convective clouds in the developmental stage.
基金supported by the National Natural Science Foundation of China(6107207061301179)the National Science and Technology Major Project(2010ZX03006-002-04)
文摘To improve the recognition rate of signal modulation recognition methods based on the clustering algorithm under the low SNR, a modulation recognition method is proposed. The characteristic parameter of the signal is extracted by using a clustering algorithm, the neural network is trained by using the algorithm of variable gradient correction (Polak-Ribiere) so as to enhance the rate of convergence, improve the performance of recognition under the low SNR and realize modulation recognition of the signal based on the modulation system of the constellation diagram. Simulation results show that the recognition rate based on this algorithm is enhanced over 30% compared with the methods that adopt clustering algorithm or neural network based on the back propagation algorithm alone under the low SNR. The recognition rate can reach 90% when the SNR is 4 dB, and the method is easy to be achieved so that it has a broad application prospect in the modulating recognition.
基金Supported by the National Natural Science Foundation of China (No.50269001, 50569002, 50669004)Natural Science Foundation of Inner Mongolia (No.200208020512, 200711020604)The Key Scientific and Technologic Project of the 10th Five-Year Plan of Inner Mongolia (No.20010103)
文摘Water quality assessment of lakes is important to determine functional zones of water use.Considering the fuzziness during the partitioning process for lake water quality in an arid area,a multiplex model of fuzzy clustering with pattern recognition was developed by integrating transitive closure method,ISODATA algorithm in fuzzy clustering and fuzzy pattern recognition.The model was applied to partition the Ulansuhai Lake,a typical shallow lake in arid climate zone in the west part of Inner Mongolia,China and grade the condition of water quality divisions.The results showed that the partition well matched the real conditions of the lake,and the method has been proved accurate in the application.