This paper proposes an equivalent modeling method for photovoltaic(PV)power stations via a particle swarm optimization(PSO)K-means clustering(KMC)algorithm with passive filter parameter clustering to address the compl...This paper proposes an equivalent modeling method for photovoltaic(PV)power stations via a particle swarm optimization(PSO)K-means clustering(KMC)algorithm with passive filter parameter clustering to address the complexities,simulation time cost and convergence problems of detailed PV power station models.First,the amplitude–frequency curves of different filter parameters are analyzed.Based on the results,a grouping parameter set for characterizing the external filter characteristics is established.These parameters are further defined as clustering parameters.A single PV inverter model is then established as a prerequisite foundation.The proposed equivalent method combines the global search capability of PSO with the rapid convergence of KMC,effectively overcoming the tendency of KMC to become trapped in local optima.This approach enhances both clustering accuracy and numerical stability when determining equivalence for PV inverter units.Using the proposed clustering method,both a detailed PV power station model and an equivalent model are developed and compared.Simulation and hardwarein-loop(HIL)results based on the equivalent model verify that the equivalent method accurately represents the dynamic characteristics of PVpower stations and adapts well to different operating conditions.The proposed equivalent modeling method provides an effective analysis tool for future renewable energy integration research.展开更多
The Intrusion Detection System(IDS)is a security mechanism developed to observe network traffic and recognize suspicious or malicious activities.Clustering algorithms are often incorporated into IDS;however,convention...The Intrusion Detection System(IDS)is a security mechanism developed to observe network traffic and recognize suspicious or malicious activities.Clustering algorithms are often incorporated into IDS;however,conventional clustering-based methods face notable drawbacks,including poor scalability in handling high-dimensional datasets and a strong dependence of outcomes on initial conditions.To overcome the performance limitations of existing methods,this study proposes a novel quantum-inspired clustering algorithm that relies on a similarity coefficient-based quantum genetic algorithm(SC-QGA)and an improved quantum artificial bee colony algorithm hybrid K-means(IQABC-K).First,the SC-QGA algorithmis constructed based on quantum computing and integrates similarity coefficient theory to strengthen genetic diversity and feature extraction capabilities.For the subsequent clustering phase,the process based on the IQABC-K algorithm is enhanced with the core improvement of adaptive rotation gate and movement exploitation strategies to balance the exploration capabilities of global search and the exploitation capabilities of local search.Simultaneously,the acceleration of convergence toward the global optimum and a reduction in computational complexity are facilitated by means of the global optimum bootstrap strategy and a linear population reduction strategy.Through experimental evaluation with multiple algorithms and diverse performance metrics,the proposed algorithm confirms reliable accuracy on three datasets:KDD CUP99,NSL_KDD,and UNSW_NB15,achieving accuracy of 98.57%,98.81%,and 98.32%,respectively.These results affirm its potential as an effective solution for practical clustering applications.展开更多
AIM:To evaluate long-term visual field(VF)prediction using K-means clustering in patients with primary open angle glaucoma(POAG).METHODS:Patients who underwent 24-2 VF tests≥10 were included in this study.Using 52 to...AIM:To evaluate long-term visual field(VF)prediction using K-means clustering in patients with primary open angle glaucoma(POAG).METHODS:Patients who underwent 24-2 VF tests≥10 were included in this study.Using 52 total deviation values(TDVs)from the first 10 VF tests of the training dataset,VF points were clustered into several regions using the hierarchical ordered partitioning and collapsing hybrid(HOPACH)and K-means clustering.Based on the clustering results,a linear regression analysis was applied to each clustered region of the testing dataset to predict the TDVs of the 10th VF test.Three to nine VF tests were used to predict the 10th VF test,and the prediction errors(root mean square error,RMSE)of each clustering method and pointwise linear regression(PLR)were compared.RESULTS:The training group consisted of 228 patients(mean age,54.20±14.38y;123 males and 105 females),and the testing group included 81 patients(mean age,54.88±15.22y;43 males and 38 females).All subjects were diagnosed with POAG.Fifty-two VF points were clustered into 11 and nine regions using HOPACH and K-means clustering,respectively.K-means clustering had a lower prediction error than PLR when n=1:3 and 1:4(both P≤0.003).The prediction errors of K-means clustering were lower than those of HOPACH in all sections(n=1:4 to 1:9;all P≤0.011),except for n=1:3(P=0.680).PLR outperformed K-means clustering only when n=1:8 and 1:9(both P≤0.020).CONCLUSION:K-means clustering can predict longterm VF test results more accurately in patients with POAG with limited VF data.展开更多
Various factors,including weak tie-lines into the electric power system(EPS)networks,can lead to low-frequency oscillations(LFOs),which are considered an instant,non-threatening situation,but slow-acting and poisonous...Various factors,including weak tie-lines into the electric power system(EPS)networks,can lead to low-frequency oscillations(LFOs),which are considered an instant,non-threatening situation,but slow-acting and poisonous.Considering the challenge mentioned,this article proposes a clustering-based machine learning(ML)framework to enhance the stability of EPS networks by suppressing LFOs through real-time tuning of key power system stabilizer(PSS)parameters.To validate the proposed strategy,two distinct EPS networks are selected:the single-machine infinite-bus(SMIB)with a single-stage PSS and the unified power flow controller(UPFC)coordinated SMIB with a double-stage PSS.To generate data under various loading conditions for both networks,an efficient but offline meta-heuristic algorithm,namely the grey wolf optimizer(GWO),is used,with the loading conditions as inputs and the key PSS parameters as outputs.The generated loading conditions are then clustered using the fuzzy k-means(FKM)clustering method.Finally,the group method of data handling(GMDH)and long short-term memory(LSTM)ML models are developed for clustered data to predict PSS key parameters in real time for any loading condition.A few well-known statistical performance indices(SPI)are considered for validation and robustness of the training and testing procedure of the developed FKM-GMDH and FKM-LSTM models based on the prediction of PSS parameters.The performance of the ML models is also evaluated using three stability indices(i.e.,minimum damping ratio,eigenvalues,and time-domain simulations)after optimally tuned PSS with real-time estimated parameters under changing operating conditions.Besides,the outputs of the offline(GWO-based)metaheuristic model,proposed real-time(FKM-GMDH and FKM-LSTM)machine learning models,and previously reported literature models are compared.According to the results,the proposed methodology outperforms the others in enhancing the stability of the selected EPS networks by damping out the observed unwanted LFOs under various loading conditions.展开更多
In allusion to the disadvantage of having to obtain the number of clusters of data sets in advance and the sensitivity to selecting initial clustering centers in the k-means algorithm, an improved k-means clustering a...In allusion to the disadvantage of having to obtain the number of clusters of data sets in advance and the sensitivity to selecting initial clustering centers in the k-means algorithm, an improved k-means clustering algorithm is proposed. First, the concept of a silhouette coefficient is introduced, and the optimal clustering number Kopt of a data set with unknown class information is confirmed by calculating the silhouette coefficient of objects in clusters under different K values. Then the distribution of the data set is obtained through hierarchical clustering and the initial clustering-centers are confirmed. Finally, the clustering is completed by the traditional k-means clustering. By the theoretical analysis, it is proved that the improved k-means clustering algorithm has proper computational complexity. The experimental results of IRIS testing data set show that the algorithm can distinguish different clusters reasonably and recognize the outliers efficiently, and the entropy generated by the algorithm is lower.展开更多
To address the issue of abnormal energy consumption fluctuations in the converter steelmaking process,an integrated diagnostic method combining the gray wolf optimization(GWO)algorithm,support vector machine(SVM),and ...To address the issue of abnormal energy consumption fluctuations in the converter steelmaking process,an integrated diagnostic method combining the gray wolf optimization(GWO)algorithm,support vector machine(SVM),and K-means clustering was proposed.Eight input parameters—derived from molten iron conditions and external factors—were selected as feature variables.A GWO-SVM model was developed to accurately predict the energy consumption of individual heats.Based on the prediction results,the mean absolute percentage error and maximum relative error of the test set were employed as criteria to identify heats with abnormal energy usage.For these heats,the K-means clustering algorithm was used to determine benchmark values of influencing factors from similar steel grades,enabling root-cause diagnosis of excessive energy consumption.The proposed method was applied to real production data from a converter in a steel plant.The analysis reveals that heat sample No.44 exhibits abnormal energy consumption,due to gas recovery being 1430.28 kg of standard coal below the benchmark level.A secondary contributing factor is a steam recovery shortfall of 237.99 kg of standard coal.This integrated approach offers a scientifically grounded tool for energy management in converter operations and provides valuable guidance for optimizing process parameters and enhancing energy efficiency.展开更多
In k-means clustering, we are given a set of n data points in d-dimensional space R^d and an integer k and the problem is to determine a set of k points in R^d, called centers, so as to minimize the mean squared dista...In k-means clustering, we are given a set of n data points in d-dimensional space R^d and an integer k and the problem is to determine a set of k points in R^d, called centers, so as to minimize the mean squared distance from each data point to its nearest center. In this paper, we present a simple and efficient clustering algorithm based on the k-means algorithm, which we call enhanced k-means algorithm. This algorithm is easy to implement, requiring a simple data structure to keep some information in each iteration to be used in the next iteration. Our experimental results demonstrated that our scheme can improve the computational speed of the k-means algorithm by the magnitude in the total number of distance calculations and the overall time of computation.展开更多
For the existing support vector machine, when recognizing more questions, the shortcomings of high computational complexity and low recognition rate under the low SNR are emerged. The characteristic parameter of the s...For the existing support vector machine, when recognizing more questions, the shortcomings of high computational complexity and low recognition rate under the low SNR are emerged. The characteristic parameter of the signal is extracted and optimized by using a clustering algorithm, support vector machine is trained by grading algorithm so as to enhance the rate of convergence, improve the performance of recognition under the low SNR and realize modulation recognition of the signal based on the modulation system of the constellation diagram in this paper. Simulation results show that the average recognition rate based on this algorithm is enhanced over 30% compared with methods that adopting clustering algorithm or support vector machine respectively under the low SNR. The average recognition rate can reach 90% when the SNR is 5 dB, and the method is easy to be achieved so that it has broad application prospect in the modulating recognition.展开更多
Cluster-basedmodels have numerous application scenarios in vehicular ad-hoc networks(VANETs)and can greatly help improve the communication performance of VANETs.However,the frequent movement of vehicles can often lead...Cluster-basedmodels have numerous application scenarios in vehicular ad-hoc networks(VANETs)and can greatly help improve the communication performance of VANETs.However,the frequent movement of vehicles can often lead to changes in the network topology,thereby reducing cluster stability in urban scenarios.To address this issue,we propose a clustering model based on the density peak clustering(DPC)method and sparrow search algorithm(SSA),named SDPC.First,the model constructs a fitness function based on the parameters obtained from the DPC method and deploys the SSA for iterative optimization to select cluster heads(CHs).Then,the vehicles that have not been selected as CHs are assigned to appropriate clusters by comprehensively considering the distance parameter and link-reliability parameter.Finally,cluster maintenance strategies are considered to tackle the changes in the clusters’organizational structure.To verify the performance of the model,we conducted a simulation on a real-world scenario for multiple metrics related to clusters’stability.The results show that compared with the APROVE and the GAPC,SDPC showed clear performance advantages,indicating that SDPC can effectively ensure VANETs’cluster stability in urban scenarios.展开更多
Reliable Cluster Head(CH)selectionbased routing protocols are necessary for increasing the packet transmission efficiency with optimal path discovery that never introduces degradation over the transmission reliability...Reliable Cluster Head(CH)selectionbased routing protocols are necessary for increasing the packet transmission efficiency with optimal path discovery that never introduces degradation over the transmission reliability.In this paper,Hybrid Golden Jackal,and Improved Whale Optimization Algorithm(HGJIWOA)is proposed as an effective and optimal routing protocol that guarantees efficient routing of data packets in the established between the CHs and the movable sink.This HGJIWOA included the phases of Dynamic Lens-Imaging Learning Strategy and Novel Update Rules for determining the reliable route essential for data packets broadcasting attained through fitness measure estimation-based CH selection.The process of CH selection achieved using Golden Jackal Optimization Algorithm(GJOA)completely depends on the factors of maintainability,consistency,trust,delay,and energy.The adopted GJOA algorithm play a dominant role in determining the optimal path of routing depending on the parameter of reduced delay and minimal distance.It further utilized Improved Whale Optimisation Algorithm(IWOA)for forwarding the data from chosen CHs to the BS via optimized route depending on the parameters of energy and distance.It also included a reliable route maintenance process that aids in deciding the selected route through which data need to be transmitted or re-routed.The simulation outcomes of the proposed HGJIWOA mechanism with different sensor nodes confirmed an improved mean throughput of 18.21%,sustained residual energy of 19.64%with minimized end-to-end delay of 21.82%,better than the competitive CH selection approaches.展开更多
Data clustering is an essential technique for analyzing complex datasets and continues to be a central research topic in data analysis.Traditional clustering algorithms,such as K-means,are widely used due to their sim...Data clustering is an essential technique for analyzing complex datasets and continues to be a central research topic in data analysis.Traditional clustering algorithms,such as K-means,are widely used due to their simplicity and efficiency.This paper proposes a novel Spiral Mechanism-Optimized Phasmatodea Population Evolution Algorithm(SPPE)to improve clustering performance.The SPPE algorithm introduces several enhancements to the standard Phasmatodea Population Evolution(PPE)algorithm.Firstly,a Variable Neighborhood Search(VNS)factor is incorporated to strengthen the local search capability and foster population diversity.Secondly,a position update model,incorporating a spiral mechanism,is designed to improve the algorithm’s global exploration and convergence speed.Finally,a dynamic balancing factor,guided by fitness values,adjusts the search process to balance exploration and exploitation effectively.The performance of SPPE is first validated on CEC2013 benchmark functions,where it demonstrates excellent convergence speed and superior optimization results compared to several state-of-the-art metaheuristic algorithms.To further verify its practical applicability,SPPE is combined with the K-means algorithm for data clustering and tested on seven datasets.Experimental results show that SPPE-K-means improves clustering accuracy,reduces dependency on initialization,and outperforms other clustering approaches.This study highlights SPPE’s robustness and efficiency in solving both optimization and clustering challenges,making it a promising tool for complex data analysis tasks.展开更多
As vehicular networks grow increasingly complex due to high node mobility and dynamic traffic conditions,efficient clustering mechanisms are vital to ensure stable and scalable communication.Recent studies have emphas...As vehicular networks grow increasingly complex due to high node mobility and dynamic traffic conditions,efficient clustering mechanisms are vital to ensure stable and scalable communication.Recent studies have emphasized the need for adaptive clustering strategies to improve performance in Intelligent Transportation Systems(ITS).This paper presents the Grasshopper Optimization Algorithm for Vehicular Network Clustering(GOAVNET)algorithm,an innovative approach to optimal vehicular clustering in Vehicular Ad-Hoc Networks(VANETs),leveraging the Grasshopper Optimization Algorithm(GOA)to address the critical challenges of traffic congestion and communication inefficiencies in Intelligent Transportation Systems(ITS).The proposed GOA-VNET employs an iterative and interactive optimization mechanism to dynamically adjust node positions and cluster configurations,ensuring robust adaptability to varying vehicular densities and transmission ranges.Key features of GOA-VNET include the utilization of attraction zone,repulsion zone,and comfort zone parameters,which collectively enhance clustering efficiency and minimize congestion within Regions of Interest(ROI).By managing cluster configurations and node densities effectively,GOA-VNET ensures balanced load distribution and seamless data transmission,even in scenarios with high vehicular densities and varying transmission ranges.Comparative evaluations against the Whale Optimization Algorithm(WOA)and Grey Wolf Optimization(GWO)demonstrate that GOA-VNET consistently outperforms these methods by achieving superior clustering efficiency,reducing the number of clusters by up to 10%in high-density scenarios,and improving data transmission reliability.Simulation results reveal that under a 100-600 m transmission range,GOA-VNET achieves an average reduction of 8%-15%in the number of clusters and maintains a 5%-10%improvement in packet delivery ratio(PDR)compared to baseline algorithms.Additionally,the algorithm incorporates a heat transfer-inspired load-balancing mechanism,ensuring equitable distribution of nodes among cluster leaders(CLs)and maintaining a stable network environment.These results validate GOA-VNET as a reliable and scalable solution for VANETs,with significant potential to support next-generation ITS.Future research could further enhance the algorithm by integrating multi-objective optimization techniques and exploring broader applications in complex traffic scenarios.展开更多
We propose a robust earthquake clustering method:the Bayesian Gaussian mixture model with nearest-neighbor distance(BGMM-NND)algorithm.Unlike the conventional nearest neighbor distance method,the BGMM-NND algorithm el...We propose a robust earthquake clustering method:the Bayesian Gaussian mixture model with nearest-neighbor distance(BGMM-NND)algorithm.Unlike the conventional nearest neighbor distance method,the BGMM-NND algorithm eliminates the need for hyperparameter tuning or reliance on fixed thresholds,offering enhanced flexibility for clustering across varied seismic scales.By integrating cumulative probability and BGMM with principal component analysis(PCA),the BGMM-NND algorithm effectively distinguishes between background and triggered earthquakes while maintaining the magnitude component and resolving the issue of excessively large spatial cluster domains.We apply the BGMM-NND algorithm to the Sichuan–Yunnan seismic catalog from 1971 to 2024,revealing notable variations in earthquake frequency,triggering characteristics,and recurrence patterns across different fault zones.Distinct clustering and triggering behaviors are identified along different segments of the Longmenshan Fault.Multiple seismic modes,namely,the short-distance mode,the medium-distance mode,the repeating-like mode,the uniform background mode,and the Wenchuan mode,are uncovered.The algorithm's flexibility and robust performance in earthquake clustering makes it a valuable tool for exploring seismicity characteristics,offering new insights into earthquake clustering and the spatiotemporal patterns of seismic activity.展开更多
Due to the limitation and hesitation in one's knowledge, the membership degree of an element to a given set usually has a few different values, in which the conventional fuzzy sets are invalid. Hesitant fuzzy sets ar...Due to the limitation and hesitation in one's knowledge, the membership degree of an element to a given set usually has a few different values, in which the conventional fuzzy sets are invalid. Hesitant fuzzy sets are a powerful tool to treat this case. The present paper focuses on investigating the clustering technique for hesitant fuzzy sets based on the K-means clustering algorithm which takes the results of hierarchical clustering as the initial clusters. Finally, two examples demonstrate the validity of our algorithm.展开更多
Wireless Sensor Networks(WSNs),as a crucial component of the Internet of Things(IoT),are widely used in environmental monitoring,industrial control,and security surveillance.However,WSNs still face challenges such as ...Wireless Sensor Networks(WSNs),as a crucial component of the Internet of Things(IoT),are widely used in environmental monitoring,industrial control,and security surveillance.However,WSNs still face challenges such as inaccurate node clustering,low energy efficiency,and shortened network lifespan in practical deployments,which significantly limit their large-scale application.To address these issues,this paper proposes an Adaptive Chaotic Ant Colony Optimization algorithm(AC-ACO),aiming to optimize the energy utilization and system lifespan of WSNs.AC-ACO combines the path-planning capability of Ant Colony Optimization(ACO)with the dynamic characteristics of chaotic mapping and introduces an adaptive mechanism to enhance the algorithm’s flexibility and adaptability.By dynamically adjusting the pheromone evaporation factor and heuristic weights,efficient node clustering is achieved.Additionally,a chaotic mapping initialization strategy is employed to enhance population diversity and avoid premature convergence.To validate the algorithm’s performance,this paper compares AC-ACO with clustering methods such as Low-Energy Adaptive Clustering Hierarchy(LEACH),ACO,Particle Swarm Optimization(PSO),and Genetic Algorithm(GA).Simulation results demonstrate that AC-ACO outperforms the compared algorithms in key metrics such as energy consumption optimization,network lifetime extension,and communication delay reduction,providing an efficient solution for improving energy efficiency and ensuring long-term stable operation of wireless sensor networks.展开更多
Recognizing discontinuities within rock masses is a critical aspect of rock engineering.The development of remote sensing technologies has significantly enhanced the quality and quantity of the point clouds collected ...Recognizing discontinuities within rock masses is a critical aspect of rock engineering.The development of remote sensing technologies has significantly enhanced the quality and quantity of the point clouds collected from rock outcrops.In response,we propose a workflow that balances accuracy and efficiency to extract discontinuities from massive point clouds.The proposed method employs voxel filtering to downsample point clouds,constructs a point cloud topology using K-d trees,utilizes principal component analysis to calculate the point cloud normals,and employs the pointwise clustering(PWC)algorithm to extract discontinuities from rock outcrop point clouds.This method provides information on the location and orientation(dip direction and dip angle)of the discontinuities,and the modified whale optimization algorithm(MWOA)is utilized to identify major discontinuity sets and their average orientations.Performance evaluations based on three real cases demonstrate that the proposed method significantly reduces computational time costs without sacrificing accuracy.In particular,the method yields more reasonable extraction results for discontinuities with certain undulations.The presented approach offers a novel tool for efficiently extracting discontinuities from large-scale point clouds.展开更多
Addressing the issue that flight plans between Chinese city pairs typically rely on a single route,lacking alternative paths and posing challenges in responding to emergencies,this study employs the“quantile-inflecti...Addressing the issue that flight plans between Chinese city pairs typically rely on a single route,lacking alternative paths and posing challenges in responding to emergencies,this study employs the“quantile-inflection point method”to analyze specific deviation trajectories,determine deviation thresholds,and identify commonly used deviation paths.By combining multiple similarity metrics,including Euclidean distance,Hausdorff distance,and sector edit distance,with the density-based spatial clustering of applications with noise(DBSCAN)algorithm,the study clusters deviation trajectories to construct a multi-option trajectory set for city pairs.A case study of 23578 flight trajectories between the Guangzhou airport cluster and the Shanghai airport cluster demonstrates the effectiveness of the proposed framework.Experimental results show that sector edit distance achieves superior clustering performance compared to Euclidean and Hausdorff distances,with higher silhouette coefficients and lower Davies⁃Bouldin indices,ensuring better intra-cluster compactness and inter-cluster separation.Based on clustering results,19 representative trajectory options are identified,covering both nominal and deviation paths,which significantly enhance route diversity and reflect actual flight practices.This provides a practical basis for optimizing flight paths and scheduling,enhancing the flexibility of route selection for flights between city pairs.展开更多
The distillation process is an important chemical process,and the application of data-driven modelling approach has the potential to reduce model complexity compared to mechanistic modelling,thus improving the efficie...The distillation process is an important chemical process,and the application of data-driven modelling approach has the potential to reduce model complexity compared to mechanistic modelling,thus improving the efficiency of process optimization or monitoring studies.However,the distillation process is highly nonlinear and has multiple uncertainty perturbation intervals,which brings challenges to accurate data-driven modelling of distillation processes.This paper proposes a systematic data-driven modelling framework to solve these problems.Firstly,data segment variance was introduced into the K-means algorithm to form K-means data interval(KMDI)clustering in order to cluster the data into perturbed and steady state intervals for steady-state data extraction.Secondly,maximal information coefficient(MIC)was employed to calculate the nonlinear correlation between variables for removing redundant features.Finally,extreme gradient boosting(XGBoost)was integrated as the basic learner into adaptive boosting(AdaBoost)with the error threshold(ET)set to improve weights update strategy to construct the new integrated learning algorithm,XGBoost-AdaBoost-ET.The superiority of the proposed framework is verified by applying this data-driven modelling framework to a real industrial process of propylene distillation.展开更多
In the era of big data,personalised recommendation systems are essential for enhancing user engagement and driving business growth.However,traditional recommendation algorithms,such as collaborative filtering,face sig...In the era of big data,personalised recommendation systems are essential for enhancing user engagement and driving business growth.However,traditional recommendation algorithms,such as collaborative filtering,face significant challenges due to data sparsity,algorithm scalability,and the difficulty of adapting to dynamic user preferences.These limitations hinder the ability of systems to provide highly accurate and personalised recommendations.To address these challenges,this paper proposes a clustering-based recommendation method that integrates an enhanced Grasshopper Optimisation Algorithm(GOA),termed LCGOA,to improve the accuracy and efficiency of recommendation systems by optimising cluster centroids in a dynamic environment.By combining the K-means algorithm with the enhanced GOA,which incorporates a Lévy flight mechanism and multi-strategy co-evolution,our method overcomes the centroid sensitivity issue,a key limitation in traditional clustering techniques.Experimental results across multiple datasets show that the proposed LCGOA-based method significantly outperforms conventional recommendation algorithms in terms of recommendation accuracy,offering more relevant content to users and driving greater customer satisfaction and business growth.展开更多
K-means algorithm is one of the most widely used algorithms in the clustering analysis. To deal with the problem caused by the random selection of initial center points in the traditional al- gorithm, this paper propo...K-means algorithm is one of the most widely used algorithms in the clustering analysis. To deal with the problem caused by the random selection of initial center points in the traditional al- gorithm, this paper proposes an improved K-means algorithm based on the similarity matrix. The im- proved algorithm can effectively avoid the random selection of initial center points, therefore it can provide effective initial points for clustering process, and reduce the fluctuation of clustering results which are resulted from initial points selections, thus a better clustering quality can be obtained. The experimental results also show that the F-measure of the improved K-means algorithm has been greatly improved and the clustering results are more stable.展开更多
基金supported by the Research Project of China Southern Power Grid(No.056200KK52222031).
文摘This paper proposes an equivalent modeling method for photovoltaic(PV)power stations via a particle swarm optimization(PSO)K-means clustering(KMC)algorithm with passive filter parameter clustering to address the complexities,simulation time cost and convergence problems of detailed PV power station models.First,the amplitude–frequency curves of different filter parameters are analyzed.Based on the results,a grouping parameter set for characterizing the external filter characteristics is established.These parameters are further defined as clustering parameters.A single PV inverter model is then established as a prerequisite foundation.The proposed equivalent method combines the global search capability of PSO with the rapid convergence of KMC,effectively overcoming the tendency of KMC to become trapped in local optima.This approach enhances both clustering accuracy and numerical stability when determining equivalence for PV inverter units.Using the proposed clustering method,both a detailed PV power station model and an equivalent model are developed and compared.Simulation and hardwarein-loop(HIL)results based on the equivalent model verify that the equivalent method accurately represents the dynamic characteristics of PVpower stations and adapts well to different operating conditions.The proposed equivalent modeling method provides an effective analysis tool for future renewable energy integration research.
基金supported by the NSFC(Grant Nos.62176273,62271070,62441212)The Open Foundation of State Key Laboratory of Networking and Switching Technology(Beijing University of Posts and Telecommunications)under Grant SKLNST-2024-1-062025Major Project of the Natural Science Foundation of Inner Mongolia(2025ZD008).
文摘The Intrusion Detection System(IDS)is a security mechanism developed to observe network traffic and recognize suspicious or malicious activities.Clustering algorithms are often incorporated into IDS;however,conventional clustering-based methods face notable drawbacks,including poor scalability in handling high-dimensional datasets and a strong dependence of outcomes on initial conditions.To overcome the performance limitations of existing methods,this study proposes a novel quantum-inspired clustering algorithm that relies on a similarity coefficient-based quantum genetic algorithm(SC-QGA)and an improved quantum artificial bee colony algorithm hybrid K-means(IQABC-K).First,the SC-QGA algorithmis constructed based on quantum computing and integrates similarity coefficient theory to strengthen genetic diversity and feature extraction capabilities.For the subsequent clustering phase,the process based on the IQABC-K algorithm is enhanced with the core improvement of adaptive rotation gate and movement exploitation strategies to balance the exploration capabilities of global search and the exploitation capabilities of local search.Simultaneously,the acceleration of convergence toward the global optimum and a reduction in computational complexity are facilitated by means of the global optimum bootstrap strategy and a linear population reduction strategy.Through experimental evaluation with multiple algorithms and diverse performance metrics,the proposed algorithm confirms reliable accuracy on three datasets:KDD CUP99,NSL_KDD,and UNSW_NB15,achieving accuracy of 98.57%,98.81%,and 98.32%,respectively.These results affirm its potential as an effective solution for practical clustering applications.
基金Supported by the Korea Health Technology R&D Project through the Korea Health Industry Development Institute(KHIDI),the Ministry of Health&Welfare,Republic of Korea(No.RS-2020-KH088726)the Patient-Centered Clinical Research Coordinating Center(PACEN),the Ministry of Health and Welfare,Republic of Korea(No.HC19C0276)the National Research Foundation of Korea(NRF),the Korea Government(MSIT)(No.RS-2023-00247504).
文摘AIM:To evaluate long-term visual field(VF)prediction using K-means clustering in patients with primary open angle glaucoma(POAG).METHODS:Patients who underwent 24-2 VF tests≥10 were included in this study.Using 52 total deviation values(TDVs)from the first 10 VF tests of the training dataset,VF points were clustered into several regions using the hierarchical ordered partitioning and collapsing hybrid(HOPACH)and K-means clustering.Based on the clustering results,a linear regression analysis was applied to each clustered region of the testing dataset to predict the TDVs of the 10th VF test.Three to nine VF tests were used to predict the 10th VF test,and the prediction errors(root mean square error,RMSE)of each clustering method and pointwise linear regression(PLR)were compared.RESULTS:The training group consisted of 228 patients(mean age,54.20±14.38y;123 males and 105 females),and the testing group included 81 patients(mean age,54.88±15.22y;43 males and 38 females).All subjects were diagnosed with POAG.Fifty-two VF points were clustered into 11 and nine regions using HOPACH and K-means clustering,respectively.K-means clustering had a lower prediction error than PLR when n=1:3 and 1:4(both P≤0.003).The prediction errors of K-means clustering were lower than those of HOPACH in all sections(n=1:4 to 1:9;all P≤0.011),except for n=1:3(P=0.680).PLR outperformed K-means clustering only when n=1:8 and 1:9(both P≤0.020).CONCLUSION:K-means clustering can predict longterm VF test results more accurately in patients with POAG with limited VF data.
基金supported by the Deanship of Research at the King Fahd University of Petroleum&Minerals,Dhahran,31261,Saudi Arabia,under Project No.EC241001.
文摘Various factors,including weak tie-lines into the electric power system(EPS)networks,can lead to low-frequency oscillations(LFOs),which are considered an instant,non-threatening situation,but slow-acting and poisonous.Considering the challenge mentioned,this article proposes a clustering-based machine learning(ML)framework to enhance the stability of EPS networks by suppressing LFOs through real-time tuning of key power system stabilizer(PSS)parameters.To validate the proposed strategy,two distinct EPS networks are selected:the single-machine infinite-bus(SMIB)with a single-stage PSS and the unified power flow controller(UPFC)coordinated SMIB with a double-stage PSS.To generate data under various loading conditions for both networks,an efficient but offline meta-heuristic algorithm,namely the grey wolf optimizer(GWO),is used,with the loading conditions as inputs and the key PSS parameters as outputs.The generated loading conditions are then clustered using the fuzzy k-means(FKM)clustering method.Finally,the group method of data handling(GMDH)and long short-term memory(LSTM)ML models are developed for clustered data to predict PSS key parameters in real time for any loading condition.A few well-known statistical performance indices(SPI)are considered for validation and robustness of the training and testing procedure of the developed FKM-GMDH and FKM-LSTM models based on the prediction of PSS parameters.The performance of the ML models is also evaluated using three stability indices(i.e.,minimum damping ratio,eigenvalues,and time-domain simulations)after optimally tuned PSS with real-time estimated parameters under changing operating conditions.Besides,the outputs of the offline(GWO-based)metaheuristic model,proposed real-time(FKM-GMDH and FKM-LSTM)machine learning models,and previously reported literature models are compared.According to the results,the proposed methodology outperforms the others in enhancing the stability of the selected EPS networks by damping out the observed unwanted LFOs under various loading conditions.
基金The National Natural Science Foundation of China(No50674086)Specialized Research Fund for the Doctoral Program of Higher Education (No20060290508)the Youth Scientific Research Foundation of China University of Mining and Technology (No2006A047)
文摘In allusion to the disadvantage of having to obtain the number of clusters of data sets in advance and the sensitivity to selecting initial clustering centers in the k-means algorithm, an improved k-means clustering algorithm is proposed. First, the concept of a silhouette coefficient is introduced, and the optimal clustering number Kopt of a data set with unknown class information is confirmed by calculating the silhouette coefficient of objects in clusters under different K values. Then the distribution of the data set is obtained through hierarchical clustering and the initial clustering-centers are confirmed. Finally, the clustering is completed by the traditional k-means clustering. By the theoretical analysis, it is proved that the improved k-means clustering algorithm has proper computational complexity. The experimental results of IRIS testing data set show that the algorithm can distinguish different clusters reasonably and recognize the outliers efficiently, and the entropy generated by the algorithm is lower.
基金support from the National Key R&D Program of China(Grant No.2020YFB1711100).
文摘To address the issue of abnormal energy consumption fluctuations in the converter steelmaking process,an integrated diagnostic method combining the gray wolf optimization(GWO)algorithm,support vector machine(SVM),and K-means clustering was proposed.Eight input parameters—derived from molten iron conditions and external factors—were selected as feature variables.A GWO-SVM model was developed to accurately predict the energy consumption of individual heats.Based on the prediction results,the mean absolute percentage error and maximum relative error of the test set were employed as criteria to identify heats with abnormal energy usage.For these heats,the K-means clustering algorithm was used to determine benchmark values of influencing factors from similar steel grades,enabling root-cause diagnosis of excessive energy consumption.The proposed method was applied to real production data from a converter in a steel plant.The analysis reveals that heat sample No.44 exhibits abnormal energy consumption,due to gas recovery being 1430.28 kg of standard coal below the benchmark level.A secondary contributing factor is a steam recovery shortfall of 237.99 kg of standard coal.This integrated approach offers a scientifically grounded tool for energy management in converter operations and provides valuable guidance for optimizing process parameters and enhancing energy efficiency.
文摘In k-means clustering, we are given a set of n data points in d-dimensional space R^d and an integer k and the problem is to determine a set of k points in R^d, called centers, so as to minimize the mean squared distance from each data point to its nearest center. In this paper, we present a simple and efficient clustering algorithm based on the k-means algorithm, which we call enhanced k-means algorithm. This algorithm is easy to implement, requiring a simple data structure to keep some information in each iteration to be used in the next iteration. Our experimental results demonstrated that our scheme can improve the computational speed of the k-means algorithm by the magnitude in the total number of distance calculations and the overall time of computation.
基金supported in part by the National Natural Science Foundation of China under Grand No.61871129 and No.61301179Projects of Science and Technology Plan Guangdong Province under Grand No.2014A010101284
文摘For the existing support vector machine, when recognizing more questions, the shortcomings of high computational complexity and low recognition rate under the low SNR are emerged. The characteristic parameter of the signal is extracted and optimized by using a clustering algorithm, support vector machine is trained by grading algorithm so as to enhance the rate of convergence, improve the performance of recognition under the low SNR and realize modulation recognition of the signal based on the modulation system of the constellation diagram in this paper. Simulation results show that the average recognition rate based on this algorithm is enhanced over 30% compared with methods that adopting clustering algorithm or support vector machine respectively under the low SNR. The average recognition rate can reach 90% when the SNR is 5 dB, and the method is easy to be achieved so that it has broad application prospect in the modulating recognition.
文摘Cluster-basedmodels have numerous application scenarios in vehicular ad-hoc networks(VANETs)and can greatly help improve the communication performance of VANETs.However,the frequent movement of vehicles can often lead to changes in the network topology,thereby reducing cluster stability in urban scenarios.To address this issue,we propose a clustering model based on the density peak clustering(DPC)method and sparrow search algorithm(SSA),named SDPC.First,the model constructs a fitness function based on the parameters obtained from the DPC method and deploys the SSA for iterative optimization to select cluster heads(CHs).Then,the vehicles that have not been selected as CHs are assigned to appropriate clusters by comprehensively considering the distance parameter and link-reliability parameter.Finally,cluster maintenance strategies are considered to tackle the changes in the clusters’organizational structure.To verify the performance of the model,we conducted a simulation on a real-world scenario for multiple metrics related to clusters’stability.The results show that compared with the APROVE and the GAPC,SDPC showed clear performance advantages,indicating that SDPC can effectively ensure VANETs’cluster stability in urban scenarios.
文摘Reliable Cluster Head(CH)selectionbased routing protocols are necessary for increasing the packet transmission efficiency with optimal path discovery that never introduces degradation over the transmission reliability.In this paper,Hybrid Golden Jackal,and Improved Whale Optimization Algorithm(HGJIWOA)is proposed as an effective and optimal routing protocol that guarantees efficient routing of data packets in the established between the CHs and the movable sink.This HGJIWOA included the phases of Dynamic Lens-Imaging Learning Strategy and Novel Update Rules for determining the reliable route essential for data packets broadcasting attained through fitness measure estimation-based CH selection.The process of CH selection achieved using Golden Jackal Optimization Algorithm(GJOA)completely depends on the factors of maintainability,consistency,trust,delay,and energy.The adopted GJOA algorithm play a dominant role in determining the optimal path of routing depending on the parameter of reduced delay and minimal distance.It further utilized Improved Whale Optimisation Algorithm(IWOA)for forwarding the data from chosen CHs to the BS via optimized route depending on the parameters of energy and distance.It also included a reliable route maintenance process that aids in deciding the selected route through which data need to be transmitted or re-routed.The simulation outcomes of the proposed HGJIWOA mechanism with different sensor nodes confirmed an improved mean throughput of 18.21%,sustained residual energy of 19.64%with minimized end-to-end delay of 21.82%,better than the competitive CH selection approaches.
文摘Data clustering is an essential technique for analyzing complex datasets and continues to be a central research topic in data analysis.Traditional clustering algorithms,such as K-means,are widely used due to their simplicity and efficiency.This paper proposes a novel Spiral Mechanism-Optimized Phasmatodea Population Evolution Algorithm(SPPE)to improve clustering performance.The SPPE algorithm introduces several enhancements to the standard Phasmatodea Population Evolution(PPE)algorithm.Firstly,a Variable Neighborhood Search(VNS)factor is incorporated to strengthen the local search capability and foster population diversity.Secondly,a position update model,incorporating a spiral mechanism,is designed to improve the algorithm’s global exploration and convergence speed.Finally,a dynamic balancing factor,guided by fitness values,adjusts the search process to balance exploration and exploitation effectively.The performance of SPPE is first validated on CEC2013 benchmark functions,where it demonstrates excellent convergence speed and superior optimization results compared to several state-of-the-art metaheuristic algorithms.To further verify its practical applicability,SPPE is combined with the K-means algorithm for data clustering and tested on seven datasets.Experimental results show that SPPE-K-means improves clustering accuracy,reduces dependency on initialization,and outperforms other clustering approaches.This study highlights SPPE’s robustness and efficiency in solving both optimization and clustering challenges,making it a promising tool for complex data analysis tasks.
基金supported by Institute of Information&Communications Technology Planning&Evaluation(IITP)grant funded by the Korea government(MSIT)(No.RS-2024-00337489Development of Data Drift Management Technology to Overcome Performance Degradation of AI Analysis Models).
文摘As vehicular networks grow increasingly complex due to high node mobility and dynamic traffic conditions,efficient clustering mechanisms are vital to ensure stable and scalable communication.Recent studies have emphasized the need for adaptive clustering strategies to improve performance in Intelligent Transportation Systems(ITS).This paper presents the Grasshopper Optimization Algorithm for Vehicular Network Clustering(GOAVNET)algorithm,an innovative approach to optimal vehicular clustering in Vehicular Ad-Hoc Networks(VANETs),leveraging the Grasshopper Optimization Algorithm(GOA)to address the critical challenges of traffic congestion and communication inefficiencies in Intelligent Transportation Systems(ITS).The proposed GOA-VNET employs an iterative and interactive optimization mechanism to dynamically adjust node positions and cluster configurations,ensuring robust adaptability to varying vehicular densities and transmission ranges.Key features of GOA-VNET include the utilization of attraction zone,repulsion zone,and comfort zone parameters,which collectively enhance clustering efficiency and minimize congestion within Regions of Interest(ROI).By managing cluster configurations and node densities effectively,GOA-VNET ensures balanced load distribution and seamless data transmission,even in scenarios with high vehicular densities and varying transmission ranges.Comparative evaluations against the Whale Optimization Algorithm(WOA)and Grey Wolf Optimization(GWO)demonstrate that GOA-VNET consistently outperforms these methods by achieving superior clustering efficiency,reducing the number of clusters by up to 10%in high-density scenarios,and improving data transmission reliability.Simulation results reveal that under a 100-600 m transmission range,GOA-VNET achieves an average reduction of 8%-15%in the number of clusters and maintains a 5%-10%improvement in packet delivery ratio(PDR)compared to baseline algorithms.Additionally,the algorithm incorporates a heat transfer-inspired load-balancing mechanism,ensuring equitable distribution of nodes among cluster leaders(CLs)and maintaining a stable network environment.These results validate GOA-VNET as a reliable and scalable solution for VANETs,with significant potential to support next-generation ITS.Future research could further enhance the algorithm by integrating multi-objective optimization techniques and exploring broader applications in complex traffic scenarios.
基金supported by the National Key Research and Development Program of China(Grant Nos.2021YFC3000705 and 2021YFC3000705-05)the National Natural Science Foundation of China(Grant No.42074049)the Youth Innovation Promotion Association of the Chinese Academy of Sciences(Grant No.2023471).
文摘We propose a robust earthquake clustering method:the Bayesian Gaussian mixture model with nearest-neighbor distance(BGMM-NND)algorithm.Unlike the conventional nearest neighbor distance method,the BGMM-NND algorithm eliminates the need for hyperparameter tuning or reliance on fixed thresholds,offering enhanced flexibility for clustering across varied seismic scales.By integrating cumulative probability and BGMM with principal component analysis(PCA),the BGMM-NND algorithm effectively distinguishes between background and triggered earthquakes while maintaining the magnitude component and resolving the issue of excessively large spatial cluster domains.We apply the BGMM-NND algorithm to the Sichuan–Yunnan seismic catalog from 1971 to 2024,revealing notable variations in earthquake frequency,triggering characteristics,and recurrence patterns across different fault zones.Distinct clustering and triggering behaviors are identified along different segments of the Longmenshan Fault.Multiple seismic modes,namely,the short-distance mode,the medium-distance mode,the repeating-like mode,the uniform background mode,and the Wenchuan mode,are uncovered.The algorithm's flexibility and robust performance in earthquake clustering makes it a valuable tool for exploring seismicity characteristics,offering new insights into earthquake clustering and the spatiotemporal patterns of seismic activity.
基金Supported by the National Natural Science Foundation of China(61273209)
文摘Due to the limitation and hesitation in one's knowledge, the membership degree of an element to a given set usually has a few different values, in which the conventional fuzzy sets are invalid. Hesitant fuzzy sets are a powerful tool to treat this case. The present paper focuses on investigating the clustering technique for hesitant fuzzy sets based on the K-means clustering algorithm which takes the results of hierarchical clustering as the initial clusters. Finally, two examples demonstrate the validity of our algorithm.
基金funded by the Natural Science Foundation of Xinjiang Uygur Autonomous Region:No.22D01B148Bidding Topics for the Center for Integration of Education and Production and Development of New Business in 2024:No.2024-KYJD05+1 种基金Basic Scientific Research Business Fee Project of Colleges and Universities in Autonomous Region:No.XJEDU2025P126Xinjiang College of Science&Technology School-level Scientific Research Fund Project:No.2024-KYTD01.
文摘Wireless Sensor Networks(WSNs),as a crucial component of the Internet of Things(IoT),are widely used in environmental monitoring,industrial control,and security surveillance.However,WSNs still face challenges such as inaccurate node clustering,low energy efficiency,and shortened network lifespan in practical deployments,which significantly limit their large-scale application.To address these issues,this paper proposes an Adaptive Chaotic Ant Colony Optimization algorithm(AC-ACO),aiming to optimize the energy utilization and system lifespan of WSNs.AC-ACO combines the path-planning capability of Ant Colony Optimization(ACO)with the dynamic characteristics of chaotic mapping and introduces an adaptive mechanism to enhance the algorithm’s flexibility and adaptability.By dynamically adjusting the pheromone evaporation factor and heuristic weights,efficient node clustering is achieved.Additionally,a chaotic mapping initialization strategy is employed to enhance population diversity and avoid premature convergence.To validate the algorithm’s performance,this paper compares AC-ACO with clustering methods such as Low-Energy Adaptive Clustering Hierarchy(LEACH),ACO,Particle Swarm Optimization(PSO),and Genetic Algorithm(GA).Simulation results demonstrate that AC-ACO outperforms the compared algorithms in key metrics such as energy consumption optimization,network lifetime extension,and communication delay reduction,providing an efficient solution for improving energy efficiency and ensuring long-term stable operation of wireless sensor networks.
基金supported by the National Natural Science Foundation of China(Grant No.42407232)the Sichuan Science and Technology Program(Grant No.2024NSFSC0826).
文摘Recognizing discontinuities within rock masses is a critical aspect of rock engineering.The development of remote sensing technologies has significantly enhanced the quality and quantity of the point clouds collected from rock outcrops.In response,we propose a workflow that balances accuracy and efficiency to extract discontinuities from massive point clouds.The proposed method employs voxel filtering to downsample point clouds,constructs a point cloud topology using K-d trees,utilizes principal component analysis to calculate the point cloud normals,and employs the pointwise clustering(PWC)algorithm to extract discontinuities from rock outcrop point clouds.This method provides information on the location and orientation(dip direction and dip angle)of the discontinuities,and the modified whale optimization algorithm(MWOA)is utilized to identify major discontinuity sets and their average orientations.Performance evaluations based on three real cases demonstrate that the proposed method significantly reduces computational time costs without sacrificing accuracy.In particular,the method yields more reasonable extraction results for discontinuities with certain undulations.The presented approach offers a novel tool for efficiently extracting discontinuities from large-scale point clouds.
基金supported in part by Boeing Company and Nanjing University of Aeronautics and Astronautics(NUAA)through the Research on Decision Support Technology of Air Traffic Operation Management in Convective Weather under Project 2022-GT-129in part by the Postgraduate Research and Practice Innovation Program of NUAA(No.xcxjh20240709)。
文摘Addressing the issue that flight plans between Chinese city pairs typically rely on a single route,lacking alternative paths and posing challenges in responding to emergencies,this study employs the“quantile-inflection point method”to analyze specific deviation trajectories,determine deviation thresholds,and identify commonly used deviation paths.By combining multiple similarity metrics,including Euclidean distance,Hausdorff distance,and sector edit distance,with the density-based spatial clustering of applications with noise(DBSCAN)algorithm,the study clusters deviation trajectories to construct a multi-option trajectory set for city pairs.A case study of 23578 flight trajectories between the Guangzhou airport cluster and the Shanghai airport cluster demonstrates the effectiveness of the proposed framework.Experimental results show that sector edit distance achieves superior clustering performance compared to Euclidean and Hausdorff distances,with higher silhouette coefficients and lower Davies⁃Bouldin indices,ensuring better intra-cluster compactness and inter-cluster separation.Based on clustering results,19 representative trajectory options are identified,covering both nominal and deviation paths,which significantly enhance route diversity and reflect actual flight practices.This provides a practical basis for optimizing flight paths and scheduling,enhancing the flexibility of route selection for flights between city pairs.
基金supported by the National Key Research and Development Program of China(2023YFB3307801)the National Natural Science Foundation of China(62394343,62373155,62073142)+3 种基金Major Science and Technology Project of Xinjiang(No.2022A01006-4)the Programme of Introducing Talents of Discipline to Universities(the 111 Project)under Grant B17017the Fundamental Research Funds for the Central Universities,Science Foundation of China University of Petroleum,Beijing(No.2462024YJRC011)the Open Research Project of the State Key Laboratory of Industrial Control Technology,China(Grant No.ICT2024B70).
文摘The distillation process is an important chemical process,and the application of data-driven modelling approach has the potential to reduce model complexity compared to mechanistic modelling,thus improving the efficiency of process optimization or monitoring studies.However,the distillation process is highly nonlinear and has multiple uncertainty perturbation intervals,which brings challenges to accurate data-driven modelling of distillation processes.This paper proposes a systematic data-driven modelling framework to solve these problems.Firstly,data segment variance was introduced into the K-means algorithm to form K-means data interval(KMDI)clustering in order to cluster the data into perturbed and steady state intervals for steady-state data extraction.Secondly,maximal information coefficient(MIC)was employed to calculate the nonlinear correlation between variables for removing redundant features.Finally,extreme gradient boosting(XGBoost)was integrated as the basic learner into adaptive boosting(AdaBoost)with the error threshold(ET)set to improve weights update strategy to construct the new integrated learning algorithm,XGBoost-AdaBoost-ET.The superiority of the proposed framework is verified by applying this data-driven modelling framework to a real industrial process of propylene distillation.
基金Natural Science Research Project of Education Department of Anhui Province of China,Grant/Award Number:2023AH051020Key Project of Anhui Province's Science and Technology Innovation Tackle Plan,Grant/Award Number:202423k09020040+3 种基金National Key Research and Development Program of China,Grant/Award Number:2023YFD1802200Natural Science Foundation of Anhui Province,Grant/Award Number:2308085MF21National Natural Science Foundation of China,Grant/Award Numbers:32472007,62301006,62306008University Synergy Innovation Program of Anhui Province,Grant/Award Number:GXXT-2022-046。
文摘In the era of big data,personalised recommendation systems are essential for enhancing user engagement and driving business growth.However,traditional recommendation algorithms,such as collaborative filtering,face significant challenges due to data sparsity,algorithm scalability,and the difficulty of adapting to dynamic user preferences.These limitations hinder the ability of systems to provide highly accurate and personalised recommendations.To address these challenges,this paper proposes a clustering-based recommendation method that integrates an enhanced Grasshopper Optimisation Algorithm(GOA),termed LCGOA,to improve the accuracy and efficiency of recommendation systems by optimising cluster centroids in a dynamic environment.By combining the K-means algorithm with the enhanced GOA,which incorporates a Lévy flight mechanism and multi-strategy co-evolution,our method overcomes the centroid sensitivity issue,a key limitation in traditional clustering techniques.Experimental results across multiple datasets show that the proposed LCGOA-based method significantly outperforms conventional recommendation algorithms in terms of recommendation accuracy,offering more relevant content to users and driving greater customer satisfaction and business growth.
文摘K-means algorithm is one of the most widely used algorithms in the clustering analysis. To deal with the problem caused by the random selection of initial center points in the traditional al- gorithm, this paper proposes an improved K-means algorithm based on the similarity matrix. The im- proved algorithm can effectively avoid the random selection of initial center points, therefore it can provide effective initial points for clustering process, and reduce the fluctuation of clustering results which are resulted from initial points selections, thus a better clustering quality can be obtained. The experimental results also show that the F-measure of the improved K-means algorithm has been greatly improved and the clustering results are more stable.