To address the challenge of identifying the primary causes of energy consumption fluctuations and accurately assessing the influence of various factors in the converter unit of an iron and steel plant,the focus is pla...To address the challenge of identifying the primary causes of energy consumption fluctuations and accurately assessing the influence of various factors in the converter unit of an iron and steel plant,the focus is placed on the critical components of material and heat balance.Through a thorough analysis of the interactions between various components and energy consumptions,six pivotal factors have been identified—raw material composition,steel type,steel temperature,slag temperature,recycling practices,and operational parameters.Utilizing a framework based on an equivalent energy consumption model,an integrated intelligent diagnostic model has been developed that encapsulates these factors,providing a comprehensive assessment tool for converter energy consumption.Employing the K-means clustering algorithm,historical operational data from the converter have been meticulously analyzed to determine baseline values for essential variables such as energy consumption and recovery rates.Building upon this data-driven foundation,an innovative online system for the intelligent diagnosis of converter energy consumption has been crafted and implemented,enhancing the precision and efficiency of energy management.Upon implementation with energy consumption data at a steel plant in 2023,the diagnostic analysis performed by the system exposed significant variations in energy usage across different converter units.The analysis revealed that the most significant factor influencing the variation in energy consumption for both furnaces was the steel grade,with contributions of−0.550 and 0.379.展开更多
In allusion to the disadvantage of having to obtain the number of clusters of data sets in advance and the sensitivity to selecting initial clustering centers in the k-means algorithm, an improved k-means clustering a...In allusion to the disadvantage of having to obtain the number of clusters of data sets in advance and the sensitivity to selecting initial clustering centers in the k-means algorithm, an improved k-means clustering algorithm is proposed. First, the concept of a silhouette coefficient is introduced, and the optimal clustering number Kopt of a data set with unknown class information is confirmed by calculating the silhouette coefficient of objects in clusters under different K values. Then the distribution of the data set is obtained through hierarchical clustering and the initial clustering-centers are confirmed. Finally, the clustering is completed by the traditional k-means clustering. By the theoretical analysis, it is proved that the improved k-means clustering algorithm has proper computational complexity. The experimental results of IRIS testing data set show that the algorithm can distinguish different clusters reasonably and recognize the outliers efficiently, and the entropy generated by the algorithm is lower.展开更多
Cluster-basedmodels have numerous application scenarios in vehicular ad-hoc networks(VANETs)and can greatly help improve the communication performance of VANETs.However,the frequent movement of vehicles can often lead...Cluster-basedmodels have numerous application scenarios in vehicular ad-hoc networks(VANETs)and can greatly help improve the communication performance of VANETs.However,the frequent movement of vehicles can often lead to changes in the network topology,thereby reducing cluster stability in urban scenarios.To address this issue,we propose a clustering model based on the density peak clustering(DPC)method and sparrow search algorithm(SSA),named SDPC.First,the model constructs a fitness function based on the parameters obtained from the DPC method and deploys the SSA for iterative optimization to select cluster heads(CHs).Then,the vehicles that have not been selected as CHs are assigned to appropriate clusters by comprehensively considering the distance parameter and link-reliability parameter.Finally,cluster maintenance strategies are considered to tackle the changes in the clusters’organizational structure.To verify the performance of the model,we conducted a simulation on a real-world scenario for multiple metrics related to clusters’stability.The results show that compared with the APROVE and the GAPC,SDPC showed clear performance advantages,indicating that SDPC can effectively ensure VANETs’cluster stability in urban scenarios.展开更多
Reliable Cluster Head(CH)selectionbased routing protocols are necessary for increasing the packet transmission efficiency with optimal path discovery that never introduces degradation over the transmission reliability...Reliable Cluster Head(CH)selectionbased routing protocols are necessary for increasing the packet transmission efficiency with optimal path discovery that never introduces degradation over the transmission reliability.In this paper,Hybrid Golden Jackal,and Improved Whale Optimization Algorithm(HGJIWOA)is proposed as an effective and optimal routing protocol that guarantees efficient routing of data packets in the established between the CHs and the movable sink.This HGJIWOA included the phases of Dynamic Lens-Imaging Learning Strategy and Novel Update Rules for determining the reliable route essential for data packets broadcasting attained through fitness measure estimation-based CH selection.The process of CH selection achieved using Golden Jackal Optimization Algorithm(GJOA)completely depends on the factors of maintainability,consistency,trust,delay,and energy.The adopted GJOA algorithm play a dominant role in determining the optimal path of routing depending on the parameter of reduced delay and minimal distance.It further utilized Improved Whale Optimisation Algorithm(IWOA)for forwarding the data from chosen CHs to the BS via optimized route depending on the parameters of energy and distance.It also included a reliable route maintenance process that aids in deciding the selected route through which data need to be transmitted or re-routed.The simulation outcomes of the proposed HGJIWOA mechanism with different sensor nodes confirmed an improved mean throughput of 18.21%,sustained residual energy of 19.64%with minimized end-to-end delay of 21.82%,better than the competitive CH selection approaches.展开更多
Data clustering is an essential technique for analyzing complex datasets and continues to be a central research topic in data analysis.Traditional clustering algorithms,such as K-means,are widely used due to their sim...Data clustering is an essential technique for analyzing complex datasets and continues to be a central research topic in data analysis.Traditional clustering algorithms,such as K-means,are widely used due to their simplicity and efficiency.This paper proposes a novel Spiral Mechanism-Optimized Phasmatodea Population Evolution Algorithm(SPPE)to improve clustering performance.The SPPE algorithm introduces several enhancements to the standard Phasmatodea Population Evolution(PPE)algorithm.Firstly,a Variable Neighborhood Search(VNS)factor is incorporated to strengthen the local search capability and foster population diversity.Secondly,a position update model,incorporating a spiral mechanism,is designed to improve the algorithm’s global exploration and convergence speed.Finally,a dynamic balancing factor,guided by fitness values,adjusts the search process to balance exploration and exploitation effectively.The performance of SPPE is first validated on CEC2013 benchmark functions,where it demonstrates excellent convergence speed and superior optimization results compared to several state-of-the-art metaheuristic algorithms.To further verify its practical applicability,SPPE is combined with the K-means algorithm for data clustering and tested on seven datasets.Experimental results show that SPPE-K-means improves clustering accuracy,reduces dependency on initialization,and outperforms other clustering approaches.This study highlights SPPE’s robustness and efficiency in solving both optimization and clustering challenges,making it a promising tool for complex data analysis tasks.展开更多
As vehicular networks grow increasingly complex due to high node mobility and dynamic traffic conditions,efficient clustering mechanisms are vital to ensure stable and scalable communication.Recent studies have emphas...As vehicular networks grow increasingly complex due to high node mobility and dynamic traffic conditions,efficient clustering mechanisms are vital to ensure stable and scalable communication.Recent studies have emphasized the need for adaptive clustering strategies to improve performance in Intelligent Transportation Systems(ITS).This paper presents the Grasshopper Optimization Algorithm for Vehicular Network Clustering(GOAVNET)algorithm,an innovative approach to optimal vehicular clustering in Vehicular Ad-Hoc Networks(VANETs),leveraging the Grasshopper Optimization Algorithm(GOA)to address the critical challenges of traffic congestion and communication inefficiencies in Intelligent Transportation Systems(ITS).The proposed GOA-VNET employs an iterative and interactive optimization mechanism to dynamically adjust node positions and cluster configurations,ensuring robust adaptability to varying vehicular densities and transmission ranges.Key features of GOA-VNET include the utilization of attraction zone,repulsion zone,and comfort zone parameters,which collectively enhance clustering efficiency and minimize congestion within Regions of Interest(ROI).By managing cluster configurations and node densities effectively,GOA-VNET ensures balanced load distribution and seamless data transmission,even in scenarios with high vehicular densities and varying transmission ranges.Comparative evaluations against the Whale Optimization Algorithm(WOA)and Grey Wolf Optimization(GWO)demonstrate that GOA-VNET consistently outperforms these methods by achieving superior clustering efficiency,reducing the number of clusters by up to 10%in high-density scenarios,and improving data transmission reliability.Simulation results reveal that under a 100-600 m transmission range,GOA-VNET achieves an average reduction of 8%-15%in the number of clusters and maintains a 5%-10%improvement in packet delivery ratio(PDR)compared to baseline algorithms.Additionally,the algorithm incorporates a heat transfer-inspired load-balancing mechanism,ensuring equitable distribution of nodes among cluster leaders(CLs)and maintaining a stable network environment.These results validate GOA-VNET as a reliable and scalable solution for VANETs,with significant potential to support next-generation ITS.Future research could further enhance the algorithm by integrating multi-objective optimization techniques and exploring broader applications in complex traffic scenarios.展开更多
We propose a robust earthquake clustering method:the Bayesian Gaussian mixture model with nearest-neighbor distance(BGMM-NND)algorithm.Unlike the conventional nearest neighbor distance method,the BGMM-NND algorithm el...We propose a robust earthquake clustering method:the Bayesian Gaussian mixture model with nearest-neighbor distance(BGMM-NND)algorithm.Unlike the conventional nearest neighbor distance method,the BGMM-NND algorithm eliminates the need for hyperparameter tuning or reliance on fixed thresholds,offering enhanced flexibility for clustering across varied seismic scales.By integrating cumulative probability and BGMM with principal component analysis(PCA),the BGMM-NND algorithm effectively distinguishes between background and triggered earthquakes while maintaining the magnitude component and resolving the issue of excessively large spatial cluster domains.We apply the BGMM-NND algorithm to the Sichuan–Yunnan seismic catalog from 1971 to 2024,revealing notable variations in earthquake frequency,triggering characteristics,and recurrence patterns across different fault zones.Distinct clustering and triggering behaviors are identified along different segments of the Longmenshan Fault.Multiple seismic modes,namely,the short-distance mode,the medium-distance mode,the repeating-like mode,the uniform background mode,and the Wenchuan mode,are uncovered.The algorithm's flexibility and robust performance in earthquake clustering makes it a valuable tool for exploring seismicity characteristics,offering new insights into earthquake clustering and the spatiotemporal patterns of seismic activity.展开更多
Wireless Sensor Networks(WSNs),as a crucial component of the Internet of Things(IoT),are widely used in environmental monitoring,industrial control,and security surveillance.However,WSNs still face challenges such as ...Wireless Sensor Networks(WSNs),as a crucial component of the Internet of Things(IoT),are widely used in environmental monitoring,industrial control,and security surveillance.However,WSNs still face challenges such as inaccurate node clustering,low energy efficiency,and shortened network lifespan in practical deployments,which significantly limit their large-scale application.To address these issues,this paper proposes an Adaptive Chaotic Ant Colony Optimization algorithm(AC-ACO),aiming to optimize the energy utilization and system lifespan of WSNs.AC-ACO combines the path-planning capability of Ant Colony Optimization(ACO)with the dynamic characteristics of chaotic mapping and introduces an adaptive mechanism to enhance the algorithm’s flexibility and adaptability.By dynamically adjusting the pheromone evaporation factor and heuristic weights,efficient node clustering is achieved.Additionally,a chaotic mapping initialization strategy is employed to enhance population diversity and avoid premature convergence.To validate the algorithm’s performance,this paper compares AC-ACO with clustering methods such as Low-Energy Adaptive Clustering Hierarchy(LEACH),ACO,Particle Swarm Optimization(PSO),and Genetic Algorithm(GA).Simulation results demonstrate that AC-ACO outperforms the compared algorithms in key metrics such as energy consumption optimization,network lifetime extension,and communication delay reduction,providing an efficient solution for improving energy efficiency and ensuring long-term stable operation of wireless sensor networks.展开更多
Recognizing discontinuities within rock masses is a critical aspect of rock engineering.The development of remote sensing technologies has significantly enhanced the quality and quantity of the point clouds collected ...Recognizing discontinuities within rock masses is a critical aspect of rock engineering.The development of remote sensing technologies has significantly enhanced the quality and quantity of the point clouds collected from rock outcrops.In response,we propose a workflow that balances accuracy and efficiency to extract discontinuities from massive point clouds.The proposed method employs voxel filtering to downsample point clouds,constructs a point cloud topology using K-d trees,utilizes principal component analysis to calculate the point cloud normals,and employs the pointwise clustering(PWC)algorithm to extract discontinuities from rock outcrop point clouds.This method provides information on the location and orientation(dip direction and dip angle)of the discontinuities,and the modified whale optimization algorithm(MWOA)is utilized to identify major discontinuity sets and their average orientations.Performance evaluations based on three real cases demonstrate that the proposed method significantly reduces computational time costs without sacrificing accuracy.In particular,the method yields more reasonable extraction results for discontinuities with certain undulations.The presented approach offers a novel tool for efficiently extracting discontinuities from large-scale point clouds.展开更多
The distillation process is an important chemical process,and the application of data-driven modelling approach has the potential to reduce model complexity compared to mechanistic modelling,thus improving the efficie...The distillation process is an important chemical process,and the application of data-driven modelling approach has the potential to reduce model complexity compared to mechanistic modelling,thus improving the efficiency of process optimization or monitoring studies.However,the distillation process is highly nonlinear and has multiple uncertainty perturbation intervals,which brings challenges to accurate data-driven modelling of distillation processes.This paper proposes a systematic data-driven modelling framework to solve these problems.Firstly,data segment variance was introduced into the K-means algorithm to form K-means data interval(KMDI)clustering in order to cluster the data into perturbed and steady state intervals for steady-state data extraction.Secondly,maximal information coefficient(MIC)was employed to calculate the nonlinear correlation between variables for removing redundant features.Finally,extreme gradient boosting(XGBoost)was integrated as the basic learner into adaptive boosting(AdaBoost)with the error threshold(ET)set to improve weights update strategy to construct the new integrated learning algorithm,XGBoost-AdaBoost-ET.The superiority of the proposed framework is verified by applying this data-driven modelling framework to a real industrial process of propylene distillation.展开更多
In the era of big data,personalised recommendation systems are essential for enhancing user engagement and driving business growth.However,traditional recommendation algorithms,such as collaborative filtering,face sig...In the era of big data,personalised recommendation systems are essential for enhancing user engagement and driving business growth.However,traditional recommendation algorithms,such as collaborative filtering,face significant challenges due to data sparsity,algorithm scalability,and the difficulty of adapting to dynamic user preferences.These limitations hinder the ability of systems to provide highly accurate and personalised recommendations.To address these challenges,this paper proposes a clustering-based recommendation method that integrates an enhanced Grasshopper Optimisation Algorithm(GOA),termed LCGOA,to improve the accuracy and efficiency of recommendation systems by optimising cluster centroids in a dynamic environment.By combining the K-means algorithm with the enhanced GOA,which incorporates a Lévy flight mechanism and multi-strategy co-evolution,our method overcomes the centroid sensitivity issue,a key limitation in traditional clustering techniques.Experimental results across multiple datasets show that the proposed LCGOA-based method significantly outperforms conventional recommendation algorithms in terms of recommendation accuracy,offering more relevant content to users and driving greater customer satisfaction and business growth.展开更多
The k-means algorithm is a popular data clustering technique due to its speed and simplicity. However, it is susceptible to issues such as sensitivity to the chosen seeds, and inaccurate clusters due to poor initial s...The k-means algorithm is a popular data clustering technique due to its speed and simplicity. However, it is susceptible to issues such as sensitivity to the chosen seeds, and inaccurate clusters due to poor initial seeds, particularly in complex datasets or datasets with non-spherical clusters. In this paper, a Comprehensive K-Means Clustering algorithm is presented, in which multiple trials of k-means are performed on a given dataset. The clustering results from each trial are transformed into a five-dimensional data point, containing the scope values of the x and y coordinates of the clusters along with the number of points within that cluster. A graph is then generated displaying the configuration of these points using Principal Component Analysis (PCA), from which we can observe and determine the common clustering patterns in the dataset. The robustness and strength of these patterns are then examined by observing the variance of the results of each trial, wherein a different subset of the data keeping a certain percentage of original data points is clustered. By aggregating information from multiple trials, we can distinguish clusters that consistently emerge across different runs from those that are more sensitive or unlikely, hence deriving more reliable conclusions about the underlying structure of complex datasets. Our experiments show that our algorithm is able to find the most common associations between different dimensions of data over multiple trials, often more accurately than other algorithms, as well as measure stability of these clusters, an ability that other k-means algorithms lack.展开更多
In k-means clustering, we are given a set of n data points in d-dimensional space R^d and an integer k and the problem is to determine a set of k points in R^d, called centers, so as to minimize the mean squared dista...In k-means clustering, we are given a set of n data points in d-dimensional space R^d and an integer k and the problem is to determine a set of k points in R^d, called centers, so as to minimize the mean squared distance from each data point to its nearest center. In this paper, we present a simple and efficient clustering algorithm based on the k-means algorithm, which we call enhanced k-means algorithm. This algorithm is easy to implement, requiring a simple data structure to keep some information in each iteration to be used in the next iteration. Our experimental results demonstrated that our scheme can improve the computational speed of the k-means algorithm by the magnitude in the total number of distance calculations and the overall time of computation.展开更多
For the existing support vector machine, when recognizing more questions, the shortcomings of high computational complexity and low recognition rate under the low SNR are emerged. The characteristic parameter of the s...For the existing support vector machine, when recognizing more questions, the shortcomings of high computational complexity and low recognition rate under the low SNR are emerged. The characteristic parameter of the signal is extracted and optimized by using a clustering algorithm, support vector machine is trained by grading algorithm so as to enhance the rate of convergence, improve the performance of recognition under the low SNR and realize modulation recognition of the signal based on the modulation system of the constellation diagram in this paper. Simulation results show that the average recognition rate based on this algorithm is enhanced over 30% compared with methods that adopting clustering algorithm or support vector machine respectively under the low SNR. The average recognition rate can reach 90% when the SNR is 5 dB, and the method is easy to be achieved so that it has broad application prospect in the modulating recognition.展开更多
Due to the limitation and hesitation in one's knowledge, the membership degree of an element to a given set usually has a few different values, in which the conventional fuzzy sets are invalid. Hesitant fuzzy sets ar...Due to the limitation and hesitation in one's knowledge, the membership degree of an element to a given set usually has a few different values, in which the conventional fuzzy sets are invalid. Hesitant fuzzy sets are a powerful tool to treat this case. The present paper focuses on investigating the clustering technique for hesitant fuzzy sets based on the K-means clustering algorithm which takes the results of hierarchical clustering as the initial clusters. Finally, two examples demonstrate the validity of our algorithm.展开更多
In a large-scale wireless sensor network(WSN),densely distributed sensor nodes process a large amount of data.The aggregation of data in a network can consume a great amount of energy.To balance and reduce the energy ...In a large-scale wireless sensor network(WSN),densely distributed sensor nodes process a large amount of data.The aggregation of data in a network can consume a great amount of energy.To balance and reduce the energy consumption of nodes in a WSN and extend the network life,this paper proposes a nonuniform clustering routing algorithm based on the improved K-means algorithm.The algorithm uses a clustering method to form and optimize clusters,and it selects appropriate cluster heads to balance network energy consumption and extend the life cycle of the WSN.To ensure that the cluster head(CH)selection in the network is fair and that the location of the selected CH is not concentrated within a certain range,we chose the appropriate CH competition radius.Simulation results show that,compared with LEACH,LEACH-C,and the DEEC clustering algorithm,this algorithm can effectively balance the energy consumption of the CH and extend the network life.展开更多
Classification systems such as Slope Mass Rating(SMR) are currently being used to undertake slope stability analysis. In SMR classification system, data is allocated to certain classes based on linguistic and experien...Classification systems such as Slope Mass Rating(SMR) are currently being used to undertake slope stability analysis. In SMR classification system, data is allocated to certain classes based on linguistic and experience-based criteria. In order to eliminate linguistic criteria resulted from experience-based judgments and account for uncertainties in determining class boundaries developed by SMR system,the system classification results were corrected using two clustering algorithms, namely K-means and fuzzy c-means(FCM), for the ratings obtained via continuous and discrete functions. By applying clustering algorithms in SMR classification system, no in-advance experience-based judgment was made on the number of extracted classes in this system, and it was only after all steps of the clustering algorithms were accomplished that new classification scheme was proposed for SMR system under different failure modes based on the ratings obtained via continuous and discrete functions. The results of this study showed that, engineers can achieve more reliable and objective evaluations over slope stability by using SMR system based on the ratings calculated via continuous and discrete functions.展开更多
基金financial support from the National Key R&D Program of China(Grant No.2020YFB1711100).
文摘To address the challenge of identifying the primary causes of energy consumption fluctuations and accurately assessing the influence of various factors in the converter unit of an iron and steel plant,the focus is placed on the critical components of material and heat balance.Through a thorough analysis of the interactions between various components and energy consumptions,six pivotal factors have been identified—raw material composition,steel type,steel temperature,slag temperature,recycling practices,and operational parameters.Utilizing a framework based on an equivalent energy consumption model,an integrated intelligent diagnostic model has been developed that encapsulates these factors,providing a comprehensive assessment tool for converter energy consumption.Employing the K-means clustering algorithm,historical operational data from the converter have been meticulously analyzed to determine baseline values for essential variables such as energy consumption and recovery rates.Building upon this data-driven foundation,an innovative online system for the intelligent diagnosis of converter energy consumption has been crafted and implemented,enhancing the precision and efficiency of energy management.Upon implementation with energy consumption data at a steel plant in 2023,the diagnostic analysis performed by the system exposed significant variations in energy usage across different converter units.The analysis revealed that the most significant factor influencing the variation in energy consumption for both furnaces was the steel grade,with contributions of−0.550 and 0.379.
基金The National Natural Science Foundation of China(No50674086)Specialized Research Fund for the Doctoral Program of Higher Education (No20060290508)the Youth Scientific Research Foundation of China University of Mining and Technology (No2006A047)
文摘In allusion to the disadvantage of having to obtain the number of clusters of data sets in advance and the sensitivity to selecting initial clustering centers in the k-means algorithm, an improved k-means clustering algorithm is proposed. First, the concept of a silhouette coefficient is introduced, and the optimal clustering number Kopt of a data set with unknown class information is confirmed by calculating the silhouette coefficient of objects in clusters under different K values. Then the distribution of the data set is obtained through hierarchical clustering and the initial clustering-centers are confirmed. Finally, the clustering is completed by the traditional k-means clustering. By the theoretical analysis, it is proved that the improved k-means clustering algorithm has proper computational complexity. The experimental results of IRIS testing data set show that the algorithm can distinguish different clusters reasonably and recognize the outliers efficiently, and the entropy generated by the algorithm is lower.
文摘Cluster-basedmodels have numerous application scenarios in vehicular ad-hoc networks(VANETs)and can greatly help improve the communication performance of VANETs.However,the frequent movement of vehicles can often lead to changes in the network topology,thereby reducing cluster stability in urban scenarios.To address this issue,we propose a clustering model based on the density peak clustering(DPC)method and sparrow search algorithm(SSA),named SDPC.First,the model constructs a fitness function based on the parameters obtained from the DPC method and deploys the SSA for iterative optimization to select cluster heads(CHs).Then,the vehicles that have not been selected as CHs are assigned to appropriate clusters by comprehensively considering the distance parameter and link-reliability parameter.Finally,cluster maintenance strategies are considered to tackle the changes in the clusters’organizational structure.To verify the performance of the model,we conducted a simulation on a real-world scenario for multiple metrics related to clusters’stability.The results show that compared with the APROVE and the GAPC,SDPC showed clear performance advantages,indicating that SDPC can effectively ensure VANETs’cluster stability in urban scenarios.
文摘Reliable Cluster Head(CH)selectionbased routing protocols are necessary for increasing the packet transmission efficiency with optimal path discovery that never introduces degradation over the transmission reliability.In this paper,Hybrid Golden Jackal,and Improved Whale Optimization Algorithm(HGJIWOA)is proposed as an effective and optimal routing protocol that guarantees efficient routing of data packets in the established between the CHs and the movable sink.This HGJIWOA included the phases of Dynamic Lens-Imaging Learning Strategy and Novel Update Rules for determining the reliable route essential for data packets broadcasting attained through fitness measure estimation-based CH selection.The process of CH selection achieved using Golden Jackal Optimization Algorithm(GJOA)completely depends on the factors of maintainability,consistency,trust,delay,and energy.The adopted GJOA algorithm play a dominant role in determining the optimal path of routing depending on the parameter of reduced delay and minimal distance.It further utilized Improved Whale Optimisation Algorithm(IWOA)for forwarding the data from chosen CHs to the BS via optimized route depending on the parameters of energy and distance.It also included a reliable route maintenance process that aids in deciding the selected route through which data need to be transmitted or re-routed.The simulation outcomes of the proposed HGJIWOA mechanism with different sensor nodes confirmed an improved mean throughput of 18.21%,sustained residual energy of 19.64%with minimized end-to-end delay of 21.82%,better than the competitive CH selection approaches.
文摘Data clustering is an essential technique for analyzing complex datasets and continues to be a central research topic in data analysis.Traditional clustering algorithms,such as K-means,are widely used due to their simplicity and efficiency.This paper proposes a novel Spiral Mechanism-Optimized Phasmatodea Population Evolution Algorithm(SPPE)to improve clustering performance.The SPPE algorithm introduces several enhancements to the standard Phasmatodea Population Evolution(PPE)algorithm.Firstly,a Variable Neighborhood Search(VNS)factor is incorporated to strengthen the local search capability and foster population diversity.Secondly,a position update model,incorporating a spiral mechanism,is designed to improve the algorithm’s global exploration and convergence speed.Finally,a dynamic balancing factor,guided by fitness values,adjusts the search process to balance exploration and exploitation effectively.The performance of SPPE is first validated on CEC2013 benchmark functions,where it demonstrates excellent convergence speed and superior optimization results compared to several state-of-the-art metaheuristic algorithms.To further verify its practical applicability,SPPE is combined with the K-means algorithm for data clustering and tested on seven datasets.Experimental results show that SPPE-K-means improves clustering accuracy,reduces dependency on initialization,and outperforms other clustering approaches.This study highlights SPPE’s robustness and efficiency in solving both optimization and clustering challenges,making it a promising tool for complex data analysis tasks.
基金supported by Institute of Information&Communications Technology Planning&Evaluation(IITP)grant funded by the Korea government(MSIT)(No.RS-2024-00337489Development of Data Drift Management Technology to Overcome Performance Degradation of AI Analysis Models).
文摘As vehicular networks grow increasingly complex due to high node mobility and dynamic traffic conditions,efficient clustering mechanisms are vital to ensure stable and scalable communication.Recent studies have emphasized the need for adaptive clustering strategies to improve performance in Intelligent Transportation Systems(ITS).This paper presents the Grasshopper Optimization Algorithm for Vehicular Network Clustering(GOAVNET)algorithm,an innovative approach to optimal vehicular clustering in Vehicular Ad-Hoc Networks(VANETs),leveraging the Grasshopper Optimization Algorithm(GOA)to address the critical challenges of traffic congestion and communication inefficiencies in Intelligent Transportation Systems(ITS).The proposed GOA-VNET employs an iterative and interactive optimization mechanism to dynamically adjust node positions and cluster configurations,ensuring robust adaptability to varying vehicular densities and transmission ranges.Key features of GOA-VNET include the utilization of attraction zone,repulsion zone,and comfort zone parameters,which collectively enhance clustering efficiency and minimize congestion within Regions of Interest(ROI).By managing cluster configurations and node densities effectively,GOA-VNET ensures balanced load distribution and seamless data transmission,even in scenarios with high vehicular densities and varying transmission ranges.Comparative evaluations against the Whale Optimization Algorithm(WOA)and Grey Wolf Optimization(GWO)demonstrate that GOA-VNET consistently outperforms these methods by achieving superior clustering efficiency,reducing the number of clusters by up to 10%in high-density scenarios,and improving data transmission reliability.Simulation results reveal that under a 100-600 m transmission range,GOA-VNET achieves an average reduction of 8%-15%in the number of clusters and maintains a 5%-10%improvement in packet delivery ratio(PDR)compared to baseline algorithms.Additionally,the algorithm incorporates a heat transfer-inspired load-balancing mechanism,ensuring equitable distribution of nodes among cluster leaders(CLs)and maintaining a stable network environment.These results validate GOA-VNET as a reliable and scalable solution for VANETs,with significant potential to support next-generation ITS.Future research could further enhance the algorithm by integrating multi-objective optimization techniques and exploring broader applications in complex traffic scenarios.
基金supported by the National Key Research and Development Program of China(Grant Nos.2021YFC3000705 and 2021YFC3000705-05)the National Natural Science Foundation of China(Grant No.42074049)the Youth Innovation Promotion Association of the Chinese Academy of Sciences(Grant No.2023471).
文摘We propose a robust earthquake clustering method:the Bayesian Gaussian mixture model with nearest-neighbor distance(BGMM-NND)algorithm.Unlike the conventional nearest neighbor distance method,the BGMM-NND algorithm eliminates the need for hyperparameter tuning or reliance on fixed thresholds,offering enhanced flexibility for clustering across varied seismic scales.By integrating cumulative probability and BGMM with principal component analysis(PCA),the BGMM-NND algorithm effectively distinguishes between background and triggered earthquakes while maintaining the magnitude component and resolving the issue of excessively large spatial cluster domains.We apply the BGMM-NND algorithm to the Sichuan–Yunnan seismic catalog from 1971 to 2024,revealing notable variations in earthquake frequency,triggering characteristics,and recurrence patterns across different fault zones.Distinct clustering and triggering behaviors are identified along different segments of the Longmenshan Fault.Multiple seismic modes,namely,the short-distance mode,the medium-distance mode,the repeating-like mode,the uniform background mode,and the Wenchuan mode,are uncovered.The algorithm's flexibility and robust performance in earthquake clustering makes it a valuable tool for exploring seismicity characteristics,offering new insights into earthquake clustering and the spatiotemporal patterns of seismic activity.
基金funded by the Natural Science Foundation of Xinjiang Uygur Autonomous Region:No.22D01B148Bidding Topics for the Center for Integration of Education and Production and Development of New Business in 2024:No.2024-KYJD05+1 种基金Basic Scientific Research Business Fee Project of Colleges and Universities in Autonomous Region:No.XJEDU2025P126Xinjiang College of Science&Technology School-level Scientific Research Fund Project:No.2024-KYTD01.
文摘Wireless Sensor Networks(WSNs),as a crucial component of the Internet of Things(IoT),are widely used in environmental monitoring,industrial control,and security surveillance.However,WSNs still face challenges such as inaccurate node clustering,low energy efficiency,and shortened network lifespan in practical deployments,which significantly limit their large-scale application.To address these issues,this paper proposes an Adaptive Chaotic Ant Colony Optimization algorithm(AC-ACO),aiming to optimize the energy utilization and system lifespan of WSNs.AC-ACO combines the path-planning capability of Ant Colony Optimization(ACO)with the dynamic characteristics of chaotic mapping and introduces an adaptive mechanism to enhance the algorithm’s flexibility and adaptability.By dynamically adjusting the pheromone evaporation factor and heuristic weights,efficient node clustering is achieved.Additionally,a chaotic mapping initialization strategy is employed to enhance population diversity and avoid premature convergence.To validate the algorithm’s performance,this paper compares AC-ACO with clustering methods such as Low-Energy Adaptive Clustering Hierarchy(LEACH),ACO,Particle Swarm Optimization(PSO),and Genetic Algorithm(GA).Simulation results demonstrate that AC-ACO outperforms the compared algorithms in key metrics such as energy consumption optimization,network lifetime extension,and communication delay reduction,providing an efficient solution for improving energy efficiency and ensuring long-term stable operation of wireless sensor networks.
基金supported by the National Natural Science Foundation of China(Grant No.42407232)the Sichuan Science and Technology Program(Grant No.2024NSFSC0826).
文摘Recognizing discontinuities within rock masses is a critical aspect of rock engineering.The development of remote sensing technologies has significantly enhanced the quality and quantity of the point clouds collected from rock outcrops.In response,we propose a workflow that balances accuracy and efficiency to extract discontinuities from massive point clouds.The proposed method employs voxel filtering to downsample point clouds,constructs a point cloud topology using K-d trees,utilizes principal component analysis to calculate the point cloud normals,and employs the pointwise clustering(PWC)algorithm to extract discontinuities from rock outcrop point clouds.This method provides information on the location and orientation(dip direction and dip angle)of the discontinuities,and the modified whale optimization algorithm(MWOA)is utilized to identify major discontinuity sets and their average orientations.Performance evaluations based on three real cases demonstrate that the proposed method significantly reduces computational time costs without sacrificing accuracy.In particular,the method yields more reasonable extraction results for discontinuities with certain undulations.The presented approach offers a novel tool for efficiently extracting discontinuities from large-scale point clouds.
基金supported by the National Key Research and Development Program of China(2023YFB3307801)the National Natural Science Foundation of China(62394343,62373155,62073142)+3 种基金Major Science and Technology Project of Xinjiang(No.2022A01006-4)the Programme of Introducing Talents of Discipline to Universities(the 111 Project)under Grant B17017the Fundamental Research Funds for the Central Universities,Science Foundation of China University of Petroleum,Beijing(No.2462024YJRC011)the Open Research Project of the State Key Laboratory of Industrial Control Technology,China(Grant No.ICT2024B70).
文摘The distillation process is an important chemical process,and the application of data-driven modelling approach has the potential to reduce model complexity compared to mechanistic modelling,thus improving the efficiency of process optimization or monitoring studies.However,the distillation process is highly nonlinear and has multiple uncertainty perturbation intervals,which brings challenges to accurate data-driven modelling of distillation processes.This paper proposes a systematic data-driven modelling framework to solve these problems.Firstly,data segment variance was introduced into the K-means algorithm to form K-means data interval(KMDI)clustering in order to cluster the data into perturbed and steady state intervals for steady-state data extraction.Secondly,maximal information coefficient(MIC)was employed to calculate the nonlinear correlation between variables for removing redundant features.Finally,extreme gradient boosting(XGBoost)was integrated as the basic learner into adaptive boosting(AdaBoost)with the error threshold(ET)set to improve weights update strategy to construct the new integrated learning algorithm,XGBoost-AdaBoost-ET.The superiority of the proposed framework is verified by applying this data-driven modelling framework to a real industrial process of propylene distillation.
基金Natural Science Research Project of Education Department of Anhui Province of China,Grant/Award Number:2023AH051020Key Project of Anhui Province's Science and Technology Innovation Tackle Plan,Grant/Award Number:202423k09020040+3 种基金National Key Research and Development Program of China,Grant/Award Number:2023YFD1802200Natural Science Foundation of Anhui Province,Grant/Award Number:2308085MF21National Natural Science Foundation of China,Grant/Award Numbers:32472007,62301006,62306008University Synergy Innovation Program of Anhui Province,Grant/Award Number:GXXT-2022-046。
文摘In the era of big data,personalised recommendation systems are essential for enhancing user engagement and driving business growth.However,traditional recommendation algorithms,such as collaborative filtering,face significant challenges due to data sparsity,algorithm scalability,and the difficulty of adapting to dynamic user preferences.These limitations hinder the ability of systems to provide highly accurate and personalised recommendations.To address these challenges,this paper proposes a clustering-based recommendation method that integrates an enhanced Grasshopper Optimisation Algorithm(GOA),termed LCGOA,to improve the accuracy and efficiency of recommendation systems by optimising cluster centroids in a dynamic environment.By combining the K-means algorithm with the enhanced GOA,which incorporates a Lévy flight mechanism and multi-strategy co-evolution,our method overcomes the centroid sensitivity issue,a key limitation in traditional clustering techniques.Experimental results across multiple datasets show that the proposed LCGOA-based method significantly outperforms conventional recommendation algorithms in terms of recommendation accuracy,offering more relevant content to users and driving greater customer satisfaction and business growth.
文摘在高压并联电抗器声纹信号监测系统中,长时海量无标签声纹的高维非平稳性导致特征提取困难、无监督聚类适应性差。由此提出了一种基于深度自适应K-means++算法(deep adaptive K-means++clustering algorithm,DAKCA)的750 kV电抗器声纹聚类方法。首先通过采用两阶段无监督策略微调的改进堆叠稀疏自编码器(stacked sparse autoencoder,SSAE),对快速傅里叶变换后的归一化频域数据提取电抗器原始声纹32维深度特征。进一步提出了依据最近邻聚类有效性指标(clustering validation index based on nearest neighbors,CVNN)的自适应K-means++聚类算法,构建了能自适应确定最优聚类个数的电抗器声纹聚类模型。最后通过西北地区某750 kV电抗器实测声纹数据集进行了验证。结果表明,DAKCA算法对无标签声纹数据在不同样本均衡程度下能够稳定提取32维深度特征,并实现最优聚类,为直接高效利用电抗器无标签声纹数据提供了参考。
文摘The k-means algorithm is a popular data clustering technique due to its speed and simplicity. However, it is susceptible to issues such as sensitivity to the chosen seeds, and inaccurate clusters due to poor initial seeds, particularly in complex datasets or datasets with non-spherical clusters. In this paper, a Comprehensive K-Means Clustering algorithm is presented, in which multiple trials of k-means are performed on a given dataset. The clustering results from each trial are transformed into a five-dimensional data point, containing the scope values of the x and y coordinates of the clusters along with the number of points within that cluster. A graph is then generated displaying the configuration of these points using Principal Component Analysis (PCA), from which we can observe and determine the common clustering patterns in the dataset. The robustness and strength of these patterns are then examined by observing the variance of the results of each trial, wherein a different subset of the data keeping a certain percentage of original data points is clustered. By aggregating information from multiple trials, we can distinguish clusters that consistently emerge across different runs from those that are more sensitive or unlikely, hence deriving more reliable conclusions about the underlying structure of complex datasets. Our experiments show that our algorithm is able to find the most common associations between different dimensions of data over multiple trials, often more accurately than other algorithms, as well as measure stability of these clusters, an ability that other k-means algorithms lack.
文摘In k-means clustering, we are given a set of n data points in d-dimensional space R^d and an integer k and the problem is to determine a set of k points in R^d, called centers, so as to minimize the mean squared distance from each data point to its nearest center. In this paper, we present a simple and efficient clustering algorithm based on the k-means algorithm, which we call enhanced k-means algorithm. This algorithm is easy to implement, requiring a simple data structure to keep some information in each iteration to be used in the next iteration. Our experimental results demonstrated that our scheme can improve the computational speed of the k-means algorithm by the magnitude in the total number of distance calculations and the overall time of computation.
基金supported in part by the National Natural Science Foundation of China under Grand No.61871129 and No.61301179Projects of Science and Technology Plan Guangdong Province under Grand No.2014A010101284
文摘For the existing support vector machine, when recognizing more questions, the shortcomings of high computational complexity and low recognition rate under the low SNR are emerged. The characteristic parameter of the signal is extracted and optimized by using a clustering algorithm, support vector machine is trained by grading algorithm so as to enhance the rate of convergence, improve the performance of recognition under the low SNR and realize modulation recognition of the signal based on the modulation system of the constellation diagram in this paper. Simulation results show that the average recognition rate based on this algorithm is enhanced over 30% compared with methods that adopting clustering algorithm or support vector machine respectively under the low SNR. The average recognition rate can reach 90% when the SNR is 5 dB, and the method is easy to be achieved so that it has broad application prospect in the modulating recognition.
基金Supported by the National Natural Science Foundation of China(61273209)
文摘Due to the limitation and hesitation in one's knowledge, the membership degree of an element to a given set usually has a few different values, in which the conventional fuzzy sets are invalid. Hesitant fuzzy sets are a powerful tool to treat this case. The present paper focuses on investigating the clustering technique for hesitant fuzzy sets based on the K-means clustering algorithm which takes the results of hierarchical clustering as the initial clusters. Finally, two examples demonstrate the validity of our algorithm.
基金This research was funded by the Science and Technology Support Plan Project of Hebei Province(grant numbers 17210803D and 19273703D)the Science and Technology Spark Project of the Hebei Seismological Bureau(grant number DZ20180402056)+1 种基金the Education Department of Hebei Province(grant number QN2018095)the Polytechnic College of Hebei University of Science and Technology.
文摘In a large-scale wireless sensor network(WSN),densely distributed sensor nodes process a large amount of data.The aggregation of data in a network can consume a great amount of energy.To balance and reduce the energy consumption of nodes in a WSN and extend the network life,this paper proposes a nonuniform clustering routing algorithm based on the improved K-means algorithm.The algorithm uses a clustering method to form and optimize clusters,and it selects appropriate cluster heads to balance network energy consumption and extend the life cycle of the WSN.To ensure that the cluster head(CH)selection in the network is fair and that the location of the selected CH is not concentrated within a certain range,we chose the appropriate CH competition radius.Simulation results show that,compared with LEACH,LEACH-C,and the DEEC clustering algorithm,this algorithm can effectively balance the energy consumption of the CH and extend the network life.
文摘Classification systems such as Slope Mass Rating(SMR) are currently being used to undertake slope stability analysis. In SMR classification system, data is allocated to certain classes based on linguistic and experience-based criteria. In order to eliminate linguistic criteria resulted from experience-based judgments and account for uncertainties in determining class boundaries developed by SMR system,the system classification results were corrected using two clustering algorithms, namely K-means and fuzzy c-means(FCM), for the ratings obtained via continuous and discrete functions. By applying clustering algorithms in SMR classification system, no in-advance experience-based judgment was made on the number of extracted classes in this system, and it was only after all steps of the clustering algorithms were accomplished that new classification scheme was proposed for SMR system under different failure modes based on the ratings obtained via continuous and discrete functions. The results of this study showed that, engineers can achieve more reliable and objective evaluations over slope stability by using SMR system based on the ratings calculated via continuous and discrete functions.