In machine vision,elliptical targets frequently appear within the camera's region of interest(ROI).Ellipse detection is essential for shape detection and geometric measurements in machine vision.However,existing e...In machine vision,elliptical targets frequently appear within the camera's region of interest(ROI).Ellipse detection is essential for shape detection and geometric measurements in machine vision.However,existing ellipse detection algorithms often face issues such as high computational complexity,strong dependence on initial conditions,sensitivity to noise,and lack of robustness to occlusions.In this paper,we propose a fast and robust ellipse detection method to address these challenges.This method first utilizes edge gradient and curvature information to segment the curve into circular arcs.Next,based on the convexity of the arcs,it divides them into different quadrants of the ellipse,groups and fits the arcs according to multiple geometric constraints at a low computational cost.Finally,it reduces the parameter space for hierarchical clustering and then segments the complete ellipse into several sectors for verification.We compare our method across seven datasets,including five public image datasets and two from industrial camera scenes.Experimental results show that our method achieves a precision ranging from 67.1%to 98.9%,a recall ranging from 48.1%to 92.9%,and an F-measure ranging from 58.0%to 95.8%.The average execution time per image ranges from 25 ms to 192 ms,demonstrating both high accuracy and efficiency.展开更多
Low visibility conditions,particularly those caused by fog,significantly affect road safety and reduce drivers’ability to see ahead clearly.The conventional approaches used to address this problem primarily rely on i...Low visibility conditions,particularly those caused by fog,significantly affect road safety and reduce drivers’ability to see ahead clearly.The conventional approaches used to address this problem primarily rely on instrument-based and fixed-threshold-based theoretical frameworks,which face challenges in adaptability and demonstrate lower performance under varying environmental conditions.To overcome these challenges,we propose a real-time visibility estimation model that leverages roadside CCTV cameras to monitor and identify visibility levels under different weather conditions.The proposedmethod begins by identifying specific regions of interest(ROI)in the CCTVimages and focuses on extracting specific features such as the number of lines and contours detected within these regions.These features are then provided as an input to the proposed hierarchical clusteringmodel,which classifies them into different visibility levels without the need for predefined rules and threshold values.In the proposed approach,we used two different distance similaritymetrics,namely dynamic time warping(DTW)and Euclidean distance,alongside the proposed hierarchical clustering model and noted its performance in terms of numerous evaluation measures.The proposed model achieved an average accuracy of 97.81%,precision of 91.31%,recall of 91.25%,and F1-score of 91.27% using theDTWdistancemetric.We also conducted experiments for other deep learning(DL)-based models used in the literature and compared their performances with the proposed model.The experimental results demonstrate that the proposedmodel ismore adaptable and consistent compared to themethods used in the literature.The proposedmethod provides drivers real-time and accurate visibility information and enhances road safety during low visibility conditions.展开更多
AIM:To evaluate long-term visual field(VF)prediction using K-means clustering in patients with primary open angle glaucoma(POAG).METHODS:Patients who underwent 24-2 VF tests≥10 were included in this study.Using 52 to...AIM:To evaluate long-term visual field(VF)prediction using K-means clustering in patients with primary open angle glaucoma(POAG).METHODS:Patients who underwent 24-2 VF tests≥10 were included in this study.Using 52 total deviation values(TDVs)from the first 10 VF tests of the training dataset,VF points were clustered into several regions using the hierarchical ordered partitioning and collapsing hybrid(HOPACH)and K-means clustering.Based on the clustering results,a linear regression analysis was applied to each clustered region of the testing dataset to predict the TDVs of the 10th VF test.Three to nine VF tests were used to predict the 10th VF test,and the prediction errors(root mean square error,RMSE)of each clustering method and pointwise linear regression(PLR)were compared.RESULTS:The training group consisted of 228 patients(mean age,54.20±14.38y;123 males and 105 females),and the testing group included 81 patients(mean age,54.88±15.22y;43 males and 38 females).All subjects were diagnosed with POAG.Fifty-two VF points were clustered into 11 and nine regions using HOPACH and K-means clustering,respectively.K-means clustering had a lower prediction error than PLR when n=1:3 and 1:4(both P≤0.003).The prediction errors of K-means clustering were lower than those of HOPACH in all sections(n=1:4 to 1:9;all P≤0.011),except for n=1:3(P=0.680).PLR outperformed K-means clustering only when n=1:8 and 1:9(both P≤0.020).CONCLUSION:K-means clustering can predict longterm VF test results more accurately in patients with POAG with limited VF data.展开更多
Due to the limitation and hesitation in one's knowledge, the membership degree of an element to a given set usually has a few different values, in which the conventional fuzzy sets are invalid. Hesitant fuzzy sets ar...Due to the limitation and hesitation in one's knowledge, the membership degree of an element to a given set usually has a few different values, in which the conventional fuzzy sets are invalid. Hesitant fuzzy sets are a powerful tool to treat this case. The present paper focuses on investigating the clustering technique for hesitant fuzzy sets based on the K-means clustering algorithm which takes the results of hierarchical clustering as the initial clusters. Finally, two examples demonstrate the validity of our algorithm.展开更多
Traditional Chinese medicine(TCM)has played a significant role in the prevention and treatment of chronic heart failure(CHF).To study TCM diagnosis of CHF,a total of 278 Chinese clinical research articles on the study...Traditional Chinese medicine(TCM)has played a significant role in the prevention and treatment of chronic heart failure(CHF).To study TCM diagnosis of CHF,a total of 278 Chinese clinical research articles on the study of CHF syndromes in recent 40 years retrieved from Web of Science,Scopus,Pub Med,Embase,CNKI,Wanfang Data,Cq VIP,and Sino Med.According to cumulative frequency analysis,network analysis,and hierarchical cluster analysis,the study found the distribution of CHF syndromes was syndrome of qi deficiency with blood stasis,syndrome of qi and yin deficiency,syndrome of yang deficiency with water flooding,syndrome of heart blood stasis obstruction,syndrome of turbid phlegm,and syndrome of collapse due to primordial yang deficiency.The syndrome elements on location of illness were heart,kidney,lung,and spleen.The syndrome elements on nature of illness were qi deficiency,blood stasis,yang deficiency,yin deficiency,water retention,and turbid phlegm.These findings can provide reference to the research on diagnosis and treatment of CHF,and contribute to the study on syndrome standardization and objective research of TCM diagnosis.展开更多
Single-pass is commonly used in topic detection and tracking( TDT) due to its simplicity,high efficiency and low cost. When dealing with large-scale data,time cost will increase sharply and clustering performance will...Single-pass is commonly used in topic detection and tracking( TDT) due to its simplicity,high efficiency and low cost. When dealing with large-scale data,time cost will increase sharply and clustering performance will be affected greatly. Aiming at this problem,hierarchical clustering algorithm based on single-pass is proposed,which is inspired by hierarchical and concurrent ideas to divide clustering process into three stages. News reports are classified into different categories firstly.Then there are twice single-pass clustering processes in the same category,and one agglomerative clustering among different categories. In addition,for semantic similarity in news reports,topic model is improved based on named entities. Experimental results show that the proposed method can effectively accelerate the process as well as improve the performance.展开更多
Intuitionistic fuzzy set(IFS)is a set of 2-tuple arguments,each of which is characterized by a membership degree and a nonmembership degree.The generalized form of IFS is interval-valued intuitionistic fuzzy set(IVIFS...Intuitionistic fuzzy set(IFS)is a set of 2-tuple arguments,each of which is characterized by a membership degree and a nonmembership degree.The generalized form of IFS is interval-valued intuitionistic fuzzy set(IVIFS),whose components are intervals rather than exact numbers.IFSs and IVIFSs have been found to be very useful to describe vagueness and uncertainty.However,it seems that little attention has been focused on the clustering analysis of IFSs and IVIFSs.An intuitionistic fuzzy hierarchical algorithm is introduced for clustering IFSs,which is based on the traditional hierarchical clustering procedure,the intuitionistic fuzzy aggregation operator,and the basic distance measures between IFSs:the Hamming distance,normalized Hamming,weighted Hamming,the Euclidean distance,the normalized Euclidean distance,and the weighted Euclidean distance.Subsequently,the algorithm is extended for clustering IVIFSs.Finally the algorithm and its extended form are applied to the classifications of building materials and enterprises respectively.展开更多
Banana is an important crop grown in Oman and there is a dearth of information on its genetic diversity to assist in crop breeding and improvement programs.This study employed amplified fragment length polymorphism(AF...Banana is an important crop grown in Oman and there is a dearth of information on its genetic diversity to assist in crop breeding and improvement programs.This study employed amplified fragment length polymorphism(AFLP) to investigate the genetic variation in local banana cultivars from the southern region of Oman.Using 12 primer combinations,a total of 1094 bands were scored,of which 1012 were polymorphic.Eighty-two unique markers were identified,which revealed the distinct separation of the seven cultivars.The results obtained show that AFLP can be used to differentiate the banana cultivars.Further classification by phylogenetic,hierarchical clustering and principal component analyses showed significant differences between the clusters found with molecular markers and those clusters created by previous studies using morphological analysis.Based on the analytical results,a consensus dendrogram of the banana cultivars is presented.展开更多
News feed is one of the potential information providing sources which give updates on various topics of different domains.These updates on various topics need to be collected since the domain specific interested users...News feed is one of the potential information providing sources which give updates on various topics of different domains.These updates on various topics need to be collected since the domain specific interested users are in need of important updates in their domains with organized data from various sources.In this paper,the news summarization system is proposed for the news data streams from RSS feeds and Google news.Since news stream analysis requires live content,the news data are continuously collected for our experimentation.Themajor contributions of thiswork involve domain corpus based news collection,news content extraction,hierarchical clustering of the news and summarization of news.Many of the existing news summarization systems lack in providing dynamic content with domain wise representation.This is alleviated in our proposed systemby tagging the news feed with domain corpuses and organizing the news streams with the hierarchical structure with topic wise representation.Further,the news streams are summarized for the users with a novel summarization algorithm.The proposed summarization system generates topic wise summaries effectively for the user and no system in the literature has handled the news summarization by collecting the data dynamically and organizing the content hierarchically.The proposed system is compared with existing systems and achieves better results in generating news summaries.The Online news content editors are highly benefitted by this system for instantly getting the news summaries of their domain interest.展开更多
It is a challenging topic to develop an efficient algorithm for large scale classification problems in many applications of machine learning. In this paper, a hierarchical clustering and fixed- layer local learning (...It is a challenging topic to develop an efficient algorithm for large scale classification problems in many applications of machine learning. In this paper, a hierarchical clustering and fixed- layer local learning (HCFLL) based support vector machine(SVM) algorithm is proposed to deal with this problem. Firstly, HCFLL hierarchically dusters a given dataset into a modified clustering feature tree based on the ideas of unsupervised clustering and supervised clustering. Then it locally trains SVM on each labeled subtree at a fixed-layer of the tree. The experimental results show that compared with the existing popular algorithms such as core vector machine and decision.tree support vector machine, HCFLL can significantly improve the training and testing speeds with comparable testing accuracy.展开更多
This article proposes a novel stable clustering design method for hierarchical satellite network in order to increase its stability,reduce the overhead of storage and exert effective control of the delay performances ...This article proposes a novel stable clustering design method for hierarchical satellite network in order to increase its stability,reduce the overhead of storage and exert effective control of the delay performances based on a 5-dimensional vector model.According to the function of stability measureinent and owing to the limitation of minimal average routing table length,the hierarchical satellite network is grouped into separate stable connected clusters to improve destruction resistance and reconstruction ability in the future integrated network.In each cluster,redundant communication links with little contribution to network stability and slight influences on delay variation are deleted to satisfy the requirements for stability and connectivity by means of optimal link resources,and,also,the idea of logical weight is introduced to select the optimal satellites used to communicate with neighboring cluster satellites.Finally,the feasibility and effectiveness of the proposed method are verified by comparing it with the simulated performances of other two typical hierarchical satellite networks,double layer satellite constellation(DLSC)and satellite over satellite(SOS).展开更多
Graphical representation of hierarchical clustering results is of final importance in hierarchical cluster analysis of data. Unfortunately, almost all mathematical or statistical software may have a weak capability of...Graphical representation of hierarchical clustering results is of final importance in hierarchical cluster analysis of data. Unfortunately, almost all mathematical or statistical software may have a weak capability of showcasing such clustering results. Particularly, most of clustering results or trees drawn cannot be represented in a dendrogram with a resizable, rescalable and free-style fashion. With the “dynamic” drawing instead of “static” one, this research works around these weak functionalities that restrict visualization of clustering results in an arbitrary manner. It introduces an algorithmic solution to these functionalities, which adopts seamless pixel rearrangements to be able to resize and rescale dendrograms or tree diagrams. The results showed that the algorithm developed makes clustering outcome representation a really free visualization of hierarchical clustering and bioinformatics analysis. Especially, it possesses features of selectively visualizing and/or saving results in a specific size, scale and style (different views).展开更多
As an important branch of machine learning,clustering analysis is widely used in some fields,e.g.,image pattern recognition,social network analysis,information security,and so on.In this paper,we consider the designin...As an important branch of machine learning,clustering analysis is widely used in some fields,e.g.,image pattern recognition,social network analysis,information security,and so on.In this paper,we consider the designing of clustering algorithm in quantum scenario,and propose a quantum hierarchical agglomerative clustering algorithm,which is based on one dimension discrete quantum walk with single-point phase defects.In the proposed algorithm,two nonclassical characters of this kind of quantum walk,localization and ballistic effects,are exploited.At first,each data point is viewed as a particle and performed this kind of quantum walk with a parameter,which is determined by its neighbors.After that,the particles are measured in a calculation basis.In terms of the measurement result,every attribute value of the corresponding data point is modified appropriately.In this way,each data point interacts with its neighbors and moves toward a certain center point.At last,this process is repeated several times until similar data points cluster together and form distinct classes.Simulation experiments on the synthetic and real world data demonstrate the effectiveness of the presented algorithm.Compared with some classical algorithms,the proposed algorithm achieves better clustering results.Moreover,combining quantum cluster assignment method,the presented algorithm can speed up the calculating velocity.展开更多
The complexity of large-scale network systems made of a large number of nonlinearly interconnected components is a restrictive facet for their modeling and analysis. In this paper, we propose a framework of hierarchic...The complexity of large-scale network systems made of a large number of nonlinearly interconnected components is a restrictive facet for their modeling and analysis. In this paper, we propose a framework of hierarchical modeling of a complex network system, based on a recursive unsupervised spectral clustering method. The hierarchical model serves the purpose of facilitating the management of complexity in the analysis of real-world critical infrastructures. We exemplify this by referring to the reliability analysis of the 380 kV Italian Power Transmission Network (IPTN). In this work of analysis, the classical component Importance Measures (IMs) of reliability theory have been extended to render them compatible and applicable to a complex distributed network system. By utilizing these extended IMs, the reliability properties of the IPTN system can be evaluated in the framework of the hierarchical system model, with the aim of providing risk managers with information on the risk/safety significance of system structures and components.展开更多
For the charging station construction of electric vehicle,location selecting is a key issue.There are two problems in location selection of the electric vehicle charging station.One is determining the location of char...For the charging station construction of electric vehicle,location selecting is a key issue.There are two problems in location selection of the electric vehicle charging station.One is determining the location of charging station;the other is evaluating the location of charging station.To determine the charging station location,an spatial clustering algorithm is proposed and programmed.The example simulation shows the effectiveness of the spatial clustering algorithm.To evaluate the charging station location,a multi-hierarchical fuzzy method is proposed.Based on the location factors of electric vehicle charging station,the hierarchical evaluation structure of electric vehicle charging station location is constructed,including three levels,4first-class factors and 14second-class factors.The fuzzy multi-hierarchical evaluation model and algorithm are built.The analysis results show that the multi-hierarchical fuzzy method can reasonably complete the electric vehicle charging station location evaluation.展开更多
Hierarchical clustering analysis based on statistic s is one of the most important mining algorithms, but the traditionary hierarchica l clustering method is based on global comparing, which only takes in Q clusteri n...Hierarchical clustering analysis based on statistic s is one of the most important mining algorithms, but the traditionary hierarchica l clustering method is based on global comparing, which only takes in Q clusteri ng while ignoring R clustering in practice, so it has some limitation especially when the number of sample and index is very large. Furthermore, because of igno ring the association between the different indexes, the clustering result is not good & true. In this paper, we present the model and the algorithm of two-level hierarchi cal clustering which integrates Q clustering with R clustering. Moreover, becaus e two-level hierarchical clustering is based on the respective clustering resul t of each class, the classification of the indexes directly effects on the a ccuracy of the final clustering result, how to appropriately classify the inde xes is the chief and difficult problem we must handle in advance. Although some literatures also have referred to the issue of the classificati on of the indexes, but the articles classify the indexes only according to their superficial signification, which is unscientific. The reasons are as follow s: First, the superficial signification of some indexes usually takes on different meanings and it is easy to be misapprehended by different person. Furthermore, t his classification method seldom make use of history data, the classification re sult is not so objective. Second, for some indexes, its superficial signification didn’t show any mean ings, so simply from the superficial signification, we can’t classify them to c ertain classes. Third, this classification method need the users have higher level knowledge of this field, otherwise it is difficult for the users to understand the signifi cation of some indexes, which sometimes is not available. So in this paper, to this question, we first use R clustering method to cluste ring indexes, dividing p dimension indexes into q classes, then adopt two-level clustering method to get the final result. Obviously, the classification result is more objective and accurate. Moreover, after the first step, we can get the relation of the different indexes and their interaction. We can also know under a certain class indexes, which samples can be clustering to a class. (These semi finished results sometimes are very useful.) The experiments also indicates the effective and accurate of the algorithms. And, the result of R clustering ca n be easily used for the later practice.展开更多
We propose two models in this paper. The concept of association model is put forward to obtain the co-occurrence relationships among keywords in the documents and the hierarchical Hamming clustering model is used to r...We propose two models in this paper. The concept of association model is put forward to obtain the co-occurrence relationships among keywords in the documents and the hierarchical Hamming clustering model is used to reduce the dimensionality of the category feature vector space which can solve the problem of the extremely high dimensionality of the documents' feature space. The results of experiment indicate that it can obtain the co-occurrence relations among key-words in the documents which promote the recall of classification system effectively. The hierarchical Hamming clustering model can reduce the dimensionality of the category feature vector efficiently, the size of the vector space is only about 10% of the primary dimensionality. Key words text classification - concept association - hierarchical clustering - hamming clustering CLC number TN 915. 08 Foundation item: Supporteded by the National 863 Project of China (2001AA142160, 2002AA145090)Biography: Su Gui-yang (1974-), male, Ph. D candidate, research direction: information filter and text classification.展开更多
Network topology inference is one of the important applications of network tomography.Traditional network topology inference may impact network normal operation due to its generation of huge data traffic.A unicast net...Network topology inference is one of the important applications of network tomography.Traditional network topology inference may impact network normal operation due to its generation of huge data traffic.A unicast network topology inference is proposed to use time to live(TTL)for layering and classify nodes layer by layer based on the similarity of node pairs.Finally,the method infers logical network topology effectively with self-adaptive combination of previous results.Simulation results show that the proposed method holds a high accuracy of topology inference while decreasing network measuring flow,thus improves measurement efficiency.展开更多
For the load modeling of a large power grid,the large number of substations covered by it must be segregated into several categories and,thereafter,a load model built for each type.To address the problem of skewed clu...For the load modeling of a large power grid,the large number of substations covered by it must be segregated into several categories and,thereafter,a load model built for each type.To address the problem of skewed clustering tree in the classical hierarchical clustering method used for categorizing substations,a fair hierarchical clustering method is proposed in this paper.First,the fairness index is defined based on the Gini coefficient.Thereafter,a hierarchical clustering method is proposed based on the fairness index.Finally,the clustering results are evaluated using the contour coefficient and the t-SNE two-dimensional plane map.The substations clustering example of a real large power grid considered in this paper illustrates that the proposed fair hierarchical clustering method can effectively address the problem of the skewed clustering tree with high accuracy.展开更多
In clustering algorithms,the selection of neighbors significantly affects the quality of the final clustering results.While various neighbor relationships exist,such as K-nearest neighbors,natural neighbors,and shared...In clustering algorithms,the selection of neighbors significantly affects the quality of the final clustering results.While various neighbor relationships exist,such as K-nearest neighbors,natural neighbors,and shared neighbors,most neighbor relationships can only handle single structural relationships,and the identification accuracy is low for datasets with multiple structures.In life,people’s first instinct for complex things is to divide them into multiple parts to complete.Partitioning the dataset into more sub-graphs is a good idea approach to identifying complex structures.Taking inspiration from this,we propose a novel neighbor method:Shared Natural Neighbors(SNaN).To demonstrate the superiority of this neighbor method,we propose a shared natural neighbors-based hierarchical clustering algorithm for discovering arbitrary-shaped clusters(HC-SNaN).Our algorithm excels in identifying both spherical clusters and manifold clusters.Tested on synthetic datasets and real-world datasets,HC-SNaN demonstrates significant advantages over existing clustering algorithms,particularly when dealing with datasets containing arbitrary shapes.展开更多
基金supported by National Major Scientific Research Instrument Development Project of China(No.51927804)Science Fund for Shaanxi Provincial Department of Education's Youth Innovation Team Research Plan under Grant(No.23JP169).
文摘In machine vision,elliptical targets frequently appear within the camera's region of interest(ROI).Ellipse detection is essential for shape detection and geometric measurements in machine vision.However,existing ellipse detection algorithms often face issues such as high computational complexity,strong dependence on initial conditions,sensitivity to noise,and lack of robustness to occlusions.In this paper,we propose a fast and robust ellipse detection method to address these challenges.This method first utilizes edge gradient and curvature information to segment the curve into circular arcs.Next,based on the convexity of the arcs,it divides them into different quadrants of the ellipse,groups and fits the arcs according to multiple geometric constraints at a low computational cost.Finally,it reduces the parameter space for hierarchical clustering and then segments the complete ellipse into several sectors for verification.We compare our method across seven datasets,including five public image datasets and two from industrial camera scenes.Experimental results show that our method achieves a precision ranging from 67.1%to 98.9%,a recall ranging from 48.1%to 92.9%,and an F-measure ranging from 58.0%to 95.8%.The average execution time per image ranges from 25 ms to 192 ms,demonstrating both high accuracy and efficiency.
文摘Low visibility conditions,particularly those caused by fog,significantly affect road safety and reduce drivers’ability to see ahead clearly.The conventional approaches used to address this problem primarily rely on instrument-based and fixed-threshold-based theoretical frameworks,which face challenges in adaptability and demonstrate lower performance under varying environmental conditions.To overcome these challenges,we propose a real-time visibility estimation model that leverages roadside CCTV cameras to monitor and identify visibility levels under different weather conditions.The proposedmethod begins by identifying specific regions of interest(ROI)in the CCTVimages and focuses on extracting specific features such as the number of lines and contours detected within these regions.These features are then provided as an input to the proposed hierarchical clusteringmodel,which classifies them into different visibility levels without the need for predefined rules and threshold values.In the proposed approach,we used two different distance similaritymetrics,namely dynamic time warping(DTW)and Euclidean distance,alongside the proposed hierarchical clustering model and noted its performance in terms of numerous evaluation measures.The proposed model achieved an average accuracy of 97.81%,precision of 91.31%,recall of 91.25%,and F1-score of 91.27% using theDTWdistancemetric.We also conducted experiments for other deep learning(DL)-based models used in the literature and compared their performances with the proposed model.The experimental results demonstrate that the proposedmodel ismore adaptable and consistent compared to themethods used in the literature.The proposedmethod provides drivers real-time and accurate visibility information and enhances road safety during low visibility conditions.
基金Supported by the Korea Health Technology R&D Project through the Korea Health Industry Development Institute(KHIDI),the Ministry of Health&Welfare,Republic of Korea(No.RS-2020-KH088726)the Patient-Centered Clinical Research Coordinating Center(PACEN),the Ministry of Health and Welfare,Republic of Korea(No.HC19C0276)the National Research Foundation of Korea(NRF),the Korea Government(MSIT)(No.RS-2023-00247504).
文摘AIM:To evaluate long-term visual field(VF)prediction using K-means clustering in patients with primary open angle glaucoma(POAG).METHODS:Patients who underwent 24-2 VF tests≥10 were included in this study.Using 52 total deviation values(TDVs)from the first 10 VF tests of the training dataset,VF points were clustered into several regions using the hierarchical ordered partitioning and collapsing hybrid(HOPACH)and K-means clustering.Based on the clustering results,a linear regression analysis was applied to each clustered region of the testing dataset to predict the TDVs of the 10th VF test.Three to nine VF tests were used to predict the 10th VF test,and the prediction errors(root mean square error,RMSE)of each clustering method and pointwise linear regression(PLR)were compared.RESULTS:The training group consisted of 228 patients(mean age,54.20±14.38y;123 males and 105 females),and the testing group included 81 patients(mean age,54.88±15.22y;43 males and 38 females).All subjects were diagnosed with POAG.Fifty-two VF points were clustered into 11 and nine regions using HOPACH and K-means clustering,respectively.K-means clustering had a lower prediction error than PLR when n=1:3 and 1:4(both P≤0.003).The prediction errors of K-means clustering were lower than those of HOPACH in all sections(n=1:4 to 1:9;all P≤0.011),except for n=1:3(P=0.680).PLR outperformed K-means clustering only when n=1:8 and 1:9(both P≤0.020).CONCLUSION:K-means clustering can predict longterm VF test results more accurately in patients with POAG with limited VF data.
基金Supported by the National Natural Science Foundation of China(61273209)
文摘Due to the limitation and hesitation in one's knowledge, the membership degree of an element to a given set usually has a few different values, in which the conventional fuzzy sets are invalid. Hesitant fuzzy sets are a powerful tool to treat this case. The present paper focuses on investigating the clustering technique for hesitant fuzzy sets based on the K-means clustering algorithm which takes the results of hierarchical clustering as the initial clusters. Finally, two examples demonstrate the validity of our algorithm.
基金financed by the grants from the National Natural Science Foundation of China(No.81803996)Shanghai Key Laboratory of Health Identification and Assessment(No.21DZ2271000)。
文摘Traditional Chinese medicine(TCM)has played a significant role in the prevention and treatment of chronic heart failure(CHF).To study TCM diagnosis of CHF,a total of 278 Chinese clinical research articles on the study of CHF syndromes in recent 40 years retrieved from Web of Science,Scopus,Pub Med,Embase,CNKI,Wanfang Data,Cq VIP,and Sino Med.According to cumulative frequency analysis,network analysis,and hierarchical cluster analysis,the study found the distribution of CHF syndromes was syndrome of qi deficiency with blood stasis,syndrome of qi and yin deficiency,syndrome of yang deficiency with water flooding,syndrome of heart blood stasis obstruction,syndrome of turbid phlegm,and syndrome of collapse due to primordial yang deficiency.The syndrome elements on location of illness were heart,kidney,lung,and spleen.The syndrome elements on nature of illness were qi deficiency,blood stasis,yang deficiency,yin deficiency,water retention,and turbid phlegm.These findings can provide reference to the research on diagnosis and treatment of CHF,and contribute to the study on syndrome standardization and objective research of TCM diagnosis.
基金Supported by the National Natural Science Foundation of China(No.61502312)the Fundamental Research Funds for the Central Universities(No.2017BQ024)+1 种基金the Natural Science Foundation of Guangdong Province(No.2017A030310428)the Science and Technology Programm of Guangzhou(No.201806020075,20180210025)
文摘Single-pass is commonly used in topic detection and tracking( TDT) due to its simplicity,high efficiency and low cost. When dealing with large-scale data,time cost will increase sharply and clustering performance will be affected greatly. Aiming at this problem,hierarchical clustering algorithm based on single-pass is proposed,which is inspired by hierarchical and concurrent ideas to divide clustering process into three stages. News reports are classified into different categories firstly.Then there are twice single-pass clustering processes in the same category,and one agglomerative clustering among different categories. In addition,for semantic similarity in news reports,topic model is improved based on named entities. Experimental results show that the proposed method can effectively accelerate the process as well as improve the performance.
基金supported by the National Natural Science Foundation of China(70571087)the National Science Fund for Distinguished Young Scholars of China(70625005)
文摘Intuitionistic fuzzy set(IFS)is a set of 2-tuple arguments,each of which is characterized by a membership degree and a nonmembership degree.The generalized form of IFS is interval-valued intuitionistic fuzzy set(IVIFS),whose components are intervals rather than exact numbers.IFSs and IVIFSs have been found to be very useful to describe vagueness and uncertainty.However,it seems that little attention has been focused on the clustering analysis of IFSs and IVIFSs.An intuitionistic fuzzy hierarchical algorithm is introduced for clustering IFSs,which is based on the traditional hierarchical clustering procedure,the intuitionistic fuzzy aggregation operator,and the basic distance measures between IFSs:the Hamming distance,normalized Hamming,weighted Hamming,the Euclidean distance,the normalized Euclidean distance,and the weighted Euclidean distance.Subsequently,the algorithm is extended for clustering IVIFSs.Finally the algorithm and its extended form are applied to the classifications of building materials and enterprises respectively.
基金Project supported by Programs of Sultan Qaboos University (Nos SR/AGR/BIOR/05/01 and IG/AGR/PLANT/04/01),Sultanate of Oman,and the Research Chair in Postharvest Technology at the University of Stellenbosch,South Africa
文摘Banana is an important crop grown in Oman and there is a dearth of information on its genetic diversity to assist in crop breeding and improvement programs.This study employed amplified fragment length polymorphism(AFLP) to investigate the genetic variation in local banana cultivars from the southern region of Oman.Using 12 primer combinations,a total of 1094 bands were scored,of which 1012 were polymorphic.Eighty-two unique markers were identified,which revealed the distinct separation of the seven cultivars.The results obtained show that AFLP can be used to differentiate the banana cultivars.Further classification by phylogenetic,hierarchical clustering and principal component analyses showed significant differences between the clusters found with molecular markers and those clusters created by previous studies using morphological analysis.Based on the analytical results,a consensus dendrogram of the banana cultivars is presented.
文摘News feed is one of the potential information providing sources which give updates on various topics of different domains.These updates on various topics need to be collected since the domain specific interested users are in need of important updates in their domains with organized data from various sources.In this paper,the news summarization system is proposed for the news data streams from RSS feeds and Google news.Since news stream analysis requires live content,the news data are continuously collected for our experimentation.Themajor contributions of thiswork involve domain corpus based news collection,news content extraction,hierarchical clustering of the news and summarization of news.Many of the existing news summarization systems lack in providing dynamic content with domain wise representation.This is alleviated in our proposed systemby tagging the news feed with domain corpuses and organizing the news streams with the hierarchical structure with topic wise representation.Further,the news streams are summarized for the users with a novel summarization algorithm.The proposed summarization system generates topic wise summaries effectively for the user and no system in the literature has handled the news summarization by collecting the data dynamically and organizing the content hierarchically.The proposed system is compared with existing systems and achieves better results in generating news summaries.The Online news content editors are highly benefitted by this system for instantly getting the news summaries of their domain interest.
基金National Natural Science Foundation of China ( No. 61070033 )Fundamental Research Funds for the Central Universities,China( No. 2012ZM0061)
文摘It is a challenging topic to develop an efficient algorithm for large scale classification problems in many applications of machine learning. In this paper, a hierarchical clustering and fixed- layer local learning (HCFLL) based support vector machine(SVM) algorithm is proposed to deal with this problem. Firstly, HCFLL hierarchically dusters a given dataset into a modified clustering feature tree based on the ideas of unsupervised clustering and supervised clustering. Then it locally trains SVM on each labeled subtree at a fixed-layer of the tree. The experimental results show that compared with the existing popular algorithms such as core vector machine and decision.tree support vector machine, HCFLL can significantly improve the training and testing speeds with comparable testing accuracy.
基金National Natural Science Foundation of China(60532030)
文摘This article proposes a novel stable clustering design method for hierarchical satellite network in order to increase its stability,reduce the overhead of storage and exert effective control of the delay performances based on a 5-dimensional vector model.According to the function of stability measureinent and owing to the limitation of minimal average routing table length,the hierarchical satellite network is grouped into separate stable connected clusters to improve destruction resistance and reconstruction ability in the future integrated network.In each cluster,redundant communication links with little contribution to network stability and slight influences on delay variation are deleted to satisfy the requirements for stability and connectivity by means of optimal link resources,and,also,the idea of logical weight is introduced to select the optimal satellites used to communicate with neighboring cluster satellites.Finally,the feasibility and effectiveness of the proposed method are verified by comparing it with the simulated performances of other two typical hierarchical satellite networks,double layer satellite constellation(DLSC)and satellite over satellite(SOS).
文摘Graphical representation of hierarchical clustering results is of final importance in hierarchical cluster analysis of data. Unfortunately, almost all mathematical or statistical software may have a weak capability of showcasing such clustering results. Particularly, most of clustering results or trees drawn cannot be represented in a dendrogram with a resizable, rescalable and free-style fashion. With the “dynamic” drawing instead of “static” one, this research works around these weak functionalities that restrict visualization of clustering results in an arbitrary manner. It introduces an algorithmic solution to these functionalities, which adopts seamless pixel rearrangements to be able to resize and rescale dendrograms or tree diagrams. The results showed that the algorithm developed makes clustering outcome representation a really free visualization of hierarchical clustering and bioinformatics analysis. Especially, it possesses features of selectively visualizing and/or saving results in a specific size, scale and style (different views).
基金This work was supported by National Natural Science Foundation of China(Grants Nos.61976053 and 61772134)Fujian Province Natural Science Foundation(Grant No.2018J01776)+1 种基金Program for New Century Excellent Talents in Fujian Province University,Probability and Statistics:Theory and Application(Grant No.IRTL1704)the Program for Innovative Research Team in Science and Technology in Fujian Province University.
文摘As an important branch of machine learning,clustering analysis is widely used in some fields,e.g.,image pattern recognition,social network analysis,information security,and so on.In this paper,we consider the designing of clustering algorithm in quantum scenario,and propose a quantum hierarchical agglomerative clustering algorithm,which is based on one dimension discrete quantum walk with single-point phase defects.In the proposed algorithm,two nonclassical characters of this kind of quantum walk,localization and ballistic effects,are exploited.At first,each data point is viewed as a particle and performed this kind of quantum walk with a parameter,which is determined by its neighbors.After that,the particles are measured in a calculation basis.In terms of the measurement result,every attribute value of the corresponding data point is modified appropriately.In this way,each data point interacts with its neighbors and moves toward a certain center point.At last,this process is repeated several times until similar data points cluster together and form distinct classes.Simulation experiments on the synthetic and real world data demonstrate the effectiveness of the presented algorithm.Compared with some classical algorithms,the proposed algorithm achieves better clustering results.Moreover,combining quantum cluster assignment method,the presented algorithm can speed up the calculating velocity.
文摘The complexity of large-scale network systems made of a large number of nonlinearly interconnected components is a restrictive facet for their modeling and analysis. In this paper, we propose a framework of hierarchical modeling of a complex network system, based on a recursive unsupervised spectral clustering method. The hierarchical model serves the purpose of facilitating the management of complexity in the analysis of real-world critical infrastructures. We exemplify this by referring to the reliability analysis of the 380 kV Italian Power Transmission Network (IPTN). In this work of analysis, the classical component Importance Measures (IMs) of reliability theory have been extended to render them compatible and applicable to a complex distributed network system. By utilizing these extended IMs, the reliability properties of the IPTN system can be evaluated in the framework of the hierarchical system model, with the aim of providing risk managers with information on the risk/safety significance of system structures and components.
基金supported by the National Natural Science Foundation of China(No.51575047)
文摘For the charging station construction of electric vehicle,location selecting is a key issue.There are two problems in location selection of the electric vehicle charging station.One is determining the location of charging station;the other is evaluating the location of charging station.To determine the charging station location,an spatial clustering algorithm is proposed and programmed.The example simulation shows the effectiveness of the spatial clustering algorithm.To evaluate the charging station location,a multi-hierarchical fuzzy method is proposed.Based on the location factors of electric vehicle charging station,the hierarchical evaluation structure of electric vehicle charging station location is constructed,including three levels,4first-class factors and 14second-class factors.The fuzzy multi-hierarchical evaluation model and algorithm are built.The analysis results show that the multi-hierarchical fuzzy method can reasonably complete the electric vehicle charging station location evaluation.
文摘Hierarchical clustering analysis based on statistic s is one of the most important mining algorithms, but the traditionary hierarchica l clustering method is based on global comparing, which only takes in Q clusteri ng while ignoring R clustering in practice, so it has some limitation especially when the number of sample and index is very large. Furthermore, because of igno ring the association between the different indexes, the clustering result is not good & true. In this paper, we present the model and the algorithm of two-level hierarchi cal clustering which integrates Q clustering with R clustering. Moreover, becaus e two-level hierarchical clustering is based on the respective clustering resul t of each class, the classification of the indexes directly effects on the a ccuracy of the final clustering result, how to appropriately classify the inde xes is the chief and difficult problem we must handle in advance. Although some literatures also have referred to the issue of the classificati on of the indexes, but the articles classify the indexes only according to their superficial signification, which is unscientific. The reasons are as follow s: First, the superficial signification of some indexes usually takes on different meanings and it is easy to be misapprehended by different person. Furthermore, t his classification method seldom make use of history data, the classification re sult is not so objective. Second, for some indexes, its superficial signification didn’t show any mean ings, so simply from the superficial signification, we can’t classify them to c ertain classes. Third, this classification method need the users have higher level knowledge of this field, otherwise it is difficult for the users to understand the signifi cation of some indexes, which sometimes is not available. So in this paper, to this question, we first use R clustering method to cluste ring indexes, dividing p dimension indexes into q classes, then adopt two-level clustering method to get the final result. Obviously, the classification result is more objective and accurate. Moreover, after the first step, we can get the relation of the different indexes and their interaction. We can also know under a certain class indexes, which samples can be clustering to a class. (These semi finished results sometimes are very useful.) The experiments also indicates the effective and accurate of the algorithms. And, the result of R clustering ca n be easily used for the later practice.
文摘We propose two models in this paper. The concept of association model is put forward to obtain the co-occurrence relationships among keywords in the documents and the hierarchical Hamming clustering model is used to reduce the dimensionality of the category feature vector space which can solve the problem of the extremely high dimensionality of the documents' feature space. The results of experiment indicate that it can obtain the co-occurrence relations among key-words in the documents which promote the recall of classification system effectively. The hierarchical Hamming clustering model can reduce the dimensionality of the category feature vector efficiently, the size of the vector space is only about 10% of the primary dimensionality. Key words text classification - concept association - hierarchical clustering - hamming clustering CLC number TN 915. 08 Foundation item: Supporteded by the National 863 Project of China (2001AA142160, 2002AA145090)Biography: Su Gui-yang (1974-), male, Ph. D candidate, research direction: information filter and text classification.
基金supported by the National Natural Science Foundation of China (Nos.61373137,61373017, 61373139)the Major Program of Jiangsu Higher Education Institutions (No.14KJA520002)+1 种基金the Six Industries Talent Peaks Plan of Jiangsu(No.2013-DZXX-014)the Jiangsu Qinglan Project
文摘Network topology inference is one of the important applications of network tomography.Traditional network topology inference may impact network normal operation due to its generation of huge data traffic.A unicast network topology inference is proposed to use time to live(TTL)for layering and classify nodes layer by layer based on the similarity of node pairs.Finally,the method infers logical network topology effectively with self-adaptive combination of previous results.Simulation results show that the proposed method holds a high accuracy of topology inference while decreasing network measuring flow,thus improves measurement efficiency.
基金supported by the Major Science and Technology Project of Yunnan Province entitled“Research and Application of Key Technologies of Power Grid Operation Analysis and Protection Control for Improving Green Power Consumption”(202002AF080001)the China South Power Grid Science and Technology Project entitled“Research on Load Model and Modeling Method of Yunnan Power Grid”(YNKJXM20180017).
文摘For the load modeling of a large power grid,the large number of substations covered by it must be segregated into several categories and,thereafter,a load model built for each type.To address the problem of skewed clustering tree in the classical hierarchical clustering method used for categorizing substations,a fair hierarchical clustering method is proposed in this paper.First,the fairness index is defined based on the Gini coefficient.Thereafter,a hierarchical clustering method is proposed based on the fairness index.Finally,the clustering results are evaluated using the contour coefficient and the t-SNE two-dimensional plane map.The substations clustering example of a real large power grid considered in this paper illustrates that the proposed fair hierarchical clustering method can effectively address the problem of the skewed clustering tree with high accuracy.
基金This work was supported by Science and Technology Research Program of Chongqing Municipal Education Commission(KJZD-M202300502,KJQN201800539).
文摘In clustering algorithms,the selection of neighbors significantly affects the quality of the final clustering results.While various neighbor relationships exist,such as K-nearest neighbors,natural neighbors,and shared neighbors,most neighbor relationships can only handle single structural relationships,and the identification accuracy is low for datasets with multiple structures.In life,people’s first instinct for complex things is to divide them into multiple parts to complete.Partitioning the dataset into more sub-graphs is a good idea approach to identifying complex structures.Taking inspiration from this,we propose a novel neighbor method:Shared Natural Neighbors(SNaN).To demonstrate the superiority of this neighbor method,we propose a shared natural neighbors-based hierarchical clustering algorithm for discovering arbitrary-shaped clusters(HC-SNaN).Our algorithm excels in identifying both spherical clusters and manifold clusters.Tested on synthetic datasets and real-world datasets,HC-SNaN demonstrates significant advantages over existing clustering algorithms,particularly when dealing with datasets containing arbitrary shapes.