期刊文献+

基于多语义文本表示的自适应模糊聚类算法 被引量:10

Adaptive Fuzzy Clustering Algorithm Based on Multi-semantic Text Representation
在线阅读 下载PDF
导出
摘要 由于词语的多语义问题和传统的文本表示与聚类过程相互独立的问题,导致文本聚类准确率较低。针对上述问题提出一种基于多语义文本表示的自适应模糊C-均值(multi-semantic repre-sentation based adaptive fuzzy C-means,MSR-AFCM)聚类算法。通过将词语软聚类划分成多个词簇构建多个语义空间,将语义空间个数作为文本初始聚类数目,利用词语的语义隶属度计算每个文本属于文本空间的语义隶属度,并以此为对算法进行初始化。在算法运行过程中,根据更新的文本语义隶属度和文本分布状况,逐步剔除冗余的文本空间,以达到优化聚类数目的目标。与其他文本聚类算法相比,MSR-AFCM算法能够提高2.66%~12.5%的聚类性能,具有更高的准确率和兰德系数。 The multi-semantic problem of words,together with the problem of independence between the tra-ditional text representation and the clustering process,leads to the low precision of text clustering.A multi-semantic representation based adaptive fuzzy C-means(MSR-AFCM)clustering algorithm was proposed to solve these problems.To be specific,multiple semantic spaces were constructed by using the soft clustering of words.Then,the number of semantic spaces was taken as the initial number of text clustering.Next,the semantic membership degree of each text belonging to the text space was calculated by utilizing the semantic members-hip degree of the words,and then the algorithm degree was initialized accordingly.During the running of the algorithm,the redundant text spaces were gradually eliminated based on the updated text semantic membership degree and text distribution,so as to achieve the goal of optimizing the number of text clustering.Compared with other text clustering algorithms,MSR-AFCM algorithm can improve the clustering performance by about 2.66%~12.5%,it has higher accuracy and rand coefficient.
作者 刘雅萱 武娇 顾兴全 尹雪婷 LIU Ya-xuan;WU Jiao;GU Xing-quan;YIN Xue-ting(College of Sciences,China Jiliang University,Hangzhou 310018,China;College of Standardization,China Jiliang University,Hangzhou 310018,China)
出处 《科学技术与工程》 北大核心 2023年第11期4709-4717,共9页 Science Technology and Engineering
基金 国家市场监督管理总局技术保障专项(2021YJ005) 浙江省教育厅科研项目(Y201738417)。
关键词 文本聚类 文本表示 模糊聚类 隶属度 text clustering text representation fuzzy clustering affiliation
  • 相关文献

参考文献5

二级参考文献40

共引文献88

同被引文献84

引证文献10

二级引证文献33

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部