摘要
针对XML数据中的孤立点问题,利用聚类分析思想和XML数据嵌套结构特性所蕴含的元素间的上下文信息,设计了一种在XML半结构数据中检测孤立点的算法。该算法把逻辑相关的结点聚集到相应的子空间中,并基于这些相关子空间计算孤立点兴趣度度量XO度量,以此来识别孤立点数据。实验结果表明,该算法在一定规模的孤立点数据下能够达到较高的识别效率。
Aimed at the outliers in XML data, utilizing the clustering analysis and the context information inherent in the XML data models, a new kind of arithmetic for detecting outliers is designed in XML data, and this arithmetic assemble the logic-related node in XML data to the same subspace, and according to these the related subspace compute a interest-ness measure XO-Measure of outliers and identifyiny outliers. Experimental results show that the proposed approach is effective in identifyiny outliers in a real world XML data set with certain scale outliers.
出处
《计算机工程与设计》
CSCD
北大核心
2010年第18期4001-4004,共4页
Computer Engineering and Design