期刊文献+

基于单个XML文档结构的数据挖掘 被引量:3

Data Mining Techniques for Structure of Single XML Document
在线阅读 下载PDF
导出
摘要 提出了一种基于XML的结构进行数据挖掘的算法,该方法使用现有的XML解析工具JAVA DOM对XML文件进行解析,形成XML文档树,把XML中的标签按照层次作为标记路径存储起来,再对标记路径进行关联规则挖掘,得到频繁事务。通过实验表明,只有当XML的结构呈不规则时,挖掘效率才会随最小支持度的增大而提高。 An algorithm based on structure of XML was proposed. XML was parsed using JAVA DOM in order to get XML document tree. The label of XML was stored as label path. Then, frequent transactions were obtained through mining association rules on label paths. The results show that if only the structure of XML is anomaly, the efficiency will be improved when minimal support is increased.
出处 《石油化工高等学校学报》 EI CAS 2007年第1期94-98,共5页 Journal of Petrochemical Universities
基金 北京市教育委员会科技发展计划面上项目(KM200510017006)
关键词 XML文档 标记路径 关联规则 数据挖掘 频繁事务 XML document Path label Association rules~ Data mining Frequent transaction
  • 相关文献

参考文献8

  • 1潘有能,邓三鸿.基于XML和关联规则的Web挖掘研究[J].现代图书情报技术,2004(7):30-34. 被引量:9
  • 2Richi N,Rebecca W,Anton T.Data mining and XML documents:proceedings of the international conference on internet computing[C].USA:[s.n.],2002:660-666.
  • 3Jacky W W W,Gillian D.Mining association rules from XML data using xquery:proceedings of the second workshop on Australian information security,Data mining and web intelligence,and software internationalization[C].USA:[s.n.],2004,32:169-174.
  • 4Qin Ding,Kevin Ricords.Deriving general association rules from XML data:proceedings of the ACIS fourth international conference on software engineering,Artificial intelligence,Networking and parallel/distributed computing (SNPD03)[C].Germany:[s.n.],2003:348-352.
  • 5韩家炜 Michelin K.数据挖掘:概念与技术[M].北京:机械工业出版社,2001..
  • 6李晓明,凤旺森.两种对URL的散列效果很好的函数[J].软件学报,2004,15(2):179-184. 被引量:45
  • 7郑仕辉,周傲英,张龙.XML文档的相似测度和结构索引研究[J].计算机学报,2003,26(9):1116-1122. 被引量:28
  • 8赵妍,逄玉俊,文东丽.从样本数据中提取模糊规则的算法研究[J].石油化工高等学校学报,2004,17(3):83-88. 被引量:4

二级参考文献46

  • 1徐振航,刘莉芹.XML与面向Web的数据挖掘技术[J].软件世界,2000(10):120-122. 被引量:16
  • 2XQuery: A query language for XML. W3C Working Draft 15February 2001, available: http://www. w3. org/TR/xquery/.
  • 3Tarjan. Three partition refinement algorithms. SIAM Journalon Computing, 1987, 16(6): 973-989.
  • 4Henzinger M R, Henzinger T A, Kopke P W. Computing sim-ulations on finite and infinite graphs. In: Proceedings of the36th Annual IEEE Symposium on Foundations of ComputerScience, Milwaukee, Wisconsin, 1995. 453-462.
  • 5Marian A, Abiteboul S, Cobena G, Mignet L. Change-centricmanagement of versions in an XML warehouse. In: Proceed-ings of the 27th International Conference on Very Large DataBases, Roma, Italy,2001. 581-590.
  • 6Goldman R, Widom J. Summarizing and searching sequential semistructured sources. Stanford University: Technical ReportTR20000312, 2000.
  • 7Zheng Shi-Hui, Zhou Ao-Ying et al. Structure-based approximate searching in XML data. Fudan University: Technical Report TR20010203,2001.
  • 8Wang J T-L, Shasha D etal. Structural matching and discovery in document databases. Sigmod Record, 1997, 26(2): 560-564.
  • 9Zhang K. A constrained editing distance between unordered labeled trees. Journal of Algorithmica, 1996, 15(3): 205-222.
  • 10Zhang K, Shasha D. On the editing distance between unordered labeled trees. Information Processing Letters, 1992, 42(3): 133-139.

共引文献142

同被引文献32

引证文献3

二级引证文献9

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部