摘要
树形数据排序是XML数据处理中一个基本问题.提出了一种XML文档高能效排序算法―EEXPSort.该算法扫描XML文档产生相互独立的排序任务,利用多核CPU对排序任务进行并行处理;同时采用数据压缩、单临时文件存储以及避免子树匹配等策略,有效地减少磁盘IO和CPU操作时间.对不同特性的XML文档开展了大量比较实验,结果表明所提算法能效优于现有性能最好的树形数据排序算法HERMES.
A fundamental problem in XML data handling is hierarchical data sorting. This paper presents an energy- efficient sorting algorithm called EEXPSort for XML document. It exploits multi-core CPU to parallelize the executions of the mutually independent tasks generated by scanning the XML document; For energy-efficiency, it employs data compression, single temporary-file storage and avoidance of tree-matching to effectively reduce disk lOs and CPU process. Extensive experiments on XML documents with different characteristics show that EEXPSort outperforms the existing quickest XML sorting schemes HERMES significantly in energy-efficiency and performance.
出处
《计算机系统应用》
2012年第12期108-112,107,共6页
Computer Systems & Applications
基金
国家自然科学基金(61070042)
浙江省自然科学基金(Y1090096)
关键词
XML文档
树形数据
能效
排序算法
优化策略
xml document
hierarchical data
energy-efficient
sorting algorithm
optimization strategy