期刊文献+

多核机群上通信高效的整数序列并行排序方法 被引量:2

Communication-efficient parallel sorting integers sequence on multi-core cluster
在线阅读 下载PDF
导出
摘要 建立一个适用于整数序列排序的数据分配模型,在多核计算节点组成的异构机群上设计通信高效的整数序列并行算法。所提出的数据分配模型依据机群中各节点不同的计算能力、通信速率和存储容量,动态计算出调度分配给各节点的数据块的大小以平衡各个节点的负载。所设计的并行排序算法利用整数序列的特性,主节点采取两轮分发数据与接收结果的方法,从节点运用分桶打包方式返回有序的整数子序列给主节点,主节点采用桶映射方法将各个有序子序列直接整合成最终有序序列,以减少需要耗费较多通信时间的数据归并操作。分析与实验测试结果表明,给出的多核机群上的整数序列并行排序算法高效,具有良好的可扩展性。 A data distribution strategy and a communication-efficient parallel algorithm for sorting integers sequence were proposed on the heterogeneous cluster with multi-core machines. The presented data distribution model properly utilized different computation speed, communication rate and memory capacity of each computing node to dynamically compute the size of the data block to be assigned to each node to balance the loads among nodes. In the proposed parallel sorting algorithm, making use of the characteristic of integers sequence, master node distributed the data blocks to the salve nodes and received the sorted subsequences with two-round mode, each salve node returned its sorted subsequence to master node by bucket- packing method, and master node linked its received sorted subsequences to form directly a final sorted sequence by the bucket mapping in order to reduce the data merge operations with large communication cost. The analysis and experimental results on the heterogeneous cluster with multi-core machines show that the presented parallel sorting integers sequence algorithm is efficient and scalable.
出处 《计算机应用》 CSCD 北大核心 2013年第3期821-824,共4页 journal of Computer Applications
基金 国家自然科学基金资助项目(60963001)
关键词 整数排序 并行算法 多核机群 数据分配 integers sorting parallel algorithm muhi-eore cluster data distribution
  • 相关文献

参考文献13

  • 1INOUE H,MORIYAMA T,KOMATSU H. AA-Sort:a new parallel sorting algorithm for multi-core SIMD processors[A].Washington,DC:IEEE Computer Society,2007.189-198.
  • 2RAMPRASAD N,BARUAH P K. Radix sort on the cell broadband engine[A].Piscataway,NJ:IEEE Press,2007.
  • 3CEDERMAN D,TSIGAS P. On sorting and load balancing on GPU[J].ACM SIGARCH Computer Architecture News,2008,(05):11-18.
  • 4GREB A,ZACHMANN G. GPU-ABiSort:optimal parallel sorting on stream architectures[A].Washington,DC:IEEE Computer Society,2006.25-29.
  • 5HAO S,DU Z,BADER D. A partition-merge based cacheconscious parallel sorting algorithm for CMP with shared cache[A].Washington,DC:IEEE Computer Society,2009.396-403.
  • 6HULT(E)N R,KESSLER C W,KELLER J. Optimized on-chip-pipelined mergesort on the Cell/B.E[A].Beilin:Springer-Verlag,2010.187-198.
  • 7SATISH N,KIM C,CHHUGANI J. Fast sort on CPUs and GPUs:a case for bandwidth oblivious SIMD sort[A].New York:acm Press,2010.351-362.
  • 8ZHONG C,QU Z Y,YANG F. Efficient and scalable threadlevel parallel algorithms for sorting multisets on multi-core systems[J].Journal of Computers,2012,(01):30-41.
  • 9ZHONG C,KE Q,LIU J. Thread-level parallel algorithm for sorting integer sequence on multi-core computers[A].Washington,DC:IEEE Computer Society,2011.37-41.
  • 10ZHONG C,FENG P,YIN M X. Sampling-based cache-efficient parallel sorting on multi-core systems[J].Journal of Computer Information Systems,2012,(08):6713-6722.

同被引文献13

引证文献2

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部