多核机群上通信高效的整数序列并行排序方法被引量：2

Communication-efficient parallel sorting integers sequence on multi-core cluster

下载PDF

导出

摘要建立一个适用于整数序列排序的数据分配模型,在多核计算节点组成的异构机群上设计通信高效的整数序列并行算法。所提出的数据分配模型依据机群中各节点不同的计算能力、通信速率和存储容量,动态计算出调度分配给各节点的数据块的大小以平衡各个节点的负载。所设计的并行排序算法利用整数序列的特性,主节点采取两轮分发数据与接收结果的方法,从节点运用分桶打包方式返回有序的整数子序列给主节点,主节点采用桶映射方法将各个有序子序列直接整合成最终有序序列,以减少需要耗费较多通信时间的数据归并操作。分析与实验测试结果表明,给出的多核机群上的整数序列并行排序算法高效,具有良好的可扩展性。 A data distribution strategy and a communication-efficient parallel algorithm for sorting integers sequence were proposed on the heterogeneous cluster with multi-core machines. The presented data distribution model properly utilized different computation speed, communication rate and memory capacity of each computing node to dynamically compute the size of the data block to be assigned to each node to balance the loads among nodes. In the proposed parallel sorting algorithm, making use of the characteristic of integers sequence, master node distributed the data blocks to the salve nodes and received the sorted subsequences with two-round mode, each salve node returned its sorted subsequence to master node by bucket- packing method, and master node linked its received sorted subsequences to form directly a final sorted sequence by the bucket mapping in order to reduce the data merge operations with large communication cost. The analysis and experimental results on the heterogeneous cluster with multi-core machines show that the presented parallel sorting integers sequence algorithm is efficient and scalable.

作者柯琦钟诚陈清媛陆向艳

机构地区广西大学计算机与电子信息学院广西财经学院信息与统计学院

出处《计算机应用》 CSCD 北大核心 2013年第3期821-824,共4页 journal of Computer Applications

基金国家自然科学基金资助项目(60963001)

关键词整数排序并行算法多核机群数据分配 integers sorting parallel algorithm muhi-eore cluster data distribution

分类号 TP338.6 [自动化与计算机技术—计算机系统结构] TP301.6 [自动化与计算机技术—计算机系统结构]

引文网络
相关文献

参考文献13

1INOUE H,MORIYAMA T,KOMATSU H. AA-Sort:a new parallel sorting algorithm for multi-core SIMD processors[A].Washington,DC:IEEE Computer Society,2007.189-198.
2RAMPRASAD N,BARUAH P K. Radix sort on the cell broadband engine[A].Piscataway,NJ:IEEE Press,2007.
3CEDERMAN D,TSIGAS P. On sorting and load balancing on GPU[J].ACM SIGARCH Computer Architecture News,2008,(05):11-18.
4GREB A,ZACHMANN G. GPU-ABiSort:optimal parallel sorting on stream architectures[A].Washington,DC:IEEE Computer Society,2006.25-29.
5HAO S,DU Z,BADER D. A partition-merge based cacheconscious parallel sorting algorithm for CMP with shared cache[A].Washington,DC:IEEE Computer Society,2009.396-403.
6HULT(E)N R,KESSLER C W,KELLER J. Optimized on-chip-pipelined mergesort on the Cell/B.E[A].Beilin:Springer-Verlag,2010.187-198.
7SATISH N,KIM C,CHHUGANI J. Fast sort on CPUs and GPUs:a case for bandwidth oblivious SIMD sort[A].New York:acm Press,2010.351-362.
8ZHONG C,QU Z Y,YANG F. Efficient and scalable threadlevel parallel algorithms for sorting multisets on multi-core systems[J].Journal of Computers,2012,(01):30-41.
9ZHONG C,KE Q,LIU J. Thread-level parallel algorithm for sorting integer sequence on multi-core computers[A].Washington,DC:IEEE Computer Society,2011.37-41.
10ZHONG C,FENG P,YIN M X. Sampling-based cache-efficient parallel sorting on multi-core systems[J].Journal of Computer Information Systems,2012,(08):6713-6722.

同被引文献13

1赵永华,迟学斌.基于SMP集群的MPI+OpenMP混合编程模型及有效实现[J].微电子学与计算机,2005,22(10):7-11. 被引量：33
2XuanWang,Wen-jingLI,Ze-yuTang,Weizhi LIAO. Research and Implementation of Petri Nets Parallelization Model[A].武汉理工大学.Proceedings of the 13th International Symposium on Distributed Computing and Applications to Business, Engineering & Science(DCABES 2014)[C].武汉理工大学,2014:5.
3.L.M.Kristensen,M. Westergaard. Automatic Structure-Based Code Generationfrom Coloured Petri Nets: A Proof of Concept. In Proc.OfFMICS'lO,LNCS,pages 215-230. Springer,2010.
4StavrosI. Souravlas,Manos Roumeliotis. Petri Net Modeling and Simulation of Pipelined Redistributions for a Deadlock-Free System[J].Cogent Engineering, 2015,21.
5JulianaM.N. Silva, Cristina Boeres, Lucia M.A. Drummond, Artur A. Pessoa. Memory Aware Load Balance Strategy on a ParallelBranch-and-Bound ApplicationfJ]. Concurrency Computat.: Pract. Exper., 2015,275.
6李文敬,王汝凉,廖伟志.基于P-不变量的Petri网并行化方法的研究[J].计算机工程与设计,2009,30(16):3758-3761. 被引量：2
7李文敬,廖伟志,王汝凉.Petri网系统的功能划分及其并行算法[J].计算机工程,2009,35(21):48-50. 被引量：9
8潘卫,陈燎原,张锦华,李永革,潘莉,夏凡.基于SMP集群的MPI+OpenMP混合编程模型研究[J].计算机应用研究,2009,26(12):4592-4594. 被引量：19
9陈辉,孙雷鸣,李录明,罗省贤,赵安军.基于MPI+OpenMP的多层次并行偏移算法研究[J].成都理工大学学报（自然科学版）,2010,37(5):528-534. 被引量：8
10林英,孟正,康雁,于倩.多核下一种线程调度算法的研究与实现[J].计算机技术与发展,2013,23(10):19-22. 被引量：2

引证文献2

1王玄,李文敬.基于多核机群的Petri网系统并行化模型的研究[J].现代计算机（中旬刊）,2016(4):12-17.
2周杰,李文敬.基于三层混合编程模型的Petri网并行算法研究[J].计算机科学,2017,44(B11):586-591. 被引量：2

二级引证文献2

1张婷,李文敬,黄帆.基于多核PC的MAP记录表冲突规避算法[J].计算机工程与设计,2020,41(12):3419-3424.
2郝磊,耿宏.基于改进谓词/变迁网的飞行方式指示器模型研究[J].计算机应用与软件,2023,40(7):77-84.

1王秋芬,王保胜.一种针对任意整数序列的超“快速排序”算法研究[J].南阳理工学院学报,2010,2(2):31-35. 被引量：1
2刘群,黄朔.基于遗传优化的WSNs多源单汇路由算法[J].辽宁工程技术大学学报（自然科学版）,2008,27(5):742-744.
3王洁,曾宇,张建林.多核机群下基于神经网络的MPI运行时参数优化[J].计算机科学,2010,37(6):229-232. 被引量：3
4柴乔林,仲肇铭.整数排序—一种新的图排序算法及应用[J].新浪潮,1995(7):1-3.
5李磊.最优并行排序算法[J].计算机研究与发展,1990,27(6):40-42. 被引量：3
6杨利,朱和,周兴铭.三个并行排序算法的可扩充性分析[J].国防科技大学学报,1995,17(4):66-74.
7王玄,李文敬.基于多核机群的Petri网系统并行化模型的研究[J].现代计算机（中旬刊）,2016(4):12-17.
8SQL Anywhere 11中文版[J].微电脑世界,2009(1):116-116.
9王燕军.计算机程序设计中的排序问题研究[J].电子技术与软件工程,2016(15):255-255. 被引量：1
10梁文忠.一种基于冒泡排序算法的改进[J].梧州学院学报,2009,19(3):61-65. 被引量：2

计算机应用

2013年第3期

浏览历史

内容加载中请稍等...

多核机群上通信高效的整数序列并行排序方法被引量：2

参考文献13

同被引文献13

引证文献2

二级引证文献2

相关作者

相关机构

相关主题

浏览历史

多核机群上通信高效的整数序列并行排序方法 被引量：2

参考文献13

同被引文献13

引证文献2

二级引证文献2

相关作者

相关机构

相关主题

浏览历史

多核机群上通信高效的整数序列并行排序方法被引量：2