期刊文献+

高性能互联网络交换机研究与设计 被引量:3

Research and Design of High Performance Interconnection Network Switch
在线阅读 下载PDF
导出
摘要 高性能互联网络交换机是高性能计算机系统的核心部件.科学计算作为高性能计算机的上层应用,不仅要求交换机具有低延迟、高带宽的特性,还要求其在集合通信如广播、多播和同步操作等进行硬件级支持.HyperLink交换机,作为曙光5000计算机系统互联网络的重要组成部件,具有38.4ns单级延迟和160Gbps聚合带宽,并能够同时支持16组多播和16组同步操作.理想情况下,1024个节点多播和同步操作可以在2μs内完成,大大加速了科学计算的性能.为了对HyperLink交换机性能进行评价,建立了周期精确的仿真模型.通过模拟证明,对于16端口输入缓冲交换机,3个虚通道是性价比最好的选择;当MTU为1KB时,4KB大小的输入缓冲就可达到最高单播吞吐率.采用理论分析的方法比较了具有相同网络带宽的多轨网络和单轨网络,分析表明,前者可以有效降低网络延迟,因此能够比后者提供更高的网络吞吐率.采用LogP模型分析了HyperLink多播和Barrier的性能,分析表明,HyperLink交换机具有良好扩展性,能够很好支持到数千节点. High performance interconnection network switch plays a critical role in high performance computing (HPC) systems. As upper layer applications of the HPC, scientific computations demand not only low latency and high bandwidth of switch, but also hardware support of collective communications, such as broadcast, multicast, and barrier, etc. HyperLink switch, the core component of Dawning 5000 interconnection networks, has 38. 4ns single stage latency and 160 Gbps aggregated bandwidth, furthermore it supports 16 multicast groups and 16 barrier groups simultaneously. In the ideal condition, 1024 nodes can finish multicast and barrier operations within 2μs, which greatly improves the performance of scientific application. A cycle-accurate switch model is also built to evaluate switch performances. The simulation proves that 3 virtual channels are the best performance-cost choice for 16-port input-buffered switch, and that 4 KB input buffer is sufficient for 1 KB MTU switch to achieve the highest unicast throughput. A comparison between multi-rail networks and single-rail networks which have the same bandwidth as multi-rail networks is also given in theoretical analyses. It is shown that the former could effectively minimize the network latency, and thus provides much higher network throughput than the latter. The LogP model is employed to evaluate HyperLink multicast and barrier performances, which shows that the HyperLink switch has good scalability, easily supporting up to thousands of nodes.
出处 《计算机研究与发展》 EI CSCD 北大核心 2008年第12期2069-2078,共10页 Journal of Computer Research and Development
基金 国家"八六三"高技术研究发展计划基金项目(2006AA01A102)~~
关键词 互联网络 交换机 集合通信 多播 同步 ASIC设计 interconnection networks switch collective communication multicast barrier ASIC design
  • 相关文献

参考文献21

  • 1Salvador C. A strategy for efficient and scalable collective communication in the quadrics network [D]. Valencia, Spain: Electronic Engineering Departrnent, Technical University of Valencia, 2005.
  • 2Riesen R. Communication patterns [C] //Proc of Workshop on Communication Architecture for Clusters. Los Alamitos, CA: IEEE Computer Society, 2006.
  • 3Quadrics QsNetII: A network for supereomputing applications [OL]. [2007-10-12]. http://www. quadrics. com/.
  • 4Scott S, Abts D, Kim J, et al. The BlackWidow high-radix Clos network [C] //Proc of the 33rd Annual Int Symp on Computer Architecture. Los Alamitos, CA: IEEE Computer Society, 2006:16-28.
  • 5Liu Jiuxing, Mamidala Amith R, Dhabaleswar K Panda. Fast and scalable MPI-level broadcast using InfiniBand's hardware multicast Support [C] //Proc of Int Parallel and Distributed Processing Symposium ( IPDPS 2004 ). Piscataway, NJ: IEEE, 2004.
  • 6Myricom. Myrinet Overview [OL]. [2007-05-17]. http:// www. myricom. com/myrinet/overview/.
  • 7Filch J, Lopez P, Malumbres M P, et al. Boosting the performance of Myrinet networks [J]. IEEE Trans on Parallel and Distributed Systems, 2002, 13(11): 1166-1182.
  • 8Mellanox. Mellanox InfiniScaleTM III [OL]. http://www. mellanox. com/pdf/products/silicon/InfiniScalelII. pdf.
  • 9The Message Passing Interface (MPI) standard [OL]. [2007-05-18]. http://www-unix. mcs. anl. gov/mpi/.
  • 10Berkeley UPC-Unified Parallel C [OL]. [2007-05-18]. http://upc. lbl. gov/.

同被引文献20

引证文献3

二级引证文献32

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部