A variation-aware task mapping approach is proposed for a multi-core network-on-chips with redundant cores, which includes both the design-time mapping and run-time scheduling algorithms. Firstly, a design-time geneti...A variation-aware task mapping approach is proposed for a multi-core network-on-chips with redundant cores, which includes both the design-time mapping and run-time scheduling algorithms. Firstly, a design-time genetic task mapping algorithm is proposed during the design stage to generate multiple task mapping solutions which cover a maximum range of chips. Then, during the run, one optimal task mapping solution is selected. Additionally, logical cores are mapped to physically available cores. Both core asymmetry and topological changes are considered in the proposed approach. Experimental results show that the performance yield of the proposed approach is 96% on average, and the communication cost, power consumption and peak temperature are all optimized without loss of performance yield.展开更多
文摘A variation-aware task mapping approach is proposed for a multi-core network-on-chips with redundant cores, which includes both the design-time mapping and run-time scheduling algorithms. Firstly, a design-time genetic task mapping algorithm is proposed during the design stage to generate multiple task mapping solutions which cover a maximum range of chips. Then, during the run, one optimal task mapping solution is selected. Additionally, logical cores are mapped to physically available cores. Both core asymmetry and topological changes are considered in the proposed approach. Experimental results show that the performance yield of the proposed approach is 96% on average, and the communication cost, power consumption and peak temperature are all optimized without loss of performance yield.
文摘顺序任务流(sequential task flow,STF)将对共享数据的访问表示为任务之间的依赖关系,STF运行时系统通过任务构造、依赖分析和任务依赖图(task dependence graph,TDG)生成、任务调度实现异步并行,这3个环节的开销直接影响并行程序的性能.目前以STF为核心的AceMesh运行时系统,在SW39000处理器上仅使用单主核构图、多从核执行的方式.然而,SW39000处理器离散访存性能较弱,细粒度任务构图离散访存增多,构图更容易成为瓶颈.对此,提出了一种利用多从核辅助主核进行构图的算法.首先,分析在依赖分析和TDG生成过程中的并行性,在SW39000处理器上实现了一种基于胖任务依赖图(fatTDG)的多核辅助并行构图算法PFBH(parallelized fatTDG building algorithm with helpers)并进行优化.其次,针对线程间的主存资源竞争问题,提出构图与执行并行中从核资源调节方法及参数选择.最终,在5类典型应用下进行实验测试.与单核串行构图系统相比,在细粒度任务场景下最高加速为1.75倍;与SW39000处理器上的OpenACC模型相比,AceMesh最高可达2倍加速.