摘要
YARN是Hadoop中广泛应用的资源管理系统,支持MapReduce,Spark,Storm等多种计算框架,已成为大数据生态中的核心组件。然而,在Hadoop YARN现有的资源调度器中,采用基于资源预留的资源保障机制,会产生资源碎片,导致资源浪费。为提高集群的资源利用率和吞吐量,本文提出一种基于预约回填的资源分配机制。在该机制中,基于作业的优先级来决定是否对资源进行预约,并引入回填策略,在不影响预约作业执行的情况下,对资源进行回填使用。实验表明,使用基于预约回填的资源调度机制能够有效提高Hadoop YARN集群的资源利用率和吞吐量。
YARN is a resource management system widely used in Hadoop.It supports MapReduce,Spark,Storm and other computing frameworks,and has become the core component of big data ecology.However,in Hadoop YARN's existing resource scheduler,a resource guarantee mechanism based on resource reservation,will produce resource fragmentations,leading to a waste of resources.In order to improve the resource utilization and throughput of the cluster,this paper proposes a resource allocation mechanism based on reservation and backfill.In this mechanism,based on the priority of the job,it decides whether to make a reservation to the resource and introduce a backfill strategy to backfill the resource without affecting the execution of the reservation job.Experiments show that the resource scheduling mechanism based on reserved backfill can effectively improve the resource utilization and throughput of Hadoop YARN cluster.
出处
《计算机与现代化》
2017年第11期29-34,共6页
Computer and Modernization