期刊文献+

A Spark Scheduling Strategy for Heterogeneous Cluster 被引量:1

在线阅读 下载PDF
导出
摘要 As a main distributed computing system,Spark has been used to solve problems with more and more complex tasks.However,the native scheduling strategy of Spark assumes it works on a homogenized cluster,which is not so effective when it comes to heterogeneous cluster.The aim of this study is looking for a more effective strategy to schedule tasks and adding it to the source code of Spark.After investigating Spark scheduling principles and mechanisms,we developed a stratifying algorithm and a node scheduling algorithm is proposed in this paper to optimize the native scheduling strategy of Spark.In this new strategy,the static level of nodes is calculated,the dynamic factors such as the length of running tasks,and CPU usage of work nodes are considered comprehensively.And through a series of comparative experiments in alienation cluster,the new strategy costs less running time and lower CPU usage rate than the original Spark strategy,which verifies that the new schedule strategy is more effective one.
出处 《Computers, Materials & Continua》 SCIE EI 2018年第6期405-417,共13页 计算机、材料和连续体(英文)
基金 This work is supported by the National Natural Science Foundation of China(Grant No.61472248,61772337) the SJTU-Shanghai Songheng Content Analysis Joint Lab.
  • 相关文献

参考文献1

二级参考文献10

  • 1The Spark Software Foundation.Spark[EB/OL].[2015-01-08].http://spark.apache.org.
  • 2The Apache Software Foundation.Hadoop[EB/OL].[2015-01-08].http://hadoop.apache.org.
  • 3Xu Xiaolong,Cao Lingling,Wang Xinheng.Adaptive Task Scheduling Strategy Based on Dynamic Workload Adjustment for Heterogeneous Hadoop Clusters[J].IEEE Systems Journal,2014,(99):1-12.
  • 4Nightingale E B,Chen P M,Flinn J.Speculative Execution in a Distributed File System[J].ACM Transactions on Computer Systems,2006,24(4):361-392.
  • 5Yong M,Garegrat N,Mohan S.Towards a Resource Aware Scheduler in Hadoop[C]//Proceedings of the 7th IEEE International Conference on Web Services.Los Angeles,USA:IEEE Computer Society,2009:102-109.
  • 6Zaharia M,Chowdhury M,Das T,et al.Resilient Distributed Datasets:A Fault-tolerant Abstraction for In-memory Cluster Computing,UCB/EECS-2011-82[R].University of California,Berkeley,2012.
  • 7Zaharia M,Chowdhury M,Franklin M J,et al.Spark:Cluster Computing with Working Sets,UCB/EECS-2010-53[R].University of California,Berkeley,2010.
  • 8Guo Zhenhua,Fox G,Zhou Mo.Investigation of Data Locality in MapReduce[C]//Proceedings of the 12th IEEE/ACM International Symposium on Cluster,Cloud and Grid Computing.Ottawa,Canada:IEEE Computer Society,2012:419-426.
  • 9Typesafe Inc.akka[EB/OL].[2015-01-08].http://akka.io/.
  • 10Massie M,Li B,Nickoles B,et al.Monitoring with Ganglia[M].Sebastopol,USA:O'Reilly Media,2012.

共引文献19

同被引文献2

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部