摘要
ion remains significant potential.This paper proposes an enhanced MapReduce framework for geo-distributed supercomputing Internet to minimize the necessity for data transmission across data centers.Leveraging hierarchical scheduling techniques,the framework optimizes data locality to mitigate network latency and bandwidth consumption during reduce operations,thereby reducing overall job execution times.The paper introduces a mathematical model for task scheduling within supercomputing Internet and formally describes the data transmission process among data centers.In the job scheduling phase,our framework facilitates efficient overlap of transferring and computing through pre-selected data centers.Meanwhile,in the data transmission phase,the framework aggregate data to reduce the frequency of transmission,thus alleviating the adverse effects on transmission of hierarchical network architecture.Comparative analysis with existing methods demonstrates the efficacy of the proposed framework in addressing similar computational challenges.Empirical evaluations underscore the effectiveness of our method in practice.
基金
supported by the National Natural Science Foundation of China(Grant Nos.62225205,92055213,62302160)
the Natural Science Foundation of Hunan Province(Grant Nos.2024JJ6154)
the Science and Technology Program of Changsha(kh2301011)
Shenzhen Basic Research Project(Natural Science Foundation)(JCYJ20210324140002006).