In the era of the Internet of Things,distributed computing alleviates the problem of insufficient terminal computing power by integrating idle resources of heterogeneous devices.However,the imbalance between task exec...In the era of the Internet of Things,distributed computing alleviates the problem of insufficient terminal computing power by integrating idle resources of heterogeneous devices.However,the imbalance between task execution delay and node energy consumption,and the scheduling and adaptation challenges brought about by device heterogeneity,urgently need to be addressed.To tackle this problem,this paper constructs a multi-objective real-time task scheduling model that considers task real-time performance,execution delay,system energy consumption,and node interests.The model aims to minimize the delay upper bound and total energy consumption while maximizing system satisfaction.A real-time task scheduling algorithm based on bilateral matching game is proposed.By designing a bidirectional preference mechanism between tasks and computing nodes,combined with a multi-round stable matching strategy,accurate matching between tasks and nodes is achieved.Simulation results show that compared with the baseline scheme,the proposed algorithm significantly reduces the total execution cost,effectively balances the task execution delay and the energy consumption of compute nodes,and takes into account the interests of each network compute node.展开更多
国家超算互联网科学计算智能体日前在天津正式发布。该智能体通过自然语言交互,可自动完成科研任务问题拆解、算力资源调度、计算软件调用、结果分析与报告生成,将传统模式下需1天完成的工作缩短至约1小时完成。当前,人工智能正深刻重...国家超算互联网科学计算智能体日前在天津正式发布。该智能体通过自然语言交互,可自动完成科研任务问题拆解、算力资源调度、计算软件调用、结果分析与报告生成,将传统模式下需1天完成的工作缩短至约1小时完成。当前,人工智能正深刻重塑科学研究和工程创新模式。随着AI for Science (人工智能驱动的科学研究)的深入发展,科研活动对算力的需求持续增长,也对算力的组织、调度与应用方式提出了更高要求。展开更多
针对气象数值预报应用的特点及气象高性能计算资源调度管理的需求,基于Slurm(Simple Linux Utility for Resource Management)作业调度系统,在中国气象局派-曙光高性能计算机系统上提出了一套精细化的资源调度管理方法。该方法通过优化...针对气象数值预报应用的特点及气象高性能计算资源调度管理的需求,基于Slurm(Simple Linux Utility for Resource Management)作业调度系统,在中国气象局派-曙光高性能计算机系统上提出了一套精细化的资源调度管理方法。该方法通过优化调度策略与灵活的资源分区配置,从系统层面实现了气象实时业务运行保障与作业吞吐量、调度效率之间的平衡,实现了资源的高效利用;同时,引入服务质量(QoS)机制,动态调整作业优先级与资源配额,从用户层面进一步确保了资源分配的公平性与调度灵活性。系统资源使用及作业运行数据表明,该方法在保障气象实时业务稳定运行的同时,有效提高了研发作业的完成效率,确保系统整体资源的高效利用,在派-曙光高性能计算机系统上取得了良好的应用效果,对高性能计算资源在复杂应用场景下的合理调度和利用具有很好的实用性和参考意义。展开更多
基金Supported by the National Program on Key Basic Research Project(2020YFA0713600)the National Natural Science Foundation of China(62272214)。
文摘In the era of the Internet of Things,distributed computing alleviates the problem of insufficient terminal computing power by integrating idle resources of heterogeneous devices.However,the imbalance between task execution delay and node energy consumption,and the scheduling and adaptation challenges brought about by device heterogeneity,urgently need to be addressed.To tackle this problem,this paper constructs a multi-objective real-time task scheduling model that considers task real-time performance,execution delay,system energy consumption,and node interests.The model aims to minimize the delay upper bound and total energy consumption while maximizing system satisfaction.A real-time task scheduling algorithm based on bilateral matching game is proposed.By designing a bidirectional preference mechanism between tasks and computing nodes,combined with a multi-round stable matching strategy,accurate matching between tasks and nodes is achieved.Simulation results show that compared with the baseline scheme,the proposed algorithm significantly reduces the total execution cost,effectively balances the task execution delay and the energy consumption of compute nodes,and takes into account the interests of each network compute node.
文摘国家超算互联网科学计算智能体日前在天津正式发布。该智能体通过自然语言交互,可自动完成科研任务问题拆解、算力资源调度、计算软件调用、结果分析与报告生成,将传统模式下需1天完成的工作缩短至约1小时完成。当前,人工智能正深刻重塑科学研究和工程创新模式。随着AI for Science (人工智能驱动的科学研究)的深入发展,科研活动对算力的需求持续增长,也对算力的组织、调度与应用方式提出了更高要求。
文摘针对气象数值预报应用的特点及气象高性能计算资源调度管理的需求,基于Slurm(Simple Linux Utility for Resource Management)作业调度系统,在中国气象局派-曙光高性能计算机系统上提出了一套精细化的资源调度管理方法。该方法通过优化调度策略与灵活的资源分区配置,从系统层面实现了气象实时业务运行保障与作业吞吐量、调度效率之间的平衡,实现了资源的高效利用;同时,引入服务质量(QoS)机制,动态调整作业优先级与资源配额,从用户层面进一步确保了资源分配的公平性与调度灵活性。系统资源使用及作业运行数据表明,该方法在保障气象实时业务稳定运行的同时,有效提高了研发作业的完成效率,确保系统整体资源的高效利用,在派-曙光高性能计算机系统上取得了良好的应用效果,对高性能计算资源在复杂应用场景下的合理调度和利用具有很好的实用性和参考意义。