摘要
随着分布式计算越来越受到欢迎,相应技术也得到了很大发展.在分析分布式系统、容错基本概念的基础上,重点介绍了所设计与实现的一个分布式容错系统软件.该软件以Linux系统为基础,能在多节点间完成简单的消息发送与接收,能对各节点上的软硬件故障进行监测,并且能按照规则对故障机器上任务实施迁移.
With the population of distributing computing, the related technology has gained rapid progress. Based on the analysis of the concept of distributed system and fault-tolerance, a distributed fault-tolerance system software designed and implemented by the writers is mainly introduced in the paper. The system software is based on Linux, which can accomplish simple message sending and receiving across multi nodes, supervise the software and hardware malfunction on each node and implement the task migration on the malfunction machine according to the established rules.
出处
《装备指挥技术学院学报》
2005年第4期84-87,共4页
Journal of the Academy of Equipment Command & Technology
基金
国家重大基础研究项目