Edge machine learning creates a new computational paradigm by enabling the deployment of intelligent applications at the network edge.It enhances application efficiency and responsiveness by performing inference and t...Edge machine learning creates a new computational paradigm by enabling the deployment of intelligent applications at the network edge.It enhances application efficiency and responsiveness by performing inference and training tasks closer to data sources.However,it encounters several challenges in practice.The variance in hardware specifications and performance across different devices presents a major issue for the training and inference tasks.Additionally,edge devices typically possess limited network bandwidth and computing resources compared with data centers.Moreover,existing distributed training architectures often fail to consider the constraints of resources and communication efficiency in edge environments.In this paper,we propose DSparse,a method for distributed training based on sparse update in edge clusters with various memory capacities.It aims at maximizing the utilization of memory resources across all devices within a cluster.To reduce memory consumption during the training process,we adopt sparse update to prioritize the updating of selected layers on the devices in the cluster,which not only lowers memory usage but also reduces the data volume of parameters and the time required for parameter aggregation.Furthermore,DSparse utilizes a parameter aggregation mechanism based on multi-process groups,subdividing the aggregation tasks into AllReduce and Broadcast types,thereby further reducing the communication frequency for parameter aggregation.Experimental results using the MobileNetV2 model on the CIFAR-10 dataset demonstrate that DSparse reduces memory consumption by an average of 59.6%across seven devices,with a 75.4%reduction in parameter aggregation time,while maintaining model precision.展开更多
In this paper, we establish a class of sparse update algorithm based on matrix triangular factorizations for solving a system of sparse equations. The local Q-superlinear convergence of the algorithm is proved without...In this paper, we establish a class of sparse update algorithm based on matrix triangular factorizations for solving a system of sparse equations. The local Q-superlinear convergence of the algorithm is proved without introducing an m-step refactorization. We compare the numerical results of the new algorithm with those of the known algorithms, The comparison implies that the new algorithm is satisfactory.展开更多
基金supported by the National Natural Science Foundation of China under Grant Nos.62072434 and U23B2004the Innovation Funding of Institute of Computing Technology,Chinese Academy of Sciences,under Grant Nos.E361050 and E361030.
文摘Edge machine learning creates a new computational paradigm by enabling the deployment of intelligent applications at the network edge.It enhances application efficiency and responsiveness by performing inference and training tasks closer to data sources.However,it encounters several challenges in practice.The variance in hardware specifications and performance across different devices presents a major issue for the training and inference tasks.Additionally,edge devices typically possess limited network bandwidth and computing resources compared with data centers.Moreover,existing distributed training architectures often fail to consider the constraints of resources and communication efficiency in edge environments.In this paper,we propose DSparse,a method for distributed training based on sparse update in edge clusters with various memory capacities.It aims at maximizing the utilization of memory resources across all devices within a cluster.To reduce memory consumption during the training process,we adopt sparse update to prioritize the updating of selected layers on the devices in the cluster,which not only lowers memory usage but also reduces the data volume of parameters and the time required for parameter aggregation.Furthermore,DSparse utilizes a parameter aggregation mechanism based on multi-process groups,subdividing the aggregation tasks into AllReduce and Broadcast types,thereby further reducing the communication frequency for parameter aggregation.Experimental results using the MobileNetV2 model on the CIFAR-10 dataset demonstrate that DSparse reduces memory consumption by an average of 59.6%across seven devices,with a 75.4%reduction in parameter aggregation time,while maintaining model precision.
文摘In this paper, we establish a class of sparse update algorithm based on matrix triangular factorizations for solving a system of sparse equations. The local Q-superlinear convergence of the algorithm is proved without introducing an m-step refactorization. We compare the numerical results of the new algorithm with those of the known algorithms, The comparison implies that the new algorithm is satisfactory.