Edge Machine Learning(EdgeML)and Tiny Machine Learning(TinyML)are fast-growing fields that bring machine learning to resource-constrained devices,allowing real-time data processing and decision-making at the network’...Edge Machine Learning(EdgeML)and Tiny Machine Learning(TinyML)are fast-growing fields that bring machine learning to resource-constrained devices,allowing real-time data processing and decision-making at the network’s edge.However,the complexity of model conversion techniques,diverse inference mechanisms,and varied learning strategies make designing and deploying these models challenging.Additionally,deploying TinyML models on resource-constrained hardware with specific software frameworks has broadened EdgeML’s applications across various sectors.These factors underscore the necessity for a comprehensive literature review,as current reviews do not systematically encompass the most recent findings on these topics.Consequently,it provides a comprehensive overview of state-of-the-art techniques in model conversion,inference mechanisms,learning strategies within EdgeML,and deploying these models on resource-constrained edge devices using TinyML.It identifies 90 research articles published between 2018 and 2025,categorizing them into two main areas:(1)model conversion,inference,and learning strategies in EdgeML and(2)deploying TinyML models on resource-constrained hardware using specific software frameworks.In the first category,the synthesis of selected research articles compares and critically reviews various model conversion techniques,inference mechanisms,and learning strategies.In the second category,the synthesis identifies and elaborates on major development boards,software frameworks,sensors,and algorithms used in various applications across six major sectors.As a result,this article provides valuable insights for researchers,practitioners,and developers.It assists them in choosing suitable model conversion techniques,inference mechanisms,learning strategies,hardware development boards,software frameworks,sensors,and algorithms tailored to their specific needs and applications across various sectors.展开更多
Deep learning technology is widely used in computer vision.Generally,a large amount of data is used to train the model weights in deep learning,so as to obtain a model with higher accuracy.However,massive data and com...Deep learning technology is widely used in computer vision.Generally,a large amount of data is used to train the model weights in deep learning,so as to obtain a model with higher accuracy.However,massive data and complex model structures require more calculating resources.Since people generally can only carry and use mobile and portable devices in application scenarios,neural networks have limitations in terms of calculating resources,size and power consumption.Therefore,the efficient lightweight model MobileNet is used as the basic network in this study for optimization.First,the accuracy of the MobileNet model is improved by adding methods such as the convolutional block attention module(CBAM)and expansion convolution.Then,the MobileNet model is compressed by using pruning and weight quantization algorithms based on weight size.Afterwards,methods such as Python crawlers and data augmentation are employed to create a garbage classification data set.Based on the above model optimization strategy,the garbage classification mobile terminal application is deployed on mobile phones and raspberry pies,realizing completing the garbage classification task more conveniently.展开更多
Edge machine learning creates a new computational paradigm by enabling the deployment of intelligent applications at the network edge.It enhances application efficiency and responsiveness by performing inference and t...Edge machine learning creates a new computational paradigm by enabling the deployment of intelligent applications at the network edge.It enhances application efficiency and responsiveness by performing inference and training tasks closer to data sources.However,it encounters several challenges in practice.The variance in hardware specifications and performance across different devices presents a major issue for the training and inference tasks.Additionally,edge devices typically possess limited network bandwidth and computing resources compared with data centers.Moreover,existing distributed training architectures often fail to consider the constraints of resources and communication efficiency in edge environments.In this paper,we propose DSparse,a method for distributed training based on sparse update in edge clusters with various memory capacities.It aims at maximizing the utilization of memory resources across all devices within a cluster.To reduce memory consumption during the training process,we adopt sparse update to prioritize the updating of selected layers on the devices in the cluster,which not only lowers memory usage but also reduces the data volume of parameters and the time required for parameter aggregation.Furthermore,DSparse utilizes a parameter aggregation mechanism based on multi-process groups,subdividing the aggregation tasks into AllReduce and Broadcast types,thereby further reducing the communication frequency for parameter aggregation.Experimental results using the MobileNetV2 model on the CIFAR-10 dataset demonstrate that DSparse reduces memory consumption by an average of 59.6%across seven devices,with a 75.4%reduction in parameter aggregation time,while maintaining model precision.展开更多
The 5G network connecting billions of Internet of things(IoT)devices will make it possible to harvest an enormous amount of real-time mobile data.Furthermore,the 5G virtualization architecture will enable cloud comput...The 5G network connecting billions of Internet of things(IoT)devices will make it possible to harvest an enormous amount of real-time mobile data.Furthermore,the 5G virtualization architecture will enable cloud computing at the(network)edge.The availability of both rich data and computation power at the edge has motivated Internet companies to deploy artificial intelligence(AI)there,creating the hot area of edge-AI.Edge learning,the theme of this project,concerns training edge-AI models,which endow on IoT devices intelligence for responding to real-time events.However,the transmission of high-dimensional data from many edge devices to servers can result in excessive communication latency,creating a bottleneck for edge learning.Traditional wireless techniques deigned for only radio access are ineffective in tackling the challenge.Attempts to overcome the communication bottleneck has led to the development of a new class of techniques for intelligent radio resource management(RRM),called data-importance aware RRM.Their designs feature the interplay of active machine learning and wireless communication.Specifically,the metrics that measure data importance in active learning(e.g.,classification uncertainty and data diversity)are applied to RRM for efficient acquisition of distributed data in wireless networks to train AI models at servers.This article aims at providing an introduction to the emerging area of importance-aware RRM.To this end,we will introduce the design principles,survey recent advancements in the area,discuss some design examples,and suggest some promising research opportunities.展开更多
文摘Edge Machine Learning(EdgeML)and Tiny Machine Learning(TinyML)are fast-growing fields that bring machine learning to resource-constrained devices,allowing real-time data processing and decision-making at the network’s edge.However,the complexity of model conversion techniques,diverse inference mechanisms,and varied learning strategies make designing and deploying these models challenging.Additionally,deploying TinyML models on resource-constrained hardware with specific software frameworks has broadened EdgeML’s applications across various sectors.These factors underscore the necessity for a comprehensive literature review,as current reviews do not systematically encompass the most recent findings on these topics.Consequently,it provides a comprehensive overview of state-of-the-art techniques in model conversion,inference mechanisms,learning strategies within EdgeML,and deploying these models on resource-constrained edge devices using TinyML.It identifies 90 research articles published between 2018 and 2025,categorizing them into two main areas:(1)model conversion,inference,and learning strategies in EdgeML and(2)deploying TinyML models on resource-constrained hardware using specific software frameworks.In the first category,the synthesis of selected research articles compares and critically reviews various model conversion techniques,inference mechanisms,and learning strategies.In the second category,the synthesis identifies and elaborates on major development boards,software frameworks,sensors,and algorithms used in various applications across six major sectors.As a result,this article provides valuable insights for researchers,practitioners,and developers.It assists them in choosing suitable model conversion techniques,inference mechanisms,learning strategies,hardware development boards,software frameworks,sensors,and algorithms tailored to their specific needs and applications across various sectors.
文摘Deep learning technology is widely used in computer vision.Generally,a large amount of data is used to train the model weights in deep learning,so as to obtain a model with higher accuracy.However,massive data and complex model structures require more calculating resources.Since people generally can only carry and use mobile and portable devices in application scenarios,neural networks have limitations in terms of calculating resources,size and power consumption.Therefore,the efficient lightweight model MobileNet is used as the basic network in this study for optimization.First,the accuracy of the MobileNet model is improved by adding methods such as the convolutional block attention module(CBAM)and expansion convolution.Then,the MobileNet model is compressed by using pruning and weight quantization algorithms based on weight size.Afterwards,methods such as Python crawlers and data augmentation are employed to create a garbage classification data set.Based on the above model optimization strategy,the garbage classification mobile terminal application is deployed on mobile phones and raspberry pies,realizing completing the garbage classification task more conveniently.
基金supported by the National Natural Science Foundation of China under Grant Nos.62072434 and U23B2004the Innovation Funding of Institute of Computing Technology,Chinese Academy of Sciences,under Grant Nos.E361050 and E361030.
文摘Edge machine learning creates a new computational paradigm by enabling the deployment of intelligent applications at the network edge.It enhances application efficiency and responsiveness by performing inference and training tasks closer to data sources.However,it encounters several challenges in practice.The variance in hardware specifications and performance across different devices presents a major issue for the training and inference tasks.Additionally,edge devices typically possess limited network bandwidth and computing resources compared with data centers.Moreover,existing distributed training architectures often fail to consider the constraints of resources and communication efficiency in edge environments.In this paper,we propose DSparse,a method for distributed training based on sparse update in edge clusters with various memory capacities.It aims at maximizing the utilization of memory resources across all devices within a cluster.To reduce memory consumption during the training process,we adopt sparse update to prioritize the updating of selected layers on the devices in the cluster,which not only lowers memory usage but also reduces the data volume of parameters and the time required for parameter aggregation.Furthermore,DSparse utilizes a parameter aggregation mechanism based on multi-process groups,subdividing the aggregation tasks into AllReduce and Broadcast types,thereby further reducing the communication frequency for parameter aggregation.Experimental results using the MobileNetV2 model on the CIFAR-10 dataset demonstrate that DSparse reduces memory consumption by an average of 59.6%across seven devices,with a 75.4%reduction in parameter aggregation time,while maintaining model precision.
基金supported by Hong Kong Research Grants Council under the Grants 17208319,17209917 and 17259416。
文摘The 5G network connecting billions of Internet of things(IoT)devices will make it possible to harvest an enormous amount of real-time mobile data.Furthermore,the 5G virtualization architecture will enable cloud computing at the(network)edge.The availability of both rich data and computation power at the edge has motivated Internet companies to deploy artificial intelligence(AI)there,creating the hot area of edge-AI.Edge learning,the theme of this project,concerns training edge-AI models,which endow on IoT devices intelligence for responding to real-time events.However,the transmission of high-dimensional data from many edge devices to servers can result in excessive communication latency,creating a bottleneck for edge learning.Traditional wireless techniques deigned for only radio access are ineffective in tackling the challenge.Attempts to overcome the communication bottleneck has led to the development of a new class of techniques for intelligent radio resource management(RRM),called data-importance aware RRM.Their designs feature the interplay of active machine learning and wireless communication.Specifically,the metrics that measure data importance in active learning(e.g.,classification uncertainty and data diversity)are applied to RRM for efficient acquisition of distributed data in wireless networks to train AI models at servers.This article aims at providing an introduction to the emerging area of importance-aware RRM.To this end,we will introduce the design principles,survey recent advancements in the area,discuss some design examples,and suggest some promising research opportunities.