The conventional computing architecture faces substantial chal-lenges,including high latency and energy consumption between memory and processing units.In response,in-memory computing has emerged as a promising altern...The conventional computing architecture faces substantial chal-lenges,including high latency and energy consumption between memory and processing units.In response,in-memory computing has emerged as a promising alternative architecture,enabling computing operations within memory arrays to overcome these limitations.Memristive devices have gained significant attention as key components for in-memory computing due to their high-density arrays,rapid response times,and ability to emulate biological synapses.Among these devices,two-dimensional(2D)material-based memristor and memtransistor arrays have emerged as particularly promising candidates for next-generation in-memory computing,thanks to their exceptional performance driven by the unique properties of 2D materials,such as layered structures,mechanical flexibility,and the capability to form heterojunctions.This review delves into the state-of-the-art research on 2D material-based memristive arrays,encompassing critical aspects such as material selection,device perfor-mance metrics,array structures,and potential applications.Furthermore,it provides a comprehensive overview of the current challenges and limitations associated with these arrays,along with potential solutions.The primary objective of this review is to serve as a significant milestone in realizing next-generation in-memory computing utilizing 2D materials and bridge the gap from single-device characterization to array-level and system-level implementations of neuromorphic computing,leveraging the potential of 2D material-based memristive devices.展开更多
Facing the computing demands of Internet of things(IoT)and artificial intelligence(AI),the cost induced by moving the data between the central processing unit(CPU)and memory is the key problem and a chip featured with...Facing the computing demands of Internet of things(IoT)and artificial intelligence(AI),the cost induced by moving the data between the central processing unit(CPU)and memory is the key problem and a chip featured with flexible structural unit,ultra-low power consumption,and huge parallelism will be needed.In-memory computing,a non-von Neumann architecture fusing memory units and computing units,can eliminate the data transfer time and energy consumption while performing massive parallel computations.Prototype in-memory computing schemes modified from different memory technologies have shown orders of magnitude improvement in computing efficiency,making it be regarded as the ultimate computing paradigm.Here we review the state-of-the-art memory device technologies potential for in-memory computing,summarize their versatile applications in neural network,stochastic generation,and hybrid precision digital computing,with promising solutions for unprecedented computing tasks,and also discuss the challenges of stability and integration for general in-memory computing.展开更多
The“memory wall”of traditional von Neumann computing systems severely restricts the efficiency of data-intensive task execution,while in-memory computing(IMC)architecture is a promising approach to breaking the bott...The“memory wall”of traditional von Neumann computing systems severely restricts the efficiency of data-intensive task execution,while in-memory computing(IMC)architecture is a promising approach to breaking the bottleneck.Although variations and instability in ultra-scaled memory cells seriously degrade the calculation accuracy in IMC architectures,stochastic computing(SC)can compensate for these shortcomings due to its low sensitivity to cell disturbances.Furthermore,massive parallel computing can be processed to improve the speed and efficiency of the system.In this paper,by designing logic functions in NOR flash arrays,SC in IMC for the image edge detection is realized,demonstrating ultra-low computational complexity and power consumption(25.5 fJ/pixel at 2-bit sequence length).More impressively,the noise immunity is 6 times higher than that of the traditional binary method,showing good tolerances to cell variation and reliability degradation when implementing massive parallel computation in the array.展开更多
The in-memory computing(IMC)paradigm emerges as an effective solution to break the bottlenecks of conventional von Neumann architecture.In the current work,an approximate multiplier in spin-orbit torque magnetoresisti...The in-memory computing(IMC)paradigm emerges as an effective solution to break the bottlenecks of conventional von Neumann architecture.In the current work,an approximate multiplier in spin-orbit torque magnetoresistive random access memory(SOTMRAM)based true IMC(STIMC)architecture was presented,where computations were performed natively within the cell array instead of in peripheral circuits.Firstly,basic Boolean logic operations were realized by utilizing the feature of unipolar SOT device.Two majority gate-based imprecise compressors and an ultra-efficient approximate multiplier were then built to reduce the energy and latency.An optimized data mapping strategy facilitating bit-serial operations with an extensive degree of parallelism was also adopted.Finally,the performance enhancements by performing our approximate multiplier in image smoothing were demonstrated.Detailed simulation results show that the proposed 838 approximate multiplier could reduce the energy and latency at least by 74.2%and 44.4%compared with the existing designs.Moreover,the scheme could achieve improved peak signal-to-noise ratio(PSNR)and structural similarity index metric(SSIM),ensuring high-quality image processing outcomes.展开更多
Recurrent neural networks(RNNs)have proven to be indispensable for processing sequential and temporal data,with extensive applications in language modeling,text generation,machine translation,and time-series forecasti...Recurrent neural networks(RNNs)have proven to be indispensable for processing sequential and temporal data,with extensive applications in language modeling,text generation,machine translation,and time-series forecasting.Despite their versatility,RNNs are frequently beset by significant training expenses and slow convergence times,which impinge upon their deployment in edge AI applications.Reservoir computing(RC),a specialized RNN variant,is attracting increased attention as a cost-effective alternative for processing temporal and sequential data at the edge.RC’s distinctive advantage stems from its compatibility with emerging memristive hardware,which leverages the energy efficiency and reduced footprint of analog in-memory and in-sensor computing,offering a streamlined and energy-efficient solution.This review offers a comprehensive explanation of RC’s underlying principles,fabrication processes,and surveys recent progress in nano-memristive device based RC systems from the viewpoints of in-memory and in-sensor RC function.It covers a spectrum of memristive device,from established oxide-based memristive device to cutting-edge material science developments,providing readers with a lucid understanding of RC’s hardware implementation and fostering innovative designs for in-sensor RC systems.Lastly,we identify prevailing challenges and suggest viable solutions,paving the way for future advancements in in-sensor RC technology.展开更多
Layer pseudospins,exhibiting quantum coherence and precise multistate controllability,present significant potential for the advancement of future computing technologies.In this work,we propose an in-memory probabilist...Layer pseudospins,exhibiting quantum coherence and precise multistate controllability,present significant potential for the advancement of future computing technologies.In this work,we propose an in-memory probabilistic computing scheme based on the electrical manipulation of layer pseudospins in layered materials,by exploiting the interaction between real spins and layer pseudospins.展开更多
Combining logical function and memory characteristics of transistors is an ideal strategy for enhancing computational efficiency of transistor devices.Here,we rationally design a tri-gate two-dimensional(2D)ferroelect...Combining logical function and memory characteristics of transistors is an ideal strategy for enhancing computational efficiency of transistor devices.Here,we rationally design a tri-gate two-dimensional(2D)ferroelectric van der Waals heterostructures device based on copper indium thiophosphate(CuInP_(2)S_(6))and few layers tungsten disulfide(WS_(2)),and demonstrate its multi-functional applications in multi-valued state of data,non-volatile storage,and logic operation.By co-regulating the input signals across the tri-gate,we show that the device can switch functions flexibly at a low supply voltage of 6 V,giving rise to an ultra-high current switching ratio of 107 and a low subthreshold swing of 53.9 mV/dec.These findings offer perspectives in designing smart 2D devices with excellent functions based on ferroelectric van der Waals heterostructures.展开更多
Satellite edge computing has garnered significant attention from researchers;however,processing a large volume of tasks within multi-node satellite networks still poses considerable challenges.The sharp increase in us...Satellite edge computing has garnered significant attention from researchers;however,processing a large volume of tasks within multi-node satellite networks still poses considerable challenges.The sharp increase in user demand for latency-sensitive tasks has inevitably led to offloading bottlenecks and insufficient computational capacity on individual satellite edge servers,making it necessary to implement effective task offloading scheduling to enhance user experience.In this paper,we propose a priority-based task scheduling strategy based on a Software-Defined Network(SDN)framework for satellite-terrestrial integrated networks,which clarifies the execution order of tasks based on their priority.Subsequently,we apply a Dueling-Double Deep Q-Network(DDQN)algorithm enhanced with prioritized experience replay to derive a computation offloading strategy,improving the experience replay mechanism within the Dueling-DDQN framework.Next,we utilize the Deep Deterministic Policy Gradient(DDPG)algorithm to determine the optimal resource allocation strategy to reduce the processing latency of sub-tasks.Simulation results demonstrate that the proposed d3-DDPG algorithm outperforms other approaches,effectively reducing task processing latency and thus improving user experience and system efficiency.展开更多
Optoelectronic memristor is generating growing research interest for high efficient computing and sensing-memory applications.In this work,an optoelectronic memristor with Au/a-C:Te/Pt structure is developed.Synaptic ...Optoelectronic memristor is generating growing research interest for high efficient computing and sensing-memory applications.In this work,an optoelectronic memristor with Au/a-C:Te/Pt structure is developed.Synaptic functions,i.e.,excita-tory post-synaptic current and pair-pulse facilitation are successfully mimicked with the memristor under electrical and optical stimulations.More importantly,the device exhibited distinguishable response currents by adjusting 4-bit input electrical/opti-cal signals.A multi-mode reservoir computing(RC)system is constructed with the optoelectronic memristors to emulate human tactile-visual fusion recognition and an accuracy of 98.7%is achieved.The optoelectronic memristor provides potential for developing multi-mode RC system.展开更多
As an important complement to cloud computing, edge computing can effectively reduce the workload of the backbone network. To reduce latency and energy consumption of edge computing, deep learning is used to learn the...As an important complement to cloud computing, edge computing can effectively reduce the workload of the backbone network. To reduce latency and energy consumption of edge computing, deep learning is used to learn the task offloading strategies by interacting with the entities. In actual application scenarios, users of edge computing are always changing dynamically. However, the existing task offloading strategies cannot be applied to such dynamic scenarios. To solve this problem, we propose a novel dynamic task offloading framework for distributed edge computing, leveraging the potential of meta-reinforcement learning (MRL). Our approach formulates a multi-objective optimization problem aimed at minimizing both delay and energy consumption. We model the task offloading strategy using a directed acyclic graph (DAG). Furthermore, we propose a distributed edge computing adaptive task offloading algorithm rooted in MRL. This algorithm integrates multiple Markov decision processes (MDP) with a sequence-to-sequence (seq2seq) network, enabling it to learn and adapt task offloading strategies responsively across diverse network environments. To achieve joint optimization of delay and energy consumption, we incorporate the non-dominated sorting genetic algorithm II (NSGA-II) into our framework. Simulation results demonstrate the superiority of our proposed solution, achieving a 21% reduction in time delay and a 19% decrease in energy consumption compared to alternative task offloading schemes. Moreover, our scheme exhibits remarkable adaptability, responding swiftly to changes in various network environments.展开更多
The rise of large-scale artificial intelligence(AI)models,such as ChatGPT,Deep-Seek,and autonomous vehicle systems,has significantly advanced the boundaries of AI,enabling highly complex tasks in natural language proc...The rise of large-scale artificial intelligence(AI)models,such as ChatGPT,Deep-Seek,and autonomous vehicle systems,has significantly advanced the boundaries of AI,enabling highly complex tasks in natural language processing,image recognition,and real-time decisionmaking.However,these models demand immense computational power and are often centralized,relying on cloud-based architectures with inherent limitations in latency,privacy,and energy efficiency.To address these challenges and bring AI closer to real-world applications,such as wearable health monitoring,robotics,and immersive virtual environments,innovative hardware solutions are urgently needed.This work introduces a near-sensor edge computing(NSEC)system,built on a bilayer AlN/Si waveguide platform,to provide real-time,energy-efficient AI capabilities at the edge.Leveraging the electro-optic properties of AlN microring resonators for photonic feature extraction,coupled with Si-based thermo-optic Mach-Zehnder interferometers for neural network computations,the system represents a transformative approach to AI hardware design.Demonstrated through multimodal gesture and gait analysis,the NSEC system achieves high classification accuracies of 96.77%for gestures and 98.31%for gaits,ultra-low latency(<10 ns),and minimal energy consumption(<0.34 pJ).This groundbreaking system bridges the gap between AI models and real-world applications,enabling efficient,privacy-preserving AI solutions for healthcare,robotics,and next-generation human-machine interfaces,marking a pivotal advancement in edge computing and AI deployment.展开更多
To address the increasing demand for massive data storage and processing,brain-inspired neuromorphic comput-ing systems based on artificial synaptic devices have been actively developed in recent years.Among the vario...To address the increasing demand for massive data storage and processing,brain-inspired neuromorphic comput-ing systems based on artificial synaptic devices have been actively developed in recent years.Among the various materials inves-tigated for the fabrication of synaptic devices,silicon carbide(SiC)has emerged as a preferred choices due to its high electron mobility,superior thermal conductivity,and excellent thermal stability,which exhibits promising potential for neuromorphic applications in harsh environments.In this review,the recent progress in SiC-based synaptic devices is summarized.Firstly,an in-depth discussion is conducted regarding the categories,working mechanisms,and structural designs of these devices.Subse-quently,several application scenarios for SiC-based synaptic devices are presented.Finally,a few perspectives and directions for their future development are outlined.展开更多
The rapid advent in artificial intelligence and big data has revolutionized the dynamic requirement in the demands of the computing resource for executing specific tasks in the cloud environment.The process of achievi...The rapid advent in artificial intelligence and big data has revolutionized the dynamic requirement in the demands of the computing resource for executing specific tasks in the cloud environment.The process of achieving autonomic resource management is identified to be a herculean task due to its huge distributed and heterogeneous environment.Moreover,the cloud network needs to provide autonomic resource management and deliver potential services to the clients by complying with the requirements of Quality-of-Service(QoS)without impacting the Service Level Agreements(SLAs).However,the existing autonomic cloud resource managing frameworks are not capable in handling the resources of the cloud with its dynamic requirements.In this paper,Coot Bird Behavior Model-based Workload Aware Autonomic Resource Management Scheme(CBBM-WARMS)is proposed for handling the dynamic requirements of cloud resources through the estimation of workload that need to be policed by the cloud environment.This CBBM-WARMS initially adopted the algorithm of adaptive density peak clustering for workloads clustering of the cloud.Then,it utilized the fuzzy logic during the process of workload scheduling for achieving the determining the availability of cloud resources.It further used CBBM for potential Virtual Machine(VM)deployment that attributes towards the provision of optimal resources.It is proposed with the capability of achieving optimal QoS with minimized time,energy consumption,SLA cost and SLA violation.The experimental validation of the proposed CBBMWARMS confirms minimized SLA cost of 19.21%and reduced SLA violation rate of 18.74%,better than the compared autonomic cloud resource managing frameworks.展开更多
Recently,one of the main challenges facing the smart grid is insufficient computing resources and intermittent energy supply for various distributed components(such as monitoring systems for renewable energy power sta...Recently,one of the main challenges facing the smart grid is insufficient computing resources and intermittent energy supply for various distributed components(such as monitoring systems for renewable energy power stations).To solve the problem,we propose an energy harvesting based task scheduling and resource management framework to provide robust and low-cost edge computing services for smart grid.First,we formulate an energy consumption minimization problem with regard to task offloading,time switching,and resource allocation for mobile devices,which can be decoupled and transformed into a typical knapsack problem.Then,solutions are derived by two different algorithms.Furthermore,we deploy renewable energy and energy storage units at edge servers to tackle intermittency and instability problems.Finally,we design an energy management algorithm based on sampling average approximation for edge computing servers to derive the optimal charging/discharging strategies,number of energy storage units,and renewable energy utilization.The simulation results show the efficiency and superiority of our proposed framework.展开更多
Large language models(LLMs)have emerged as powerful tools for addressing a wide range of problems,including those in scientific computing,particularly in solving partial differential equations(PDEs).However,different ...Large language models(LLMs)have emerged as powerful tools for addressing a wide range of problems,including those in scientific computing,particularly in solving partial differential equations(PDEs).However,different models exhibit distinct strengths and preferences,resulting in varying levels of performance.In this paper,we compare the capabilities of the most advanced LLMs—DeepSeek,ChatGPT,and Claude—along with their reasoning-optimized versions in addressing computational challenges.Specifically,we evaluate their proficiency in solving traditional numerical problems in scientific computing as well as leveraging scientific machine learning techniques for PDE-based problems.We designed all our experiments so that a nontrivial decision is required,e.g,defining the proper space of input functions for neural operator learning.Our findings show that reasoning and hybrid-reasoning models consistently and significantly outperform non-reasoning ones in solving challenging problems,with ChatGPT o3-mini-high generally offering the fastest reasoning speed.展开更多
Efficient resource provisioning,allocation,and computation offloading are critical to realizing lowlatency,scalable,and energy-efficient applications in cloud,fog,and edge computing.Despite its importance,integrating ...Efficient resource provisioning,allocation,and computation offloading are critical to realizing lowlatency,scalable,and energy-efficient applications in cloud,fog,and edge computing.Despite its importance,integrating Software Defined Networks(SDN)for enhancing resource orchestration,task scheduling,and traffic management remains a relatively underexplored area with significant innovation potential.This paper provides a comprehensive review of existing mechanisms,categorizing resource provisioning approaches into static,dynamic,and user-centric models,while examining applications across domains such as IoT,healthcare,and autonomous systems.The survey highlights challenges such as scalability,interoperability,and security in managing dynamic and heterogeneous infrastructures.This exclusive research evaluates how SDN enables adaptive policy-based handling of distributed resources through advanced orchestration processes.Furthermore,proposes future directions,including AI-driven optimization techniques and hybrid orchestrationmodels.By addressing these emerging opportunities,thiswork serves as a foundational reference for advancing resource management strategies in next-generation cloud,fog,and edge computing ecosystems.This survey concludes that SDN-enabled computing environments find essential guidance in addressing upcoming management opportunities.展开更多
The number of satellites,especially those operating in Low-Earth Orbit(LEO),has been exploding in recent years.Additionally,the burgeoning development of Artificial Intelligence(AI)software and hardware has opened up ...The number of satellites,especially those operating in Low-Earth Orbit(LEO),has been exploding in recent years.Additionally,the burgeoning development of Artificial Intelligence(AI)software and hardware has opened up new industrial opportunities in both air and space,with satellite-powered computing emerging as a new computing paradigm:Orbital Edge Computing(OEC).Compared to terrestrial edge computing,the mobility of LEO satellites and their limited communication,computation,and storage resources pose challenges in designing task-specific scheduling algorithms.Previous survey papers have largely focused on terrestrial edge computing or the integration of space and ground technologies,lacking a comprehensive summary of OEC architecture,algorithms,and case studies.This paper conducts a comprehensive survey and analysis of OEC's system architecture,applications,algorithms,and simulation tools,providing a solid background for researchers in the field.By discussing OEC use cases and the challenges faced,potential research directions for future OEC research are proposed.展开更多
The emergence of different computing methods such as cloud-,fog-,and edge-based Internet of Things(IoT)systems has provided the opportunity to develop intelligent systems for disease detection.Compared to other machin...The emergence of different computing methods such as cloud-,fog-,and edge-based Internet of Things(IoT)systems has provided the opportunity to develop intelligent systems for disease detection.Compared to other machine learning models,deep learning models have gained more attention from the research community,as they have shown better results with a large volume of data compared to shallow learning.However,no comprehensive survey has been conducted on integrated IoT-and computing-based systems that deploy deep learning for disease detection.This study evaluated different machine learning and deep learning algorithms and their hybrid and optimized algorithms for IoT-based disease detection,using the most recent papers on IoT-based disease detection systems that include computing approaches,such as cloud,edge,and fog.Their analysis focused on an IoT deep learning architecture suitable for disease detection.It also recognizes the different factors that require the attention of researchers to develop better IoT disease detection systems.This study can be helpful to researchers interested in developing better IoT-based disease detection and prediction systems based on deep learning using hybrid algorithms.展开更多
Distributed computing is an important topic in the field of wireless communications and networking,and its high efficiency in handling large amounts of data is particularly noteworthy.Although distributed computing be...Distributed computing is an important topic in the field of wireless communications and networking,and its high efficiency in handling large amounts of data is particularly noteworthy.Although distributed computing benefits from its ability of processing data in parallel,the communication burden between different servers is incurred,thereby the computation process is detained.Recent researches have applied coding in distributed computing to reduce the communication burden,where repetitive computation is utilized to enable multicast opportunities so that the same coded information can be reused across different servers.To handle the computation tasks in practical heterogeneous systems,we propose a novel coding scheme to effectively mitigate the "straggling effect" in distributed computing.We assume that there are two types of servers in the system and the only difference between them is their computational capabilities,the servers with lower computational capabilities are called stragglers.Given any ratio of fast servers to slow servers and any gap of computational capabilities between them,we achieve approximately the same computation time for both fast and slow servers by assigning different amounts of computation tasks to them,thus reducing the overall computation time.Furthermore,we investigate the informationtheoretic lower bound of the inter-communication load and show that the lower bound is within a constant multiplicative gap to the upper bound achieved by our scheme.Various simulations also validate the effectiveness of the proposed scheme.展开更多
基金This work was supported by the National Research Foundation,Singapore under Award No.NRF-CRP24-2020-0002.
文摘The conventional computing architecture faces substantial chal-lenges,including high latency and energy consumption between memory and processing units.In response,in-memory computing has emerged as a promising alternative architecture,enabling computing operations within memory arrays to overcome these limitations.Memristive devices have gained significant attention as key components for in-memory computing due to their high-density arrays,rapid response times,and ability to emulate biological synapses.Among these devices,two-dimensional(2D)material-based memristor and memtransistor arrays have emerged as particularly promising candidates for next-generation in-memory computing,thanks to their exceptional performance driven by the unique properties of 2D materials,such as layered structures,mechanical flexibility,and the capability to form heterojunctions.This review delves into the state-of-the-art research on 2D material-based memristive arrays,encompassing critical aspects such as material selection,device perfor-mance metrics,array structures,and potential applications.Furthermore,it provides a comprehensive overview of the current challenges and limitations associated with these arrays,along with potential solutions.The primary objective of this review is to serve as a significant milestone in realizing next-generation in-memory computing utilizing 2D materials and bridge the gap from single-device characterization to array-level and system-level implementations of neuromorphic computing,leveraging the potential of 2D material-based memristive devices.
基金Project supported by the National Natural Science Foundation of China(Grant Nos.61925402 and 61851402)Science and Technology Commission of Shanghai Municipality,China(Grant No.19JC1416600)+1 种基金the National Key Research and Development Program of China(Grant No.2017YFB0405600)Shanghai Education Development Foundation and Shanghai Municipal Education Commission Shuguang Program,China(Grant No.18SG01).
文摘Facing the computing demands of Internet of things(IoT)and artificial intelligence(AI),the cost induced by moving the data between the central processing unit(CPU)and memory is the key problem and a chip featured with flexible structural unit,ultra-low power consumption,and huge parallelism will be needed.In-memory computing,a non-von Neumann architecture fusing memory units and computing units,can eliminate the data transfer time and energy consumption while performing massive parallel computations.Prototype in-memory computing schemes modified from different memory technologies have shown orders of magnitude improvement in computing efficiency,making it be regarded as the ultimate computing paradigm.Here we review the state-of-the-art memory device technologies potential for in-memory computing,summarize their versatile applications in neural network,stochastic generation,and hybrid precision digital computing,with promising solutions for unprecedented computing tasks,and also discuss the challenges of stability and integration for general in-memory computing.
基金supported by the National Natural Science Foundation of China(Nos.62034006,91964105,61874068)the China Key Research and Development Program(No.2016YFA0201802)+1 种基金the Natural Science Foundation of Shandong Province(No.ZR2020JQ28)Program of Qilu Young Scholars of Shandong University。
文摘The“memory wall”of traditional von Neumann computing systems severely restricts the efficiency of data-intensive task execution,while in-memory computing(IMC)architecture is a promising approach to breaking the bottleneck.Although variations and instability in ultra-scaled memory cells seriously degrade the calculation accuracy in IMC architectures,stochastic computing(SC)can compensate for these shortcomings due to its low sensitivity to cell disturbances.Furthermore,massive parallel computing can be processed to improve the speed and efficiency of the system.In this paper,by designing logic functions in NOR flash arrays,SC in IMC for the image edge detection is realized,demonstrating ultra-low computational complexity and power consumption(25.5 fJ/pixel at 2-bit sequence length).More impressively,the noise immunity is 6 times higher than that of the traditional binary method,showing good tolerances to cell variation and reliability degradation when implementing massive parallel computation in the array.
基金supported in part by National Natural Science Foundation of China(Grant Nos.62374055,12327806)supported in part by Natural Science Foundation of Wuhan(Grant No.2024040701010049).
文摘The in-memory computing(IMC)paradigm emerges as an effective solution to break the bottlenecks of conventional von Neumann architecture.In the current work,an approximate multiplier in spin-orbit torque magnetoresistive random access memory(SOTMRAM)based true IMC(STIMC)architecture was presented,where computations were performed natively within the cell array instead of in peripheral circuits.Firstly,basic Boolean logic operations were realized by utilizing the feature of unipolar SOT device.Two majority gate-based imprecise compressors and an ultra-efficient approximate multiplier were then built to reduce the energy and latency.An optimized data mapping strategy facilitating bit-serial operations with an extensive degree of parallelism was also adopted.Finally,the performance enhancements by performing our approximate multiplier in image smoothing were demonstrated.Detailed simulation results show that the proposed 838 approximate multiplier could reduce the energy and latency at least by 74.2%and 44.4%compared with the existing designs.Moreover,the scheme could achieve improved peak signal-to-noise ratio(PSNR)and structural similarity index metric(SSIM),ensuring high-quality image processing outcomes.
基金supported by National Key Research and Development Program of China(Grant No.2022YFA1405600)Beijing Natural Science Foundation(Grant No.Z210006)+3 种基金National Natural Science Foundation of China—Young Scientists Fund(Grant No.12104051,62122004)Hong Kong Research Grant Council(Grant Nos.27206321,17205922,17212923 and C1009-22GF)Shenzhen Science and Technology Innovation Commission(SGDX20220530111405040)partially supported by ACCESS—AI Chip Center for Emerging Smart Systems,sponsored by Innovation and Technology Fund(ITF),Hong Kong SAR。
文摘Recurrent neural networks(RNNs)have proven to be indispensable for processing sequential and temporal data,with extensive applications in language modeling,text generation,machine translation,and time-series forecasting.Despite their versatility,RNNs are frequently beset by significant training expenses and slow convergence times,which impinge upon their deployment in edge AI applications.Reservoir computing(RC),a specialized RNN variant,is attracting increased attention as a cost-effective alternative for processing temporal and sequential data at the edge.RC’s distinctive advantage stems from its compatibility with emerging memristive hardware,which leverages the energy efficiency and reduced footprint of analog in-memory and in-sensor computing,offering a streamlined and energy-efficient solution.This review offers a comprehensive explanation of RC’s underlying principles,fabrication processes,and surveys recent progress in nano-memristive device based RC systems from the viewpoints of in-memory and in-sensor RC function.It covers a spectrum of memristive device,from established oxide-based memristive device to cutting-edge material science developments,providing readers with a lucid understanding of RC’s hardware implementation and fostering innovative designs for in-sensor RC systems.Lastly,we identify prevailing challenges and suggest viable solutions,paving the way for future advancements in in-sensor RC technology.
基金supported by the National Natural Science Foundation of China(Grant Nos.12322407,62122036,and 62034004)the Natural Science Foundation of Jiangsu Province(Grant No.BK20233001)+5 种基金the National Key R&D Program of China(Grant Nos.2023YFF0718400 and 2023YFF1203600)the Leading-edge Technology Program of Jiangsu Natural Science Foundation(Grant No.BK20232004)the Strategic Priority Research Program of the Chinese Academy of Sciences(Grant No.XDB44000000)Innovation Program for Quantum Science and Technologysupport from the Fundamental Research Funds for the Central Universities(Grant Nos.020414380227,020414380240,and 020414380242)the e-Science Center of Collaborative Innovation Center of Advanced Microstructures。
文摘Layer pseudospins,exhibiting quantum coherence and precise multistate controllability,present significant potential for the advancement of future computing technologies.In this work,we propose an in-memory probabilistic computing scheme based on the electrical manipulation of layer pseudospins in layered materials,by exploiting the interaction between real spins and layer pseudospins.
基金supported by the National Natural Science Foundation of China(No.62104073)the China Postdoctoral Science Foundation(No.2021M691088)+1 种基金the Pearl River Talent Recruitment Program(No.2019ZT08X639)Z.C.W.acknowledges the European Research Executive Agency(Project 101079184-FUNLAYERS).
文摘Combining logical function and memory characteristics of transistors is an ideal strategy for enhancing computational efficiency of transistor devices.Here,we rationally design a tri-gate two-dimensional(2D)ferroelectric van der Waals heterostructures device based on copper indium thiophosphate(CuInP_(2)S_(6))and few layers tungsten disulfide(WS_(2)),and demonstrate its multi-functional applications in multi-valued state of data,non-volatile storage,and logic operation.By co-regulating the input signals across the tri-gate,we show that the device can switch functions flexibly at a low supply voltage of 6 V,giving rise to an ultra-high current switching ratio of 107 and a low subthreshold swing of 53.9 mV/dec.These findings offer perspectives in designing smart 2D devices with excellent functions based on ferroelectric van der Waals heterostructures.
文摘Satellite edge computing has garnered significant attention from researchers;however,processing a large volume of tasks within multi-node satellite networks still poses considerable challenges.The sharp increase in user demand for latency-sensitive tasks has inevitably led to offloading bottlenecks and insufficient computational capacity on individual satellite edge servers,making it necessary to implement effective task offloading scheduling to enhance user experience.In this paper,we propose a priority-based task scheduling strategy based on a Software-Defined Network(SDN)framework for satellite-terrestrial integrated networks,which clarifies the execution order of tasks based on their priority.Subsequently,we apply a Dueling-Double Deep Q-Network(DDQN)algorithm enhanced with prioritized experience replay to derive a computation offloading strategy,improving the experience replay mechanism within the Dueling-DDQN framework.Next,we utilize the Deep Deterministic Policy Gradient(DDPG)algorithm to determine the optimal resource allocation strategy to reduce the processing latency of sub-tasks.Simulation results demonstrate that the proposed d3-DDPG algorithm outperforms other approaches,effectively reducing task processing latency and thus improving user experience and system efficiency.
基金supported by the"Science and Technology Development Plan Project of Jilin Province,China"(Grant No.20240101018JJ)the Fundamental Research Funds for the Central Universities(Grant No.2412023YQ004)the National Natural Science Foundation of China(Grant Nos.52072065,52272140,52372137,and U23A20568).
文摘Optoelectronic memristor is generating growing research interest for high efficient computing and sensing-memory applications.In this work,an optoelectronic memristor with Au/a-C:Te/Pt structure is developed.Synaptic functions,i.e.,excita-tory post-synaptic current and pair-pulse facilitation are successfully mimicked with the memristor under electrical and optical stimulations.More importantly,the device exhibited distinguishable response currents by adjusting 4-bit input electrical/opti-cal signals.A multi-mode reservoir computing(RC)system is constructed with the optoelectronic memristors to emulate human tactile-visual fusion recognition and an accuracy of 98.7%is achieved.The optoelectronic memristor provides potential for developing multi-mode RC system.
基金funded by the Fundamental Research Funds for the Central Universities(J2023-024,J2023-027).
文摘As an important complement to cloud computing, edge computing can effectively reduce the workload of the backbone network. To reduce latency and energy consumption of edge computing, deep learning is used to learn the task offloading strategies by interacting with the entities. In actual application scenarios, users of edge computing are always changing dynamically. However, the existing task offloading strategies cannot be applied to such dynamic scenarios. To solve this problem, we propose a novel dynamic task offloading framework for distributed edge computing, leveraging the potential of meta-reinforcement learning (MRL). Our approach formulates a multi-objective optimization problem aimed at minimizing both delay and energy consumption. We model the task offloading strategy using a directed acyclic graph (DAG). Furthermore, we propose a distributed edge computing adaptive task offloading algorithm rooted in MRL. This algorithm integrates multiple Markov decision processes (MDP) with a sequence-to-sequence (seq2seq) network, enabling it to learn and adapt task offloading strategies responsively across diverse network environments. To achieve joint optimization of delay and energy consumption, we incorporate the non-dominated sorting genetic algorithm II (NSGA-II) into our framework. Simulation results demonstrate the superiority of our proposed solution, achieving a 21% reduction in time delay and a 19% decrease in energy consumption compared to alternative task offloading schemes. Moreover, our scheme exhibits remarkable adaptability, responding swiftly to changes in various network environments.
基金the National Research Foundation(NRF)Singapore mid-sized center grant(NRF-MSG-2023-0002)FrontierCRP grant(NRF-F-CRP-2024-0006)+2 种基金A*STAR Singapore MTC RIE2025 project(M24W1NS005)IAF-PP project(M23M5a0069)Ministry of Education(MOE)Singapore Tier 2 project(MOE-T2EP50220-0014).
文摘The rise of large-scale artificial intelligence(AI)models,such as ChatGPT,Deep-Seek,and autonomous vehicle systems,has significantly advanced the boundaries of AI,enabling highly complex tasks in natural language processing,image recognition,and real-time decisionmaking.However,these models demand immense computational power and are often centralized,relying on cloud-based architectures with inherent limitations in latency,privacy,and energy efficiency.To address these challenges and bring AI closer to real-world applications,such as wearable health monitoring,robotics,and immersive virtual environments,innovative hardware solutions are urgently needed.This work introduces a near-sensor edge computing(NSEC)system,built on a bilayer AlN/Si waveguide platform,to provide real-time,energy-efficient AI capabilities at the edge.Leveraging the electro-optic properties of AlN microring resonators for photonic feature extraction,coupled with Si-based thermo-optic Mach-Zehnder interferometers for neural network computations,the system represents a transformative approach to AI hardware design.Demonstrated through multimodal gesture and gait analysis,the NSEC system achieves high classification accuracies of 96.77%for gestures and 98.31%for gaits,ultra-low latency(<10 ns),and minimal energy consumption(<0.34 pJ).This groundbreaking system bridges the gap between AI models and real-world applications,enabling efficient,privacy-preserving AI solutions for healthcare,robotics,and next-generation human-machine interfaces,marking a pivotal advancement in edge computing and AI deployment.
基金supported by the Natural Science Foundation of Zhejiang Province(Grant No.LQ24F040007)the National Natural Science Foundation of China(Grant No.U22A2075)the Opening Project of State Key Laboratory of Polymer Materials Engineering(Sichuan University)(Grant No.sklpme2024-1-21).
文摘To address the increasing demand for massive data storage and processing,brain-inspired neuromorphic comput-ing systems based on artificial synaptic devices have been actively developed in recent years.Among the various materials inves-tigated for the fabrication of synaptic devices,silicon carbide(SiC)has emerged as a preferred choices due to its high electron mobility,superior thermal conductivity,and excellent thermal stability,which exhibits promising potential for neuromorphic applications in harsh environments.In this review,the recent progress in SiC-based synaptic devices is summarized.Firstly,an in-depth discussion is conducted regarding the categories,working mechanisms,and structural designs of these devices.Subse-quently,several application scenarios for SiC-based synaptic devices are presented.Finally,a few perspectives and directions for their future development are outlined.
文摘The rapid advent in artificial intelligence and big data has revolutionized the dynamic requirement in the demands of the computing resource for executing specific tasks in the cloud environment.The process of achieving autonomic resource management is identified to be a herculean task due to its huge distributed and heterogeneous environment.Moreover,the cloud network needs to provide autonomic resource management and deliver potential services to the clients by complying with the requirements of Quality-of-Service(QoS)without impacting the Service Level Agreements(SLAs).However,the existing autonomic cloud resource managing frameworks are not capable in handling the resources of the cloud with its dynamic requirements.In this paper,Coot Bird Behavior Model-based Workload Aware Autonomic Resource Management Scheme(CBBM-WARMS)is proposed for handling the dynamic requirements of cloud resources through the estimation of workload that need to be policed by the cloud environment.This CBBM-WARMS initially adopted the algorithm of adaptive density peak clustering for workloads clustering of the cloud.Then,it utilized the fuzzy logic during the process of workload scheduling for achieving the determining the availability of cloud resources.It further used CBBM for potential Virtual Machine(VM)deployment that attributes towards the provision of optimal resources.It is proposed with the capability of achieving optimal QoS with minimized time,energy consumption,SLA cost and SLA violation.The experimental validation of the proposed CBBMWARMS confirms minimized SLA cost of 19.21%and reduced SLA violation rate of 18.74%,better than the compared autonomic cloud resource managing frameworks.
基金supported in part by the National Natural Science Foundation of China under Grant No.61473066in part by the Natural Science Foundation of Hebei Province under Grant No.F2021501020+2 种基金in part by the S&T Program of Qinhuangdao under Grant No.202401A195in part by the Science Research Project of Hebei Education Department under Grant No.QN2025008in part by the Innovation Capability Improvement Plan Project of Hebei Province under Grant No.22567637H
文摘Recently,one of the main challenges facing the smart grid is insufficient computing resources and intermittent energy supply for various distributed components(such as monitoring systems for renewable energy power stations).To solve the problem,we propose an energy harvesting based task scheduling and resource management framework to provide robust and low-cost edge computing services for smart grid.First,we formulate an energy consumption minimization problem with regard to task offloading,time switching,and resource allocation for mobile devices,which can be decoupled and transformed into a typical knapsack problem.Then,solutions are derived by two different algorithms.Furthermore,we deploy renewable energy and energy storage units at edge servers to tackle intermittency and instability problems.Finally,we design an energy management algorithm based on sampling average approximation for edge computing servers to derive the optimal charging/discharging strategies,number of energy storage units,and renewable energy utilization.The simulation results show the efficiency and superiority of our proposed framework.
基金supported by the ONR Vannevar Bush Faculty Fellowship(Grant No.N00014-22-1-2795).
文摘Large language models(LLMs)have emerged as powerful tools for addressing a wide range of problems,including those in scientific computing,particularly in solving partial differential equations(PDEs).However,different models exhibit distinct strengths and preferences,resulting in varying levels of performance.In this paper,we compare the capabilities of the most advanced LLMs—DeepSeek,ChatGPT,and Claude—along with their reasoning-optimized versions in addressing computational challenges.Specifically,we evaluate their proficiency in solving traditional numerical problems in scientific computing as well as leveraging scientific machine learning techniques for PDE-based problems.We designed all our experiments so that a nontrivial decision is required,e.g,defining the proper space of input functions for neural operator learning.Our findings show that reasoning and hybrid-reasoning models consistently and significantly outperform non-reasoning ones in solving challenging problems,with ChatGPT o3-mini-high generally offering the fastest reasoning speed.
文摘Efficient resource provisioning,allocation,and computation offloading are critical to realizing lowlatency,scalable,and energy-efficient applications in cloud,fog,and edge computing.Despite its importance,integrating Software Defined Networks(SDN)for enhancing resource orchestration,task scheduling,and traffic management remains a relatively underexplored area with significant innovation potential.This paper provides a comprehensive review of existing mechanisms,categorizing resource provisioning approaches into static,dynamic,and user-centric models,while examining applications across domains such as IoT,healthcare,and autonomous systems.The survey highlights challenges such as scalability,interoperability,and security in managing dynamic and heterogeneous infrastructures.This exclusive research evaluates how SDN enables adaptive policy-based handling of distributed resources through advanced orchestration processes.Furthermore,proposes future directions,including AI-driven optimization techniques and hybrid orchestrationmodels.By addressing these emerging opportunities,thiswork serves as a foundational reference for advancing resource management strategies in next-generation cloud,fog,and edge computing ecosystems.This survey concludes that SDN-enabled computing environments find essential guidance in addressing upcoming management opportunities.
基金funded by the Hong Kong-Macao-Taiwan Science and Technology Cooperation Project of the Science and Technology Innovation Action Plan in Shanghai,China(23510760200)the Oriental Talent Youth Program of Shanghai,China(No.Y3DFRCZL01)+1 种基金the Outstanding Program of the Youth Innovation Promotion Association of the Chinese Academy of Sciences(No.Y2023080)the Strategic Priority Research Program of the Chinese Academy of Sciences Category A(No.XDA0360404).
文摘The number of satellites,especially those operating in Low-Earth Orbit(LEO),has been exploding in recent years.Additionally,the burgeoning development of Artificial Intelligence(AI)software and hardware has opened up new industrial opportunities in both air and space,with satellite-powered computing emerging as a new computing paradigm:Orbital Edge Computing(OEC).Compared to terrestrial edge computing,the mobility of LEO satellites and their limited communication,computation,and storage resources pose challenges in designing task-specific scheduling algorithms.Previous survey papers have largely focused on terrestrial edge computing or the integration of space and ground technologies,lacking a comprehensive summary of OEC architecture,algorithms,and case studies.This paper conducts a comprehensive survey and analysis of OEC's system architecture,applications,algorithms,and simulation tools,providing a solid background for researchers in the field.By discussing OEC use cases and the challenges faced,potential research directions for future OEC research are proposed.
文摘The emergence of different computing methods such as cloud-,fog-,and edge-based Internet of Things(IoT)systems has provided the opportunity to develop intelligent systems for disease detection.Compared to other machine learning models,deep learning models have gained more attention from the research community,as they have shown better results with a large volume of data compared to shallow learning.However,no comprehensive survey has been conducted on integrated IoT-and computing-based systems that deploy deep learning for disease detection.This study evaluated different machine learning and deep learning algorithms and their hybrid and optimized algorithms for IoT-based disease detection,using the most recent papers on IoT-based disease detection systems that include computing approaches,such as cloud,edge,and fog.Their analysis focused on an IoT deep learning architecture suitable for disease detection.It also recognizes the different factors that require the attention of researchers to develop better IoT disease detection systems.This study can be helpful to researchers interested in developing better IoT-based disease detection and prediction systems based on deep learning using hybrid algorithms.
基金supported by NSF China(No.T2421002,62061146002,62020106005)。
文摘Distributed computing is an important topic in the field of wireless communications and networking,and its high efficiency in handling large amounts of data is particularly noteworthy.Although distributed computing benefits from its ability of processing data in parallel,the communication burden between different servers is incurred,thereby the computation process is detained.Recent researches have applied coding in distributed computing to reduce the communication burden,where repetitive computation is utilized to enable multicast opportunities so that the same coded information can be reused across different servers.To handle the computation tasks in practical heterogeneous systems,we propose a novel coding scheme to effectively mitigate the "straggling effect" in distributed computing.We assume that there are two types of servers in the system and the only difference between them is their computational capabilities,the servers with lower computational capabilities are called stragglers.Given any ratio of fast servers to slow servers and any gap of computational capabilities between them,we achieve approximately the same computation time for both fast and slow servers by assigning different amounts of computation tasks to them,thus reducing the overall computation time.Furthermore,we investigate the informationtheoretic lower bound of the inter-communication load and show that the lower bound is within a constant multiplicative gap to the upper bound achieved by our scheme.Various simulations also validate the effectiveness of the proposed scheme.