期刊文献+
共找到5篇文章
< 1 >
每页显示 20 50 100
Optimizing energy efficiency of CNN-based object detection with dynamic voltage and frequency scaling
1
作者 Weixiong Jiang Heng Yu +3 位作者 Jiale Zhang Jiaxuan Wu Shaobo Luo yajun ha 《Journal of Semiconductors》 EI CAS CSCD 2020年第2期83-92,共10页
On the one hand,accelerating convolution neural networks(CNNs)on FPGAs requires ever increasing high energy efficiency in the edge computing paradigm.On the other hand,unlike normal digital algorithms,CNNs maintain th... On the one hand,accelerating convolution neural networks(CNNs)on FPGAs requires ever increasing high energy efficiency in the edge computing paradigm.On the other hand,unlike normal digital algorithms,CNNs maintain their high robustness even with limited timing errors.By taking advantage of this unique feature,we propose to use dynamic voltage and frequency scaling(DVFS)to further optimize the energy efficiency for CNNs.First,we have developed a DVFS framework on FPGAs.Second,we apply the DVFS to SkyNet,a state-of-the-art neural network targeting on object detection.Third,we analyze the impact of DVFS on CNNs in terms of performance,power,energy efficiency and accuracy.Compared to the state-of-the-art,experimental results show that we have achieved 38%improvement in energy efficiency without any loss in accuracy.Results also show that we can achieve 47%improvement in energy efficiency if we allow 0.11%relaxation in accuracy. 展开更多
关键词 CNN FPGA DVFS object detection
在线阅读 下载PDF
A routing algorithm for FPGAs with time-multiplexed interconnects
2
作者 Ruiqi Luo Xiaolei Chen yajun ha 《Journal of Semiconductors》 EI CAS CSCD 2020年第2期73-82,共10页
Previous studies show that interconnects occupy a large portion of the timing budget and area in FPGAs.In this work,we propose a time-multiplexing technique on FPGA interconnects.In order to fully exploit this interco... Previous studies show that interconnects occupy a large portion of the timing budget and area in FPGAs.In this work,we propose a time-multiplexing technique on FPGA interconnects.In order to fully exploit this interconnect architecture,we propose a time-multiplexed routing algorithm that can actively identify qualified nets and schedule them to multiplexable wires.We validate the algorithm by using the router to implement 20 benchmark circuits to time-multiplexed FPGAs.We achieve a 38%smaller minimum channel width and 3.8%smaller circuit critical path delay compared with the state-of-the-art architecture router when a wire can be time-multiplexed six times in a cycle. 展开更多
关键词 field programmable gate arrays digital integrated circuits routing algorithm design and analysis
在线阅读 下载PDF
Preface to the Special Issue on Reconfigurable Computing for Energy Efficient AI Microchip Technologies
3
作者 haigang Yang yajun ha +2 位作者 Lingli Wang Wei Zhang Yingyan Lin 《Journal of Semiconductors》 EI CAS CSCD 2020年第2期3-3,共1页
Many artificial intelligence(AI)processing tasks,especially those related to deep neural networks(DNNs),are both computation and memory intensive.Yet the traditional computing platforms such as CPU are increasingly fa... Many artificial intelligence(AI)processing tasks,especially those related to deep neural networks(DNNs),are both computation and memory intensive.Yet the traditional computing platforms such as CPU are increasingly facing difficulties in dealing with those massive processing workloads.Reconfigurable computing(RC)features the ability to perform computations in hardware to increase execution capabilities,and at the same time retain much of the flexibility of a software solution.The microchip design based on the reconfigurable computing models and principles has emerged as an effective means to ensure that the AI applications can be accelerated to not only meet the performance and throughput targets but also the power and energy efficiency requirements. 展开更多
关键词 HARDWARE EXECUTION artificial
在线阅读 下载PDF
HDD-RAM:A 40-nm 0.35 V 25 MHz Half-Select Disturb-Free Memory With Data-Aware 10T SRAM
4
作者 YIFEI LI JIAN CHEN +3 位作者 YUQI WANG WENFENG ZhaO YUhaO SHU yajun ha 《Integrated Circuits and Systems》 2025年第4期205-216,共12页
Ultra-low-voltage SRAM is an indispensable component that is increasingly adopted in energyefficient computing systems.However,it comes at the cost of increased sensitivity to soft errors.To address this issue,bit-int... Ultra-low-voltage SRAM is an indispensable component that is increasingly adopted in energyefficient computing systems.However,it comes at the cost of increased sensitivity to soft errors.To address this issue,bit-interleaving SRAM is widely used to mitigate soft errors.But it suffers from half-select disturbance.Previous works address such disturbance by using a dedicated write port or enhanced write assist scheme.However,these works may decrease write margin,induce high cell-level write latency,or incur architecture-level time/timing overhead.In this paper,we develop a high-speed bit-interleaving half-select disturb-free memory with data-aware 10T SRAM.First,we present an isolated and decoupled topology with dedicated write control to improve stability.Second,we present a data-aware write path with enhanced write-ability that effectively reduces the write access time.A 40-nm 4-Kb test chip has been fabricated to validate the optimizations above.Measurement results show that our half-select disturb-free test chip achieves a peak operating frequency of 25 MHz and an energy consumption of 0.168 fJ/bit with a supply voltage of 0.35 V.Compared with the state-of-the-art designs,it has achieved a speed up of 2.72×and an energy saving of 93.8%. 展开更多
关键词 Ultra-low-voltage SRAM soft error bit-interleaving half-select disturb-free
在线阅读 下载PDF
Overview of Cryogenic CMOS Based Computing Systems 被引量:1
5
作者 YUhaO SHU BIN NING +7 位作者 YIFEI LI ZhaODONG LYU JINCHENG WANG LINTAO LAN YUXIN ZHOU MENGRU ZhaNG HONGTU ZhaNG yajun ha 《Integrated Circuits and Systems》 2024年第4期167-177,共11页
As integrated circuits advance into the post-Moore era,the improvement of computing performance encounters several challenges,making it difficult to meet the ever-growing computing demands.Cryogenic complementary meta... As integrated circuits advance into the post-Moore era,the improvement of computing performance encounters several challenges,making it difficult to meet the ever-growing computing demands.Cryogenic complementary metal oxide semiconductor(CMOS)based computing systems have emerged as a promising solution for overcoming the existing computing performance bottleneck.By cooling the circuitry to cryogenic temperatures,device leakage and wire resistance can be significantly reduced,leading to further improvements in energy efficiency and performance.Here,we conduct a comprehensive review of the cryogenic CMOS based computing systems across multiple optimization layers,including the CMOS process,modeling,electronic design automation(EDA),circuits,and architecture.Moreover,this review identifies potential future works and applications. 展开更多
关键词 Architecture cryogenic CMOS cryogenic circuits cryogenic computing cryogenic memory device model in-memory computing neural network quantum computing
在线阅读 下载PDF
上一页 1 下一页 到第
使用帮助 返回顶部