On-device Artificial Intelligence(AI)accelerators capable of not only inference but also training neural network models are in increasing demand in the industrial AI field,where frequent retraining is crucial due to f...On-device Artificial Intelligence(AI)accelerators capable of not only inference but also training neural network models are in increasing demand in the industrial AI field,where frequent retraining is crucial due to frequent production changes.Batch normalization(BN)is fundamental to training convolutional neural networks(CNNs),but its implementation in compact accelerator chips remains challenging due to computational complexity,particularly in calculating statistical parameters and gradients across mini-batches.Existing accelerator architectures either compromise the training accuracy of CNNs through approximations or require substantial computational resources,limiting their practical deployment.We present a hardware-optimized BN accelerator that maintains training accuracy while significantly reducing computational overhead through three novel techniques:(1)resourcesharing for efficient resource utilization across forward and backward passes,(2)interleaved buffering for reduced dynamic random-access memory(DRAM)access latencies,and(3)zero-skipping for minimal gradient computation.Implemented on a VCU118 Field Programmable Gate Array(FPGA)on 100 MHz and validated using You Only Look Once version 2-tiny(YOLOv2-tiny)on the PASCALVisualObjectClasses(VOC)dataset,our normalization accelerator achieves a 72%reduction in processing time and 83%lower power consumption compared to a 2.4 GHz Intel Central Processing Unit(CPU)software normalization implementation,while maintaining accuracy(0.51%mean Average Precision(mAP)drop at floating-point 32 bits(FP32),1.35%at brain floating-point 16 bits(bfloat16)).When integrated into a neural processing unit(NPU),the design demonstrates 63%and 97%performance improvements over AMD CPU and Reduced Instruction Set Computing-V(RISC-V)implementations,respectively.These results confirm that our proposed BN hardware design enables efficient,high-accuracy,and power-saving on-device training for modern CNNs.Our results demonstrate that efficient hardware implementation of standard batch normalization is achievable without sacrificing accuracy,enabling practical on-device CNN training with significantly reduced computational and power requirements.展开更多
The aim of this article is to explore potential directions for the development of artificial intelligence(AI).It points out that,while current AI can handle the statistical properties of complex systems,it has difficu...The aim of this article is to explore potential directions for the development of artificial intelligence(AI).It points out that,while current AI can handle the statistical properties of complex systems,it has difficulty effectively processing and fully representing their spatiotemporal complexity patterns.The article also discusses a potential path of AI development in the engineering domain.Based on the existing understanding of the principles of multilevel com-plexity,this article suggests that consistency among the logical structures of datasets,AI models,model-building software,and hardware will be an important AI development direction and is worthy of careful consideration.展开更多
Spiking neural networks(SNN)represent a paradigm shift toward discrete,event-driven neural computation that mirrors biological brain mechanisms.This survey systematically examines current SNN research,focusing on trai...Spiking neural networks(SNN)represent a paradigm shift toward discrete,event-driven neural computation that mirrors biological brain mechanisms.This survey systematically examines current SNN research,focusing on training methodologies,hardware implementations,and practical applications.We analyze four major training paradigms:ANN-to-SNN conversion,direct gradient-based training,spike-timing-dependent plasticity(STDP),and hybrid approaches.Our review encompasses major specialized hardware platforms:Intel Loihi,IBM TrueNorth,SpiNNaker,and BrainScaleS,analyzing their capabilities and constraints.We survey applications spanning computer vision,robotics,edge computing,and brain-computer interfaces,identifying where SNN provide compelling advantages.Our comparative analysis reveals SNN offer significant energy efficiency improvements(1000-10000×reduction)and natural temporal processing,while facing challenges in scalability and training complexity.We identify critical research directions including improved gradient estimation,standardized benchmarking protocols,and hardware-software co-design approaches.This survey provides researchers and practitioners with a comprehensive understanding of current SNN capabilities,limitations,and future prospects.展开更多
In-optical-sensor computing architectures based on neuro-inspired optical sensor arrays have become key milestones for in-sensor artificial intelligence(AI)technology,enabling intelligent vision sensing and extensive ...In-optical-sensor computing architectures based on neuro-inspired optical sensor arrays have become key milestones for in-sensor artificial intelligence(AI)technology,enabling intelligent vision sensing and extensive data processing.These architectures must demonstrate potential advantages in terms of mass production and complementary metal oxide semiconductor compatibility.Here,we introduce a visible-light-driven neuromorphic vision system that integrates front-end retinomorphic photosensors with a back-end artificial neural network(ANN),employing a single neuro-inspired indium-g allium-zinc-oxide photo transistor(NIP)featuring an aluminum sensitization layer(ASL).By methodically adjusting the ASL coverage on IGZO phototransistors,a fast-switching response-type and a synaptic response-type of IGZO photo transistors are successfully developed.Notably,the fabricated NIP shows a remarkable retina-like photoinduced synaptic plasticity under wavelengths up to 635 nm,with over256-states,weight update nonlinearity below 0.1,and a dynamic range of 64.01.Owing to this technology,a 6×6 neuro-inspired optical image sensor array with the NIP can perform highly integrated sensing,memory,and preprocessing functions,including contrast enhancement,and handwritten digit image recognition.The demonstrated prototype highlights the potential for efficient hardware implementations in in-sensor AI technologies.展开更多
1Introduction Embodied Artificial Intelligence(Embodied AI)has recently become a key research focus[1].It emphasizes agents'abilities to perceive,comprehend,and act in physical worlds to complete tasks.Simulation ...1Introduction Embodied Artificial Intelligence(Embodied AI)has recently become a key research focus[1].It emphasizes agents'abilities to perceive,comprehend,and act in physical worlds to complete tasks.Simulation platforms are essential in this area,as they simulate agent behaviors in set environments and tasks,thereby accelerating algorithm validation and optimization.However,constructing such a platform presents several challenges.展开更多
A hardwale demodulation method for 2-D edge detection is proposed. The filtering step and the differential step are implemented by using the hardware circuit. This demodulation circuit simplifies the edgefinder and re...A hardwale demodulation method for 2-D edge detection is proposed. The filtering step and the differential step are implemented by using the hardware circuit. This demodulation circuit simplifies the edgefinder and reduces the measuring cycle. The calibration method of scale setting is also presented,and bymeasuring some calibrated objects,the demodulation errors and the error correction table is obtained.展开更多
This paper presents an integrated on line learning system to evolve programmable logic array (PLA) controllers for navigating an autonomous robot in a two dimensional environment. The integrated on line learning sy...This paper presents an integrated on line learning system to evolve programmable logic array (PLA) controllers for navigating an autonomous robot in a two dimensional environment. The integrated on line learning system consists of two learning modules: one is the module of reinforcement learning based on temporal difference learning based on genetic algorithms, and the other is the module of evolutionary learning based on genetic algorithms. The control rules extracted from the module of reinforcement learning can be used as input to the module of evolutionary learning, and quickly implemented by the PLA through on line evolution. The on line evolution has shown promise as a method of learning systems in complex environment. The evolved PLA controllers can successfully navigate the robot to a target in the two dimensional environment while avoiding collisions with randomly positioned obstacles.展开更多
Lisfranc injuries can be difficult injuries to identify and treat, while also being the subject of significant debate on proper surgical management. A narrative literature review was performed using Pubmed and Google ...Lisfranc injuries can be difficult injuries to identify and treat, while also being the subject of significant debate on proper surgical management. A narrative literature review was performed using Pubmed and Google Scholar databases to identify recent studies evaluating open reduction internal fixation vs primary arthrodesis for Lisfranc injuries to further elucidate optimal surgical management. Additional focus was placed removal of hardware after ORIF to identify the need for routine hardware removal as an additional surgery may guide surgeon decision-making. This review showed inconclusive data on the superiority of ORIF vs arthrodesis, as multiple conflicting results exist, though established that functional results are similar between these options. Though both are generally accepted treatment options, there are no well-designed randomized controlled trials directly comparing the two. Retention of hardware after ORIF has been shown to be tolerated, though there is a significant risk of the need for unplanned removal due to pain and hardware breakage.展开更多
Although there exist a few good schemes to protect the kernel hooks of operating systems, attackers are still able to circumvent existing defense mechanisms with spurious context infonmtion. To address this challenge,...Although there exist a few good schemes to protect the kernel hooks of operating systems, attackers are still able to circumvent existing defense mechanisms with spurious context infonmtion. To address this challenge, this paper proposes a framework, called HooklMA, to detect compromised kernel hooks by using hardware debugging features. The key contribution of the work is that context information is captured from hardware instead of from relatively vulnerable kernel data. Using commodity hardware, a proof-of-concept pro- totype system of HooklMA has been developed. This prototype handles 3 082 dynamic control-flow transfers with related hooks in the kernel space. Experiments show that HooklMA is capable of detecting compomised kernel hooks caused by kernel rootkits. Performance evaluations with UnixBench indicate that runtirre overhead introduced by HooklMA is about 21.5%.展开更多
The self-healing strategy is a key component in designing the bio-inspired embryonics circuit with the structure of cell arrays. However, the existing self-healing strategies of embryonics circuits mainly focus on per...The self-healing strategy is a key component in designing the bio-inspired embryonics circuit with the structure of cell arrays. However, the existing self-healing strategies of embryonics circuits mainly focus on permanent faults inside the modules of cells such as the function module and the configuration register, while little attention is paid to transient faults. From the point of view of obtaining high efficiency of hardware utilization, it would be a huge waste of hardware resources by permanent elimination when a cell only suffers a transient fault which can be repaired by a configuration mechanism. A new self-healing strategy, the Fault-Cell Reutilization Self-healing Strategy(FCRSS) which presents a method for reusing transient fault cells, is proposed in this paper. The circuit structures of all the modules in the cells are described in detail. In the new strategy, two processes of elimination and reconfiguration are combined. Within the process of fault-cell elimination, cells with transient faults in the embryonics circuit array could be reused simultaneously to replace the functions of the cells on their left side in the same row. Therefore, transient fault-cells in a transparent state can be reconfigured to realize the fault-cell reutilization. Finally,a circuit simulation, resource consumption, a reliability analysis and a detailed normalization analysis are presented. The FCRSS can improve the hardware utilization rate and system reliability at the expense of a small amount of hardware resources and reconfiguration time. Following the conclusion, the method of determining the optimal self-healing strategy is presented according to the environmental conditions.展开更多
With the wide application of electronic hardware in aircraft such as air-to-ground communication,satellite communication,positioning system and so on,aircraft hardware is facing great secure pressure.Focusing on the s...With the wide application of electronic hardware in aircraft such as air-to-ground communication,satellite communication,positioning system and so on,aircraft hardware is facing great secure pressure.Focusing on the secure problem of aircraft hardware,this paper proposes a supervisory control architecture based on secure System-on-a-Chip(So C)system.The proposed architecture is attack-immune and trustworthy,which can support trusted escrow application and Dynamic Integrity Measurement(DIM)without interference.This architecture is characterized by a Trusted Monitoring System(TMS)hardware isolated from the Main Processor System(MPS),a secure access channel from TMS to the running memory of the MPS,and the channel is unidirectional.Based on this architecture,the DIM program running on TMS is used to measure and call the Lightweight Measurement Agent(LMA)program running on MPS.By this method,the Operating System(OS)kernel,key software and data of the MPS can be dynamically measured without disturbance,which makes it difficult for adversaries to attack through software.Besides,this architecture has been fully verified on FPGA prototype system.Compared with the existing systems,our architecture achieves higher security and is more efficient on DIM,which can fully supervise the running of application and aircraft hardware OS.展开更多
The interpretation of spinal images fixed with metallic hardware forms an increasing bulk of daily practice in a busy imaging department. Radiologists are required to be familiar with the instrumentation and operative...The interpretation of spinal images fixed with metallic hardware forms an increasing bulk of daily practice in a busy imaging department. Radiologists are required to be familiar with the instrumentation and operative options used in spinal fixation and fusion procedures, especially in his or her institute. This is critical in evaluating the position of implants and potential complications associated with the operative approaches and spinal fixation devices used. Thus, the radiologist can play an important role in patient care and outcome. This review outlines the advantages and disadvantages of commonly used imaging methods and reports on the best yield for each modality and how to overcome the problematic issues associated with the presence of metallic hardware during imaging. Baseline radiographs are essential as they are the baseline point for evaluation of future studies should patients develop symptoms suggesting possible complications. They may justify further imaging workup with computed tomography, magnetic resonance and/or nuclear medicine studies as the evaluation of a patient with a spinal implant involves a multi-modality approach. This review describes imaging features of potential complications associated with spinal fusion surgery as well as the instrumentation used. This basic knowledge aims to help radiologists approach everyday practice in clinical imaging.展开更多
For polar codes,the performance of successive cancellation list(SCL)decoding is capable of approaching that of maximum likelihood decoding.However,the existing hardware architectures for the SCL decoding suffer from h...For polar codes,the performance of successive cancellation list(SCL)decoding is capable of approaching that of maximum likelihood decoding.However,the existing hardware architectures for the SCL decoding suffer from high hardware complexity due to calculating L decoding paths simultaneously,which are unfriendly to the devices with limited logical resources,such as field programmable gate arrays(FPGAs).In this paper,we propose a list-serial pipelined hardware architecture with low complexity for the SCL decoding,where the serial calculation and the pipelined operation are elegantly combined to strike a balance between the complexity and the latency.Moreover,we employ only one successive cancellation(SC)decoder core without L×L crossbars,and reduce the number of inputs of the metric sorter from 2L to L+2.Finally,the FPGA implementations show that the hardware resource consumption is significantly reduced with negligible decoding performance loss.展开更多
In the face of harsh natural environment applications such as earth-orbiting and deep space satellites, underwater sea vehicles, strong electromagnetic interference and temperature stress,the circuits faults appear ea...In the face of harsh natural environment applications such as earth-orbiting and deep space satellites, underwater sea vehicles, strong electromagnetic interference and temperature stress,the circuits faults appear easily. Circuit faults will inevitably lead to serious losses of availability or impeded mission success without self-repair over the mission duration. Traditional fault-repair methods based on redundant fault-tolerant technique are straightforward to implement, yet their area, power and weight cost can be excessive. Moreover they utilize all plug-in or component level circuits to realize redundant backup, such that their applicability is limited. Hence, a novel selfrepair technology based on evolvable hardware(EHW) and reparation balance technology(RBT) is proposed. Its cost is low, and fault self-repair of various circuits and devices can be realized through dynamic configuration. Making full use of the fault signals, correcting circuit can be found through EHW technique to realize the balance and compensation of the fault output-signals. In this paper, the self-repair model was analyzed which based on EHW and RBT technique, the specific self-repair strategy was studied, the corresponding self-repair circuit fault system was designed, and the typical faults were simulated and analyzed which combined with the actual electronic devices. Simulation results demonstrated that the proposed fault self-repair strategy was feasible. Compared to traditional techniques, fault self-repair based on EHW consumes fewer hardware resources, and the scope of fault self-repair was expanded significantly.展开更多
Since traditional fault tolerance methods of electronic systems are based on redundant fault tolerance technique,and their structures are fixed when circuits are designed,the self-adaptive ability is limited.In order ...Since traditional fault tolerance methods of electronic systems are based on redundant fault tolerance technique,and their structures are fixed when circuits are designed,the self-adaptive ability is limited.In order to solve these problems,a novel circuit self-adaptive design technique based on evolvable hardware(EHW)is proposed.It features robustness,self-organization and self-adaption.It can be adapted to a complex environment through dynamic configuration of the circuit.In this paper,the proposed technique simulated.The consumption of hardware resources and the number of convergence iterations researched.The effectiveness and superiority of the proposed technique are verified.The designed circuit has the ability of resistible redundant-state interference(RRSI).The proposed technique has a broad application prospect,and it has great significance.展开更多
Emerging memristive devices offer enormous advantages for applications such as non-volatile memories and inmemory computing(IMC),but there is a rising interest in using memristive technologies for security application...Emerging memristive devices offer enormous advantages for applications such as non-volatile memories and inmemory computing(IMC),but there is a rising interest in using memristive technologies for security applications in the era of internet of things(IoT).In this review article,for achieving secure hardware systems in IoT,lowpower design techniques based on emerging memristive technology for hardware security primitives/systems are presented.By reviewing the state-of-the-art in three highlighted memristive application areas,i.e.memristive non-volatile memory,memristive reconfigurable logic computing and memristive artificial intelligent computing,their application-level impacts on the novel implementations of secret key generation,crypto functions and machine learning attacks are explored,respectively.For the low-power security applications in IoT,it is essential to understand how to best realize cryptographic circuitry using memristive circuitries,and to assess the implications of memristive crypto implementations on security and to develop novel computing paradigms that will enhance their security.This review article aims to help researchers to explore security solutions,to analyze new possible threats and to develop corresponding protections for the secure hardware systems based on low-cost memristive circuit designs.展开更多
Scientific research requires the collection of data in order to study, monitor, analyze, describe, or understand a particular process or event. Data collection efforts are often a compromise: manual measurements can b...Scientific research requires the collection of data in order to study, monitor, analyze, describe, or understand a particular process or event. Data collection efforts are often a compromise: manual measurements can be time-consuming and labor-intensive, resulting in data being collected at a low frequency, while automating the data-collection process can reduce labor requirements and increase the frequency of measurements, but at the cost of added expense of electronic data-collecting instrumentation. Rapid advances in electronic technologies have resulted in a variety of new and inexpensive sensing, monitoring, and control capabilities which offer opportunities for implementation in agricultural and natural-resource research applications. An Open Source Hardware project called Arduino consists of a programmable microcontroller development platform, expansion capability through add-on boards, and a programming development environment for creating custom microcontroller software. All circuit-board and electronic component specifications, as well as the programming software, are open-source and freely available for anyone to use or modify. Inexpensive sensors and the Arduino development platform were used to develop several inexpensive, automated sensing and datalogging systems for use in agricultural and natural-resources related research projects. Systems were developed and implemented to monitor soil-moisture status of field crops for irrigation scheduling and crop-water use studies, to measure daily evaporation-pan water levels for quantifying evaporative demand, and to monitor environmental parameters under forested conditions. These studies demonstrate the usefulness of automated measurements, and offer guidance for other researchers in developing inexpensive sensing and monitoring systems to further their research.展开更多
Embryonic Array(EA) with different configuration methods will directly affect its reliability and hardware consumption. At present, EA configuration design is lack of quantitative analysis method. In order to reasonab...Embryonic Array(EA) with different configuration methods will directly affect its reliability and hardware consumption. At present, EA configuration design is lack of quantitative analysis method. In order to reasonably optimize EA configuration design, an EA configuration optimization design method is proposed, which is based on the constraints of EA hardware consumption and reliability. Through the analysis of EA working process and composition, quantitative analysis of EA reliability and hardware consumption are completed. Based on the constraints of EA hardware consumption and reliability, the mathematical model of EA configuration optimization design is established, which transfers EA configuration optimization design into an integer nonlinear programming model problem. According to the difference of the fitness value of individual waiting for mutation in population, adaptive mutation operator and crossover operator are selected, and a novel Modified Adaptive Differential Evolution(MADE) algorithm is proposed,which is used to solve EA configuration optimization design problem. Simulation experiments and analysis indicate that the MADE is able to effectively improve the speed, accuracy and stability of algorithm. Moreover, the proposed EA configuration optimization design method can select the most reasonable EA configuration design, and play an important guiding role in EA optimization design.展开更多
Hardware/software partitioning is an important step in the design of embedded systems. In this paper, the hardware/software partitioning problem is modeled as a constrained binary integer programming problem, which is...Hardware/software partitioning is an important step in the design of embedded systems. In this paper, the hardware/software partitioning problem is modeled as a constrained binary integer programming problem, which is further converted equivalently to an unconstrained binary integer programming problem by a penalty method. A local search method, HSFM, is developed to obtain a discrete local minimizer of the unconstrained binary integer programming problem. Next, an auxiliary function, which has the same global optimal solutions as the unconstrained binary integer programming problem, is constructed, and its properties are studied. We show that applying HSFM to minimize the auxiliary function can escape from previous local optima by the increase of the parameter value successfully. Finally, a discrete dynamic convexized method is developed to solve the hardware/software partitioning problem. Computational results and comparisons indicate that the proposed algorithm can get high-quality solutions.展开更多
This paper presents a simple yet effective decoding for general quasi-cyclic low-density parity-check (QC-LDPC) codes, which not only achieves high hardware utility efficiency (HUE), but also brings about great me...This paper presents a simple yet effective decoding for general quasi-cyclic low-density parity-check (QC-LDPC) codes, which not only achieves high hardware utility efficiency (HUE), but also brings about great memory block reduction without any performance degradation. The main idea is to split the check matrix into several row blocks, then to perform the improved mes- sage passing computations sequentially block by block. As the decoding algorithm improves, the sequential tie between the two-phase computations is broken, so that the two-phase computations can be overlapped which bring in high HUE. Two over- lapping schemes are also presented, each of which suits a different situation. In addition, an efficient memory arrangement scheme is proposed to reduce the great memory block requirement of the LDPC decoder. As an example, for the 0.4 rate LDPC code selected from Chinese Digital TV Terrestrial Broadcasting (DTTB), our decoding saves over 80% memory blocks com- pared with the conventional decoding, and the decoder achieves 0.97 HUE. Finally, the 0.4 rate LDPC decoder is implemented on an FPGA device EP2S30 (speed grade -5). Using 8 row processing units, the decoder can achieve a maximum net throughput of 28.5 Mbps at 20 iterations.展开更多
基金supported by the National Research Foundation of Korea(NRF)grant for RLRC funded by the Korea government(MSIT)(No.2022R1A5A8026986,RLRC)supported by Institute of Information&Communications Technology Planning&Evaluation(IITP)grant funded by the Korea government(MSIT)(No.2020-0-01304,Development of Self-Learnable Mobile Recursive Neural Network Processor Technology)+3 种基金supported by the MSIT(Ministry of Science and ICT),Republic of Korea,under the Grand Information Technology Research Center support program(IITP-2024-2020-0-01462,Grand-ICT)supervised by the IITP(Institute for Information&Communications Technology Planning&Evaluation)supported by the Korea Technology and Information Promotion Agency for SMEs(TIPA)supported by the Korean government(Ministry of SMEs and Startups)’s Smart Manufacturing Innovation R&D(RS-2024-00434259).
文摘On-device Artificial Intelligence(AI)accelerators capable of not only inference but also training neural network models are in increasing demand in the industrial AI field,where frequent retraining is crucial due to frequent production changes.Batch normalization(BN)is fundamental to training convolutional neural networks(CNNs),but its implementation in compact accelerator chips remains challenging due to computational complexity,particularly in calculating statistical parameters and gradients across mini-batches.Existing accelerator architectures either compromise the training accuracy of CNNs through approximations or require substantial computational resources,limiting their practical deployment.We present a hardware-optimized BN accelerator that maintains training accuracy while significantly reducing computational overhead through three novel techniques:(1)resourcesharing for efficient resource utilization across forward and backward passes,(2)interleaved buffering for reduced dynamic random-access memory(DRAM)access latencies,and(3)zero-skipping for minimal gradient computation.Implemented on a VCU118 Field Programmable Gate Array(FPGA)on 100 MHz and validated using You Only Look Once version 2-tiny(YOLOv2-tiny)on the PASCALVisualObjectClasses(VOC)dataset,our normalization accelerator achieves a 72%reduction in processing time and 83%lower power consumption compared to a 2.4 GHz Intel Central Processing Unit(CPU)software normalization implementation,while maintaining accuracy(0.51%mean Average Precision(mAP)drop at floating-point 32 bits(FP32),1.35%at brain floating-point 16 bits(bfloat16)).When integrated into a neural processing unit(NPU),the design demonstrates 63%and 97%performance improvements over AMD CPU and Reduced Instruction Set Computing-V(RISC-V)implementations,respectively.These results confirm that our proposed BN hardware design enables efficient,high-accuracy,and power-saving on-device training for modern CNNs.Our results demonstrate that efficient hardware implementation of standard batch normalization is achievable without sacrificing accuracy,enabling practical on-device CNN training with significantly reduced computational and power requirements.
文摘The aim of this article is to explore potential directions for the development of artificial intelligence(AI).It points out that,while current AI can handle the statistical properties of complex systems,it has difficulty effectively processing and fully representing their spatiotemporal complexity patterns.The article also discusses a potential path of AI development in the engineering domain.Based on the existing understanding of the principles of multilevel com-plexity,this article suggests that consistency among the logical structures of datasets,AI models,model-building software,and hardware will be an important AI development direction and is worthy of careful consideration.
文摘Spiking neural networks(SNN)represent a paradigm shift toward discrete,event-driven neural computation that mirrors biological brain mechanisms.This survey systematically examines current SNN research,focusing on training methodologies,hardware implementations,and practical applications.We analyze four major training paradigms:ANN-to-SNN conversion,direct gradient-based training,spike-timing-dependent plasticity(STDP),and hybrid approaches.Our review encompasses major specialized hardware platforms:Intel Loihi,IBM TrueNorth,SpiNNaker,and BrainScaleS,analyzing their capabilities and constraints.We survey applications spanning computer vision,robotics,edge computing,and brain-computer interfaces,identifying where SNN provide compelling advantages.Our comparative analysis reveals SNN offer significant energy efficiency improvements(1000-10000×reduction)and natural temporal processing,while facing challenges in scalability and training complexity.We identify critical research directions including improved gradient estimation,standardized benchmarking protocols,and hardware-software co-design approaches.This survey provides researchers and practitioners with a comprehensive understanding of current SNN capabilities,limitations,and future prospects.
基金supported by the National Research Foundation of Korea(NRF)Grant funded by the Korea government(MSIT)(Grant No.RS-2023-00256917)Samsung Display。
文摘In-optical-sensor computing architectures based on neuro-inspired optical sensor arrays have become key milestones for in-sensor artificial intelligence(AI)technology,enabling intelligent vision sensing and extensive data processing.These architectures must demonstrate potential advantages in terms of mass production and complementary metal oxide semiconductor compatibility.Here,we introduce a visible-light-driven neuromorphic vision system that integrates front-end retinomorphic photosensors with a back-end artificial neural network(ANN),employing a single neuro-inspired indium-g allium-zinc-oxide photo transistor(NIP)featuring an aluminum sensitization layer(ASL).By methodically adjusting the ASL coverage on IGZO phototransistors,a fast-switching response-type and a synaptic response-type of IGZO photo transistors are successfully developed.Notably,the fabricated NIP shows a remarkable retina-like photoinduced synaptic plasticity under wavelengths up to 635 nm,with over256-states,weight update nonlinearity below 0.1,and a dynamic range of 64.01.Owing to this technology,a 6×6 neuro-inspired optical image sensor array with the NIP can perform highly integrated sensing,memory,and preprocessing functions,including contrast enhancement,and handwritten digit image recognition.The demonstrated prototype highlights the potential for efficient hardware implementations in in-sensor AI technologies.
基金supported by the National Natural Science Foundation of China(Grant No.62322601).
文摘1Introduction Embodied Artificial Intelligence(Embodied AI)has recently become a key research focus[1].It emphasizes agents'abilities to perceive,comprehend,and act in physical worlds to complete tasks.Simulation platforms are essential in this area,as they simulate agent behaviors in set environments and tasks,thereby accelerating algorithm validation and optimization.However,constructing such a platform presents several challenges.
文摘A hardwale demodulation method for 2-D edge detection is proposed. The filtering step and the differential step are implemented by using the hardware circuit. This demodulation circuit simplifies the edgefinder and reduces the measuring cycle. The calibration method of scale setting is also presented,and bymeasuring some calibrated objects,the demodulation errors and the error correction table is obtained.
文摘This paper presents an integrated on line learning system to evolve programmable logic array (PLA) controllers for navigating an autonomous robot in a two dimensional environment. The integrated on line learning system consists of two learning modules: one is the module of reinforcement learning based on temporal difference learning based on genetic algorithms, and the other is the module of evolutionary learning based on genetic algorithms. The control rules extracted from the module of reinforcement learning can be used as input to the module of evolutionary learning, and quickly implemented by the PLA through on line evolution. The on line evolution has shown promise as a method of learning systems in complex environment. The evolved PLA controllers can successfully navigate the robot to a target in the two dimensional environment while avoiding collisions with randomly positioned obstacles.
文摘Lisfranc injuries can be difficult injuries to identify and treat, while also being the subject of significant debate on proper surgical management. A narrative literature review was performed using Pubmed and Google Scholar databases to identify recent studies evaluating open reduction internal fixation vs primary arthrodesis for Lisfranc injuries to further elucidate optimal surgical management. Additional focus was placed removal of hardware after ORIF to identify the need for routine hardware removal as an additional surgery may guide surgeon decision-making. This review showed inconclusive data on the superiority of ORIF vs arthrodesis, as multiple conflicting results exist, though established that functional results are similar between these options. Though both are generally accepted treatment options, there are no well-designed randomized controlled trials directly comparing the two. Retention of hardware after ORIF has been shown to be tolerated, though there is a significant risk of the need for unplanned removal due to pain and hardware breakage.
基金The authors would like to thank the anonymous reviewers for their insightful corrnlents that have helped improve the presentation of this paper. The work was supported partially by the National Natural Science Foundation of China under Grants No. 61070192, No.91018008, No. 61170240 the National High-Tech Research Development Program of China under Grant No. 2007AA01ZA14 the Natural Science Foundation of Beijing un- der Grant No. 4122041.
文摘Although there exist a few good schemes to protect the kernel hooks of operating systems, attackers are still able to circumvent existing defense mechanisms with spurious context infonmtion. To address this challenge, this paper proposes a framework, called HooklMA, to detect compromised kernel hooks by using hardware debugging features. The key contribution of the work is that context information is captured from hardware instead of from relatively vulnerable kernel data. Using commodity hardware, a proof-of-concept pro- totype system of HooklMA has been developed. This prototype handles 3 082 dynamic control-flow transfers with related hooks in the kernel space. Experiments show that HooklMA is capable of detecting compomised kernel hooks caused by kernel rootkits. Performance evaluations with UnixBench indicate that runtirre overhead introduced by HooklMA is about 21.5%.
基金co-supported by the National Natural Science Foundation of China(Nos.61202001,61402226)the Fundamental Research Funds for the Central Universities of NUAA of China(Nos.NS2018026,NS2012024)
文摘The self-healing strategy is a key component in designing the bio-inspired embryonics circuit with the structure of cell arrays. However, the existing self-healing strategies of embryonics circuits mainly focus on permanent faults inside the modules of cells such as the function module and the configuration register, while little attention is paid to transient faults. From the point of view of obtaining high efficiency of hardware utilization, it would be a huge waste of hardware resources by permanent elimination when a cell only suffers a transient fault which can be repaired by a configuration mechanism. A new self-healing strategy, the Fault-Cell Reutilization Self-healing Strategy(FCRSS) which presents a method for reusing transient fault cells, is proposed in this paper. The circuit structures of all the modules in the cells are described in detail. In the new strategy, two processes of elimination and reconfiguration are combined. Within the process of fault-cell elimination, cells with transient faults in the embryonics circuit array could be reused simultaneously to replace the functions of the cells on their left side in the same row. Therefore, transient fault-cells in a transparent state can be reconfigured to realize the fault-cell reutilization. Finally,a circuit simulation, resource consumption, a reliability analysis and a detailed normalization analysis are presented. The FCRSS can improve the hardware utilization rate and system reliability at the expense of a small amount of hardware resources and reconfiguration time. Following the conclusion, the method of determining the optimal self-healing strategy is presented according to the environmental conditions.
基金supported by the National Key Research and Development Program of China(No.2017YFB0802502)by the Aeronautical Science Foundation(No.2017ZC51038)+4 种基金by the National Natural Science Foundation of China(Nos.62002006,61702028,61672083,61370190,61772538,61532021,61472429,and 61402029)by the Foundation of Science and Technology on Information Assurance Laboratory(No.1421120305162112006)by the National Cryptography Development Fund(No.MMJJ20170106)by the Defense Industrial Technology Development Program(No.JCKY2016204A102)by the Liaoning Collaboration Innovation Center For CSLE,China。
文摘With the wide application of electronic hardware in aircraft such as air-to-ground communication,satellite communication,positioning system and so on,aircraft hardware is facing great secure pressure.Focusing on the secure problem of aircraft hardware,this paper proposes a supervisory control architecture based on secure System-on-a-Chip(So C)system.The proposed architecture is attack-immune and trustworthy,which can support trusted escrow application and Dynamic Integrity Measurement(DIM)without interference.This architecture is characterized by a Trusted Monitoring System(TMS)hardware isolated from the Main Processor System(MPS),a secure access channel from TMS to the running memory of the MPS,and the channel is unidirectional.Based on this architecture,the DIM program running on TMS is used to measure and call the Lightweight Measurement Agent(LMA)program running on MPS.By this method,the Operating System(OS)kernel,key software and data of the MPS can be dynamically measured without disturbance,which makes it difficult for adversaries to attack through software.Besides,this architecture has been fully verified on FPGA prototype system.Compared with the existing systems,our architecture achieves higher security and is more efficient on DIM,which can fully supervise the running of application and aircraft hardware OS.
文摘The interpretation of spinal images fixed with metallic hardware forms an increasing bulk of daily practice in a busy imaging department. Radiologists are required to be familiar with the instrumentation and operative options used in spinal fixation and fusion procedures, especially in his or her institute. This is critical in evaluating the position of implants and potential complications associated with the operative approaches and spinal fixation devices used. Thus, the radiologist can play an important role in patient care and outcome. This review outlines the advantages and disadvantages of commonly used imaging methods and reports on the best yield for each modality and how to overcome the problematic issues associated with the presence of metallic hardware during imaging. Baseline radiographs are essential as they are the baseline point for evaluation of future studies should patients develop symptoms suggesting possible complications. They may justify further imaging workup with computed tomography, magnetic resonance and/or nuclear medicine studies as the evaluation of a patient with a spinal implant involves a multi-modality approach. This review describes imaging features of potential complications associated with spinal fusion surgery as well as the instrumentation used. This basic knowledge aims to help radiologists approach everyday practice in clinical imaging.
基金supported in part by the National Key R&D Program of China(No.2019YFB1803400)。
文摘For polar codes,the performance of successive cancellation list(SCL)decoding is capable of approaching that of maximum likelihood decoding.However,the existing hardware architectures for the SCL decoding suffer from high hardware complexity due to calculating L decoding paths simultaneously,which are unfriendly to the devices with limited logical resources,such as field programmable gate arrays(FPGAs).In this paper,we propose a list-serial pipelined hardware architecture with low complexity for the SCL decoding,where the serial calculation and the pipelined operation are elegantly combined to strike a balance between the complexity and the latency.Moreover,we employ only one successive cancellation(SC)decoder core without L×L crossbars,and reduce the number of inputs of the metric sorter from 2L to L+2.Finally,the FPGA implementations show that the hardware resource consumption is significantly reduced with negligible decoding performance loss.
基金supported by the National Natural Science Foundation of China (Nos. 61271153, 61372039)
文摘In the face of harsh natural environment applications such as earth-orbiting and deep space satellites, underwater sea vehicles, strong electromagnetic interference and temperature stress,the circuits faults appear easily. Circuit faults will inevitably lead to serious losses of availability or impeded mission success without self-repair over the mission duration. Traditional fault-repair methods based on redundant fault-tolerant technique are straightforward to implement, yet their area, power and weight cost can be excessive. Moreover they utilize all plug-in or component level circuits to realize redundant backup, such that their applicability is limited. Hence, a novel selfrepair technology based on evolvable hardware(EHW) and reparation balance technology(RBT) is proposed. Its cost is low, and fault self-repair of various circuits and devices can be realized through dynamic configuration. Making full use of the fault signals, correcting circuit can be found through EHW technique to realize the balance and compensation of the fault output-signals. In this paper, the self-repair model was analyzed which based on EHW and RBT technique, the specific self-repair strategy was studied, the corresponding self-repair circuit fault system was designed, and the typical faults were simulated and analyzed which combined with the actual electronic devices. Simulation results demonstrated that the proposed fault self-repair strategy was feasible. Compared to traditional techniques, fault self-repair based on EHW consumes fewer hardware resources, and the scope of fault self-repair was expanded significantly.
基金This work was supported by National Natural Science Foundation of China(Nos.61271153 and 61372039).
文摘Since traditional fault tolerance methods of electronic systems are based on redundant fault tolerance technique,and their structures are fixed when circuits are designed,the self-adaptive ability is limited.In order to solve these problems,a novel circuit self-adaptive design technique based on evolvable hardware(EHW)is proposed.It features robustness,self-organization and self-adaption.It can be adapted to a complex environment through dynamic configuration of the circuit.In this paper,the proposed technique simulated.The consumption of hardware resources and the number of convergence iterations researched.The effectiveness and superiority of the proposed technique are verified.The designed circuit has the ability of resistible redundant-state interference(RRSI).The proposed technique has a broad application prospect,and it has great significance.
基金supported by the DFG(German Research Foundation)Priority Program Nano Security,Project MemCrypto(Projektnummer 439827659/funding id DU 1896/2–1,PO 1220/15–1)the funding by the Fraunhofer Internal Programs under Grant No.Attract 600768。
文摘Emerging memristive devices offer enormous advantages for applications such as non-volatile memories and inmemory computing(IMC),but there is a rising interest in using memristive technologies for security applications in the era of internet of things(IoT).In this review article,for achieving secure hardware systems in IoT,lowpower design techniques based on emerging memristive technology for hardware security primitives/systems are presented.By reviewing the state-of-the-art in three highlighted memristive application areas,i.e.memristive non-volatile memory,memristive reconfigurable logic computing and memristive artificial intelligent computing,their application-level impacts on the novel implementations of secret key generation,crypto functions and machine learning attacks are explored,respectively.For the low-power security applications in IoT,it is essential to understand how to best realize cryptographic circuitry using memristive circuitries,and to assess the implications of memristive crypto implementations on security and to develop novel computing paradigms that will enhance their security.This review article aims to help researchers to explore security solutions,to analyze new possible threats and to develop corresponding protections for the secure hardware systems based on low-cost memristive circuit designs.
文摘Scientific research requires the collection of data in order to study, monitor, analyze, describe, or understand a particular process or event. Data collection efforts are often a compromise: manual measurements can be time-consuming and labor-intensive, resulting in data being collected at a low frequency, while automating the data-collection process can reduce labor requirements and increase the frequency of measurements, but at the cost of added expense of electronic data-collecting instrumentation. Rapid advances in electronic technologies have resulted in a variety of new and inexpensive sensing, monitoring, and control capabilities which offer opportunities for implementation in agricultural and natural-resource research applications. An Open Source Hardware project called Arduino consists of a programmable microcontroller development platform, expansion capability through add-on boards, and a programming development environment for creating custom microcontroller software. All circuit-board and electronic component specifications, as well as the programming software, are open-source and freely available for anyone to use or modify. Inexpensive sensors and the Arduino development platform were used to develop several inexpensive, automated sensing and datalogging systems for use in agricultural and natural-resources related research projects. Systems were developed and implemented to monitor soil-moisture status of field crops for irrigation scheduling and crop-water use studies, to measure daily evaporation-pan water levels for quantifying evaporative demand, and to monitor environmental parameters under forested conditions. These studies demonstrate the usefulness of automated measurements, and offer guidance for other researchers in developing inexpensive sensing and monitoring systems to further their research.
基金supported by the National Natural Science Foundation of China(Nos.61372039 and 61601495)
文摘Embryonic Array(EA) with different configuration methods will directly affect its reliability and hardware consumption. At present, EA configuration design is lack of quantitative analysis method. In order to reasonably optimize EA configuration design, an EA configuration optimization design method is proposed, which is based on the constraints of EA hardware consumption and reliability. Through the analysis of EA working process and composition, quantitative analysis of EA reliability and hardware consumption are completed. Based on the constraints of EA hardware consumption and reliability, the mathematical model of EA configuration optimization design is established, which transfers EA configuration optimization design into an integer nonlinear programming model problem. According to the difference of the fitness value of individual waiting for mutation in population, adaptive mutation operator and crossover operator are selected, and a novel Modified Adaptive Differential Evolution(MADE) algorithm is proposed,which is used to solve EA configuration optimization design problem. Simulation experiments and analysis indicate that the MADE is able to effectively improve the speed, accuracy and stability of algorithm. Moreover, the proposed EA configuration optimization design method can select the most reasonable EA configuration design, and play an important guiding role in EA optimization design.
基金Supported by the National Natural Science Foundation of China(11301255)the Fund by Collaborative Innovation Center of IoT Industrialization and Intelligent Production,Minjiang University(IIC1703)+1 种基金Foundation of Minjiang University(MYK17032)the Program for New Century Excellent Talents in Fujian Province University
文摘Hardware/software partitioning is an important step in the design of embedded systems. In this paper, the hardware/software partitioning problem is modeled as a constrained binary integer programming problem, which is further converted equivalently to an unconstrained binary integer programming problem by a penalty method. A local search method, HSFM, is developed to obtain a discrete local minimizer of the unconstrained binary integer programming problem. Next, an auxiliary function, which has the same global optimal solutions as the unconstrained binary integer programming problem, is constructed, and its properties are studied. We show that applying HSFM to minimize the auxiliary function can escape from previous local optima by the increase of the parameter value successfully. Finally, a discrete dynamic convexized method is developed to solve the hardware/software partitioning problem. Computational results and comparisons indicate that the proposed algorithm can get high-quality solutions.
基金Science and Technology on Avionics Integration Laboratory and Aeronautical Science Foundation of China (20115551022)
文摘This paper presents a simple yet effective decoding for general quasi-cyclic low-density parity-check (QC-LDPC) codes, which not only achieves high hardware utility efficiency (HUE), but also brings about great memory block reduction without any performance degradation. The main idea is to split the check matrix into several row blocks, then to perform the improved mes- sage passing computations sequentially block by block. As the decoding algorithm improves, the sequential tie between the two-phase computations is broken, so that the two-phase computations can be overlapped which bring in high HUE. Two over- lapping schemes are also presented, each of which suits a different situation. In addition, an efficient memory arrangement scheme is proposed to reduce the great memory block requirement of the LDPC decoder. As an example, for the 0.4 rate LDPC code selected from Chinese Digital TV Terrestrial Broadcasting (DTTB), our decoding saves over 80% memory blocks com- pared with the conventional decoding, and the decoder achieves 0.97 HUE. Finally, the 0.4 rate LDPC decoder is implemented on an FPGA device EP2S30 (speed grade -5). Using 8 row processing units, the decoder can achieve a maximum net throughput of 28.5 Mbps at 20 iterations.