A novel test approach for interconnect resources(IRs)in field programmable gate arrays(FPGA)has been proposed.In the test approach,SBs(switch boxes)of IRs in FPGA has been utilized to test IRs.Furthermore,configurable...A novel test approach for interconnect resources(IRs)in field programmable gate arrays(FPGA)has been proposed.In the test approach,SBs(switch boxes)of IRs in FPGA has been utilized to test IRs.Furthermore,configurable logic blocks(CLBs)in FPGA have also been employed to enhance driving capability and the position of fault IR can be determined by monitoring the IRs associated SBs.As a result,IRs can be scanned maximally with minimum configuration patterns.In the experiment,an in-house developed FPGA test system based on system-on-chip(SoC)hardware/software verification technology has been applied to test XC4000E family of Xilinx.The experiment results revealed that the IRs in FPGA can be tested by 6 test patterns.展开更多
Accurate calibrations of stiffness and position are crucial to the quantitative measurement with optical tweezers. In this paper, we present a new calibration scheme for optical tweezers including stiffness and positi...Accurate calibrations of stiffness and position are crucial to the quantitative measurement with optical tweezers. In this paper, we present a new calibration scheme for optical tweezers including stiffness and position calibrations. In our system, acousto-optic deflectors (AODs) are used as laser beam manipulating component. The AODs are controlled by a field programmable gate array (FPGA) connected to a computer using universal serial bus (USB) communication mode. Our results agree well with the present theory and other experimental results.展开更多
Satellite laser ranging (SLR) is one of the major space geodetic instruments, which has various applications in earth science. In this paper, we introduce several issues regarding the key technology implementation o...Satellite laser ranging (SLR) is one of the major space geodetic instruments, which has various applications in earth science. In this paper, we introduce several issues regarding the key technology implementation of high-repetition-rate SLR system. Compared with traditional technology, using kHz and 8ps pulse width laser component, the data quantity and quality of high-repetition-rate satellite laser ranging (SLR) can be significantly improved. The characteristics of high-repetition-rate laser ranging and the key technologies are presented, including the event timer with the precision of picosecond, the generation of range gate signal, and so on. All of them are based on the Field Programmable Gate Arrays (FPGA) and tested on China mobile SLR system-TROS1000. Finally, the observations of satellite Beacon-C are given.展开更多
A novel efficient partial sharing channelization structure with odd and even stacking is designed and implemented. There are two special designs in the proposed structure. Firstly, by the intensive channel overlap des...A novel efficient partial sharing channelization structure with odd and even stacking is designed and implemented. There are two special designs in the proposed structure. Firstly, by the intensive channel overlap design, for non-cooperative wideband signals, the proposed structure can achieve good parameter estimation accuracy and high probability of complete interception.Secondly, based on the partial sharing design developed in this paper, the computation burden of the proposed structure can be greatly reduced compared with the traditional directly implemented structures. Experiments and numerical simulations are conducted to evaluate the proposed structure, which shows its improvements over traditional methods in terms of field programmable gate arrays(FPGA) resource consumption and parameter estimation accuracy.展开更多
In order to reduce physical unclonable fixnction (PUF) response instability and imbalance caused by the metastability and the bias of arbiter, this paper uses an improved balanced D flip-plop (DFF) based on the un...In order to reduce physical unclonable fixnction (PUF) response instability and imbalance caused by the metastability and the bias of arbiter, this paper uses an improved balanced D flip-plop (DFF) based on the unbalanced DFF to reduce the bias in response output and enhances the security of PUF by adopting two balanced DFFs in series. The experimental results show that two cascaded balanced DFFs improve the stability of the DFF, and the output of two balanced DFFs is more reliable. The entropy of output is fixed at 98.7%.展开更多
A novel matching method for simultaneous multi-target recognition is proposed by jointly considering target's prior scattering knowledge and the polarization parameters of radar echoes. The matching coefficients a...A novel matching method for simultaneous multi-target recognition is proposed by jointly considering target's prior scattering knowledge and the polarization parameters of radar echoes. The matching coefficients are calculated for the judgment. MATLAB simulations show that several targets can be accurately recognized simultaneously, and a high recognition probability can be achieved in Monte Carlo simulations. The total execution time can be remarkably reduced in the Field Programmable Gate Array (FPGA) implementation of the matching procedure.展开更多
A simulation method to simulate the pseudorandom code P. M PP radar' s echo signal is proposed that makes use of the pre-generated Doppler simulation data, according to the relative movement parameter of the radar an...A simulation method to simulate the pseudorandom code P. M PP radar' s echo signal is proposed that makes use of the pre-generated Doppler simulation data, according to the relative movement parameter of the radar and the target. It resolves the problem of the high precision distance simulation and the high speed digital shift phase. At the same time, the radar dynamic digital video frequency target signal simulator is designed. Simulation results of the critical unit and the output waveform are given. The result of the test satisfies the system's request.展开更多
An optical fiber control and transmission module is designed and realized based on Virtex-7 field programmable gata array(FPGA), which can be applied in multi-channel broadband digital receivers. The module consists o...An optical fiber control and transmission module is designed and realized based on Virtex-7 field programmable gata array(FPGA), which can be applied in multi-channel broadband digital receivers. The module consists of sampling data transfer submodule and multi-channel synchronous sampling control submodule. The sampling data transmission in 4× fiber link channel is realized with the self-defined transfer protocol. The measured maximum data rate is 4.97 Gbyte/s. By connecting coherent clocks to the transmitter and receiver endpoints and using the self-defined transfer protocol, multi-channel sampling control signals transferred in optical fibers can be received synchronously by each analog-to-digital converter(ADC) with high accuracy and strong anti-interference ability. The module designed in this paper has certain reference value in increasing the transmission bandwidth and the synchronous sampling accuracy of multi-channel broadband digital receivers.展开更多
A novel architecture for computing the fast Fourier transform ( FFT ) on programmable devices is presented.To improve the system operation speed , a hybrid parallel FFT algorithm is used.Results indicate that the use ...A novel architecture for computing the fast Fourier transform ( FFT ) on programmable devices is presented.To improve the system operation speed , a hybrid parallel FFT algorithm is used.Results indicate that the use of an 8×8parallel structure for realizing the 64-point FFT leads to a 8times higher processing speed compared with its counterparts employing other series of techniques.展开更多
This paper mainly represents the realization of synchro controller based on the programmable logic devices FPGA by request of HF ground wave radar synchro controller under the instance of making the best of the virtue...This paper mainly represents the realization of synchro controller based on the programmable logic devices FPGA by request of HF ground wave radar synchro controller under the instance of making the best of the virtues of FPGA. This design introduces the data communication between PC and synchro controller by PC Bus, which can carry the synchronous signals parameters to RAM of synchro controller, then according to the theory that the result of comparing counter value with signals parameters is the needed wave, we produce all waves HF ground wave radar needs, moreover all waves are produced time-sharing in order to save resources.展开更多
In recent years,deep neural networks have become a fascinating and influential research subject,and they play a critical role in video processing and analytics.Since,video analytics are predominantly hardware centric,...In recent years,deep neural networks have become a fascinating and influential research subject,and they play a critical role in video processing and analytics.Since,video analytics are predominantly hardware centric,exploration of implementing the deep neural networks in the hardware needs its brighter light of research.However,the computational complexity and resource constraints of deep neural networks are increasing exponentially by time.Convolutional neural networks are one of the most popular deep learning architecture especially for image classification and video analytics.But these algorithms need an efficient implement strategy for incorporating more real time computations in terms of handling the videos in the hardware.Field programmable Gate arrays(FPGA)is thought to be more advantageous in implementing the convolutional neural networks when compared to Graphics Processing Unit(GPU)in terms of energy efficient and low computational complexity.But still,an intelligent architecture is required for implementing the CNN in FPGA for processing the videos.This paper introduces a modern high-performance,energy-efficient Bat Pruned Ensembled Convolutional networks(BPEC-CNN)for processing the video in the hardware.The system integrates the Bat Evolutionary Pruned layers for CNN and implements the new shared Distributed Filtering Structures(DFS)for handing the filter layers in CNN with pipelined data-path in FPGA.In addition,the proposed system adopts the hardware-software co-design methodology for an energy efficiency and less computational complexity.The extensive experimentations are carried out using CASIA video datasets with ARTIX-7 FPGA boards(number)and various algorithms centric parameters such as accuracy,sensitivity,specificity and architecture centric parameters such as the power,area and throughput are analyzed.These results are then compared with the existing pruned CNN architectures such as CNN-Prunner in which the proposed architecture has been shown 25%better performance than the existing architectures.展开更多
For an SOI-FPGA (silicon-on-insulator field programmable gate arrays) (VS1000) fabricated with 0.5 ttm SOI-CMOS (silicon-on-insulator complementary-metal-oxide-semiconductor) process, a complete integrated platf...For an SOI-FPGA (silicon-on-insulator field programmable gate arrays) (VS1000) fabricated with 0.5 ttm SOI-CMOS (silicon-on-insulator complementary-metal-oxide-semiconductor) process, a complete integrated platform of FPGA computer-aided design (CAD) toolset (VDK) is developed, which can convert the Verilog HDL (hardware description language) description into a bitstream and finally download it into an FPGA. Experiments and testing verify that this FPGA CAD works well and efficiently.展开更多
Previous studies show that interconnects occupy a large portion of the timing budget and area in FPGAs.In this work,we propose a time-multiplexing technique on FPGA interconnects.In order to fully exploit this interco...Previous studies show that interconnects occupy a large portion of the timing budget and area in FPGAs.In this work,we propose a time-multiplexing technique on FPGA interconnects.In order to fully exploit this interconnect architecture,we propose a time-multiplexed routing algorithm that can actively identify qualified nets and schedule them to multiplexable wires.We validate the algorithm by using the router to implement 20 benchmark circuits to time-multiplexed FPGAs.We achieve a 38%smaller minimum channel width and 3.8%smaller circuit critical path delay compared with the state-of-the-art architecture router when a wire can be time-multiplexed six times in a cycle.展开更多
With the progress of the railway technology, the railway transportation is becoming more efficient, intelligent and faster. High speed trains, as a major part of the railway transportation, are engaged with passenger&...With the progress of the railway technology, the railway transportation is becoming more efficient, intelligent and faster. High speed trains, as a major part of the railway transportation, are engaged with passenger's safety, and therefore the reliability issue is very important in such vital systems. In this paper, a dependable speed controller core based on FPGA has been developed for high speed trains. To improve the reliability and mitigate single upset faults on basic speed controller, this paper proposes a new effective method which is based on hardware redundancy. In the proposed Hybrid Dual Duplex Redundancy(HDDR) method, the original controller is quadruplicated and correct values are voted through the comparator and error detection unit. We have analyzed the proposed system with Reliability, Availability, Mean time to failure and Security(RAMS) theory in order to evaluate the effectiveness of proposed scheme. Theoretical analysis shows that the Mean Time To Failure(MTTF) of the proposed system is 2.5 times better than the traditional Triple Modular Redundancy(TMR). Furthermore, the fault injection experimental results reveal that the capability of tolerating Single Event Upsets(SEUs) in the proposed method increases up to 7.5 times with respect to a regular speed controller.展开更多
Field Programmable Gate Arrays(FPGAs)offer high capability in implementing of complex systems,and currently are an attractive solution for space system electronics.However,FPGAs are susceptible to radiation induced Si...Field Programmable Gate Arrays(FPGAs)offer high capability in implementing of complex systems,and currently are an attractive solution for space system electronics.However,FPGAs are susceptible to radiation induced Single-Event Upsets(SEUs).To insure reliable operation of FPGA based systems in a harsh radiation environment,various SEU mitigation techniques have been provided.In this paper we propose a system based on dynamic partial reconfiguration capability of the modern devices to evaluate the SEU fault effect in FPGA.The proposed approach combines the fault injection controller with the host FPGA,and therefore the hardware complexity is minimized.All of the SEU injection and evaluation requirements are performed by a soft-core which realized inside the host FPGA.Experimental results on some standard benchmark circuits reveal that the proposed system is able to speed up the fault injection campaign 50 times in compared to conventional method.展开更多
Power dissipation has become one of the key optimization conditions in logic design on field programmable gate arrays (FPGAs), thus the power estimation is necessary for logic design optimization. Nowadays, signal a...Power dissipation has become one of the key optimization conditions in logic design on field programmable gate arrays (FPGAs), thus the power estimation is necessary for logic design optimization. Nowadays, signal activity data created by logic simulation based on test vectors is essential to be used to determine the toggle rate of each signals and blocks in power estimation tools provided by field programmable gate array (FPGA) electronic design automation (EDA) tools. The accuracy of power estimation highly depends on the quality of test vectors, especially, pattern coverage. As probability distribution can describe the uncertainty signals, this work provides an algorithm which can estimate FPGAs power more effectively and accurately by using signal probability distribution rather than test vectors.展开更多
Nuclear industries have faced the unfavorable circumstance such as components obsolescence and aging of instrumentation and control system, therefore, nuclear society is striving to resolve this issue fundamentally. V...Nuclear industries have faced the unfavorable circumstance such as components obsolescence and aging of instrumentation and control system, therefore, nuclear society is striving to resolve this issue fundamentally. Various studies have been conducted to address components obsolescence of instrumentation and control system. Intuitively FPGA (field programmable gate arrays) technology is replacing the high level of micro-processor type equipped with various software and hardware which causes acceleration of the aging and obsolescence in I & C (instrumentation and control) system in nuclear power plants. FPGAs are highlighted as an alternative means for obsolete control systems. When engineers design the control system of NPPs (nuclear power plants) with FPGAs, it is important to meet the system development life cycles and conduct the verification and validation activities regarding to FPGA-based applications for use in NPPs. Because the verification and validation process is more important than the design process, engineer should consider the characteristics of FPGA, HDL (hardware description language) programming, faults mode, and optimization technique. And also these characteristics should be reflected in verification and validation activities. As a minimum requirement, system designers require that HDL-programmed applications should be developed in accordance with system development life cycle and HPD design process. In the verification and validation processes, a review, test, and analysis activities should be properly conducted.展开更多
We present a method of time coding with ABAB synchronization timing control for real-time 3D superresolution range-gated imaging (3DSRGI). To meet the high precision of time delay and pulse width in ABAB synchroniza...We present a method of time coding with ABAB synchronization timing control for real-time 3D superresolution range-gated imaging (3DSRGI). To meet the high precision of time delay and pulse width in ABAB synchronization time sequencing, phase shift is implemented to achieve ns-scaled delay and width accuracy without restoring to high clock frequencies. Theoretical analysis and experiments prove that 1 ns delay and width precision is obtained by our timing control unit based on a single field-programmable gate array with 5 ns clock cycle. Finally, a prototype experiment of 3DSRGI is demonstrated at a 10 Hz video rate with 696 pixels × 520 pixels.展开更多
Sub-lines are one-dimensional diffraction patterns representing the light beams emerging from horizontal planes of an object image. Past research has demonstrated that the sub-lines can be encapsulated as a multi-bank...Sub-lines are one-dimensional diffraction patterns representing the light beams emerging from horizontal planes of an object image. Past research has demonstrated that the sub-lines can be encapsulated as a multi-bank filtering process, and implemented with a field programmable gate array (FPGA) device. As the complexity of the filters is high, their length and the number of input pins have to be reduced substantially, hence leading to degradation on the reconstructed images. We propose an enhanced method to overcome the problem by binarizing the filters' coefficients, and half-toning the pixel intensities of the object image. Experimental evaluation reveals that our method results in reconstructed images are superior to that obtained with the parent method.展开更多
Recently,large Transformer models have achieved impressive results in various natural language processing tasks but require enormous parameters and intensive computations,necessitating deployment on multi-device syste...Recently,large Transformer models have achieved impressive results in various natural language processing tasks but require enormous parameters and intensive computations,necessitating deployment on multi-device systems.Current solutions introduce complicated topologies with dedicated high-bandwidth interconnects to reduce communication overhead.To deal with the complexity problem in system architecture and reduce the overhead of inter-device communications,this paper proposes SALTM,a multi-device system based on a unidirectional ring topology and a 2-D model partitioning method considering quantization and pruning.First,a 1-D model partitioning method is proposed to reduce the amount of communication.Then,the block distributed on each device is further partitioned in the orthogonal direction,introducing a task-level pipeline to overlap communication and computation.To further explore the SALTM’s performance on a real large model like GPT-3,we develop an analytical model to evaluate the performance and communication overhead.Our simulation shows that a BERT model with 110 million parameters,implemented by SALTM on four FPGAs can achieve 9.65×and 1.12×speedups compared to CPU and GPU,respectively.The simulation also shows that the execution time of 4-FPGA SALTM is 1.52×that of an ideal system with infinite inter-device bandwidth.For GPT-3 with 175 billion parameters,our analytical model predicts that SALTM comprising 16 VC1502 FPGAs and 16 A30 GPUs can achieve inference latency of 287 ms and 164 ms,respectively.展开更多
基金supported by the Key Techniques of FPGA Architecture under Grant No.9140A08010106QT9201
文摘A novel test approach for interconnect resources(IRs)in field programmable gate arrays(FPGA)has been proposed.In the test approach,SBs(switch boxes)of IRs in FPGA has been utilized to test IRs.Furthermore,configurable logic blocks(CLBs)in FPGA have also been employed to enhance driving capability and the position of fault IR can be determined by monitoring the IRs associated SBs.As a result,IRs can be scanned maximally with minimum configuration patterns.In the experiment,an in-house developed FPGA test system based on system-on-chip(SoC)hardware/software verification technology has been applied to test XC4000E family of Xilinx.The experiment results revealed that the IRs in FPGA can be tested by 6 test patterns.
基金the National Natural Science Foundation of China under Grant No.10674008Biomed-X Center of Peking University.
文摘Accurate calibrations of stiffness and position are crucial to the quantitative measurement with optical tweezers. In this paper, we present a new calibration scheme for optical tweezers including stiffness and position calibrations. In our system, acousto-optic deflectors (AODs) are used as laser beam manipulating component. The AODs are controlled by a field programmable gate array (FPGA) connected to a computer using universal serial bus (USB) communication mode. Our results agree well with the present theory and other experimental results.
基金supported by the National Natural Science Foundation of China(40774013)
文摘Satellite laser ranging (SLR) is one of the major space geodetic instruments, which has various applications in earth science. In this paper, we introduce several issues regarding the key technology implementation of high-repetition-rate SLR system. Compared with traditional technology, using kHz and 8ps pulse width laser component, the data quantity and quality of high-repetition-rate satellite laser ranging (SLR) can be significantly improved. The characteristics of high-repetition-rate laser ranging and the key technologies are presented, including the event timer with the precision of picosecond, the generation of range gate signal, and so on. All of them are based on the Field Programmable Gate Arrays (FPGA) and tested on China mobile SLR system-TROS1000. Finally, the observations of satellite Beacon-C are given.
文摘A novel efficient partial sharing channelization structure with odd and even stacking is designed and implemented. There are two special designs in the proposed structure. Firstly, by the intensive channel overlap design, for non-cooperative wideband signals, the proposed structure can achieve good parameter estimation accuracy and high probability of complete interception.Secondly, based on the partial sharing design developed in this paper, the computation burden of the proposed structure can be greatly reduced compared with the traditional directly implemented structures. Experiments and numerical simulations are conducted to evaluate the proposed structure, which shows its improvements over traditional methods in terms of field programmable gate arrays(FPGA) resource consumption and parameter estimation accuracy.
基金Supported by the National Natural Science Foundation of China(41371402)the Fundamental Research Funds for the Central Universities(2015211020201)
文摘In order to reduce physical unclonable fixnction (PUF) response instability and imbalance caused by the metastability and the bias of arbiter, this paper uses an improved balanced D flip-plop (DFF) based on the unbalanced DFF to reduce the bias in response output and enhances the security of PUF by adopting two balanced DFFs in series. The experimental results show that two cascaded balanced DFFs improve the stability of the DFF, and the output of two balanced DFFs is more reliable. The entropy of output is fixed at 98.7%.
文摘A novel matching method for simultaneous multi-target recognition is proposed by jointly considering target's prior scattering knowledge and the polarization parameters of radar echoes. The matching coefficients are calculated for the judgment. MATLAB simulations show that several targets can be accurately recognized simultaneously, and a high recognition probability can be achieved in Monte Carlo simulations. The total execution time can be remarkably reduced in the Field Programmable Gate Array (FPGA) implementation of the matching procedure.
文摘A simulation method to simulate the pseudorandom code P. M PP radar' s echo signal is proposed that makes use of the pre-generated Doppler simulation data, according to the relative movement parameter of the radar and the target. It resolves the problem of the high precision distance simulation and the high speed digital shift phase. At the same time, the radar dynamic digital video frequency target signal simulator is designed. Simulation results of the critical unit and the output waveform are given. The result of the test satisfies the system's request.
文摘An optical fiber control and transmission module is designed and realized based on Virtex-7 field programmable gata array(FPGA), which can be applied in multi-channel broadband digital receivers. The module consists of sampling data transfer submodule and multi-channel synchronous sampling control submodule. The sampling data transmission in 4× fiber link channel is realized with the self-defined transfer protocol. The measured maximum data rate is 4.97 Gbyte/s. By connecting coherent clocks to the transmitter and receiver endpoints and using the self-defined transfer protocol, multi-channel sampling control signals transferred in optical fibers can be received synchronously by each analog-to-digital converter(ADC) with high accuracy and strong anti-interference ability. The module designed in this paper has certain reference value in increasing the transmission bandwidth and the synchronous sampling accuracy of multi-channel broadband digital receivers.
基金Supported by the National Natural Science Foundation of China(60801052)the Aeronautical Science Foundation of China(2009ZC52036)+1 种基金the Ph.D.Programs Foundation of China's Ministry of Education(200802871056)the Nanjing University of Aeronautics & Astronautics Research Funding(NS2010109,NS2010114)
文摘A novel architecture for computing the fast Fourier transform ( FFT ) on programmable devices is presented.To improve the system operation speed , a hybrid parallel FFT algorithm is used.Results indicate that the use of an 8×8parallel structure for realizing the 64-point FFT leads to a 8times higher processing speed compared with its counterparts employing other series of techniques.
基金the National High Technology Development of China(863-818-01-02)
文摘This paper mainly represents the realization of synchro controller based on the programmable logic devices FPGA by request of HF ground wave radar synchro controller under the instance of making the best of the virtues of FPGA. This design introduces the data communication between PC and synchro controller by PC Bus, which can carry the synchronous signals parameters to RAM of synchro controller, then according to the theory that the result of comparing counter value with signals parameters is the needed wave, we produce all waves HF ground wave radar needs, moreover all waves are produced time-sharing in order to save resources.
文摘In recent years,deep neural networks have become a fascinating and influential research subject,and they play a critical role in video processing and analytics.Since,video analytics are predominantly hardware centric,exploration of implementing the deep neural networks in the hardware needs its brighter light of research.However,the computational complexity and resource constraints of deep neural networks are increasing exponentially by time.Convolutional neural networks are one of the most popular deep learning architecture especially for image classification and video analytics.But these algorithms need an efficient implement strategy for incorporating more real time computations in terms of handling the videos in the hardware.Field programmable Gate arrays(FPGA)is thought to be more advantageous in implementing the convolutional neural networks when compared to Graphics Processing Unit(GPU)in terms of energy efficient and low computational complexity.But still,an intelligent architecture is required for implementing the CNN in FPGA for processing the videos.This paper introduces a modern high-performance,energy-efficient Bat Pruned Ensembled Convolutional networks(BPEC-CNN)for processing the video in the hardware.The system integrates the Bat Evolutionary Pruned layers for CNN and implements the new shared Distributed Filtering Structures(DFS)for handing the filter layers in CNN with pipelined data-path in FPGA.In addition,the proposed system adopts the hardware-software co-design methodology for an energy efficiency and less computational complexity.The extensive experimentations are carried out using CASIA video datasets with ARTIX-7 FPGA boards(number)and various algorithms centric parameters such as accuracy,sensitivity,specificity and architecture centric parameters such as the power,area and throughput are analyzed.These results are then compared with the existing pruned CNN architectures such as CNN-Prunner in which the proposed architecture has been shown 25%better performance than the existing architectures.
文摘For an SOI-FPGA (silicon-on-insulator field programmable gate arrays) (VS1000) fabricated with 0.5 ttm SOI-CMOS (silicon-on-insulator complementary-metal-oxide-semiconductor) process, a complete integrated platform of FPGA computer-aided design (CAD) toolset (VDK) is developed, which can convert the Verilog HDL (hardware description language) description into a bitstream and finally download it into an FPGA. Experiments and testing verify that this FPGA CAD works well and efficiently.
文摘Previous studies show that interconnects occupy a large portion of the timing budget and area in FPGAs.In this work,we propose a time-multiplexing technique on FPGA interconnects.In order to fully exploit this interconnect architecture,we propose a time-multiplexed routing algorithm that can actively identify qualified nets and schedule them to multiplexable wires.We validate the algorithm by using the router to implement 20 benchmark circuits to time-multiplexed FPGAs.We achieve a 38%smaller minimum channel width and 3.8%smaller circuit critical path delay compared with the state-of-the-art architecture router when a wire can be time-multiplexed six times in a cycle.
文摘With the progress of the railway technology, the railway transportation is becoming more efficient, intelligent and faster. High speed trains, as a major part of the railway transportation, are engaged with passenger's safety, and therefore the reliability issue is very important in such vital systems. In this paper, a dependable speed controller core based on FPGA has been developed for high speed trains. To improve the reliability and mitigate single upset faults on basic speed controller, this paper proposes a new effective method which is based on hardware redundancy. In the proposed Hybrid Dual Duplex Redundancy(HDDR) method, the original controller is quadruplicated and correct values are voted through the comparator and error detection unit. We have analyzed the proposed system with Reliability, Availability, Mean time to failure and Security(RAMS) theory in order to evaluate the effectiveness of proposed scheme. Theoretical analysis shows that the Mean Time To Failure(MTTF) of the proposed system is 2.5 times better than the traditional Triple Modular Redundancy(TMR). Furthermore, the fault injection experimental results reveal that the capability of tolerating Single Event Upsets(SEUs) in the proposed method increases up to 7.5 times with respect to a regular speed controller.
文摘Field Programmable Gate Arrays(FPGAs)offer high capability in implementing of complex systems,and currently are an attractive solution for space system electronics.However,FPGAs are susceptible to radiation induced Single-Event Upsets(SEUs).To insure reliable operation of FPGA based systems in a harsh radiation environment,various SEU mitigation techniques have been provided.In this paper we propose a system based on dynamic partial reconfiguration capability of the modern devices to evaluate the SEU fault effect in FPGA.The proposed approach combines the fault injection controller with the host FPGA,and therefore the hardware complexity is minimized.All of the SEU injection and evaluation requirements are performed by a soft-core which realized inside the host FPGA.Experimental results on some standard benchmark circuits reveal that the proposed system is able to speed up the fault injection campaign 50 times in compared to conventional method.
基金supported by the National Natural Science Foundation of China under Grant No. 61176025 and No. 61006027the Fundamental Research Funds for the Central Universities under Grant No.ZYGX2012J003+1 种基金National Laboratory of Analogue Integrated Circuit Grants under Grant No. 9140C0901101002 and No. 9140C0901101003New Century Excellent Talents Program under Grant No.NCET-10-0297
文摘Power dissipation has become one of the key optimization conditions in logic design on field programmable gate arrays (FPGAs), thus the power estimation is necessary for logic design optimization. Nowadays, signal activity data created by logic simulation based on test vectors is essential to be used to determine the toggle rate of each signals and blocks in power estimation tools provided by field programmable gate array (FPGA) electronic design automation (EDA) tools. The accuracy of power estimation highly depends on the quality of test vectors, especially, pattern coverage. As probability distribution can describe the uncertainty signals, this work provides an algorithm which can estimate FPGAs power more effectively and accurately by using signal probability distribution rather than test vectors.
文摘Nuclear industries have faced the unfavorable circumstance such as components obsolescence and aging of instrumentation and control system, therefore, nuclear society is striving to resolve this issue fundamentally. Various studies have been conducted to address components obsolescence of instrumentation and control system. Intuitively FPGA (field programmable gate arrays) technology is replacing the high level of micro-processor type equipped with various software and hardware which causes acceleration of the aging and obsolescence in I & C (instrumentation and control) system in nuclear power plants. FPGAs are highlighted as an alternative means for obsolete control systems. When engineers design the control system of NPPs (nuclear power plants) with FPGAs, it is important to meet the system development life cycles and conduct the verification and validation activities regarding to FPGA-based applications for use in NPPs. Because the verification and validation process is more important than the design process, engineer should consider the characteristics of FPGA, HDL (hardware description language) programming, faults mode, and optimization technique. And also these characteristics should be reflected in verification and validation activities. As a minimum requirement, system designers require that HDL-programmed applications should be developed in accordance with system development life cycle and HPD design process. In the verification and validation processes, a review, test, and analysis activities should be properly conducted.
基金supported by the National Natural Science Foundation of China under Grant Nos.61205019and 61475150
文摘We present a method of time coding with ABAB synchronization timing control for real-time 3D superresolution range-gated imaging (3DSRGI). To meet the high precision of time delay and pulse width in ABAB synchronization time sequencing, phase shift is implemented to achieve ns-scaled delay and width accuracy without restoring to high clock frequencies. Theoretical analysis and experiments prove that 1 ns delay and width precision is obtained by our timing control unit based on a single field-programmable gate array with 5 ns clock cycle. Finally, a prototype experiment of 3DSRGI is demonstrated at a 10 Hz video rate with 696 pixels × 520 pixels.
文摘Sub-lines are one-dimensional diffraction patterns representing the light beams emerging from horizontal planes of an object image. Past research has demonstrated that the sub-lines can be encapsulated as a multi-bank filtering process, and implemented with a field programmable gate array (FPGA) device. As the complexity of the filters is high, their length and the number of input pins have to be reduced substantially, hence leading to degradation on the reconstructed images. We propose an enhanced method to overcome the problem by binarizing the filters' coefficients, and half-toning the pixel intensities of the object image. Experimental evaluation reveals that our method results in reconstructed images are superior to that obtained with the parent method.
基金supported by the Strategic Priority Research Program of the Chinese Academy of Sciences under Grant XDB44000000.
文摘Recently,large Transformer models have achieved impressive results in various natural language processing tasks but require enormous parameters and intensive computations,necessitating deployment on multi-device systems.Current solutions introduce complicated topologies with dedicated high-bandwidth interconnects to reduce communication overhead.To deal with the complexity problem in system architecture and reduce the overhead of inter-device communications,this paper proposes SALTM,a multi-device system based on a unidirectional ring topology and a 2-D model partitioning method considering quantization and pruning.First,a 1-D model partitioning method is proposed to reduce the amount of communication.Then,the block distributed on each device is further partitioned in the orthogonal direction,introducing a task-level pipeline to overlap communication and computation.To further explore the SALTM’s performance on a real large model like GPT-3,we develop an analytical model to evaluate the performance and communication overhead.Our simulation shows that a BERT model with 110 million parameters,implemented by SALTM on four FPGAs can achieve 9.65×and 1.12×speedups compared to CPU and GPU,respectively.The simulation also shows that the execution time of 4-FPGA SALTM is 1.52×that of an ideal system with infinite inter-device bandwidth.For GPT-3 with 175 billion parameters,our analytical model predicts that SALTM comprising 16 VC1502 FPGAs and 16 A30 GPUs can achieve inference latency of 287 ms and 164 ms,respectively.