Chaotic systems have been intensively studied for their roles in many applications, such as cryptography, secure communications, nonlinear controls, etc. However, the limited complexity of existing chaotic systems wea...Chaotic systems have been intensively studied for their roles in many applications, such as cryptography, secure communications, nonlinear controls, etc. However, the limited complexity of existing chaotic systems weakens chaos-based practical applications. Designing chaotic maps with high complexity is attractive. This paper proposes the exponential sine chaotification model(ESCM), a method of using the exponential sine function as a nonlinear transform model, to enhance the complexity of chaotic maps. To verify the performance of the ESCM, we firstly demonstrated it through theoretical analysis. Then, to exhibit the high efficiency and usability of ESCM, we applied ESCM to one-dimensional(1D) and multidimensional(MD) chaotic systems. The effects were examined by the Lyapunov exponent and it was found that enhanced chaotic maps have much more complicated dynamic behaviors compared to their originals. To validate the simplicity of ESCM in hardware implementation, we simulated three enhanced chaotic maps using a digital signal processor(DSP). To explore the ESCM in practical application, we applied ESCM to image encryption. The results verified that the ESCM can make previous chaos maps competitive for usage in image encryption.展开更多
Random Projection Code (RPC) is a mechanism that combines channel coding and modulation together and realizes rate adaptation in the receiving end. Random projection code’s mapping matrix has significant influences o...Random Projection Code (RPC) is a mechanism that combines channel coding and modulation together and realizes rate adaptation in the receiving end. Random projection code’s mapping matrix has significant influences on decoding performance as well as hardware implementation complexity. To reduce hardware implementation complexity, we design a quasi-cyclic mapping matrix for RPC codes. Compared with other construction approaches, our design gets rid of data filter component, thus reducing chip area 7284.95 um2, and power consumption 331.46 uW in 0.13 um fabrication. Our simulation results show that our method does not cause any performance loss and even gets 0.2 dB to 0.5 dB gain at BER 10-4.展开更多
This article proposes an approach to the formalization of tasks and conditions for the hardware implementation of quasi-continuous observation devices with discrete receivers in remote sensing systems.Observation devi...This article proposes an approach to the formalization of tasks and conditions for the hardware implementation of quasi-continuous observation devices with discrete receivers in remote sensing systems.Observation devices with a matrix are used in medicine,ecology,aerospace photography,and geodesy,among other fields.In the discrete receivers,the sampling of an image in the matrix receiver into pixels leads to a decrease in the spatial information of the object.In a greater extent,these disadvantages can be avoided by using photosensitive matrix with a regularly changing(controlled)density of elementary receivers-matrix(RCDOER-matrix).Currently,there is no substantiation of the tasks and conditions for the hardware implementation of RCDOER-matrix.The algorithmic formation of a quasi-continuous image of observation devices with the RCDOER-matrix is proposed.The algorithm used a formal pixel-by-pixel description of the signals in the image.This algorithm formalizes the requirements for creating a photosensitive RCDOER-matrix of a certain size,as well as for changing the mechanism for forming and saving a frame with observation results.The application of the developed method will allow multiplying the pixel size of the image relative to the pixel size of the RCDOER-matrix.Developed algorithms for RCDOER-matrix are supplemented by formalizing the tasks that arise when creating prototypes.In addition,the conditions for hardware implementation are proposed,which ensure the completeness of registration of the observation picture,and allow avoiding excessive pixel measurements.Thus,the results of the research carried out approximate the practical application of RCDOER-matrix.展开更多
Cognitive radio (CR) is a technology that provides a promising new way to improve the efficiency of the use of the electromagnetic spectrum that available. Spectrum sensing helps in the detection of spectrum holes (un...Cognitive radio (CR) is a technology that provides a promising new way to improve the efficiency of the use of the electromagnetic spectrum that available. Spectrum sensing helps in the detection of spectrum holes (unused channels of the band), and instantly move into vacant channels while avoiding occupied ones. An energy detector with baseband sampling for CR is presented with mathematical analyses for an additive white Gaussian noise (AWGN) channels. A brief overview of the energy detection based spectrum sensing for CR technology is introduced. Practical implementation issues on Texas Instruments TMS320C6713 floating point DSP board are presented. Novelties of this work came from a derivation of probability of detection and probability of false alarm for the baseband energy detector without including the sampling theorems and the associated approximation.展开更多
In order to eliminate the energy waste caused by the traditional static hardware multithreaded processor used in real-time embedded system working in the low workload situation, the energy efficiency of the hardware m...In order to eliminate the energy waste caused by the traditional static hardware multithreaded processor used in real-time embedded system working in the low workload situation, the energy efficiency of the hardware multithread is discussed and a novel dynamic multithreaded architecture is proposed. The proposed architecture saves the energy wasted by removing idle threads without manipulation on the original architecture, fulfills a seamless switching mechanism which protects active threads and avoids pipeline stall during power mode switching. The report of an implemented dynamic multithreaded processor with 45 nm process from synthesis tool indicates that the area of dynamic multithreaded architecture is only 2.27% higher than the static one in achieving dynamic power dissipation, and consumes 1.3% more power in the same peak performance.展开更多
The SubBytes (S-box) transformation is the most crucial operation in the AES algorithm, significantly impacting the implementation performance of AES chips. To design a high-performance S-box, a segmented optimization...The SubBytes (S-box) transformation is the most crucial operation in the AES algorithm, significantly impacting the implementation performance of AES chips. To design a high-performance S-box, a segmented optimization implementation of the S-box is proposed based on the composite field inverse operation in this paper. This proposed S-box implementation is modeled using Verilog language and synthesized using Design Complier software under the premise of ensuring the correctness of the simulation result. The synthesis results show that, compared to several current S-box implementation schemes, the proposed implementation of the S-box significantly reduces the area overhead and critical path delay, then gets higher hardware efficiency. This provides strong support for realizing efficient and compact S-box ASIC designs.展开更多
The efficient implementation of the Advanced Encryption Standard(AES)is crucial for network data security.This paper presents novel hardware implementations of the AES S-box,a core component,using tower field represen...The efficient implementation of the Advanced Encryption Standard(AES)is crucial for network data security.This paper presents novel hardware implementations of the AES S-box,a core component,using tower field representations and Boolean Satisfiability(SAT)solvers.Our research makes several significant contri-butions to the field.Firstly,we have optimized the GF(24)inversion,achieving a remarkable 31.35%area reduction(15.33 GE)compared to the best known implementations.Secondly,we have enhanced multiplication implementa-tions for transformation matrices using a SAT-method based on local solutions.This approach has yielded notable improvements,such as a 22.22%reduction in area(42.00 GE)for the top transformation matrix in GF((24)2)-type S-box implementation.Furthermore,we have proposed new implementations of GF(((22)2)2)-type and GF((24)2)-type S-boxes,with the GF(((22)2)2)-type demonstrating superior performance.This implementation offers two variants:a small area variant that sets new area records,and a fast variant that establishes new benchmarks in Area-Execution-Time(AET)and energy consumption.Our approach significantly improves upon existing S-box implementations,offering advancements in area,speed,and energy consumption.These optimizations contribute to more efficient and secure AES implementations,potentially enhancing various cryptographic applications in the field of network security.展开更多
For polar codes,the performance of successive cancellation list(SCL)decoding is capable of approaching that of maximum likelihood decoding.However,the existing hardware architectures for the SCL decoding suffer from h...For polar codes,the performance of successive cancellation list(SCL)decoding is capable of approaching that of maximum likelihood decoding.However,the existing hardware architectures for the SCL decoding suffer from high hardware complexity due to calculating L decoding paths simultaneously,which are unfriendly to the devices with limited logical resources,such as field programmable gate arrays(FPGAs).In this paper,we propose a list-serial pipelined hardware architecture with low complexity for the SCL decoding,where the serial calculation and the pipelined operation are elegantly combined to strike a balance between the complexity and the latency.Moreover,we employ only one successive cancellation(SC)decoder core without L×L crossbars,and reduce the number of inputs of the metric sorter from 2L to L+2.Finally,the FPGA implementations show that the hardware resource consumption is significantly reduced with negligible decoding performance loss.展开更多
Rapid single flux quantum(RSFQ)circuits are a kind of superconducting digital circuits,having properties of a natural gate-level pipelining synchronous sequential circuit,which demonstrates high energy efficiency and ...Rapid single flux quantum(RSFQ)circuits are a kind of superconducting digital circuits,having properties of a natural gate-level pipelining synchronous sequential circuit,which demonstrates high energy efficiency and high throughput advantage.We find that the high-throughput and high-speed performance of RSFQ circuits can take the advantage of a hardware implementation of the encryption algorithm,whereas these are rarely applied to this field.Among the available encryption algorithms,the advanced encryption standard(AES)algorithm is an advanced encryption standard algorithm.It is currently the most widely used symmetric cryptography algorithm.In this work,we aim to demonstrate the SubByte operation of an AES-128 algorithm using RSFQ circuits based on the SIMIT Nb0_(3) process.We design an AES S-box circuit in the RSFQ logic,and compare its operational frequency,power dissipation,and throughput with those of the CMOS-based circuit post-simulated in the same structure.The complete RSFQ S-box circuit costs a total of 42237 Josephson junctions with nearly 130 Gbps throughput under the maximum simulated frequency of 16.28 GHz.Our analysis shows that the frequency and throughput of the RSFQ-based S-box are about four times higher than those of the CMOS-based S-box.Further,we design and fabricate a few typical modules of the S-box.Subsequent measurements demonstrate the correct functioning of the modules in both low and high frequencies up to 28.8 GHz.展开更多
It is necessary to know the status of adhesion conditions between wheel and rail for efficient accelerating and decelerating of railroad vehicle.The proper estimation of adhesion conditions and their real-time impleme...It is necessary to know the status of adhesion conditions between wheel and rail for efficient accelerating and decelerating of railroad vehicle.The proper estimation of adhesion conditions and their real-time implementation is considered a challenge for scholars.In this paper,the development of simulation model of extended Kalman filter(EKF)in MATLAB/Simulink is presented to estimate various railway wheelset parameters in different contact conditions of track.Due to concurrent in nature,the Xilinx®System-on-Chip Zynq Field Programmable Gate Array(FPGA)device is chosen to check the onboard estimation ofwheel-rail interaction parameters by using the National Instruments(NI)myRIO®development board.The NImyRIO®development board is flexible to deal with nonlinearities,uncertain changes,and fastchanging dynamics in real-time occurring in wheel-rail contact conditions during vehicle operation.The simulated dataset of the railway nonlinear wheelsetmodel is tested on FPGA-based EKF with different track conditions and with accelerating and decelerating operations of the vehicle.The proposed model-based estimation of railway wheelset parameters is synthesized on FPGA and its simulation is carried out for functional verification on FPGA.The obtained simulation results are aligned with the simulation results obtained through MATLAB.To the best of our knowledge,this is the first time study that presents the implementation of a model-based estimation of railway wheelset parameters on FPGA and its functional verification.The functional behavior of the FPGA-based estimator shows that these results are the addition of current knowledge in the field of the railway.展开更多
The advances of digital arithmetic techniques permit computer designers to implement high speed application specific chips. The currently produced digital circuits have demonstrated high performance in terms of severa...The advances of digital arithmetic techniques permit computer designers to implement high speed application specific chips. The currently produced digital circuits have demonstrated high performance in terms of several criteria, such as, high clock rate, short input/output delay, small silicon area, and low power dissipation. In this paper, we implement several sinusoidal generation methods to optimize their performance and output using advanced digital arithmetic techniques. In this paper, the implementations of advanced digital oscillator structures with and without pipelining are proposed. The synthesis results of the implementation with pipelining have proven that it is superior to other sinusoidal generation methods in terms of the maximum frequency and signal resolution. Hence, this method is used in the design of the proposed digital oscillator chip.展开更多
The implementation method of the IEEE 802.11 Medium Access Control (MAC) protocol is mainly based on DSP (Digital Signal Processor)/ ARM (Advanced Reduced instruction set computer Machine) processor or DSP/ARM IP (Int...The implementation method of the IEEE 802.11 Medium Access Control (MAC) protocol is mainly based on DSP (Digital Signal Processor)/ ARM (Advanced Reduced instruction set computer Machine) processor or DSP/ARM IP (Intellectual Property) core. This paper presents a method based on Nios II soft-core processor embedded in Altera’s Cyclone FPGA (Field Programmable Gate Array) and MicroC/OS-II RTOS (Real-Time Operation System). The benefits and drawbacks of above methods are compared, and then the method presented in this paper is described. The hardware and software partitioning are discussed; the hardware architecture is also illustrated and the MAC software programming is described in detail. The presented method has some advantages, such as low cost, easy-implementation and very suitable for the implementation of IEEE 802.11 MAC in research stage.展开更多
In this paper, a novel Medium Access Control (MAC) protocol for industrial Wireless Local Area Networks (WLANs) is proposed and studied. The main challenge in industry automation systems is the ultra-low network laten...In this paper, a novel Medium Access Control (MAC) protocol for industrial Wireless Local Area Networks (WLANs) is proposed and studied. The main challenge in industry automation systems is the ultra-low network latency with a target upper bound in the order of 1 ms while maintaining high network reliability and availability. The novelty of the proposed wireless MAC protocol resides in its similar latency performance as its counterpart in wired industrial LAN. First, the functional design of the MAC protocol is introduced. Then its performance results gained from hardware implementation (SystemC and VHDL) on an FPGA platform are presented. Finally, a real-time communication module which achieves the ultra-low latency required in industrial automation is described.展开更多
We propose a novel high-performance hardware architecture of processor for elliptic curve scalar multiplication based on the Lopez-Dahab algorithm over GF(2^163) in polynomial basis representation. The processor can...We propose a novel high-performance hardware architecture of processor for elliptic curve scalar multiplication based on the Lopez-Dahab algorithm over GF(2^163) in polynomial basis representation. The processor can do all the operations using an efficient modular arithmetic logic unit, which includes an addition unit, a square and a carefully designed multiplication unit. In the proposed architecture, multiplication, addition, and square can be performed in parallel by the decomposition of computation. The point addition and point doubling iteration operations can be performed in six multiplications by optimization and solution of data dependency. The implementation results based on Xilinx VirtexⅡ XC2V6000 FPGA show that the proposed design can do random elliptic curve scalar multiplication GF(2^163) in 34.11 μs, occupying 2821 registers and 13 376 LUTs.展开更多
This paper describes two single-chip——complex programmable logic devices/field programmable gate arrays(CPLD/FPGA)——implementations of the new advanced encryption standard (AES) algorithm based on the basic iterat...This paper describes two single-chip——complex programmable logic devices/field programmable gate arrays(CPLD/FPGA)——implementations of the new advanced encryption standard (AES) algorithm based on the basic iteration architecture (design [A]) and the hybrid pipelining architecture (design [B]). Design [A] is an encryption-and-decryption implementation based on the basic iteration architecture. This design not only supports 128-bit, 192-bit, 256-bit keys, but saves hardware resources because of the iteration architecture and sharing technology. Design [B] is a method of the 2×2 hybrid pipelining architecture. Based on the AES interleaved mode of operation, the design successfully accomplishes the algorithm, which operates in the feedback mode (cipher block chaining). It not only guarantees security of encryption/decryption, but obtains high data throughput of 1.05 Gb/s. The two designs have been realized on Aitera′s EP20k300EBC652-1 devices.展开更多
Traditional optical neural networks necessitate separate components for linear operations and nonlinear activations,increasing system complexity and energy consumption.Here,we demonstrate a dual-function electro-optic...Traditional optical neural networks necessitate separate components for linear operations and nonlinear activations,increasing system complexity and energy consumption.Here,we demonstrate a dual-function electro-optical modulator based on graphene-coated silicon waveguides that simultaneously performs both weight mapping and nonlinear activation functions within a single device.By exploiting the voltage-dependent optical transmission characteristics of graphene,our modulator achieves a transmission range from 0.14 to 1.0 with a voltage-response curve that closely approximates a sigmoid function(fitting error<±6%).Assessment analysis on standard machine learning datasets reveals reconstruction accuracies of approximately 86.7%after 60 training iterations,with accuracy remaining above 80%under significant noise perturbations.This approach substantially reduces hardware complexity and energy overhead,representing a significant advancement toward practical integrated photonic neural computing systems.展开更多
The accuracy of present flatness predictive method is limited and it just belongs to software simulation. In order to improve it, a novel flatness predictive model via T-S cloud reasoning network implemented by digita...The accuracy of present flatness predictive method is limited and it just belongs to software simulation. In order to improve it, a novel flatness predictive model via T-S cloud reasoning network implemented by digital signal processor(DSP) is proposed. First, the combination of genetic algorithm(GA) and simulated annealing algorithm(SAA) is put forward, called GA-SA algorithm, which can make full use of the global search ability of GA and local search ability of SA. Later, based on T-S cloud reasoning neural network, flatness predictive model is designed in DSP. And it is applied to 900 HC reversible cold rolling mill. Experimental results demonstrate that the flatness predictive model via T-S cloud reasoning network can run on the hardware DSP TMS320 F2812 with high accuracy and robustness by using GA-SA algorithm to optimize the model parameter.展开更多
With the rapidly increasing number of mobile devices being used as essential terminals or platforms for communication, security threats now target the whole telecommunication infrastructure and become increasingly ser...With the rapidly increasing number of mobile devices being used as essential terminals or platforms for communication, security threats now target the whole telecommunication infrastructure and become increasingly serious. Network probing tools, which are deployed as a bypass device at a mobile core network gateway, can collect and analyze all the traffic for security detection. However, due to the ever-increasing link speed, it is of vital importance to offioad the processing pressure of the detection system. In this paper, we design and evaluate a real-time pre-processing system, which includes a hardware accelerator and a multi-core processor. The implemented prototype can quickly restore each encapsulated packet and effectively distribute traffic to multiple back-end detection systems. We demonstrate the prototype in a well-deployed network environment with large volumes of real data. Experimental results show that our system can achieve at least 18 Gb/s with no packet loss with all kinds of communication protocols.展开更多
基金Project supported by the National Natural Science Foundation of China (Grant No. 51507023)Chongqing Municipal Natural Science Foundation (Grant No. cstc2020jcyjmsxm X0726)the Science and Technology Research Program of Chongqing Municipal Education Commission (Grant No. KJZD-K202100506)。
文摘Chaotic systems have been intensively studied for their roles in many applications, such as cryptography, secure communications, nonlinear controls, etc. However, the limited complexity of existing chaotic systems weakens chaos-based practical applications. Designing chaotic maps with high complexity is attractive. This paper proposes the exponential sine chaotification model(ESCM), a method of using the exponential sine function as a nonlinear transform model, to enhance the complexity of chaotic maps. To verify the performance of the ESCM, we firstly demonstrated it through theoretical analysis. Then, to exhibit the high efficiency and usability of ESCM, we applied ESCM to one-dimensional(1D) and multidimensional(MD) chaotic systems. The effects were examined by the Lyapunov exponent and it was found that enhanced chaotic maps have much more complicated dynamic behaviors compared to their originals. To validate the simplicity of ESCM in hardware implementation, we simulated three enhanced chaotic maps using a digital signal processor(DSP). To explore the ESCM in practical application, we applied ESCM to image encryption. The results verified that the ESCM can make previous chaos maps competitive for usage in image encryption.
文摘Random Projection Code (RPC) is a mechanism that combines channel coding and modulation together and realizes rate adaptation in the receiving end. Random projection code’s mapping matrix has significant influences on decoding performance as well as hardware implementation complexity. To reduce hardware implementation complexity, we design a quasi-cyclic mapping matrix for RPC codes. Compared with other construction approaches, our design gets rid of data filter component, thus reducing chip area 7284.95 um2, and power consumption 331.46 uW in 0.13 um fabrication. Our simulation results show that our method does not cause any performance loss and even gets 0.2 dB to 0.5 dB gain at BER 10-4.
文摘This article proposes an approach to the formalization of tasks and conditions for the hardware implementation of quasi-continuous observation devices with discrete receivers in remote sensing systems.Observation devices with a matrix are used in medicine,ecology,aerospace photography,and geodesy,among other fields.In the discrete receivers,the sampling of an image in the matrix receiver into pixels leads to a decrease in the spatial information of the object.In a greater extent,these disadvantages can be avoided by using photosensitive matrix with a regularly changing(controlled)density of elementary receivers-matrix(RCDOER-matrix).Currently,there is no substantiation of the tasks and conditions for the hardware implementation of RCDOER-matrix.The algorithmic formation of a quasi-continuous image of observation devices with the RCDOER-matrix is proposed.The algorithm used a formal pixel-by-pixel description of the signals in the image.This algorithm formalizes the requirements for creating a photosensitive RCDOER-matrix of a certain size,as well as for changing the mechanism for forming and saving a frame with observation results.The application of the developed method will allow multiplying the pixel size of the image relative to the pixel size of the RCDOER-matrix.Developed algorithms for RCDOER-matrix are supplemented by formalizing the tasks that arise when creating prototypes.In addition,the conditions for hardware implementation are proposed,which ensure the completeness of registration of the observation picture,and allow avoiding excessive pixel measurements.Thus,the results of the research carried out approximate the practical application of RCDOER-matrix.
文摘Cognitive radio (CR) is a technology that provides a promising new way to improve the efficiency of the use of the electromagnetic spectrum that available. Spectrum sensing helps in the detection of spectrum holes (unused channels of the band), and instantly move into vacant channels while avoiding occupied ones. An energy detector with baseband sampling for CR is presented with mathematical analyses for an additive white Gaussian noise (AWGN) channels. A brief overview of the energy detection based spectrum sensing for CR technology is introduced. Practical implementation issues on Texas Instruments TMS320C6713 floating point DSP board are presented. Novelties of this work came from a derivation of probability of detection and probability of false alarm for the baseband energy detector without including the sampling theorems and the associated approximation.
基金supported partially by the National High Technical Research and Development Program of China (863 Program) under Grants No. 2011AA040101, No. 2008AA01Z134the National Natural Science Foundation of China under Grants No. 61003251, No. 61172049, No. 61173150+2 种基金the Doctoral Fund of Ministry of Education of China under Grant No. 20100006110015Beijing Municipal Natural Science Foundation under Grant No. Z111100054011078the 2012 Ladder Plan Project of Beijing Key Laboratory of Knowledge Engineering for Materials Science under Grant No. Z121101002812005
文摘In order to eliminate the energy waste caused by the traditional static hardware multithreaded processor used in real-time embedded system working in the low workload situation, the energy efficiency of the hardware multithread is discussed and a novel dynamic multithreaded architecture is proposed. The proposed architecture saves the energy wasted by removing idle threads without manipulation on the original architecture, fulfills a seamless switching mechanism which protects active threads and avoids pipeline stall during power mode switching. The report of an implemented dynamic multithreaded processor with 45 nm process from synthesis tool indicates that the area of dynamic multithreaded architecture is only 2.27% higher than the static one in achieving dynamic power dissipation, and consumes 1.3% more power in the same peak performance.
文摘The SubBytes (S-box) transformation is the most crucial operation in the AES algorithm, significantly impacting the implementation performance of AES chips. To design a high-performance S-box, a segmented optimization implementation of the S-box is proposed based on the composite field inverse operation in this paper. This proposed S-box implementation is modeled using Verilog language and synthesized using Design Complier software under the premise of ensuring the correctness of the simulation result. The synthesis results show that, compared to several current S-box implementation schemes, the proposed implementation of the S-box significantly reduces the area overhead and critical path delay, then gets higher hardware efficiency. This provides strong support for realizing efficient and compact S-box ASIC designs.
基金supported in part by the National Natural Science Foundation of China(No.62162016)in part by the Innovation Project of Guangxi Graduate Education(Nos.YCBZ2023132 and YCSW2023304).
文摘The efficient implementation of the Advanced Encryption Standard(AES)is crucial for network data security.This paper presents novel hardware implementations of the AES S-box,a core component,using tower field representations and Boolean Satisfiability(SAT)solvers.Our research makes several significant contri-butions to the field.Firstly,we have optimized the GF(24)inversion,achieving a remarkable 31.35%area reduction(15.33 GE)compared to the best known implementations.Secondly,we have enhanced multiplication implementa-tions for transformation matrices using a SAT-method based on local solutions.This approach has yielded notable improvements,such as a 22.22%reduction in area(42.00 GE)for the top transformation matrix in GF((24)2)-type S-box implementation.Furthermore,we have proposed new implementations of GF(((22)2)2)-type and GF((24)2)-type S-boxes,with the GF(((22)2)2)-type demonstrating superior performance.This implementation offers two variants:a small area variant that sets new area records,and a fast variant that establishes new benchmarks in Area-Execution-Time(AET)and energy consumption.Our approach significantly improves upon existing S-box implementations,offering advancements in area,speed,and energy consumption.These optimizations contribute to more efficient and secure AES implementations,potentially enhancing various cryptographic applications in the field of network security.
基金supported in part by the National Key R&D Program of China(No.2019YFB1803400)。
文摘For polar codes,the performance of successive cancellation list(SCL)decoding is capable of approaching that of maximum likelihood decoding.However,the existing hardware architectures for the SCL decoding suffer from high hardware complexity due to calculating L decoding paths simultaneously,which are unfriendly to the devices with limited logical resources,such as field programmable gate arrays(FPGAs).In this paper,we propose a list-serial pipelined hardware architecture with low complexity for the SCL decoding,where the serial calculation and the pipelined operation are elegantly combined to strike a balance between the complexity and the latency.Moreover,we employ only one successive cancellation(SC)decoder core without L×L crossbars,and reduce the number of inputs of the metric sorter from 2L to L+2.Finally,the FPGA implementations show that the hardware resource consumption is significantly reduced with negligible decoding performance loss.
基金This work was supported by the National Natural Science Foundation of China(Grant No.92164101)the National Natural Science Foundation of China(Grant No.62171437)+2 种基金the Strategic Priority Research Program of the Chinese Academy of Sciences(Grant No.XDA18000000)Shanghai Science and Technology Committee(Grant No.21DZ1101000)the National Key R&D Program of China(Grant No.2021YFB0300400).
文摘Rapid single flux quantum(RSFQ)circuits are a kind of superconducting digital circuits,having properties of a natural gate-level pipelining synchronous sequential circuit,which demonstrates high energy efficiency and high throughput advantage.We find that the high-throughput and high-speed performance of RSFQ circuits can take the advantage of a hardware implementation of the encryption algorithm,whereas these are rarely applied to this field.Among the available encryption algorithms,the advanced encryption standard(AES)algorithm is an advanced encryption standard algorithm.It is currently the most widely used symmetric cryptography algorithm.In this work,we aim to demonstrate the SubByte operation of an AES-128 algorithm using RSFQ circuits based on the SIMIT Nb0_(3) process.We design an AES S-box circuit in the RSFQ logic,and compare its operational frequency,power dissipation,and throughput with those of the CMOS-based circuit post-simulated in the same structure.The complete RSFQ S-box circuit costs a total of 42237 Josephson junctions with nearly 130 Gbps throughput under the maximum simulated frequency of 16.28 GHz.Our analysis shows that the frequency and throughput of the RSFQ-based S-box are about four times higher than those of the CMOS-based S-box.Further,we design and fabricate a few typical modules of the S-box.Subsequent measurements demonstrate the correct functioning of the modules in both low and high frequencies up to 28.8 GHz.
文摘It is necessary to know the status of adhesion conditions between wheel and rail for efficient accelerating and decelerating of railroad vehicle.The proper estimation of adhesion conditions and their real-time implementation is considered a challenge for scholars.In this paper,the development of simulation model of extended Kalman filter(EKF)in MATLAB/Simulink is presented to estimate various railway wheelset parameters in different contact conditions of track.Due to concurrent in nature,the Xilinx®System-on-Chip Zynq Field Programmable Gate Array(FPGA)device is chosen to check the onboard estimation ofwheel-rail interaction parameters by using the National Instruments(NI)myRIO®development board.The NImyRIO®development board is flexible to deal with nonlinearities,uncertain changes,and fastchanging dynamics in real-time occurring in wheel-rail contact conditions during vehicle operation.The simulated dataset of the railway nonlinear wheelsetmodel is tested on FPGA-based EKF with different track conditions and with accelerating and decelerating operations of the vehicle.The proposed model-based estimation of railway wheelset parameters is synthesized on FPGA and its simulation is carried out for functional verification on FPGA.The obtained simulation results are aligned with the simulation results obtained through MATLAB.To the best of our knowledge,this is the first time study that presents the implementation of a model-based estimation of railway wheelset parameters on FPGA and its functional verification.The functional behavior of the FPGA-based estimator shows that these results are the addition of current knowledge in the field of the railway.
文摘The advances of digital arithmetic techniques permit computer designers to implement high speed application specific chips. The currently produced digital circuits have demonstrated high performance in terms of several criteria, such as, high clock rate, short input/output delay, small silicon area, and low power dissipation. In this paper, we implement several sinusoidal generation methods to optimize their performance and output using advanced digital arithmetic techniques. In this paper, the implementations of advanced digital oscillator structures with and without pipelining are proposed. The synthesis results of the implementation with pipelining have proven that it is superior to other sinusoidal generation methods in terms of the maximum frequency and signal resolution. Hence, this method is used in the design of the proposed digital oscillator chip.
文摘The implementation method of the IEEE 802.11 Medium Access Control (MAC) protocol is mainly based on DSP (Digital Signal Processor)/ ARM (Advanced Reduced instruction set computer Machine) processor or DSP/ARM IP (Intellectual Property) core. This paper presents a method based on Nios II soft-core processor embedded in Altera’s Cyclone FPGA (Field Programmable Gate Array) and MicroC/OS-II RTOS (Real-Time Operation System). The benefits and drawbacks of above methods are compared, and then the method presented in this paper is described. The hardware and software partitioning are discussed; the hardware architecture is also illustrated and the MAC software programming is described in detail. The presented method has some advantages, such as low cost, easy-implementation and very suitable for the implementation of IEEE 802.11 MAC in research stage.
基金funding from the German Federal Ministry for Education and Research(2015-2017)under the grant agreement No.16KIS0179 also referred as DEAL
文摘In this paper, a novel Medium Access Control (MAC) protocol for industrial Wireless Local Area Networks (WLANs) is proposed and studied. The main challenge in industry automation systems is the ultra-low network latency with a target upper bound in the order of 1 ms while maintaining high network reliability and availability. The novelty of the proposed wireless MAC protocol resides in its similar latency performance as its counterpart in wired industrial LAN. First, the functional design of the MAC protocol is introduced. Then its performance results gained from hardware implementation (SystemC and VHDL) on an FPGA platform are presented. Finally, a real-time communication module which achieves the ultra-low latency required in industrial automation is described.
基金supported by the Hi-Tech Research and Development Program (863) of China (No. 2006AA01Z226)the Research Foun dation of Huazhong University of Science and Technology, China (No. 2006Z001B)
文摘We propose a novel high-performance hardware architecture of processor for elliptic curve scalar multiplication based on the Lopez-Dahab algorithm over GF(2^163) in polynomial basis representation. The processor can do all the operations using an efficient modular arithmetic logic unit, which includes an addition unit, a square and a carefully designed multiplication unit. In the proposed architecture, multiplication, addition, and square can be performed in parallel by the decomposition of computation. The point addition and point doubling iteration operations can be performed in six multiplications by optimization and solution of data dependency. The implementation results based on Xilinx VirtexⅡ XC2V6000 FPGA show that the proposed design can do random elliptic curve scalar multiplication GF(2^163) in 34.11 μs, occupying 2821 registers and 13 376 LUTs.
文摘This paper describes two single-chip——complex programmable logic devices/field programmable gate arrays(CPLD/FPGA)——implementations of the new advanced encryption standard (AES) algorithm based on the basic iteration architecture (design [A]) and the hybrid pipelining architecture (design [B]). Design [A] is an encryption-and-decryption implementation based on the basic iteration architecture. This design not only supports 128-bit, 192-bit, 256-bit keys, but saves hardware resources because of the iteration architecture and sharing technology. Design [B] is a method of the 2×2 hybrid pipelining architecture. Based on the AES interleaved mode of operation, the design successfully accomplishes the algorithm, which operates in the feedback mode (cipher block chaining). It not only guarantees security of encryption/decryption, but obtains high data throughput of 1.05 Gb/s. The two designs have been realized on Aitera′s EP20k300EBC652-1 devices.
基金supported in part by the National Natural Science Foundation of China(Grant No.62401276)in part by the Natural Science Research Start-up Foundation of Recruiting Talents of Nanjing University of Posts and Telecommunications(Grant No.NY223161)。
文摘Traditional optical neural networks necessitate separate components for linear operations and nonlinear activations,increasing system complexity and energy consumption.Here,we demonstrate a dual-function electro-optical modulator based on graphene-coated silicon waveguides that simultaneously performs both weight mapping and nonlinear activation functions within a single device.By exploiting the voltage-dependent optical transmission characteristics of graphene,our modulator achieves a transmission range from 0.14 to 1.0 with a voltage-response curve that closely approximates a sigmoid function(fitting error<±6%).Assessment analysis on standard machine learning datasets reveals reconstruction accuracies of approximately 86.7%after 60 training iterations,with accuracy remaining above 80%under significant noise perturbations.This approach substantially reduces hardware complexity and energy overhead,representing a significant advancement toward practical integrated photonic neural computing systems.
基金Project(E2015203354)supported by Natural Science Foundation of Steel United Research Fund of Hebei Province,ChinaProject(ZD2016100)supported by the Science and the Technology Research Key Project of High School of Hebei Province,China+1 种基金Project(LJRC013)supported by the University Innovation Team of Hebei Province Leading Talent Cultivation,ChinaProject(16LGY015)supported by the Basic Research Special Breeding of Yanshan University,China
文摘The accuracy of present flatness predictive method is limited and it just belongs to software simulation. In order to improve it, a novel flatness predictive model via T-S cloud reasoning network implemented by digital signal processor(DSP) is proposed. First, the combination of genetic algorithm(GA) and simulated annealing algorithm(SAA) is put forward, called GA-SA algorithm, which can make full use of the global search ability of GA and local search ability of SA. Later, based on T-S cloud reasoning neural network, flatness predictive model is designed in DSP. And it is applied to 900 HC reversible cold rolling mill. Experimental results demonstrate that the flatness predictive model via T-S cloud reasoning network can run on the hardware DSP TMS320 F2812 with high accuracy and robustness by using GA-SA algorithm to optimize the model parameter.
基金supported by the National High-Tech R&D Program(863)of China(No.2012AA013002)
文摘With the rapidly increasing number of mobile devices being used as essential terminals or platforms for communication, security threats now target the whole telecommunication infrastructure and become increasingly serious. Network probing tools, which are deployed as a bypass device at a mobile core network gateway, can collect and analyze all the traffic for security detection. However, due to the ever-increasing link speed, it is of vital importance to offioad the processing pressure of the detection system. In this paper, we design and evaluate a real-time pre-processing system, which includes a hardware accelerator and a multi-core processor. The implemented prototype can quickly restore each encapsulated packet and effectively distribute traffic to multiple back-end detection systems. We demonstrate the prototype in a well-deployed network environment with large volumes of real data. Experimental results show that our system can achieve at least 18 Gb/s with no packet loss with all kinds of communication protocols.