As an important branch of information security algorithms,the efficient and flexible implementation of stream ciphers is vital.Existing implementation methods,such as FPGA,GPP and ASIC,provide a good support,but they ...As an important branch of information security algorithms,the efficient and flexible implementation of stream ciphers is vital.Existing implementation methods,such as FPGA,GPP and ASIC,provide a good support,but they could not achieve a better tradeoff between high speed processing and high flexibility.ASIC has fast processing speed,but its flexibility is poor,GPP has high flexibility,but the processing speed is slow,FPGA has high flexibility and processing speed,but the resource utilization is very low.This paper studies a stream cryptographic processor which can efficiently and flexibly implement a variety of stream cipher algorithms.By analyzing the structure model,processing characteristics and storage characteristics of stream ciphers,a reconfigurable stream cryptographic processor with special instructions based on VLIW is presented,which has separate/cluster storage structure and is oriented to stream cipher operations.The proposed instruction structure can effectively support stream cipher processing with multiple data bit widths,parallelism among stream cipher processing with different data bit widths,and parallelism among branch control and stream cipher processing with high instruction level parallelism;the designed separate/clustered special bit registers and general register heaps,key register heaps can satisfy cryptographic requirements.So the proposed processor not only flexibly accomplishes the combination of multiple basic stream cipher operations to finish stream cipher algorithms.It has been implemented with 0.18μm CMOS technology,the test results show that the frequency can reach 200 MHz,and power consumption is 310 mw.Ten kinds of stream ciphers were realized in the processor.The key stream generation throughput of Grain-80,W7,MICKEY,ACHTERBAHN and Shrink algorithm is 100 Mbps,66.67 Mbps,66.67 Mbps,50 Mbps and 800 Mbps,respectively.The test result shows that the processor presented can achieve good tradeoff between high performance and flexibility of stream ciphers.展开更多
An Efficient and flexible implementation of block ciphers is critical to achieve information security processing.Existing implementation methods such as GPP,FPGA and cryptographic application-specific ASIC provide the...An Efficient and flexible implementation of block ciphers is critical to achieve information security processing.Existing implementation methods such as GPP,FPGA and cryptographic application-specific ASIC provide the broad range of support.However,these methods could not achieve a good tradeoff between high-speed processing and flexibility.In this paper,we present a reconfigurable VLIW processor architecture targeted at block cipher processing,analyze basic operations and storage characteristics,and propose the multi-cluster register-file structure for block ciphers.As for the same operation element of block ciphers,we adopt reconfigurable technology for multiple cryptographic processing units and interconnection scheme.The proposed processor not only flexibly accomplishes the combination of multiple basic cryptographic operations,but also realizes dynamic configuration for cryptographic processing units.It has been implemented with0.18μm CMOS technology,the test results show that the frequency can reach 350 MHz.and power consumption is 420 mw.Ten kinds of block and hash ciphers were realized in the processor.The encryption throughput of AES,DES,IDEA,and SHA-1 algorithm is1554 Mbps,448Mbps,785 Mbps,and 424 Mbps respectively,the test result shows that our processor's encryption performance is significantly higher than other designs.展开更多
The requirement of the flexible and effective implementation of the Elliptic Curve Cryptography (ECC) has become more and more exigent since its dominant position in the public-key cryptography application.Based on an...The requirement of the flexible and effective implementation of the Elliptic Curve Cryptography (ECC) has become more and more exigent since its dominant position in the public-key cryptography application.Based on analyzing the basic structure features of Elliptic Curve Cryptography (ECC) algorithms,the parallel schedule algorithm of point addition and doubling is presented.And based on parallel schedule algorithm,the Application Specific Instruction-Set Co-Processor of ECC that adopting VLIW architecture is also proposed in this paper.The coprocessor for ECC is implemented and validated using Altera’s FPGA.The experimental result shows that our proposed coprocessor has advantage in high performance and flexibility.展开更多
The rapid development of multimedia techniques has increased the demands on multimedia processors. This paper presents a new design method to quickly design high performance processors for new multimedia applications....The rapid development of multimedia techniques has increased the demands on multimedia processors. This paper presents a new design method to quickly design high performance processors for new multimedia applications. In this approach, a configurable processor based on the very long instruction-set word architecture is used as the basic core for designers to easily configure new processor cores for multimedia algorithm. Specific instructions designed for multimedia applications efficiently improve the performance of the target processor. Functions not implemented in the digital signal processor (DSP) core can be easily integrated into the target processor as user-defined hardware to increase the performance. Several examples are given based on the architecture. The results show that the processor performance is enhanced approximately 4 times on the H.263 codec and that the processor outperforms both DSPs and single instruction multiple data (SIMD) multimedia extension architectures by up to 8 times when computing the 2-D-IDCT.展开更多
We present a novel audio-processing platform, FlexEngine, which is composed of a 24-bit applicationspecific instruction-set processor (ASIP) and five dedicated accelerators. Acceleration instructions, compact instru...We present a novel audio-processing platform, FlexEngine, which is composed of a 24-bit applicationspecific instruction-set processor (ASIP) and five dedicated accelerators. Acceleration instructions, compact instructions and repeat instruction are added into the ASIP's instruction set to deal with some core tasks of hearing aid algorithms. The five configurable accelerators are used to execute several of the most common functions of hearing aids. Moreover, several low power strategies, such as clock gating, data isolation, memory partition, bypass mode, sleep mode, are also applied in this platform for power reduction. The proposed platform is implemented in CMOS 130 nm technology, and test results show that power consumption of FlexEngine is 0.863 mW with the clock frequency of 8 MHz at Vdd = 1.0 V.展开更多
基金supported by National Natural Science Foundation of China with granted No.61404175
文摘As an important branch of information security algorithms,the efficient and flexible implementation of stream ciphers is vital.Existing implementation methods,such as FPGA,GPP and ASIC,provide a good support,but they could not achieve a better tradeoff between high speed processing and high flexibility.ASIC has fast processing speed,but its flexibility is poor,GPP has high flexibility,but the processing speed is slow,FPGA has high flexibility and processing speed,but the resource utilization is very low.This paper studies a stream cryptographic processor which can efficiently and flexibly implement a variety of stream cipher algorithms.By analyzing the structure model,processing characteristics and storage characteristics of stream ciphers,a reconfigurable stream cryptographic processor with special instructions based on VLIW is presented,which has separate/cluster storage structure and is oriented to stream cipher operations.The proposed instruction structure can effectively support stream cipher processing with multiple data bit widths,parallelism among stream cipher processing with different data bit widths,and parallelism among branch control and stream cipher processing with high instruction level parallelism;the designed separate/clustered special bit registers and general register heaps,key register heaps can satisfy cryptographic requirements.So the proposed processor not only flexibly accomplishes the combination of multiple basic stream cipher operations to finish stream cipher algorithms.It has been implemented with 0.18μm CMOS technology,the test results show that the frequency can reach 200 MHz,and power consumption is 310 mw.Ten kinds of stream ciphers were realized in the processor.The key stream generation throughput of Grain-80,W7,MICKEY,ACHTERBAHN and Shrink algorithm is 100 Mbps,66.67 Mbps,66.67 Mbps,50 Mbps and 800 Mbps,respectively.The test result shows that the processor presented can achieve good tradeoff between high performance and flexibility of stream ciphers.
基金supported by National Natural Science Foundation of China with granted No.61404175
文摘An Efficient and flexible implementation of block ciphers is critical to achieve information security processing.Existing implementation methods such as GPP,FPGA and cryptographic application-specific ASIC provide the broad range of support.However,these methods could not achieve a good tradeoff between high-speed processing and flexibility.In this paper,we present a reconfigurable VLIW processor architecture targeted at block cipher processing,analyze basic operations and storage characteristics,and propose the multi-cluster register-file structure for block ciphers.As for the same operation element of block ciphers,we adopt reconfigurable technology for multiple cryptographic processing units and interconnection scheme.The proposed processor not only flexibly accomplishes the combination of multiple basic cryptographic operations,but also realizes dynamic configuration for cryptographic processing units.It has been implemented with0.18μm CMOS technology,the test results show that the frequency can reach 350 MHz.and power consumption is 420 mw.Ten kinds of block and hash ciphers were realized in the processor.The encryption throughput of AES,DES,IDEA,and SHA-1 algorithm is1554 Mbps,448Mbps,785 Mbps,and 424 Mbps respectively,the test result shows that our processor's encryption performance is significantly higher than other designs.
基金supported by the national high technology research and development 863 program of China.(2008AA01Z103)
文摘The requirement of the flexible and effective implementation of the Elliptic Curve Cryptography (ECC) has become more and more exigent since its dominant position in the public-key cryptography application.Based on analyzing the basic structure features of Elliptic Curve Cryptography (ECC) algorithms,the parallel schedule algorithm of point addition and doubling is presented.And based on parallel schedule algorithm,the Application Specific Instruction-Set Co-Processor of ECC that adopting VLIW architecture is also proposed in this paper.The coprocessor for ECC is implemented and validated using Altera’s FPGA.The experimental result shows that our proposed coprocessor has advantage in high performance and flexibility.
基金Supported by the National Natural Science Foundation of China (No. 60236020)the Specialized Research Fund for the Doctoral Program of Higher Education (No. 20050003083)
文摘The rapid development of multimedia techniques has increased the demands on multimedia processors. This paper presents a new design method to quickly design high performance processors for new multimedia applications. In this approach, a configurable processor based on the very long instruction-set word architecture is used as the basic core for designers to easily configure new processor cores for multimedia algorithm. Specific instructions designed for multimedia applications efficiently improve the performance of the target processor. Functions not implemented in the digital signal processor (DSP) core can be easily integrated into the target processor as user-defined hardware to increase the performance. Several examples are given based on the architecture. The results show that the processor performance is enhanced approximately 4 times on the H.263 codec and that the processor outperforms both DSPs and single instruction multiple data (SIMD) multimedia extension architectures by up to 8 times when computing the 2-D-IDCT.
基金supported by the Advanced Research Program of the Chinese Academy of Sciences(No.XDA06020401)the National Natural Science Foundation of China(No.61306039)
文摘We present a novel audio-processing platform, FlexEngine, which is composed of a 24-bit applicationspecific instruction-set processor (ASIP) and five dedicated accelerators. Acceleration instructions, compact instructions and repeat instruction are added into the ASIP's instruction set to deal with some core tasks of hearing aid algorithms. The five configurable accelerators are used to execute several of the most common functions of hearing aids. Moreover, several low power strategies, such as clock gating, data isolation, memory partition, bypass mode, sleep mode, are also applied in this platform for power reduction. The proposed platform is implemented in CMOS 130 nm technology, and test results show that power consumption of FlexEngine is 0.863 mW with the clock frequency of 8 MHz at Vdd = 1.0 V.