摘要
FFT算法在计算机科学中具有广泛的应用,自适应FFT软件包以其良好的可移植性而备受研究人员和用户的青睐,龙芯3A是中科院计算所自主研发的四核CPU,采用RISC架构,兼容MIPS指令。主要对FFTW,UHFFT,SPIRAL这3类FFT自适应软件包进行研究。首先从搜索框架和代码产生器两方面总结了FFTW和UHFFT的异同,接着阐述了SPIRAL自动产生优化代码的三层架构实现原理,之后在国产CPU龙芯3A上对这3个软件包进行了性能测试,并结合龙芯的体系结构特点对结果作了分析对比。在最后总结了目前自适应FFT软件包的一般方法,为下一步开发自适应FFT软件包提供了思路。
FFT algorithm has a wide range of applications in computer science.Adaptive FFT software package with its excellent portability has been interested by many researchers and users.Loongson 3A is developed by institute of computing technology,Chinese academy of sciences.It is a quad-core CPU and compatible with MIPS instructions using RISC architecture.The article focused on three types of FFT adaptive libraries which are FFTW,UHFFT and SPIRAL.Firstly,we compared the difference between FFTW and UHFFT from two aspects of search framework and code generator.Then we elaborated SPIRAL's three layers schema which is used to produce optimized code automatically.Furthermore,we evaluated these libraries on the Loonson 3A platform and analyzed the results.Finally,we concluded the general method of current FFT adaptive software packages and provided a guideline for further development of adaptive FFT software package.
出处
《计算机科学》
CSCD
北大核心
2012年第12期281-285,共5页
Computer Science
基金
国家自然科学基金(61133005)
国家高技术研究发展项目(863)(2009AA01A129
2009AA01A134)
国家重大专项核高基项目(2009ZX01036-001-002)资助