张量转置(tensor transposition)作为基础张量运算原语,广泛应用于信号处理、科学计算以及深度学习等各种领域,在张量数据密集型应用及高性能计算中具有重要作用。随着能效指标在高性能计算系统中的重要性日益凸显,基于数字信号处理器(d...张量转置(tensor transposition)作为基础张量运算原语,广泛应用于信号处理、科学计算以及深度学习等各种领域,在张量数据密集型应用及高性能计算中具有重要作用。随着能效指标在高性能计算系统中的重要性日益凸显,基于数字信号处理器(digital signal processors,DSPs)的加速器已被集成至通用计算系统。然而,传统面向多核CPU和GPU的张量转置库因架构差异无法充分适配DSP架构。一方面,DSP架构的向量化计算潜力尚未得到充分挖掘;另一方面,其复杂的片上存储体系与多层次共享内存结构为张量并行程序设计带来了显著挑战。针对国产多核DSP的架构特点,提出ftmTT算法,并设计实现了一个面向多核DSP架构的通用张量转置库。ftmTT算法通过设计适配DSP架构的高效内存访问模式充分挖掘其并行化和向量化潜力,其核心创新包括:1)采用分块策略将高维张量转置转化为多核DSP平台所提供的矩阵转置内核操作;2)提出基于DMA点对点传输的张量数据块访存合并方案来降低数据搬运开销;3)通过双缓冲设计异步重叠转置计算与DMA传输实现计算通信隐藏,最终面向多核DSP实现高性能并行张量转置。在国产多核DSP平台FT-M7032的实验表明,ftmTT张量转置算法取得了最高达理论带宽75.96%的性能,达到FT-M7032平台STREAM带宽99.23%的性能。展开更多
Space-Based Solar Power(SBSP) presents a promising solution for achieving carbon neutrality and Renewable Electricity 100%(RE100) goals by offering a stable and continuous energy supply. However, its commercialization...Space-Based Solar Power(SBSP) presents a promising solution for achieving carbon neutrality and Renewable Electricity 100%(RE100) goals by offering a stable and continuous energy supply. However, its commercialization faces significant obstacles due to the technical challenges of long-distance microwave Wireless Power Transmission(WPT) from geostationary orbit. Even ground-based kilometer-scale WPT experiments remain difficult because of limited testing infrastructure, high costs, and strict electromagnetic wave regulations. Since the 1975 NASA-Raytheon experiment, which successfully recovered 30 kW of power over 1.55 km, there has been little progress in extending the transmission distance or increasing the retrieved power. This study proposes a cost-effective methodology for conducting long-range WPT experiments in constrained environments by utilizing existing infrastructure. A deep space antenna operating at 2.08 GHz with an output power of 2.3 kW and a gain of 55.3 dBi was used as the transmitter. Two test configurations were implemented: a 1.81 km ground-to-air test using an aerostat to elevate the receiver and a 1.82 km ground-to-ground test using a ladder truck positioned on a plateau. The rectenna consists of a lightweight 3×3 patch antenna array(0.9 m × 0.9 m), accompanied by a steering device and LED indicators to verify power reception. The aerostat-based test achieved a power density of 154.6 mW/m2, which corresponds to approximately 6.2% of the theoretical maximum. The performance gap is primarily attributed to near-field interference, detuning of the patch antenna, rectifier mismatch, and alignment issues. These limitations are expected to be mitigated through improved patch antenna fabrication, a transition from GaN to GaAs rectifiers optimized for lower input power, and the implementation of an automated alignment system. With these enhancements, the recovered power is expected to improve by approximately four to five times. The results demonstrate a practical and scalable framework for long-range WPT experiments under constrained conditions and provide key insights for advancing SBSP technology.展开更多
文摘张量转置(tensor transposition)作为基础张量运算原语,广泛应用于信号处理、科学计算以及深度学习等各种领域,在张量数据密集型应用及高性能计算中具有重要作用。随着能效指标在高性能计算系统中的重要性日益凸显,基于数字信号处理器(digital signal processors,DSPs)的加速器已被集成至通用计算系统。然而,传统面向多核CPU和GPU的张量转置库因架构差异无法充分适配DSP架构。一方面,DSP架构的向量化计算潜力尚未得到充分挖掘;另一方面,其复杂的片上存储体系与多层次共享内存结构为张量并行程序设计带来了显著挑战。针对国产多核DSP的架构特点,提出ftmTT算法,并设计实现了一个面向多核DSP架构的通用张量转置库。ftmTT算法通过设计适配DSP架构的高效内存访问模式充分挖掘其并行化和向量化潜力,其核心创新包括:1)采用分块策略将高维张量转置转化为多核DSP平台所提供的矩阵转置内核操作;2)提出基于DMA点对点传输的张量数据块访存合并方案来降低数据搬运开销;3)通过双缓冲设计异步重叠转置计算与DMA传输实现计算通信隐藏,最终面向多核DSP实现高性能并行张量转置。在国产多核DSP平台FT-M7032的实验表明,ftmTT张量转置算法取得了最高达理论带宽75.96%的性能,达到FT-M7032平台STREAM带宽99.23%的性能。
文摘Space-Based Solar Power(SBSP) presents a promising solution for achieving carbon neutrality and Renewable Electricity 100%(RE100) goals by offering a stable and continuous energy supply. However, its commercialization faces significant obstacles due to the technical challenges of long-distance microwave Wireless Power Transmission(WPT) from geostationary orbit. Even ground-based kilometer-scale WPT experiments remain difficult because of limited testing infrastructure, high costs, and strict electromagnetic wave regulations. Since the 1975 NASA-Raytheon experiment, which successfully recovered 30 kW of power over 1.55 km, there has been little progress in extending the transmission distance or increasing the retrieved power. This study proposes a cost-effective methodology for conducting long-range WPT experiments in constrained environments by utilizing existing infrastructure. A deep space antenna operating at 2.08 GHz with an output power of 2.3 kW and a gain of 55.3 dBi was used as the transmitter. Two test configurations were implemented: a 1.81 km ground-to-air test using an aerostat to elevate the receiver and a 1.82 km ground-to-ground test using a ladder truck positioned on a plateau. The rectenna consists of a lightweight 3×3 patch antenna array(0.9 m × 0.9 m), accompanied by a steering device and LED indicators to verify power reception. The aerostat-based test achieved a power density of 154.6 mW/m2, which corresponds to approximately 6.2% of the theoretical maximum. The performance gap is primarily attributed to near-field interference, detuning of the patch antenna, rectifier mismatch, and alignment issues. These limitations are expected to be mitigated through improved patch antenna fabrication, a transition from GaN to GaAs rectifiers optimized for lower input power, and the implementation of an automated alignment system. With these enhancements, the recovered power is expected to improve by approximately four to five times. The results demonstrate a practical and scalable framework for long-range WPT experiments under constrained conditions and provide key insights for advancing SBSP technology.