期刊文献+
共找到3篇文章
< 1 >
每页显示 20 50 100
Network Sorting Algorithm of Multi-Frequency Signal with Adaptive SNR
1
作者 Xinyong Yu Ying Guo +2 位作者 Kunfeng Zhang Lei Li Hongguang Li 《Journal of Beijing Institute of Technology》 EI CAS 2018年第2期206-212,共7页
An signal noise ratio( SNR) adaptive sorting algorithm using the time-frequency( TF)sparsity of frequency-hopping( FH) signal is proposed in this paper. Firstly,the Gabor transformation is used as TF transformat... An signal noise ratio( SNR) adaptive sorting algorithm using the time-frequency( TF)sparsity of frequency-hopping( FH) signal is proposed in this paper. Firstly,the Gabor transformation is used as TF transformation in the system and a sorting model is established under undetermined condition; then the SNR adaptive pivot threshold setting method is used to find the TF single source. The mixed matrix is estimated according to the TF matrix of single source. Lastly,signal sorting is realized through improved subspace projection combined with relative power deviation of source. Theoretical analysis and simulation results showthat this algorithm has good effectiveness and performance. 展开更多
关键词 frequency-hopping(FH) under-determined adaptive signal noise ratio(SNR) time-frequency(TF) signal source network sorting
在线阅读 下载PDF
Fine-grained vectorized merge sorting on RISC-V:from register to cache
2
作者 Jin Zhang Jincheng Zhou +2 位作者 Xiang Zhang Di Ma Chunye Gong 《CCF Transactions on High Performance Computing》 2025年第1期58-71,共14页
Merge sort as a divide-sort-merge paradigm has been widely applied in computer science fields.As modern reduced instruction set computing architectures like the fifth generation(RISC-V)regard multiple registers as a v... Merge sort as a divide-sort-merge paradigm has been widely applied in computer science fields.As modern reduced instruction set computing architectures like the fifth generation(RISC-V)regard multiple registers as a vector register group for wide instruction parallelism,optimizing merge sort with this vectorized property is becoming increasingly common.In this paper,we overhaul the divide-sort-merge paradigm,from its register-level sort to the cache-aware merge,to develop a finegrained RISC-V vectorized merge sort(RVMS).From the register-level view,the inline vectorized transpose instruction is missed in RISC-V,so implementing it efficiently is non-trivial.Besides,the vectorized comparisons do not always work well in the merging networks.Both issues primarily stem from the expensive data shuffle instruction.To bypass it,RVMS strides to take register data as the proxy of data shuffle to accelerate the transpose operation,and meanwhile replaces vectorized comparisons with scalar cousin for more light real value swap.On the other hand,as cache-aware merge makes larger data merge in the cache,most merge schemes have two drawbacks:the in-cache merge usually has low cache utilization,while the out-of-cache merging network remains an ineffectively symmetric structure.To this end,we propose the half-merge scheme to employ the auxiliary space of in-place merge to halve the footprint of naie merge sort,and meanwhile copy one sequence to this space to avoid the former data exchange.Furthermore,an asymmetric merging network is developed to adapt to two different input sizes.Experiments on the RISC-V processor SG2042 show that four fine-grained optimization schemes including register strided transpose,hybrid merging network,half-merge strategy,and asymmetric merging network,improve performance by 4.05%,19.88%,12.23%,and 11.04%respectively.Importantly,the overall performance is 1.34x faster than the parallel sorting in the Boost C++library,and 1.85x faster than std::sort. 展开更多
关键词 Parallel sort sorting network SIMD RISC-V
在线阅读 下载PDF
thSORT:an efficient parallel sorting algorithm on multi‑core DSPs
3
作者 Mouzhi Yang Peng Zhang +2 位作者 Jianbin Fang Weifeng Liu Chun Huang 《CCF Transactions on High Performance Computing》 2024年第5期503-518,共16页
Multi-core architecture has become the main trend in high performance computing(HPC)because of its powerful parallel computing capability.Due to energy efficiency constraints,energy-efficient multi-core digital signal... Multi-core architecture has become the main trend in high performance computing(HPC)because of its powerful parallel computing capability.Due to energy efficiency constraints,energy-efficient multi-core digital signal processors(DSPs)have become an alternative architecture in HPC systems.FT-M7032 is a CPU-DSP heterogeneous processor that integrates 16 CPU cores for running operating systems and four multi-core general purpose DSP(GPDSP)clusters for providing high performance.Sorting is a fundamental operation in computer science with numerous applications and has been studied extensively,but high-performance parallel sorting algorithms are typically architecture-specific.To our knowledge,little attention has been paid to optimizing the sorting on the low-power multicore DSPs.In this paper,we propose thSORT,an efficient bitonic sorting algorithm for FT-M7032.Our algorithm consists of two parts:single-core DSP sorting and multi-core DSP sorting,both aiming to tap the features of FT-M7032.We implement a vector micro-kernel for bitonic sort and propose a multi-level algorithm to merge the results of the micro-kernel.When compared to the CPU baseline,our implementation is 1.43×faster than the parallel sorting of the Boost C++Libraries,and is 2.15×faster than std::sort. 展开更多
关键词 Bitonic sorting network Multi-core DSPs Parallel sorting
在线阅读 下载PDF
上一页 1 下一页 到第
使用帮助 返回顶部