Fine scalability can provide not only precise rate control for constant bitrate (CBR) traffic, but also accurate quality control for variable bitrate (VBR) traffic. Motion JPEG2000 is a codec that can provide fine sca...Fine scalability can provide not only precise rate control for constant bitrate (CBR) traffic, but also accurate quality control for variable bitrate (VBR) traffic. Motion JPEG2000 is a codec that can provide fine scalability with bitstreams. An efficient rate control approach utilizing a single buffer and two kinds of threshold for Motion JPEG2000 under resource constraint was proposed, which can offer good result in the constant quality video.展开更多
In recent years,various network architectures based on the Transformer model have achieved significant success in natural language processing and are increasingly being applied to other fields,underscoring the importa...In recent years,various network architectures based on the Transformer model have achieved significant success in natural language processing and are increasingly being applied to other fields,underscoring the importance of accelerating Transformer models.Models based on the Transformer architecture typically contain a vast number of parameters and impose substantial computational demands.The training and inference of these models requires significant computational resource,placing considerable demands on the computational backends.Developing software ecosystem across different platforms requires substantial development effort,making the research into cross-platform code generation technology for Transformer models particularly important.In the work,we propose HiperTI,a high performance system designed for cross-platform code generation,facilitating the inference of large transformer models based on MLIR.The GEMM code generated by HiperTI matches cuBLAS on NVIDIA A100 GPUs in performance,while its Attention computation achieves twice the performance of Triton.Additionally,on the Hygon DCU Z100,the Attention kernel from HiperTI demonstrates a 20%average performance improvement over PyTorch.展开更多
文摘Fine scalability can provide not only precise rate control for constant bitrate (CBR) traffic, but also accurate quality control for variable bitrate (VBR) traffic. Motion JPEG2000 is a codec that can provide fine scalability with bitstreams. An efficient rate control approach utilizing a single buffer and two kinds of threshold for Motion JPEG2000 under resource constraint was proposed, which can offer good result in the constant quality video.
基金supported in part by the National Natural Science Foundation of China under grant numbers 62172391,62032023,and T2125013.
文摘In recent years,various network architectures based on the Transformer model have achieved significant success in natural language processing and are increasingly being applied to other fields,underscoring the importance of accelerating Transformer models.Models based on the Transformer architecture typically contain a vast number of parameters and impose substantial computational demands.The training and inference of these models requires significant computational resource,placing considerable demands on the computational backends.Developing software ecosystem across different platforms requires substantial development effort,making the research into cross-platform code generation technology for Transformer models particularly important.In the work,we propose HiperTI,a high performance system designed for cross-platform code generation,facilitating the inference of large transformer models based on MLIR.The GEMM code generated by HiperTI matches cuBLAS on NVIDIA A100 GPUs in performance,while its Attention computation achieves twice the performance of Triton.Additionally,on the Hygon DCU Z100,the Attention kernel from HiperTI demonstrates a 20%average performance improvement over PyTorch.