基于循环代价分析的循环不变量外提算法

Loop-invariant Code Motion Algorithm Based on Loop Cost Analysis

下载PDF

导出

摘要循环不变量外提算法是一种针对程序中循环结构的常用编译优化算法,其通过将循环体中的不变计算移动到循环外部来减少重复计算的开销,从而提高程序运行的速度。但在LLVM编译器中,传统的循环不变量外提算法会将全部循环不变量外提到循环体外部,当循环不变量达到一定数量时会导致寄存器溢出,在循环内引入额外的访存代价,对循环产生负优化效果。针对上述问题,在传统LLVM循环不变量外提算法的基础上,引入了一种循环代价分析算法,通过计算循环不变量在循环体中的运行代价和外提操作可能带来的溢出代价,评估其外提可能带来的收益,只对产生正收益的循环不变量进行外提,在有效减少循环体内重复计算的同时,规避引入额外开销的风险。在国产申威831处理器平台,使用典型用例进行优化效果测评,在千万级循环下,相较于传统循环不变量优化算法,提出的新优化算法具有17%以上的性能提升;使用SPEC CPU2017基准测试集(SPECspeed 2017 Interger套件)、Perl解释器DKbench基准测试集、Python解释器pyperformance基准测试集进行综合优化效果测评,结果表明,相较于传统循环不变量优化算法,提出的新优化算法分别具有0.4%,0.63%和1%的性能提升。 Loop-invariant code motion(LICM)is a commonly used compilation optimization algorithm for loop structures in programs.By moving the invariant calculations in the loop body to outside the loop,the algorithm reduces the overhead of duplicate calculations,thus improving program execution speed.However,in LLVM compiler,the traditional LICM algorithm hoists all loop-invariants outside the loop body,which will lead to register overflow when the number of loop-invariant reaches a certain level.It will introduce additional memory access cost in the loop,resulting in a negative optimization effect on the loop.To address this issue,a loop cost analysis algorithm is introduced based on the traditional LLVM LICM algorithm.This algorithm evaluates the running cost of loop-invariant code inside the loop and the overflow cost that may be caused by moving the code outside the loop,and assesses the benefits of moving the code outside the loop.Only the loop-invariant code that produces positive benefits is moved outside the loop,effectively reducing the overhead of duplicate calculations in the loop while avoiding the risk of introducing additional costs.The proposed new optimization algorithm achieves more than 17%performance improvement compared to the traditional LICM algorithm in typical use cases for the domestic SW831 processor platform under millions of loops.Comprehensive optimization effect evaluations are conducted using the SPEC CPU 2017 benchmark test suite(SPECspeed2017 Integer Suite),Perl interpreter DKbench benchmark test suite,and Python interpreter pyperformance benchmark test suite.The results show that compared with the traditional LICM algorithm,the proposed algorithm has 0.4%,0.63%and 1%performance improvement respectively.

作者姜军翟彦河曾志恒顾轶超黄亮明 JIANG Jun;ZHAI Yanhe;ZENG Zhiheng;GU Yichao;HUANG Liangming(Wuxi Institute of Advanced Technology,Wuxi,Jiangsu 214122,China)

机构地区无锡先进技术研究院

出处《计算机科学》北大核心 2025年第6期44-51,共8页 Computer Science

基金科技部重点支持项目(GG20210701)。

关键词 LLVM编译器编译优化循环不变量外提寄存器溢出循环代价分析 LLVM compiler Compilation optimization Loop-invariant code motion Register overflow Loop cost analysis

分类号 TP314 [自动化与计算机技术—计算机软件与理论]

引文网络
相关文献

参考文献8

1王丽,高开,赵雅倩,李仁刚,曹芳,郭振华.深度学习编译器模型训练负载均衡优化方法[J].计算机科学与探索,2024,18(1):111-126. 被引量：4
2郭振华,吴艳霞,安龙飞,张国印,卢文祥.基于LLVM的函数内联优化技术研究[J].计算机工程与应用,2017,53(3):41-46. 被引量：1
3王翠霞,韩林,刘浩浩.基于指令Cache和寄存器压力的循环展开优化[J].计算机工程与科学,2022,44(12):2111-2119. 被引量：4
4胡煜霄,郑启龙.基于深度学习的循环自动调度研究[J].小型微型计算机系统,2024,45(7):1770-1777. 被引量：2
5李慧贤,刘坚.数据流分析方法[J].计算机工程与应用,2003,39(13):142-144. 被引量：13
6巩令钦,沈莉,周清雷,胡浩.基于LLVM的编译锁机制技术研究与实现[J].计算机应用与软件,2021,38(11):11-17. 被引量：2
7梁佳利,华保健,吕雅帅,苏振宇.面向深度学习算子的循环不变式外提算法[J].计算机科学与探索,2023,17(1):127-139. 被引量：1
8史惠康,王泽胜,张士宗,高翔,赵有健.通用CPU性能基准测试研究综述[J].电子学报,2023,51(1):246-256. 被引量：9

二级参考文献25

1李文龙,刘利,汤志忠.软件流水中的循环展开优化[J].北京航空航天大学学报,2004,30(11):1111-1115. 被引量：16
2廖秋林,莫玮,陈大为.SPEC CPU2000性能测试程序分析及其应用[J].国外电子测量技术,2006,25(6):65-68. 被引量：7
3Khedker U P,Dhamdhere D M.A generalized theory of bit vector data flow analysis[J].ACM TOPLAS, 1994:16(5 ) : 1472-511.
4Hecht M S.Flow Analysis of Computer Programs.The Computer Science Library Programming Language Series.Elsevier North-Holland, 1977.
5S Graham,M Wegman.A fast and usually linear algorithm for global data flow analysis[J]Journal of ACM, 1976;23( 1 ) : 172-202.
6Rohmer R,Lescoeur R,Kersit J M.The Alexander methed,a technique for the processing of recursive axioms in deductive databases[J].New Generation Computing, 1986;4(3) :273-285.
7Sharir M,Pnueli A.Two approaches to interprocedural data flow analysis[C].In:S S Muchnick,N D Jones eds.Program Flow Analysis:Theory and Applications,Prentice-Hall,Englewood Cliffs,NJ, 1981 : 189-233.
8Kildall G.A unified approach to global program optimization[C].In:Conference Record of the First ACM Symposium on Principles of Programming Languages,ACM,New York ,NY, 1973 : 194-206.
9周谦,冯晓兵,张兆庆.循环合并敏感的优化内联模型[J].计算机研究与发展,2007,44(7):1265-1271. 被引量：1
10高伟,赵荣彩,于海宁,张庆花.循环展开技术在向量程序中的应用[J].计算机科学,2016,43(1):226-231. 被引量：3

共引文献28

1刘晓锋,吴亚娟,李明东,曾宪华.编译系统中数据流分析研究[J].科技广场,2005(10):16-20.
2刘晓锋,吴亚娟,李明东,曾宪华.基于格的数据流分析框架研究[J].计算机工程与应用,2006,42(21):48-51. 被引量：1
3王嘉木,缪培昌.阴极输出6p3p后级的制作[J].视听技术,2006(11):71-74.
4付炼红,李军义,梁焰.离线并发冲突控制器的研究与实现[J].计算机工程与设计,2009,30(12):2960-2962.
5张龙杰,谢晓方,袁胜智,李洪周.逆编译中用户自定义库函数识别技术研究[J].现代电子技术,2009,32(14):120-123. 被引量：2
6陈才.一种区间型程序不变量检测方法[J].计算机与现代化,2010(3):184-187. 被引量：2
7陈涛,许金超,钮俊.基于模型检测的数据流异常测试技术研究[J].计算机工程与应用,2011,47(25):1-4. 被引量：2
8林姗,郑朝霞.基于格的数据流分析研究与应用[J].武汉理工大学学报（信息与管理工程版）,2011,33(6):932-935. 被引量：3
9喻琴仪,罗扬,杨浩.一种改进的程序不可达路径静态检测方法[J].南华大学学报（自然科学版）,2014,28(4):68-73.
10孙昌爱,郭新玲,张翔宇,陈宗岳.一种基于数据流分析的冗余变异体识别方法[J].计算机学报,2019,42(1):44-60. 被引量：8

1蔡淳豪,梁淑萍,姜军,邵宁远.基于申威平台寄存器溢出策略的预选先验优化[J].计算机科学,2025,52(6):82-87.
2朱珂,何先波,滕芊芊.基于RISC-V架构的寄存器分配研究[J].智能计算机与应用,2025,15(5):61-67. 被引量：1
3王昊天,丁岩,何贤浩,肖国庆,阳王东.SparseMode:用于高效SpMV向量化代码生成的稀疏编译框架[J].计算机研究与发展,2025,62(6):1443-1454.
4姜军,顾晓阳,徐坤坤,吕勇帅,黄亮明.面向申威平台的SIMD编程接口设计与研究[J].计算机科学,2025,52(6):66-73.
5步东伟,杨彦巧.基于单点校准差分补偿算法的甲烷气体检测仪设计[J].化学分析计量,2025,34(5):121-125.
6熊涛,黄伟凡.C语言策略模式中的函数指针间接调用性能损耗评估分析[J].电子技术(上海),2025,54(2):398-400.
7徐传福,邱昊中,车永刚.基于缓存数据重用的稀疏矩阵向量乘序列优化[J].计算机研究与发展,2025,62(6):1434-1442.

计算机科学

2025年第6期

浏览历史

内容加载中请稍等...

基于循环代价分析的循环不变量外提算法

参考文献8

二级参考文献25

共引文献28

相关作者

相关机构

相关主题

浏览历史