Synchronization performance issues related to lock such as too large critical section and improper lock usage,are inevitable in scientific computing.Even skilled programmers suffer from complicated reports of existing...Synchronization performance issues related to lock such as too large critical section and improper lock usage,are inevitable in scientific computing.Even skilled programmers suffer from complicated reports of existing lock behavior profilers,not to mention scientists who are most of the scientific computing programmers.Besides,ARM-based supercomputers emerge on the top 500 list while ARM-supported lock behavior profiling tools haven’t got enough attention as they deserve.Based on an“one step for all”workflow including problem identification,problem analysis and solution generation,this paper presents an end-to-end and fine-grained lock behavior profiling tool,supporting both ARM and×86 architecture.Specially,this paper introduces a priority function to quantify the priority of distinct solutions and users can adjust different weights of metrics.Compared to existing work using library interception and replacement or×86-based analysis framework,finedgrained analysis,highly usable report,high portability and strong compatibility make it an efficient tool for scientific computing programmers to find and optimize lock related performance bugs.展开更多
基金supported by the National Key R&D Program,with the Grant No.2023YFB3001801supported by National Natural Science Foundation of China(Nos.62322201,62072018,U23B2020,U22A2028)。
文摘Synchronization performance issues related to lock such as too large critical section and improper lock usage,are inevitable in scientific computing.Even skilled programmers suffer from complicated reports of existing lock behavior profilers,not to mention scientists who are most of the scientific computing programmers.Besides,ARM-based supercomputers emerge on the top 500 list while ARM-supported lock behavior profiling tools haven’t got enough attention as they deserve.Based on an“one step for all”workflow including problem identification,problem analysis and solution generation,this paper presents an end-to-end and fine-grained lock behavior profiling tool,supporting both ARM and×86 architecture.Specially,this paper introduces a priority function to quantify the priority of distinct solutions and users can adjust different weights of metrics.Compared to existing work using library interception and replacement or×86-based analysis framework,finedgrained analysis,highly usable report,high portability and strong compatibility make it an efficient tool for scientific computing programmers to find and optimize lock related performance bugs.