LargeLanguageModels(LLMs)are increasingly appliedinthe fieldof code translation.However,existing evaluation methodologies suffer from two major limitations:(1)the high overlap between test data and pretraining corpora...LargeLanguageModels(LLMs)are increasingly appliedinthe fieldof code translation.However,existing evaluation methodologies suffer from two major limitations:(1)the high overlap between test data and pretraining corpora,which introduces significant bias in performance evaluation;and(2)mainstream metrics focus primarily on surface-level accuracy,failing to uncover the underlying factors that constrain model capabilities.To address these issues,this paper presents TCode(Translation-Oriented Code Evaluation benchmark)—a complexity-controllable,contamination-free benchmark dataset for code translation—alongside a dedicated static feature sensitivity evaluation framework.The dataset is carefully designed to control complexity along multiple dimensions—including syntactic nesting and expression intricacy—enabling both broad coverage and fine-grained differentiation of sample difficulty.This design supports precise evaluation of model capabilities across a wide spectrum of translation challenges.The proposed evaluation framework introduces a correlation-driven analysis mechanism based on static program features,enabling predictive modeling of translation success from two perspectives:Code Form Complexity(e.g.,code length and character density)and Semantic Modeling Complexity(e.g.,syntactic depth,control-flow nesting,and type system complexity).Empirical evaluations across representative LLMs—including Qwen2.5-72B and Llama3.3-70B—demonstrate that even state-of-the-art models achieve over 80% compilation success on simple samples,but their accuracy drops sharply below 40% on complex cases.Further correlation analysis indicates that Semantic Modeling Complexity alone is correlated with up to 60% of the variance in translation success,with static program features exhibiting nonlinear threshold effects that highlight clear capability boundaries.This study departs fromthe traditional accuracy-centric evaluation paradigm and,for the first time,systematically characterizes the capabilities of large languagemodels in translation tasks through the lens of programstatic features.The findings provide actionable insights for model refinement and training strategy development.展开更多
Since specific hardware characteristics and low-level programming model are adapted to both NVIDIA GPU and new generation Sunway architecture,automatically translating mature CUDA kernels to Sunway ATHREAD kernels are...Since specific hardware characteristics and low-level programming model are adapted to both NVIDIA GPU and new generation Sunway architecture,automatically translating mature CUDA kernels to Sunway ATHREAD kernels are realistic but challenging work.To address this issue,swCUDA,an auto parallel code translation framework is proposed.To that end,we create scale affine translation to transform CUDA thread hierarchy to Sunway index,directive based memory hierarchy and data redirection optimization to assign optimal memory usage and data stride strategy,directive based grouping-calculationasynchronous-reduction(GCAR)algorithm to provide general solution for random access issue.swCUDA utilizes code generator ANTLR as compiler frontend to parse CUDA kernel and integrate novel algorithms in the node of abstracted syntax tree(AST)depending on directives.Automatically translation is performed on the entire Polybench suite and NBody simulation benchmark.We get an average 40x speedup compared with baseline on the Sunway architecture,average speedup of 15x compared to x86 CPU and average 27 percentage higher than NVIDIA GPU.Further,swCUDA is implemented to translate major kernels of the real world application Gromacs.The translated version achieves up to 17x speedup.展开更多
Developing lightweight,green,and flexible wearable electronics with high sensitivity and multifunctional sensing capabilities is of important significance in the field of outdoor sports,such as mountaineering,animal t...Developing lightweight,green,and flexible wearable electronics with high sensitivity and multifunctional sensing capabilities is of important significance in the field of outdoor sports,such as mountaineering,animal tracking and protection.This work proposes a silk fibroin fibers-based triboelectric nanogenerator(SF TENG)to harvest tiny energy from human fingertip tapping and act as a self-powered tactile sensor.The SF-TENG adopts a green,efficient,and low-cost fabrication strategy,in which a breathable and electropositive silk fibroin fiber membrane and a silver conductive layer are prepared by electrostatic spinning and magnetron sputtering,and combined with a conductive cloth and a breathable tape to form a flexible sensor that can be attached to a human skin.The thin and soft portable TENG device,having a thickness of only 0.3 mm and a mass of 354 mg at the dimension of 4.5 cm×4.5 cm,can generate a maximum power density of 1.0 mW·m^(–2).Furthermore,the SF-TENG has excellent sensitivity of 1.767 mV·Pa^(–1) with good cyclic stability.The superior sensing characteristics provide new avenues for Morse code applications toward outdoor wearable autonomous communication.The proposed SF-TENG offers promising solutions in multi-scenario outdoor sport,human-machine interface interaction,and security systems.展开更多
Automated manufacturing system is characterized by flexibility. It aims at producing a variety of products with virtually no time loses to change over from one part to the next. In this paper, the Machining Process Si...Automated manufacturing system is characterized by flexibility. It aims at producing a variety of products with virtually no time loses to change over from one part to the next. In this paper, the Machining Process Simulator GMPS is introduced, which can be used as a supported environment for machining process. It can be executed off-line or on-line in manufacturing systems in order to predict the collisions of tool with machined workpieces, fixtures or pallets. First, the functional model of GMPS is described, then adopted critical techniques in the simulator are introduced. Finally, an application of GMPS in CIMS ERC of China is presented.展开更多
This survey has provided a systematic overview of the emerging field of LLM-enabled compilation by addressing several key research questions.We first answered how LLMs are being integrated by proposing a comprehensive...This survey has provided a systematic overview of the emerging field of LLM-enabled compilation by addressing several key research questions.We first answered how LLMs are being integrated by proposing a comprehensive,multi-dimensional taxonomy that categorizes works based on their Design Philosophy(Selector,Translator,Generator),LLM Methodology,their operational Level of Code Abstraction,and the specific Task Type they address.In answering what advancements these approaches offer,we identified three primary benefits:the democratization of compiler development,the discovery of novel optimization strategies,and the broadening of the compiler’s traditional scope.Finally,in addressing the field’s challenges and opportunities,we highlighted the critical hurdles of ensuring correctness and achieving scalability,while identifying the development of hybrid systems as the most promising path forward.By providing these answers,this survey serves as a foundational roadmap for researchers and practitioners,charting the course for a new generation of LLM-powered,intelligent,adaptive and synergistic compilation tools.展开更多
文摘LargeLanguageModels(LLMs)are increasingly appliedinthe fieldof code translation.However,existing evaluation methodologies suffer from two major limitations:(1)the high overlap between test data and pretraining corpora,which introduces significant bias in performance evaluation;and(2)mainstream metrics focus primarily on surface-level accuracy,failing to uncover the underlying factors that constrain model capabilities.To address these issues,this paper presents TCode(Translation-Oriented Code Evaluation benchmark)—a complexity-controllable,contamination-free benchmark dataset for code translation—alongside a dedicated static feature sensitivity evaluation framework.The dataset is carefully designed to control complexity along multiple dimensions—including syntactic nesting and expression intricacy—enabling both broad coverage and fine-grained differentiation of sample difficulty.This design supports precise evaluation of model capabilities across a wide spectrum of translation challenges.The proposed evaluation framework introduces a correlation-driven analysis mechanism based on static program features,enabling predictive modeling of translation success from two perspectives:Code Form Complexity(e.g.,code length and character density)and Semantic Modeling Complexity(e.g.,syntactic depth,control-flow nesting,and type system complexity).Empirical evaluations across representative LLMs—including Qwen2.5-72B and Llama3.3-70B—demonstrate that even state-of-the-art models achieve over 80% compilation success on simple samples,but their accuracy drops sharply below 40% on complex cases.Further correlation analysis indicates that Semantic Modeling Complexity alone is correlated with up to 60% of the variance in translation success,with static program features exhibiting nonlinear threshold effects that highlight clear capability boundaries.This study departs fromthe traditional accuracy-centric evaluation paradigm and,for the first time,systematically characterizes the capabilities of large languagemodels in translation tasks through the lens of programstatic features.The findings provide actionable insights for model refinement and training strategy development.
基金supported in part by National Key Research and Development Program of China(Grant No.2021YFF0704000).
文摘Since specific hardware characteristics and low-level programming model are adapted to both NVIDIA GPU and new generation Sunway architecture,automatically translating mature CUDA kernels to Sunway ATHREAD kernels are realistic but challenging work.To address this issue,swCUDA,an auto parallel code translation framework is proposed.To that end,we create scale affine translation to transform CUDA thread hierarchy to Sunway index,directive based memory hierarchy and data redirection optimization to assign optimal memory usage and data stride strategy,directive based grouping-calculationasynchronous-reduction(GCAR)algorithm to provide general solution for random access issue.swCUDA utilizes code generator ANTLR as compiler frontend to parse CUDA kernel and integrate novel algorithms in the node of abstracted syntax tree(AST)depending on directives.Automatically translation is performed on the entire Polybench suite and NBody simulation benchmark.We get an average 40x speedup compared with baseline on the Sunway architecture,average speedup of 15x compared to x86 CPU and average 27 percentage higher than NVIDIA GPU.Further,swCUDA is implemented to translate major kernels of the real world application Gromacs.The translated version achieves up to 17x speedup.
基金The authors thank the National Key R&D Project from Minister of Science and Technology(Nos.2021YFA1201604 and 2021YFA1201601)the China National Postdoctoral Program for Innovative Talents(No.BX20230357)+3 种基金Project supported by the Fundamental Research Funds for the Central Universities(No.E3E46807X2)the China Postdoctoral Science Foundation(No.2023M743445)the National Key R&D Program of China(No.2023YFB2604600)the National Natural Science Foundation of China(No.52302115).
文摘Developing lightweight,green,and flexible wearable electronics with high sensitivity and multifunctional sensing capabilities is of important significance in the field of outdoor sports,such as mountaineering,animal tracking and protection.This work proposes a silk fibroin fibers-based triboelectric nanogenerator(SF TENG)to harvest tiny energy from human fingertip tapping and act as a self-powered tactile sensor.The SF-TENG adopts a green,efficient,and low-cost fabrication strategy,in which a breathable and electropositive silk fibroin fiber membrane and a silver conductive layer are prepared by electrostatic spinning and magnetron sputtering,and combined with a conductive cloth and a breathable tape to form a flexible sensor that can be attached to a human skin.The thin and soft portable TENG device,having a thickness of only 0.3 mm and a mass of 354 mg at the dimension of 4.5 cm×4.5 cm,can generate a maximum power density of 1.0 mW·m^(–2).Furthermore,the SF-TENG has excellent sensitivity of 1.767 mV·Pa^(–1) with good cyclic stability.The superior sensing characteristics provide new avenues for Morse code applications toward outdoor wearable autonomous communication.The proposed SF-TENG offers promising solutions in multi-scenario outdoor sport,human-machine interface interaction,and security systems.
文摘Automated manufacturing system is characterized by flexibility. It aims at producing a variety of products with virtually no time loses to change over from one part to the next. In this paper, the Machining Process Simulator GMPS is introduced, which can be used as a supported environment for machining process. It can be executed off-line or on-line in manufacturing systems in order to predict the collisions of tool with machined workpieces, fixtures or pallets. First, the functional model of GMPS is described, then adopted critical techniques in the simulator are introduced. Finally, an application of GMPS in CIMS ERC of China is presented.
基金supported by National R&D Program of China(2024YFB4505603)the Jiangsu Province Key R&D Program(Grant No.BG2024028)National Natural Science Foundation of China(U23B2020,62302479,62232015).
文摘This survey has provided a systematic overview of the emerging field of LLM-enabled compilation by addressing several key research questions.We first answered how LLMs are being integrated by proposing a comprehensive,multi-dimensional taxonomy that categorizes works based on their Design Philosophy(Selector,Translator,Generator),LLM Methodology,their operational Level of Code Abstraction,and the specific Task Type they address.In answering what advancements these approaches offer,we identified three primary benefits:the democratization of compiler development,the discovery of novel optimization strategies,and the broadening of the compiler’s traditional scope.Finally,in addressing the field’s challenges and opportunities,we highlighted the critical hurdles of ensuring correctness and achieving scalability,while identifying the development of hybrid systems as the most promising path forward.By providing these answers,this survey serves as a foundational roadmap for researchers and practitioners,charting the course for a new generation of LLM-powered,intelligent,adaptive and synergistic compilation tools.