Stride prefetching is recognized as an important technique to improve memory access performance. The prior work usually profiles and/or analyzes the program behavior offline, and uses the identified stride patterns to...Stride prefetching is recognized as an important technique to improve memory access performance. The prior work usually profiles and/or analyzes the program behavior offline, and uses the identified stride patterns to guide the compilation process by injecting the prefetch instructions at appropriate places. There are some researches trying to enable stride prefetching in runtime systems with online profiling, but they either cannot discover cross-procedural prefetch opportunity, or require special supports in hardware or garbage collection. In this paper, we present a prefetch engine for JVM (Java Virtual Machine). It firstly identifies the candidate load operations during just-in-time (JIT) compilation, and then instruments the compiled code to profile the addresses of those loads. The runtime profile is collected in a trace buffer, which triggers a prefetch controller upon a protection fault. The prefetch controller analyzes the trace to discover any stride patterns, then modifies the compiled code to inject the prefetch instructions in place of the instrumentations. One of the major advantages of this engine is that, it can detect striding loads in any virtual code places for both regular and irregular code, not being limited with plain loop or procedure scopes. Actually we found the cross-procedural patterns take about 30% of all the prefetchings in the representative Java benchmarks. Another major advantage of the engine is that it has runtime overhead much smaller (the maximal is less than 4.0%) than the benefits it brings. Our evaluation with Apache Harmony JVM shows that the engine can achieve an average 6.2% speed-up with SPECJVM98 and DaCapo on Intel Pentium 4 platform, in spite of the runtime overhead.展开更多
1引言可视化重用构件是可视化程序设计的基础,目前主流的可重用构件模型(CORBA,OLE/Ac-tsvex,Java玫ans[2〕等)都不太适合可视化重用构件模型〔,1,作者在完成IBM visualAge for smalltalk[3](以下简称VA Smalltalk)环境下的可重用中文...1引言可视化重用构件是可视化程序设计的基础,目前主流的可重用构件模型(CORBA,OLE/Ac-tsvex,Java玫ans[2〕等)都不太适合可视化重用构件模型〔,1,作者在完成IBM visualAge for smalltalk[3](以下简称VA Smalltalk)环境下的可重用中文报表构件时建造的可视化重用构件模型具有以下特点: ·展开更多
时间管理算法是决定RTI时间管理服务性能的关键。为解决时间管理中常用的Frederick算法计算GALT(greatest available logical time)时可能出现死锁以及仿真系统消息延迟等问题,定义了联邦成员尺度的概念,并结合时间前瞻量的动态调整思想...时间管理算法是决定RTI时间管理服务性能的关键。为解决时间管理中常用的Frederick算法计算GALT(greatest available logical time)时可能出现死锁以及仿真系统消息延迟等问题,定义了联邦成员尺度的概念,并结合时间前瞻量的动态调整思想,提出了动态尺度标注算法,并对其进行了分析。分析表明该算法不但减少了消息的延迟时间,还解决了时间管理中的死锁问题。通过在制导弹药飞行视景仿真系统上测试,表明算法改善了仿真效果,提高了仿真系统性能。展开更多
Dynamic optimization relies on runtime profile information to improve the performance of program execution. Traditional profiling techniques incur significant overhead and are not suitable for dynamic optimization. In...Dynamic optimization relies on runtime profile information to improve the performance of program execution. Traditional profiling techniques incur significant overhead and are not suitable for dynamic optimization. In this paper, a new profiling technique is proposed, that incorporates the strength of both software and hardware to achieve near-zero overhead profiling. The compiler passes profiling requests as a few bits of information in branch instructions to the hardware, and the processor executes profiling operations asynchronously in available free slots or on dedicated hardware. The compiler instrumentation of this technique is implemented using an Itanium research compiler. The result shows that the accurate block profiling incurs very little overhead to the user program in terms of the program scheduling cycles. For example, the average overhead is 0.6% for the SPECint95 benchmarks. The hardware support required for the new profiling is practical. The technique is extended to collect edge profiles for continuous phase transition detection. It is believed that the hardware-software collaborative scheme will enable many profile-driven dynamic optimizations for EPIC processors such as the Itanium processors.展开更多
Inter-process communication(IPC)provides a message passing mechanism for information exchange between applications.It has been long believed that IPCs can be abused by malware writers to launch collusive information l...Inter-process communication(IPC)provides a message passing mechanism for information exchange between applications.It has been long believed that IPCs can be abused by malware writers to launch collusive information leak using two or more applications.Much work on privacy protection focuses on the simple information leak caused by the individual applications and lacks effective approaches to preventing the collusive information leak caused by IPCs between multiple processes.In this paper,we propose a hybrid approach to prevent the collusive information leak based on information flow control.Our approach combines static information flow analysis and dynamic runtime checking together.Information leak caused by individual processes is prevented through static information flow control,and dynamic checking is done at runtime to prevent the collusive information leak.Such a combination may effectively reduce the runtime overhead of pure dynamic checking,and reduce false-alarms in pure static analysis.We develop this approach based on an abstract and simplified programming model,and formalize a novel definition of the leak-freedom property as our target security property.A simulation-based proof technique is used to prove that our approach is able to guarantee leak-freedom.All proofs are mechanized in Coq.展开更多
基金the National Natural Science Foundation of China under Grant Nos.60673146,60603049,60736012,and 60703017the National High Technology Development 863 Program of China under Grant No.2006AA010201 and No.2007AA01Z114the National Basic Research Program of China under Grant No.2005CB321601.
文摘Stride prefetching is recognized as an important technique to improve memory access performance. The prior work usually profiles and/or analyzes the program behavior offline, and uses the identified stride patterns to guide the compilation process by injecting the prefetch instructions at appropriate places. There are some researches trying to enable stride prefetching in runtime systems with online profiling, but they either cannot discover cross-procedural prefetch opportunity, or require special supports in hardware or garbage collection. In this paper, we present a prefetch engine for JVM (Java Virtual Machine). It firstly identifies the candidate load operations during just-in-time (JIT) compilation, and then instruments the compiled code to profile the addresses of those loads. The runtime profile is collected in a trace buffer, which triggers a prefetch controller upon a protection fault. The prefetch controller analyzes the trace to discover any stride patterns, then modifies the compiled code to inject the prefetch instructions in place of the instrumentations. One of the major advantages of this engine is that, it can detect striding loads in any virtual code places for both regular and irregular code, not being limited with plain loop or procedure scopes. Actually we found the cross-procedural patterns take about 30% of all the prefetchings in the representative Java benchmarks. Another major advantage of the engine is that it has runtime overhead much smaller (the maximal is less than 4.0%) than the benefits it brings. Our evaluation with Apache Harmony JVM shows that the engine can achieve an average 6.2% speed-up with SPECJVM98 and DaCapo on Intel Pentium 4 platform, in spite of the runtime overhead.
文摘时间管理算法是决定RTI时间管理服务性能的关键。为解决时间管理中常用的Frederick算法计算GALT(greatest available logical time)时可能出现死锁以及仿真系统消息延迟等问题,定义了联邦成员尺度的概念,并结合时间前瞻量的动态调整思想,提出了动态尺度标注算法,并对其进行了分析。分析表明该算法不但减少了消息的延迟时间,还解决了时间管理中的死锁问题。通过在制导弹药飞行视景仿真系统上测试,表明算法改善了仿真效果,提高了仿真系统性能。
文摘Dynamic optimization relies on runtime profile information to improve the performance of program execution. Traditional profiling techniques incur significant overhead and are not suitable for dynamic optimization. In this paper, a new profiling technique is proposed, that incorporates the strength of both software and hardware to achieve near-zero overhead profiling. The compiler passes profiling requests as a few bits of information in branch instructions to the hardware, and the processor executes profiling operations asynchronously in available free slots or on dedicated hardware. The compiler instrumentation of this technique is implemented using an Itanium research compiler. The result shows that the accurate block profiling incurs very little overhead to the user program in terms of the program scheduling cycles. For example, the average overhead is 0.6% for the SPECint95 benchmarks. The hardware support required for the new profiling is practical. The technique is extended to collect edge profiles for continuous phase transition detection. It is believed that the hardware-software collaborative scheme will enable many profile-driven dynamic optimizations for EPIC processors such as the Itanium processors.
文摘Inter-process communication(IPC)provides a message passing mechanism for information exchange between applications.It has been long believed that IPCs can be abused by malware writers to launch collusive information leak using two or more applications.Much work on privacy protection focuses on the simple information leak caused by the individual applications and lacks effective approaches to preventing the collusive information leak caused by IPCs between multiple processes.In this paper,we propose a hybrid approach to prevent the collusive information leak based on information flow control.Our approach combines static information flow analysis and dynamic runtime checking together.Information leak caused by individual processes is prevented through static information flow control,and dynamic checking is done at runtime to prevent the collusive information leak.Such a combination may effectively reduce the runtime overhead of pure dynamic checking,and reduce false-alarms in pure static analysis.We develop this approach based on an abstract and simplified programming model,and formalize a novel definition of the leak-freedom property as our target security property.A simulation-based proof technique is used to prove that our approach is able to guarantee leak-freedom.All proofs are mechanized in Coq.