期刊文献+
共找到2篇文章
< 1 >
每页显示 20 50 100
LwRustIP:Memory-safe and efficient embedded networking stack with ownership semantics
1
作者 Guangyong Shang Guangpeng Qi +4 位作者 Jianing Ren Xianqi Jin Wanjiang Shen Junchao Li Runyu Pan 《High-Confidence Computing》 2026年第1期99-108,共10页
As modern embedded systems are increasingly network connected,their protocol stacks expose themselves as a surface that is frequently attacked.While C-based implementations such as LwIP are efficient,their lack of mem... As modern embedded systems are increasingly network connected,their protocol stacks expose themselves as a surface that is frequently attacked.While C-based implementations such as LwIP are efficient,their lack of memory safety induces critical vulnerabilities such as buffer overflows,dangling pointers,and use-after-free,leading to remote code execution or privilege escalation.In this paper,we present LwRustIP,a memory-safe embedded networking stack reimplemented in Rust and compatible with LwIP.We also share our development experience.LwRustIP replaces unsafe linked-list memory management with a custom allocator that honors the Rust ownership semantics,leverages zero-copy techniques for inter-layer packet handoffs,and applies lock-free object pools for concurrent buffer management.These design choices ensure memory safety while maintaining performance comparable to traditional C-based implementations.We deploy LwRustIP on ARM-based embedded platforms and evaluate its correctness,performance,and memory safety.Experimental results show that LwRustIP achieves memory safety without incurring measurable performance overhead compared to the original C-based implementation.Our experience highlights the practical challenges and benefits of using Rust for low-level system components and offers guidance for future efforts in memory-safe reengineering of legacy C codebases. 展开更多
关键词 Embedded network protocol stack Memory safety zero-copy architecture Rust ownership semantics
在线阅读 下载PDF
Minimizing transformer inference overhead using controlling element on Shenwei AI accelerator
2
作者 Yulong ZHAO Chunzhi WU +7 位作者 Yizhuo WANG Lufei ZHANG Yaguang ZHANG Wenyuan SHEN Hao FAN Hankang FANG Yi QIN Xin LIU 《Frontiers of Information Technology & Electronic Engineering》 2025年第4期605-622,共18页
Transformer models have become a cornerstone of various natural language processing(NLP)tasks.However,the substantial computational overhead during the inference remains a significant challenge,limiting their deployme... Transformer models have become a cornerstone of various natural language processing(NLP)tasks.However,the substantial computational overhead during the inference remains a significant challenge,limiting their deployment in practical applications.In this study,we address this challenge by minimizing the inference overhead in transformer models using the controlling element on artificial intelligence(AI)accelerators.Our work is anchored by four key contributions.First,we conduct a comprehensive analysis of the overhead composition within the transformer inference process,identifying the primary bottlenecks.Second,we leverage the management processing element(MPE)of the Shenwei AI(SWAI)accelerator,implementing a three-tier scheduling framework that significantly reduces the number of host-device launches to approximately 1/10000 of the original PyTorch-GPU setup.Third,we introduce a zero-copy memory management technique using segment-page fusion,which significantly reduces memory access latency and improves overall inference efficiency.Finally,we develop a fast model loading method that eliminates redundant computations during model verification and initialization,reducing the total loading time for large models from 22128.31 ms to 1041.72 ms.Our contributions significantly enhance the optimization of transformer models,enabling more efficient and expedited inference processes on AI accelerators. 展开更多
关键词 Transformer inference optimization Three-tier scheduling zero-copy memory management Fast model loading
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部