分布式锁是分布式存储系统的重要组件,锁协议的性能对系统整体的性能有关键性影响。远程直接内存访问(remote direct memory access,RDMA)是一种新兴的数据中心网络技术,它支持单边网络通信原语,可以降低系统CPU开销,同时具备低延迟、...分布式锁是分布式存储系统的重要组件,锁协议的性能对系统整体的性能有关键性影响。远程直接内存访问(remote direct memory access,RDMA)是一种新兴的数据中心网络技术,它支持单边网络通信原语,可以降低系统CPU开销,同时具备低延迟、高吞吐的性能特性,为设计高速分布式锁协议提供了新机遇。然而,设计基于RDMA的分布式锁协议面临诸多挑战。着重在保证高性能的前提下解决扩展性和公平性挑战,提出一种RDMA网络中的高性能分布式锁协议FeLock,它利用多种类型的RDMA网络通信原语,使客户端不仅能与服务端通信加解锁,还能与其他客户端直接通信以移交锁所有权,同时实现了高性能、公平性和性能的扩展性。具体地,为保证高性能,FeLock引入了节点粒度锁管理机制,缩减锁协议在关键路径上的网络往返次数。为实现扩展性,FeLock引入了轮转移交机制,将所有节点排成1个环,客户端按照其在环中的顺序依次移交锁的所有权。为实现公平性和避免客户端饥饿,FeLock引入了节点信用机制,限制节点连续加锁的次数,避免其他节点上的客户端无法加锁。实验显示,FeLock相比于现有单边RDMA锁协议(如DSLR)表现出相似或更高的性能,并且具有更好的公平性和扩展性。在3~120个客户端的环境下,FeLock的吞吐量是DSLR的1.01~7.51倍,公平性提升至多2.24倍。展开更多
The power Internet of Things(IoT)is a significant trend in technology and a requirement for national strategic development.With the deepening digital transformation of the power grid,China’s power system has initiall...The power Internet of Things(IoT)is a significant trend in technology and a requirement for national strategic development.With the deepening digital transformation of the power grid,China’s power system has initially built a power IoT architecture comprising a perception,network,and platform application layer.However,owing to the structural complexity of the power system,the construction of the power IoT continues to face problems such as complex access management of massive heterogeneous equipment,diverse IoT protocol access methods,high concurrency of network communications,and weak data security protection.To address these issues,this study optimizes the existing architecture of the power IoT and designs an integrated management framework for the access of multi-source heterogeneous data in the power IoT,comprising cloud,pipe,edge,and terminal parts.It further reviews and analyzes the key technologies involved in the power IoT,such as the unified management of the physical model,high concurrent access,multi-protocol access,multi-source heterogeneous data storage management,and data security control,to provide a more flexible,efficient,secure,and easy-to-use solution for multi-source heterogeneous data access in the power IoT.展开更多
Data access delay has become the prominent performance bottleneck of high-end computing systems. The key to reducing data access delay in system design is to diminish data stall time. Memory locality and concurrency a...Data access delay has become the prominent performance bottleneck of high-end computing systems. The key to reducing data access delay in system design is to diminish data stall time. Memory locality and concurrency are the two essential factors influencing the performance of modern memory systems. However, existing studies in reducing data stall time rarely focus on utilizing data access concurrency because the impact of memory concurrency on overall memory system performance is not well understood. In this study, a pair of novel data stall time models, the L-C model for the combined effort of locality and concurrency and the P-M model for the effect of pure miss on data stall time, are presented. The models provide a new understanding of data access delay and provide new directions for performance optimization. Based on these new models, a summary table of advanced cache optimizations is presented. It has 38 entries contributed by data concurrency while only has 21 entries contributed by data locality, which shows the value of data concurrency. The L-C and P-M models and their associated results and opportunities introduced in this study are important and necessary for future data-centric architecture and algorithm design of modern computing systems.展开更多
近似串匹配技术在网络信息搜索、数字图书馆、模式识别、文本挖掘、IP路由查找、网络入侵检测、生物信息学、音乐研究计算等领域具有广泛的应用.基于CREW-PRAM(parallel random access machine with concurrent read and exclusive wri...近似串匹配技术在网络信息搜索、数字图书馆、模式识别、文本挖掘、IP路由查找、网络入侵检测、生物信息学、音乐研究计算等领域具有广泛的应用.基于CREW-PRAM(parallel random access machine with concurrent read and exclusive write)模型,采用波前式并行推进的方法直接计算编辑距离矩阵D,设计了一个允许k-差别的近似串匹配动态规划并行算法,该算法使用(m+1)个处理器,时间复杂度为O(n),算法理论上达到线性加速;采取水平和斜向双并行计算编辑距离矩阵D的方法,设计了一个使用a(m+1)个处理器和O(n/a+m)时间的、可伸缩的、允许k-差别的近似串匹配动态规划并行算法,+<11mna.基于分治策略,通过灵活拆分总线和合并子总线动态重构光总线系统,并充分利用光总线的消息播送技术和并行计算前缀和的方法,实现了汉明距离的并行计算,设计了两个基于LARPBS(linear arrays with reconfigurable pipelined bus system)模型的通信高效、可扩放的允许k-误配的近似串匹配并行算法,其中一个算法使用n个处理器,时间为O(m);另一个为常数时间算法,使用mn个处理器.展开更多
文摘分布式锁是分布式存储系统的重要组件,锁协议的性能对系统整体的性能有关键性影响。远程直接内存访问(remote direct memory access,RDMA)是一种新兴的数据中心网络技术,它支持单边网络通信原语,可以降低系统CPU开销,同时具备低延迟、高吞吐的性能特性,为设计高速分布式锁协议提供了新机遇。然而,设计基于RDMA的分布式锁协议面临诸多挑战。着重在保证高性能的前提下解决扩展性和公平性挑战,提出一种RDMA网络中的高性能分布式锁协议FeLock,它利用多种类型的RDMA网络通信原语,使客户端不仅能与服务端通信加解锁,还能与其他客户端直接通信以移交锁所有权,同时实现了高性能、公平性和性能的扩展性。具体地,为保证高性能,FeLock引入了节点粒度锁管理机制,缩减锁协议在关键路径上的网络往返次数。为实现扩展性,FeLock引入了轮转移交机制,将所有节点排成1个环,客户端按照其在环中的顺序依次移交锁的所有权。为实现公平性和避免客户端饥饿,FeLock引入了节点信用机制,限制节点连续加锁的次数,避免其他节点上的客户端无法加锁。实验显示,FeLock相比于现有单边RDMA锁协议(如DSLR)表现出相似或更高的性能,并且具有更好的公平性和扩展性。在3~120个客户端的环境下,FeLock的吞吐量是DSLR的1.01~7.51倍,公平性提升至多2.24倍。
基金supported by the National Key Research and Development Program of China(grant number 2019YFE0123600)。
文摘The power Internet of Things(IoT)is a significant trend in technology and a requirement for national strategic development.With the deepening digital transformation of the power grid,China’s power system has initially built a power IoT architecture comprising a perception,network,and platform application layer.However,owing to the structural complexity of the power system,the construction of the power IoT continues to face problems such as complex access management of massive heterogeneous equipment,diverse IoT protocol access methods,high concurrency of network communications,and weak data security protection.To address these issues,this study optimizes the existing architecture of the power IoT and designs an integrated management framework for the access of multi-source heterogeneous data in the power IoT,comprising cloud,pipe,edge,and terminal parts.It further reviews and analyzes the key technologies involved in the power IoT,such as the unified management of the physical model,high concurrent access,multi-protocol access,multi-source heterogeneous data storage management,and data security control,to provide a more flexible,efficient,secure,and easy-to-use solution for multi-source heterogeneous data access in the power IoT.
基金The work was supported in part by the National Science Foundation of USA under Grant Nos. CNS-1162540, CCF-0937877, and CNS-0751200. We would like to thank the Scalable Computing Software (SCS) group in the Illi- nois Institute of Technology and anonymous reviewers for their valuable and professional comments on earlier drafts of this work.
文摘Data access delay has become the prominent performance bottleneck of high-end computing systems. The key to reducing data access delay in system design is to diminish data stall time. Memory locality and concurrency are the two essential factors influencing the performance of modern memory systems. However, existing studies in reducing data stall time rarely focus on utilizing data access concurrency because the impact of memory concurrency on overall memory system performance is not well understood. In this study, a pair of novel data stall time models, the L-C model for the combined effort of locality and concurrency and the P-M model for the effect of pure miss on data stall time, are presented. The models provide a new understanding of data access delay and provide new directions for performance optimization. Based on these new models, a summary table of advanced cache optimizations is presented. It has 38 entries contributed by data concurrency while only has 21 entries contributed by data locality, which shows the value of data concurrency. The L-C and P-M models and their associated results and opportunities introduced in this study are important and necessary for future data-centric architecture and algorithm design of modern computing systems.
文摘近似串匹配技术在网络信息搜索、数字图书馆、模式识别、文本挖掘、IP路由查找、网络入侵检测、生物信息学、音乐研究计算等领域具有广泛的应用.基于CREW-PRAM(parallel random access machine with concurrent read and exclusive write)模型,采用波前式并行推进的方法直接计算编辑距离矩阵D,设计了一个允许k-差别的近似串匹配动态规划并行算法,该算法使用(m+1)个处理器,时间复杂度为O(n),算法理论上达到线性加速;采取水平和斜向双并行计算编辑距离矩阵D的方法,设计了一个使用a(m+1)个处理器和O(n/a+m)时间的、可伸缩的、允许k-差别的近似串匹配动态规划并行算法,+<11mna.基于分治策略,通过灵活拆分总线和合并子总线动态重构光总线系统,并充分利用光总线的消息播送技术和并行计算前缀和的方法,实现了汉明距离的并行计算,设计了两个基于LARPBS(linear arrays with reconfigurable pipelined bus system)模型的通信高效、可扩放的允许k-误配的近似串匹配并行算法,其中一个算法使用n个处理器,时间为O(m);另一个为常数时间算法,使用mn个处理器.