The emergence of non-volatile memory(NVM)has introduced new opportunities for performance optimizations in existing storage systems.To better utilize its byte-addressability and near-DRAM performance,NVM can be attach...The emergence of non-volatile memory(NVM)has introduced new opportunities for performance optimizations in existing storage systems.To better utilize its byte-addressability and near-DRAM performance,NVM can be attached on the memory bus and accessed via load/store memory instructions rather than the conventional block interface.In this scenario,a cache line(usually 64 bytes)becomes the data transfer unit between volatile and non-volatile devices.However,the failure-atomicity of write on NVM is the memory bit width(usually 8 bytes).This mismatch between the data transfer unit and the atomicity unit may introduce write amplification and compromise data consistency of node-based data structures such as B+-trees.In this paper,we propose WOBTree,a Write-Optimized B+-Tree for NVM to address the mismatch problem without expensive logging.WOBTree minimizes the update granularity from a tree node to a much smaller subnode and carefully arranges the write operations in it to ensure crash consistency and reduce write amplification.Experimental results show that compared with previous persistent B+-tree solutions,WOBTree reduces the write amplification by up to 86× and improves write performance by up to 61× while maintaining similar search performance.展开更多
Key-value (KV) stores have become a backbone of large-scale applications in today's data centers. Write- optimized data structures like the Log-Structured Merge-tree (LSM-tree) and their variants are widely used ...Key-value (KV) stores have become a backbone of large-scale applications in today's data centers. Write- optimized data structures like the Log-Structured Merge-tree (LSM-tree) and their variants are widely used in KV storage systems like BigTable and RocksDB. Conventional LSM-tree organizes KV items into multiple, successively larger components, and uses compaction to push KV items from one smaller component to another adjacent larger component until the KV items reach the largest component. Unfortunately, current compaction scheme incurs significant write amplification due to repeated KV item reads and writes, and then results in poor throughput. We propose a new compaction scheme, delayed compaction (dCompaction) that decreases write amplification, dCompaction postpones some compactions and gathers them into the following compaction. In this way, it avoids KV item reads and writes during compaction, and consequently improves the throughput of LSM-tree based KV stores. We implement dCompaction on RocksDB, and conduct extensive experiments. Validation using YCSB framework shows that compared with RocksDB, dCompaction has about 40% write performance improvements and also comparable read performance.展开更多
基金the National Key Research and Development Program of China(2018YFB1004401)the National Natural Science Foundation of China for Young Scientists(Grant No.61502392)the General Program of the National Natural Science Foundation of China(61472323).
文摘The emergence of non-volatile memory(NVM)has introduced new opportunities for performance optimizations in existing storage systems.To better utilize its byte-addressability and near-DRAM performance,NVM can be attached on the memory bus and accessed via load/store memory instructions rather than the conventional block interface.In this scenario,a cache line(usually 64 bytes)becomes the data transfer unit between volatile and non-volatile devices.However,the failure-atomicity of write on NVM is the memory bit width(usually 8 bytes).This mismatch between the data transfer unit and the atomicity unit may introduce write amplification and compromise data consistency of node-based data structures such as B+-trees.In this paper,we propose WOBTree,a Write-Optimized B+-Tree for NVM to address the mismatch problem without expensive logging.WOBTree minimizes the update granularity from a tree node to a much smaller subnode and carefully arranges the write operations in it to ensure crash consistency and reduce write amplification.Experimental results show that compared with previous persistent B+-tree solutions,WOBTree reduces the write amplification by up to 86× and improves write performance by up to 61× while maintaining similar search performance.
基金This work is supported by the National Key Research and Development Program of China under Grant No. 2016YFB1000202 and the National Natural Science Foundation of China under Grant Nos. 61303056 and 61379042.
文摘Key-value (KV) stores have become a backbone of large-scale applications in today's data centers. Write- optimized data structures like the Log-Structured Merge-tree (LSM-tree) and their variants are widely used in KV storage systems like BigTable and RocksDB. Conventional LSM-tree organizes KV items into multiple, successively larger components, and uses compaction to push KV items from one smaller component to another adjacent larger component until the KV items reach the largest component. Unfortunately, current compaction scheme incurs significant write amplification due to repeated KV item reads and writes, and then results in poor throughput. We propose a new compaction scheme, delayed compaction (dCompaction) that decreases write amplification, dCompaction postpones some compactions and gathers them into the following compaction. In this way, it avoids KV item reads and writes during compaction, and consequently improves the throughput of LSM-tree based KV stores. We implement dCompaction on RocksDB, and conduct extensive experiments. Validation using YCSB framework shows that compared with RocksDB, dCompaction has about 40% write performance improvements and also comparable read performance.