Recent proposals of emerging data storage devices make it necessary to reevaluate all levels of the storage hierarchy to optimize the software stack performance.However,these new devices are not always widely availabl...Recent proposals of emerging data storage devices make it necessary to reevaluate all levels of the storage hierarchy to optimize the software stack performance.However,these new devices are not always widely available and therefore early experiments may be impossible.Emulators aim at mimicking as close as possible the behavior of a component,nonetheless,emulating new and fast storage devices is a challenging task due to time perception.In this work,we propose an approach to emulate storage devices using virtual machines(VMs)allowing the evaluation of a new device within a real system.We use a technique called freezing time,which pauses a VM to manipulate its clock and hide the real I/O completion time.Our approach is implemented at the hypervisor level and it is transparent to the vip operating system or application.We evaluate the technique under a real system using regular magnetic disks to emulate faster storage devices.Our method presented a latency error of 6.5%compared to a real device.Moreover,decoupled experiment between two laboratories,at the Barcelona Super Computing Center(BSC)in Spain,and the Center of Computer Science and Free Software(C3SL)in Brazil,demonstrated that our approach is reproducible and promising to allow the virtual evaluation of next-gen storage devices.展开更多
Many scientific fields increasingly use high-performance computing(HPC)to process and analyze massive amounts of experimental data while storage systems in today's HPC environments have to cope with new access pat...Many scientific fields increasingly use high-performance computing(HPC)to process and analyze massive amounts of experimental data while storage systems in today's HPC environments have to cope with new access patterns.These patterns include many metadata operations,small I/O requests,or randomized file I/O,while general-purpose parallel file systems have been optimized for sequential shared access to large files.Burst buffer file systems create a separate file system that applications can use to store temporary data.They aggregate node-local storage available within the compute nodes or use dedicated SSD clusters and offer a peak bandwidth higher than that of the backend parallel file system without interfering with it.However,burst buffer file systems typically offer many features that a scientific application,running in isolation for a limited amount of time,does not require.We present GekkoFS,a temporary,highly-scalable file system which has been specifically optimized for the aforementioned use cases.GekkoFS provides relaxed POSIX semantics which only offers features which are actually required by most(not all)applications.GekkoFS is,therefore,able to provide scalable I/O performance and reaches millions of metadata operations already for a small number of nodes,significantly outperforming the capabilities of common parallel file systems.展开更多
基金supported by the Spanish Ministry of Science and Innovation under the TIN2015-65316 Grantthe Generalitat de Catalunya under contract 2014-SGR-1051+1 种基金the Serrapilheira Institute(Grant number Serra-1709-16621)as well as the European Union’s Horizon 2020 Research and Innovation Programme,under Grant Agreement no.671951(NEXTGenIO)for the extensions added after the MASCOTS paper.
文摘Recent proposals of emerging data storage devices make it necessary to reevaluate all levels of the storage hierarchy to optimize the software stack performance.However,these new devices are not always widely available and therefore early experiments may be impossible.Emulators aim at mimicking as close as possible the behavior of a component,nonetheless,emulating new and fast storage devices is a challenging task due to time perception.In this work,we propose an approach to emulate storage devices using virtual machines(VMs)allowing the evaluation of a new device within a real system.We use a technique called freezing time,which pauses a VM to manipulate its clock and hide the real I/O completion time.Our approach is implemented at the hypervisor level and it is transparent to the vip operating system or application.We evaluate the technique under a real system using regular magnetic disks to emulate faster storage devices.Our method presented a latency error of 6.5%compared to a real device.Moreover,decoupled experiment between two laboratories,at the Barcelona Super Computing Center(BSC)in Spain,and the Center of Computer Science and Free Software(C3SL)in Brazil,demonstrated that our approach is reproducible and promising to allow the virtual evaluation of next-gen storage devices.
基金This work has been funded by the German Research Foundation(DFG)through the Priority Programme 1648"Software for Exascale Computing"and the ADA-FS projectalso partially supported by the Spanish Ministry of Science and Innovation under Grant No.TIN2015-65316+1 种基金the Generalitat de Catalunya under Contract 2014-SGR-1051as well as the European Union's Horizon 2020 Research and Innovation Programme,under Grant Agreement No.671951(NEXTGenIO).
文摘Many scientific fields increasingly use high-performance computing(HPC)to process and analyze massive amounts of experimental data while storage systems in today's HPC environments have to cope with new access patterns.These patterns include many metadata operations,small I/O requests,or randomized file I/O,while general-purpose parallel file systems have been optimized for sequential shared access to large files.Burst buffer file systems create a separate file system that applications can use to store temporary data.They aggregate node-local storage available within the compute nodes or use dedicated SSD clusters and offer a peak bandwidth higher than that of the backend parallel file system without interfering with it.However,burst buffer file systems typically offer many features that a scientific application,running in isolation for a limited amount of time,does not require.We present GekkoFS,a temporary,highly-scalable file system which has been specifically optimized for the aforementioned use cases.GekkoFS provides relaxed POSIX semantics which only offers features which are actually required by most(not all)applications.GekkoFS is,therefore,able to provide scalable I/O performance and reaches millions of metadata operations already for a small number of nodes,significantly outperforming the capabilities of common parallel file systems.