6533b826fe1ef96bd1283c49

RESEARCH PRODUCT

Using On-Demand File Systems in HPC Environments

Marc-andré VefAndré BrinkmannWolfgang E. NagelMarco BerghoffAchim StreitSebastian OesteMehmet SoysalThorsten Zirwes

subject

MetadataFile systemComputer scienceOn demandDistributed computingDATA processing & computer scienceLustre (file system)ddc:004computer.software_genrecomputerGlobal file systemBottleneckBeeGFS

description

In modern HPC systems, parallel (distributed) file systems are used to allow fast access from and to the storage infrastructure. However, I/O performance in large-scale HPC systems has failed to keep up with the increase in computational power. As a result, the I/O subsystem which also has to cope with a large number of demanding metadata operations is often the bottleneck of the entire HPC system. In some cases, even a single bad behaving application can be held responsible for slowing down the entire HPC system, disrupting other applications that use the same I/O subsystem. These kinds of situations are likely to become more frequent in the future with larger and more powerful HPC systems. In this work, we present a simple solution for applications with very high I/O demands. Our proposed solution is to create a private parallel file system on-demand for an HPC job and use the node-local storage devices, e.g. solid-state-disks (SSD). We show that this feature is easy to add to an existing HPC environment and requires only minimal configuration to the system. We conclude that the impact on running applications is manageable and the advantages to applications that generate a high load outweigh the disadvantages. We show that in some cases applications may run slower, but the reduction of load on the global file system is prevailing in these cases.

https://dx.doi.org/10.5445/ir/1000097459