0000000000225786

AUTHOR

Ramon Nou

0000-0003-0464-1473

showing 3 related works from this author

GekkoFS - A Temporary Distributed File System for HPC Applications

2018

We present GekkoFS, a temporary, highly-scalable burst buffer file system which has been specifically optimized for new access patterns of data-intensive High-Performance Computing (HPC) applications. The file system provides relaxed POSIX semantics, only offering features which are actually required by most (not all) applications. It is able to provide scalable I/O performance and reaches millions of metadata operations already for a small number of nodes, significantly outperforming the capabilities of general-purpose parallel file systems. The work has been funded by the German Research Foundation (DFG) through the ADA-FS project as part of the Priority Programme 1648. It is also support…

File system020203 distributed computingBurst buffersParallel processing (Electronic computers)Computer scienceProcessament en paral·lel (Ordinadors)020207 software engineering02 engineering and technologyBuffer storage (Computer science)computer.software_genreData structureDistributed file systemsMetadataParallel processing (DSP implementation)POSIXServerScalabilityHPC0202 electrical engineering electronic engineering information engineeringOperating systemHigh performance computingDistributed File System:Informàtica::Arquitectura de computadors::Arquitectures paral·leles [Àrees temàtiques de la UPC]computerCàlcul intensiu (Informàtica)2018 IEEE International Conference on Cluster Computing (CLUSTER)
researchProduct

Streamlining distributed Deep Learning I/O with ad hoc file systems

2021

With evolving techniques to parallelize Deep Learning (DL) and the growing amount of training data and model complexity, High-Performance Computing (HPC) has become increasingly important for machine learning engineers. Although many compute clusters already use learning accelerators or GPUs, HPC storage systems are not suitable for the I/O requirements of DL workflows. Therefore, users typically copy the whole training data to the worker nodes or distribute partitions. Because DL depends on randomized input data, prior work stated that partitioning impacts DL accuracy. Their solutions focused mainly on training I/O performance on a high-speed network but did not cover the data stage-in pro…

Data setWorkflowDistributed databaseProcess (engineering)Computer sciencebusiness.industryDeep learningDistributed computingComputer data storageData deduplicationArtificial intelligenceGlobal Namespacebusiness2021 IEEE International Conference on Cluster Computing (CLUSTER)
researchProduct

GekkoFS — A Temporary Burst Buffer File System for HPC Applications

2020

Many scientific fields increasingly use high-performance computing (HPC) to process and analyze massive amounts of experimental data while storage systems in today’s HPC environments have to cope with new access patterns. These patterns include many metadata operations, small I/O requests, or randomized file I/O, while general-purpose parallel file systems have been optimized for sequential shared access to large files. Burst buffer file systems create a separate file system that applications can use to store temporary data. They aggregate node-local storage available within the compute nodes or use dedicated SSD clusters and offer a peak bandwidth higher than that of the backend parallel f…

Information storage and retrieval systemsPOSIXFile systemBurst buffersComputer scienceProcess (computing)computer.software_genreDistributed file systemsComputer Science ApplicationsTheoretical Computer ScienceMetadataInformació -- Sistemes d'emmagatzematge i recuperacióComputational Theory and MathematicsHardware and ArchitecturePOSIXHPC:Informàtica::Sistemes d'informació::Emmagatzematge i recuperació de la informació [Àrees temàtiques de la UPC]ScalabilityOperating systemBandwidth (computing)High performance computingIsolation (database systems)Càlcul intensiu (Informàtica)computerSoftwareJournal of Computer Science and Technology
researchProduct