0000000000171977

AUTHOR

Tim Süß

showing 11 related works from this author

Improving LSM‐trie performance by parallel search

2020

Computer scienceTrieParallel computingSoftwareParallel searchSoftware: Practice and Experience
researchProduct

Accelerating Application Migration in HPC

2016

It is predicted that the number of cores per node will rapidly increase with the upcoming era of exascale supercomputers. As a result, multiple applications will have to share one node and compete for the (often scarce) resources available on this node. Furthermore, the growing number of hardware components causes a decrease in the mean time between failures. Application migration between nodes has been proposed as a tool to mitigate these two problems: Bottlenecks due to resource sharing can be addressed by load balancing schemes which migrate applications; and hardware errors can often be tolerated by the system if faulty nodes are detected and processes are migrated ahead of time.

Mean time between failuresComputer sciencebusiness.industry020206 networking & telecommunications02 engineering and technologyLoad balancing (computing)Computer securitycomputer.software_genreShared resourceVirtual machine0202 electrical engineering electronic engineering information engineering020201 artificial intelligence & image processingbusinesscomputerComputer network
researchProduct

Pure Functions in C: A Small Keyword for Automatic Parallelization

2017

AbstractThe need for parallel task execution has been steadily growing in recent years since manufacturers mainly improve processor performance by increasing the number of installed cores instead of scaling the processor’s frequency. To make use of this potential, an essential technique to increase the parallelism of a program is to parallelize loops. Several automatic loop nest parallelizers have been developed in the past such as PluTo. The main restriction of these tools is that the loops must be statically analyzable which, among other things, disallows function calls within the loops. In this article, we present a seemingly simple extension to the C programming language which marks fun…

LOOP (programming language)Computer sciencemedia_common.quotation_subject020209 energy02 engineering and technologyParallel computingcomputer.software_genreToolchainTheoretical Computer ScienceTask (computing)Automatic parallelizationSide effect (computer science)Parallel processing (DSP implementation)020204 information systemsTheory of computationParallelism (grammar)0202 electrical engineering electronic engineering information engineeringPolytope model020201 artificial intelligence & image processingCompilerFunction (engineering)computerSoftwareInformation Systemsmedia_common2017 IEEE International Conference on Cluster Computing (CLUSTER)
researchProduct

A configurable rule based classful token bucket filter network request scheduler for the lustre file system

2017

HPC file systems today work in a best-effort manner where individual applications can flood the file system with requests, effectively leading to a denial of service for all other tasks. This paper presents a classful Token Bucket Filter (TBF) policy for the Lustre file system. The TBF enforces Remote Procedure Call (RPC) rate limitations based on (potentially complex) Quality of Service (QoS) rules. The QoS rules are enforced in Lustre's Object Storage Servers, where each request is assigned to an automatically created QoS class.The proposed QoS implementation for Lustre enables various features for each class including the support for high-priority and real-time requests even under heavy …

File systemComputer scienceQuality of service020206 networking & telecommunications020207 software engineeringDenial-of-service attackRule-based system02 engineering and technologycomputer.software_genreObject storageRemote procedure callServer0202 electrical engineering electronic engineering information engineeringOperating systemLustre (file system)computerProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis
researchProduct

Asynchronous Occlusion Culling on Heterogeneous PC Clusters for Distributed 3D Scenes

2012

We present a parallel rendering system for heterogeneous PC clusters to visualize massive models. One single, powerful visualization node is supported by a group of backend nodes with weak graphics performance. While the visualization node renders the visible objects, the backend nodes asynchronously perform visibility tests and supply the front end with visible scene objects. The visualization node stores only currently visible objects in its memory, while the scene is distributed among the backend nodes’ memory without redundancy. To efficiently compute the occlusion tests in spite of that each backend node stores only a fraction of the original geometry, we complete the scene by adding h…

Parallel renderingComputer sciencebusiness.industryVisibility (geometry)VisualizationFront and back endsAsynchronous communicationNode (computer science)Redundancy (engineering)Computer visionArtificial intelligenceGraphicsbusinessComputingMethodologies_COMPUTERGRAPHICS
researchProduct

Deriving and comparing deduplication techniques using a model-based classification

2015

Data deduplication has been a hot research topic and a large number of systems have been developed. These systems are usually seen as an inherently linked set of characteristics. However, a detailed analysis shows independent concepts that can be used in other systems. In this work, we perform this analysis on the main representatives of deduplication systems. We embed the results in a model, which shows two yet unexplored combinations of characteristics. In addition, the model enables a comprehensive evaluation of the representatives and the two new systems. We perform this evaluation based on real world data sets.

Set (abstract data type)Work (electrical)Computer scienceData deduplicationData miningcomputer.software_genrecomputerReal world dataProceedings of the Tenth European Conference on Computer Systems
researchProduct

Hyperion

2019

Indexes are essential in data management systems to increase the speed of data retrievals. Widespread data structures to provide fast and memory-efficient indexes are prefix tries. Implementations like Judy, ART, or HOT optimize their internal alignments for cache and vector unit efficiency. While these measures usually improve the performance substantially, they can have a negative impact on memory efficiency. In this paper we present Hyperion, a trie-based main-memory key-value store achieving extreme space efficiency. In contrast to other data structures, Hyperion does not depend on CPU vector units, but scans the data structure linearly. Combined with a custom memory allocator, Hyperion…

0303 health sciencesRange query (data structures)Computer scienceData structurecomputer.software_genreSearch tree03 medical and health sciencesMemory managementTrieMemory footprintData miningCachecomputer030304 developmental biologyProceedings of the 2019 International Conference on Management of Data
researchProduct

LPCC

2019

Most high-performance computing (HPC) clusters use a global parallel file system to enable high data throughput. The parallel file system is typically centralized and its storage media are physically separated from the compute cluster. Compute nodes as clients of the parallel file system are often additionally equipped with SSDs. The node internal storage media are rarely well-integrated into the I/O and compute workflows. How to make full and flexible use of these storage media is therefore a valuable research question. In this paper, we propose a hierarchical Persistent Client Caching (LPCC) mechanism for the Lustre file system. LPCC provides two modes: RW-PCC builds a read-write cache on…

File systemComputer scienceComputer clusterHierarchical storage management0202 electrical engineering electronic engineering information engineeringOperating system020206 networking & telecommunications020207 software engineeringLustre (file system)02 engineering and technologyCachecomputer.software_genrecomputerProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis
researchProduct

GekkoFS — A Temporary Burst Buffer File System for HPC Applications

2020

Many scientific fields increasingly use high-performance computing (HPC) to process and analyze massive amounts of experimental data while storage systems in today’s HPC environments have to cope with new access patterns. These patterns include many metadata operations, small I/O requests, or randomized file I/O, while general-purpose parallel file systems have been optimized for sequential shared access to large files. Burst buffer file systems create a separate file system that applications can use to store temporary data. They aggregate node-local storage available within the compute nodes or use dedicated SSD clusters and offer a peak bandwidth higher than that of the backend parallel f…

Information storage and retrieval systemsPOSIXFile systemBurst buffersComputer scienceProcess (computing)computer.software_genreDistributed file systemsComputer Science ApplicationsTheoretical Computer ScienceMetadataInformació -- Sistemes d'emmagatzematge i recuperacióComputational Theory and MathematicsHardware and ArchitecturePOSIXHPC:Informàtica::Sistemes d'informació::Emmagatzematge i recuperació de la informació [Àrees temàtiques de la UPC]ScalabilityOperating systemBandwidth (computing)High performance computingIsolation (database systems)Càlcul intensiu (Informàtica)computerSoftwareJournal of Computer Science and Technology
researchProduct

Scheduling shared continuous resources on many-cores

2014

We consider the problem of scheduling a number of jobs on m identical processors sharing a continuously divisible resource. Each job j comes with a resource requirement rj∈[0,1]. The job can be processed at full speed if granted its full resource requirement. If receiving only an x-portion of r_j, it is processed at an x-fraction of the full speed. Our goal is to find a resource assignment that minimizes the makespan (i.e., the latest completion time). Variants of such problems, relating the resource assignment of jobs to their processing speeds, have been studied under the term discrete-continuous scheduling. Known results are either very pessimistic or heuristic in nature. In this paper, …

Mathematical optimizationJob shop schedulingComputer scienceDistributed computingApproximation algorithmJob assignmentUnit sizeCompletion timeResource assignmentMultiprocessor schedulingScheduling (computing)Proceedings of the 26th ACM symposium on Parallelism in algorithms and architectures
researchProduct

Migration Techniques in HPC Environments

2014

Process migration is an important feature in modern computing centers as it allows for a more efficient use and maintenance of hardware. Especially in virtualized infrastructures it is successfully exploited by schemes for load balancing and energy efficiency. One can divide the tools and techniques into three groups: Process-level migration, virtual machine migration, and container-based migration.

DatabaseComputer scienceVirtual machineDistributed computingLoad balancing (computing)computer.software_genrecomputerProcess migrationEfficient energy useLive migration
researchProduct