6533b820fe1ef96bd12798ac
RESEARCH PRODUCT
Accelerating Application Migration in HPC
Lars NagelTim SüßStefan LankesSimon PickartzAndré BrinkmannRamy Gadsubject
Mean time between failuresComputer sciencebusiness.industry020206 networking & telecommunications02 engineering and technologyLoad balancing (computing)Computer securitycomputer.software_genreShared resourceVirtual machine0202 electrical engineering electronic engineering information engineering020201 artificial intelligence & image processingbusinesscomputerComputer networkdescription
It is predicted that the number of cores per node will rapidly increase with the upcoming era of exascale supercomputers. As a result, multiple applications will have to share one node and compete for the (often scarce) resources available on this node. Furthermore, the growing number of hardware components causes a decrease in the mean time between failures. Application migration between nodes has been proposed as a tool to mitigate these two problems: Bottlenecks due to resource sharing can be addressed by load balancing schemes which migrate applications; and hardware errors can often be tolerated by the system if faulty nodes are detected and processes are migrated ahead of time.
year | journal | country | edition | language |
---|---|---|---|---|
2016-01-01 |