0000000000368017
AUTHOR
Louis-claude Canon
Online Scheduling of Task Graphs on Hybrid Platforms
Modern computing platforms commonly include accelerators. We target the problem of scheduling applications modeled as task graphs on hybrid platforms made of two types of resources, such as CPUs and GPUs. We consider that task graphs are uncovered dynamically, and that the scheduler has information only on the available tasks, i.e., tasks whose predecessors have all been completed. Each task can be processed by either a CPU or a GPU, and the corresponding processing times are known. Our study extends a previous \(4\sqrt{m/k}\)-competitive online algorithm [2], where m is the number of CPUs and k the number of GPUs (\(m\ge k\)). We prove that no online algorithm can have a competitive ratio …
Scheduling on Two Types of Resources: a Survey
International audience; We study the problem of executing an application represented by a precedence task graph on a parallel machine composed of standard computing cores and accelerators. Contrary to most existing approaches, we distinguish the allocation and the scheduling phases and we mainly focus on the allocation part of the problem: choose the most appropriate type of computing unit for each task. We address both off-line and on-line settings and design generic scheduling approaches. In the first case, we establish strong lower bounds on the worst-case performance of a known approach based on Linear Programming for solving the allocation problem. Then, we refine the scheduling phase …
Scheduling independent stochastic tasks under deadline and budget constraints
This article discusses scheduling strategies for the problem of maximizing the expected number of tasks that can be executed on a cloud platform within a given budget and under a deadline constraint. The execution times of tasks follow independent and identically distributed probability laws. The main questions are how many processors to enroll and whether and when to interrupt tasks that have been executing for some time. We provide complexity results and an asymptotically optimal strategy for the problem instance with discrete probability distributions and without deadline. We extend the latter strategy for the general case with continuous distributions and a deadline and we design an ef…
Online Scheduling of Task Graphs on Heterogeneous Platforms
Modern computing platforms commonly include accelerators. We target the problem of scheduling applications modeled as task graphs on hybrid platforms made of two types of resources, such as CPUs and GPUs. We consider that task graphs are uncovered dynamically, and that the scheduler has information only on the available tasks, i.e., tasks whose predecessors have all been completed. Each task can be processed by either a CPU or a GPU, and the corresponding processing times are known. Our study extends a previous $4\sqrt{m/k}$ 4 m / k -competitive online algorithm by Amaris et al. [1] , where $m$ m is the number of CPUs and $k$ k the number of GPUs ( $m\geq k$ m ≥ k ). We prove that no online…
Scheduling independent stochastic tasks on heterogeneous cloud platforms
International audience; This work introduces scheduling strategies to maximize the expected number of independent tasks that can be executed on a cloud platform within a given budget and under a deadline constraint. The cloud platform is composed of several types of virtual machines (VMs), where each type has a unitexecution cost that depends upon its characteristics. The amount of budget spent during the execution of a task on a given VM is the product of its execution length by the unit execution cost of that VM. The execution lengths of tasks follow a variety of standard probability distributions (exponential, uniform, halfnormal, etc.), which is known beforehand and whose mean and stand…
A Generic Approach to Scheduling and Checkpointing Workflows
This work deals with scheduling and checkpointing strategies to execute scientific workflows on failure-prone large-scale platforms. To the best of our knowledge, this work is the first to target fail-stop errors for arbitrary workflows. Most previous work addresses soft errors, which corrupt the task being executed by a processor but do not cause the entire memory of that processor to be lost, contrarily to fail-stop errors. We revisit classical mapping heuristics such as HEFT and MinMin and complement them with several checkpointing strategies. The objective is to derive an efficient trade-off between checkpointing every task (CkptAll), which is an overkill when failures are rare events, …
Checkpointing Workflows for Fail-Stop Errors
International audience; We consider the problem of orchestrating the exe- cution of workflow applications structured as Directed Acyclic Graphs (DAGs) on parallel computing platforms that are subject to fail-stop failures. The objective is to minimize expected overall execution time, or makespan. A solution to this problem consists of a schedule of the workflow tasks on the available processors and of a decision of which application data to checkpoint to stable storage, so as to mitigate the impact of processor failures. For general DAGs this problem is hopelessly intractable. In fact, given a solution, computing its expected makespan is still a difficult problem. To address this challenge,…
Code for simulating online scheduling of task graphs on hybrid platforms
Code used to simulate a hybrid computing platform (consisting of CPUs and GPUs) to test several online algorithms that discover the task graph as it is unveiled and to compare them to the classical HEFT offline scheduler. The related study addresses online scheduling of task graphs on a hybrid platform composed of two types of processors (called CPU and GPU for convenience).This dataset consist of a single .zip artifact file consisting of:/src: source code for the simulator to assess online strategies/simu: source code to launch the simulation and generate plots/artifacts: overview documents with instructions on how to run code and system requirements/Europar: documents relating to the Euro…