0000000000749337
AUTHOR
Moritz Schlarb
Unified Parallel C++
Abstract Although MPI is commonly used for parallel programming on distributed-memory systems, Partitioned Global Address Space (PGAS) approaches are gaining attention for programming modern multi-core CPU clusters. They feature a hybrid memory abstraction: distributed memory is viewed as a shared memory that is partitioned among nodes in order to simplify programming. In this chapter you will learn about Unified Parallel C++ (UPC++), a library-based extension of C++ that gathers the advantages of both PGAS and Object Oriented paradigms. The examples included in this chapter will help you to understand the main features of PGAS languages and how they can simplify the task of programming par…
SAUCE: A web application for interactive teaching and learning of parallel programming
Abstract Prevalent hardware trends towards parallel architectures and algorithms create a growing demand for graduate students familiar with the programming of concurrent software. However, learning parallel programming is challenging due to complex communication and memory access patterns as well as the avoidance of common pitfalls such as dead-locks and race conditions. Hence, the learning process has to be supported by adequate software solutions in order to enable future computer scientists and engineers to write robust and efficient code. This paper discusses a selection of well-known parallel algorithms based on C++11 threads, OpenMP, MPI, and CUDA that can be interactively embedded i…
SAUCE: A Web-Based Automated Assessment Tool for Teaching Parallel Programming
Many curricula for undergraduate studies in computer science provide a lecture on the fundamentals of parallel programming like multi-threaded computation on shared memory architectures using POSIX threads or OpenMP. The complex structure of parallel programs can be challenging, especially for inexperienced students. Thus, there is a latent need for software supporting the learning process. Subsequent lectures may cover more advanced parallelization techniques such as the Message Passing Interface (MPI) and the Compute Unified Device Architecture (CUDA) languages. Unfortunately, the majority of students cannot easily access MPI clusters or modern hardware accelerators in order to effectivel…
Advanced C++11 Multithreading
Abstract The previous chapter introduced the basic concepts of multithreading using the C++11 threading API starting with basic spawn and join approaches, while finishing with non-trivial synchronization based on mutexes and condition variables. However, the major bottleneck of application performance is usually caused by contention for a shared resource. In case of mutex-based programming all participating threads usually try to acquire the same lock in parallel which effectively serializes the program for lightweight operations such as increment/decrement or updates of a single scalar value. Fortunately, modern CPUs provide dedicated commands that allow for the efficient execution of unin…