Search results for "multiprocessor"

showing 10 items of 18 documents

Online Scheduling of Task Graphs on Hybrid Platforms

2018

Modern computing platforms commonly include accelerators. We target the problem of scheduling applications modeled as task graphs on hybrid platforms made of two types of resources, such as CPUs and GPUs. We consider that task graphs are uncovered dynamically, and that the scheduler has information only on the available tasks, i.e., tasks whose predecessors have all been completed. Each task can be processed by either a CPU or a GPU, and the corresponding processing times are known. Our study extends a previous \(4\sqrt{m/k}\)-competitive online algorithm [2], where m is the number of CPUs and k the number of GPUs (\(m\ge k\)). We prove that no online algorithm can have a competitive ratio …

020203 distributed computingCompetitive analysisonline algorithmsComputer scienceHeuristicSchedulingSymmetric multiprocessor system02 engineering and technologyParallel computingUpper and lower boundsheterogeneous computingGraph020202 computer hardware & architectureScheduling (computing)task graphs0202 electrical engineering electronic engineering information engineeringOnline algorithm[INFO.INFO-DC]Computer Science [cs]/Distributed Parallel and Cluster Computing [cs.DC]
researchProduct

Parallel Pairwise Epistasis Detection on Heterogeneous Computing Architectures

2016

This is a post-peer-review, pre-copyedit version of an article published in IEEE Transactions on Parallel and Distributed Systems. The final authenticated version is available online at: http://dx.doi.org/10.1109/TPDS.2015.2460247. [Abstract] Development of new methods to detect pairwise epistasis, such as SNP-SNP interactions, in Genome-Wide Association Studies is an important task in bioinformatics as they can help to explain genetic influences on diseases. As these studies are time consuming operations, some tools exploit the characteristics of different hardware accelerators (such as GPUs and Xeon Phi coprocessors) to reduce the runtime. Nevertheless, all these approaches are not able t…

0301 basic medicineCoprocessorComputer science0206 medical engineeringAccelerationData modelsSymmetric multiprocessor systemComputational modeling02 engineering and technologyParallel computingSupercomputer03 medical and health sciencesTask (computing)030104 developmental biologyCoprocessorsComputational Theory and MathematicsHardware and ArchitectureSignal ProcessingGeneticsPairwise comparisonComputer architectureGraphics processing units020602 bioinformaticsXeon Phi
researchProduct

SWhybrid: A Hybrid-Parallel Framework for Large-Scale Protein Sequence Database Search

2017

Computer architectures continue to develop rapidly towards massively parallel and heterogeneous systems. Thus, easily extensible yet highly efficient parallelization approaches for a variety of platforms are urgently needed. In this paper, we present SWhybrid, a hybrid computing framework for large-scale biological sequence database search on heterogeneous computing environments with multi-core or many-core processing units (PUs) based on the Smith- Waterman (SW) algorithm. To incorporate a diverse set of PUs such as combinations of CPUs, GPUs and Xeon Phis, we abstract them as SIMD vector execution units with different number of lanes. We propose a machine model, associated with a unified …

0301 basic medicineXeonSequence databasebusiness.industryComputer scienceInterface (computing)Symmetric multiprocessor systemParallel computingSet (abstract data type)03 medical and health sciences030104 developmental biologySoftwareComputer architectureSIMDbusinessMassively parallel2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS)
researchProduct

A Fast GPU-Based Motion Estimation Algorithm for H.264/AVC

2012

H.264/AVC is the most recent predictive video compression standard to outperform other existing video coding standards by means of higher computational complexity. In recent years, heterogeneous computing has emerged as a cost-efficient solution for high-performance computing. In the literature, several algorithms have been proposed to accelerate video compression, but so far there have not been many solutions that deal with video codecs using heterogeneous systems. This paper proposes an algorithm to perform H.264/AVC inter prediction. The proposed algorithm performs the motion estimation, both with full-pixel and sub-pixel accuracy, using CUDA to assist the CPU, obtaining remarkable time …

CUDAComputational complexity theoryComputer scienceMotion estimationComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISIONCodecSymmetric multiprocessor systemImage processingData_CODINGANDINFORMATIONTHEORYCentral processing unitParallel computingData compression
researchProduct

Exploring parallel capabilities of an innovative numerical method for recovering image velocity vectors field

2010

In this paper an efficient method devoted to estimate the velocity vectors field is investigated. The method is based on a quasi-interpolant operator and involves a large amount of computation. The operations characterizing the computational scheme are ideal for parallel processing because they are local, regular and repetitive. Therefore, the spatial parallelism of the process is studied to rapidly proceed in the computation on distributed multiprocessor systems. The process has shown to be synchronous, with good task balancing and requiring a small amount of data transfer.

ComputationNumerical analysisProcess (computing)MultiprocessingField (computer science)Computational scienceComputer Science ApplicationsSettore MAT/08 - Analisi NumericaOperator (computer programming)Parallel processing (DSP implementation)Modeling and SimulationModelling and SimulationImage velocity vectors field Quasi-interpolant operator B-spline functions Distributed multiprocessor systemsAlgorithmMathematicsData transmissionMathematical and Computer Modelling
researchProduct

Online Scheduling of Task Graphs on Heterogeneous Platforms

2020

Modern computing platforms commonly include accelerators. We target the problem of scheduling applications modeled as task graphs on hybrid platforms made of two types of resources, such as CPUs and GPUs. We consider that task graphs are uncovered dynamically, and that the scheduler has information only on the available tasks, i.e., tasks whose predecessors have all been completed. Each task can be processed by either a CPU or a GPU, and the corresponding processing times are known. Our study extends a previous $4\sqrt{m/k}$ 4 m / k -competitive online algorithm by Amaris et al. [1] , where $m$ m is the number of CPUs and $k$ k the number of GPUs ( $m\geq k$ m ≥ k ). We prove that no online…

Discrete mathematics[INFO.INFO-CC]Computer Science [cs]/Computational Complexity [cs.CC]020203 distributed computingScheduleCompetitive analysisComputer scienceHeuristicSchedulingOnline algorithmsProcessor schedulingSymmetric multiprocessor system02 engineering and technologyUpper and lower boundsGraphScheduling (computing)Computational Theory and MathematicsHardware and ArchitectureSignal Processing0202 electrical engineering electronic engineering information engineeringTask analysisTask graphsHeterogeneous computingOnline algorithm[INFO.INFO-DC]Computer Science [cs]/Distributed Parallel and Cluster Computing [cs.DC]
researchProduct

SYSTOLIC GENERATION OF k-ARY TREES

1999

The only parallel generating algorithms for k-ary trees are those of Akl and Stojmenović in 1996 and of Vajnovszki and Phillips in 1997. In the first of them, trees are represented by an inversion table and the processor model is a linear aray multicomputer. In the second, trees are represented by bitstrings and the algorithm executes on a shared memory multiprocessor. In this paper we give a parallel generating algorithm for k-ary trees represented by generalized P–sequences for execution on a linear array multicomputer.

Hardware and ArchitectureShared memory multiprocessorProcessor modelWeight-balanced treeParallel algorithmParallel computingInversion tableSoftwareTheoretical Computer ScienceLinear arrayMathematicsVector processorParallel Processing Letters
researchProduct

Community-driven computational biology with Debian Linux

2011

Background The Open Source movement and its technologies are popular in the bioinformatics community because they provide freely available tools and resources for research. In order to feed the steady demand for updates on software and associated data, a service infrastructure is required for sharing and providing these tools to heterogeneous computing environments. Results The Debian Med initiative provides ready and coherent software packages for medical informatics and bioinformatics. These packages can be used together in Taverna workflows via the UseCase plugin to manage execution on local or remote machines. If such packages are available in cloud computing environments, the underlyin…

InternetTheoretical computer scienceComputer sciencebusiness.industryApplied MathematicsComputational BiologySymmetric multiprocessor systemBiochemistryComputer Science ApplicationsProceedingsSoftwareStructural BiologyThe InternetbusinessSoftware engineeringMolecular BiologySoftwareBMC Bioinformatics
researchProduct

Heterogeneous PBLAS: Optimization of PBLAS for Heterogeneous Computational Clusters

2008

This paper presents a package, called Heterogeneous PBLAS (HeteroPBLAS), which is built on top of PBLAS and provides optimized parallel basic linear algebra subprograms for heterogeneous computational clusters. We present the user interface and the software hierarchy of the first research implementation of HeteroPBLAS. This is the first step towards the development of a parallel linear algebra package for heterogeneous computational clusters. We demonstrate the efficiency of the HeteroPBLAS programs on a homogeneous computing cluster and a heterogeneous computing cluster.

Kernel (linear algebra)ScaLAPACKComputer scienceComputer clusterLinear algebraCluster (physics)Concurrent computingSymmetric multiprocessor systemParallel computingBasic Linear Algebra SubprogramsComputational science2008 International Symposium on Parallel and Distributed Computing
researchProduct

Scheduling shared continuous resources on many-cores

2014

We consider the problem of scheduling a number of jobs on m identical processors sharing a continuously divisible resource. Each job j comes with a resource requirement rj∈[0,1]. The job can be processed at full speed if granted its full resource requirement. If receiving only an x-portion of r_j, it is processed at an x-fraction of the full speed. Our goal is to find a resource assignment that minimizes the makespan (i.e., the latest completion time). Variants of such problems, relating the resource assignment of jobs to their processing speeds, have been studied under the term discrete-continuous scheduling. Known results are either very pessimistic or heuristic in nature. In this paper, …

Mathematical optimizationJob shop schedulingComputer scienceDistributed computingApproximation algorithmJob assignmentUnit sizeCompletion timeResource assignmentMultiprocessor schedulingScheduling (computing)Proceedings of the 26th ACM symposium on Parallelism in algorithms and architectures
researchProduct