Search results for "multiprocessor"

showing 10 items of 18 documents

The Acts project: track reconstruction software for HL-LHC and beyond

2019

The reconstruction of trajectories of the charged particles in the tracking detectors of high energy physics experiments is one of the most difficult and complex tasks of event reconstruction at particle colliders. As pattern recognition algorithms exhibit combinatorial scaling to high track multiplicities, they become the largest contributor to the CPU consumption within event reconstruction, particularly at current and future hadron colliders such as the LHC, HL-LHC and FCC-hh. Current algorithms provide an extremely high standard of physics and computing performance and have been tested on billions of simulated and recorded data events. However, most algorithms were first written 20 year…

Multi-core processor010308 nuclear & particles physicsEvent (computing)track data analysisPhysicsQC1-999Complex event processing01 natural sciencesprogrammingComputing and ComputersComputer engineeringMultithreading0103 physical sciencesmultiprocessorCERN LHC Coll: upgradeProgramming paradigmThread safety[INFO]Computer Science [cs]data managementReference implementation010306 general physicsnumerical calculationsperformanceactivity reportEvent reconstruction

researchProduct

Heterogeneous PBLAS: Optimization of PBLAS for Heterogeneous Computational Clusters

2008

This paper presents a package, called Heterogeneous PBLAS (HeteroPBLAS), which is built on top of PBLAS and provides optimized parallel basic linear algebra subprograms for heterogeneous computational clusters. We present the user interface and the software hierarchy of the first research implementation of HeteroPBLAS. This is the first step towards the development of a parallel linear algebra package for heterogeneous computational clusters. We demonstrate the efficiency of the HeteroPBLAS programs on a homogeneous computing cluster and a heterogeneous computing cluster.

Kernel (linear algebra)ScaLAPACKComputer scienceComputer clusterLinear algebraCluster (physics)Concurrent computingSymmetric multiprocessor systemParallel computingBasic Linear Algebra SubprogramsComputational science2008 International Symposium on Parallel and Distributed Computing

researchProduct

Wireless versus Wired Network-on-Chip to Enable the Multi- Tenant Multi-FPGAs in Cloud

2021

The new era of computing is not CPU-centric but enriched with all the heterogeneous computing resources including the reconfigurable fabric. In multi-FPGA architecture, either deployed within a data center or as a standalone model, inter-FPGA communication is crucial. Network-on-chip exhibits a promising performance for the integration of one FPGA. A sustainable communication architecture requires stable performance as the number of applications or users grows. Wireless network-on-chip has the potential to be that communication architecture, as it boasts the same performance capability as wired solutions in addition to its multicast capacities. We conducted an exploratory study to investiga…

Network on a chipMulticastComputer architecturebusiness.industryComputer scienceWirelessSymmetric multiprocessor systemData centerCloud computingArchitecturebusinessField-programmable gate array2021 IEEE International IOT, Electronics and Mechatronics Conference (IEMTRONICS)

researchProduct

Online Scheduling of Task Graphs on Hybrid Platforms

2018

Modern computing platforms commonly include accelerators. We target the problem of scheduling applications modeled as task graphs on hybrid platforms made of two types of resources, such as CPUs and GPUs. We consider that task graphs are uncovered dynamically, and that the scheduler has information only on the available tasks, i.e., tasks whose predecessors have all been completed. Each task can be processed by either a CPU or a GPU, and the corresponding processing times are known. Our study extends a previous $4\sqrt{m/k}$-competitive online algorithm [2], where m is the number of CPUs and k the number of GPUs ($m\ge k$). We prove that no online algorithm can have a competitive ratio …

020203 distributed computingCompetitive analysisonline algorithmsComputer scienceHeuristicSchedulingSymmetric multiprocessor system02 engineering and technologyParallel computingUpper and lower boundsheterogeneous computingGraph020202 computer hardware & architectureScheduling (computing)task graphs0202 electrical engineering electronic engineering information engineeringOnline algorithm[INFO.INFO-DC]Computer Science [cs]/Distributed Parallel and Cluster Computing [cs.DC]

researchProduct

The Mu3e Data Acquisition

2020

The Mu3e experiment aims to find or exclude the lepton flavour violating decay $\mu^+\to e^+e^-e^+$ with a sensitivity of one in 10$^{16}$ muon decays. The first phase of the experiment is currently under construction at the Paul Scherrer Institute (PSI, Switzerland), where beams with up to 10$^8$ muons per second are available. The detector will consist of an ultra-thin pixel tracker made from High-Voltage Monolithic Active Pixel Sensors (HV-MAPS), complemented by scintillating tiles and fibres for precise timing measurements. The experiment produces about 100 Gbit/s of zero-suppressed data which are transported to a filter farm using a network of FPGAs and fast optical links. On the filte…

Nuclear and High Energy PhysicsParticle physicsPhysics - Instrumentation and DetectorsMesonPhysics::Instrumentation and Detectorsdata acquisitionfibre: opticalFOS: Physical scienceshigh energy physics instrumentationprinted circuits7. Clean energycomputer: networkOptical fiber communicationData acquisitionsemiconductor detector: pixelOptical switchesmultiprocessor: graphicshardwareSensitivity (control systems)muon+: decay[PHYS.PHYS.PHYS-INS-DET]Physics [physics]/Physics [physics]/Instrumentation and Detectors [physics.ins-det]Electrical and Electronic EngineeringGeneralLiterature_REFERENCE(e.g.dictionariesencyclopediasglossaries)scintillation counterFPGAClocksPhysicsData acquisition (DAQ)MuonPixelMesonsDetectorlepton: flavor: violationField programmable gate arraysDetectorsInstrumentation and Detectors (physics.ins-det)sensitivityNuclear Energy and EngineeringFilter (video)field programmable gate arrays (FPGAs)Data acquisition (DAQ); field programmable gate arrays (FPGAs); high energy physics instrumentation; printed circuitselectronics: readoutHigh Energy Physics::ExperimentLeptonelectronics: design

researchProduct

A Fast GPU-Based Motion Estimation Algorithm for H.264/AVC

2012

H.264/AVC is the most recent predictive video compression standard to outperform other existing video coding standards by means of higher computational complexity. In recent years, heterogeneous computing has emerged as a cost-efficient solution for high-performance computing. In the literature, several algorithms have been proposed to accelerate video compression, but so far there have not been many solutions that deal with video codecs using heterogeneous systems. This paper proposes an algorithm to perform H.264/AVC inter prediction. The proposed algorithm performs the motion estimation, both with full-pixel and sub-pixel accuracy, using CUDA to assist the CPU, obtaining remarkable time …

CUDAComputational complexity theoryComputer scienceMotion estimationComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISIONCodecSymmetric multiprocessor systemImage processingData_CODINGANDINFORMATIONTHEORYCentral processing unitParallel computingData compression

researchProduct

Real-time data processing in the ALICE High Level Trigger at the LHC

2019

At the Large Hadron Collider at CERN in Geneva, Switzerland, atomic nuclei are collided at ultra-relativistic energies. Many final-state particles are produced in each collision and their properties are measured by the ALICE detector. The detector signals induced by the produced particles are digitized leading to data rates that are in excess of 48 GB/$s$. The ALICE High Level Trigger (HLT) system pioneered the use of FPGA- and GPU-based algorithms to reconstruct charged-particle trajectories and reduce the data size in real time. The results of the reconstruction of the collision events, available online, are used for high level data quality and detector-performance monitoring and real-tim…

calibration ; ALICE ; trigger ; monitoring ; quality ; data management ; programming ; FPGA ; multiprocessor: graphics ; performancePhysics - Instrumentation and DetectorsHigh level triggerPhysics::Instrumentation and DetectorsLevel datatutkimuslaitteetFPGA; GPUDetector calibrationGPUFOS: Physical sciencesGeneral Physics and AstronomyhiukkasfysiikkaPhysics and Astronomy(all)01 natural sciencesprogramming010305 fluids & plasmasCombinatoricsALICE0103 physical sciencesmultiprocessor: graphics[INFO]Computer Science [cs][PHYS.PHYS.PHYS-INS-DET]Physics [physics]/Physics [physics]/Instrumentation and Detectors [physics.ins-det]Detectors and Experimental Techniques010306 general physicsNuclear Experimentphysics.ins-detFPGAcomputer.programming_languagePhysicsLarge Hadron ColliderFPGA; GPU; TRACKsignaalinkäsittelyInstrumentation and Detectors (physics.ins-det)triggercalibrationmonitoringdatailmaisimetqualityHardware and ArchitectureTRACKHigh Energy Physics::Experimentdata managementAlice (programming language)computerperformance

researchProduct

Community-driven computational biology with Debian Linux

2011

Background The Open Source movement and its technologies are popular in the bioinformatics community because they provide freely available tools and resources for research. In order to feed the steady demand for updates on software and associated data, a service infrastructure is required for sharing and providing these tools to heterogeneous computing environments. Results The Debian Med initiative provides ready and coherent software packages for medical informatics and bioinformatics. These packages can be used together in Taverna workflows via the UseCase plugin to manage execution on local or remote machines. If such packages are available in cloud computing environments, the underlyin…

InternetTheoretical computer scienceComputer sciencebusiness.industryApplied MathematicsComputational BiologySymmetric multiprocessor systemBiochemistryComputer Science ApplicationsProceedingsSoftwareStructural BiologyThe InternetbusinessSoftware engineeringMolecular BiologySoftwareBMC Bioinformatics

researchProduct

Scalable Dense Factorizations for Heterogeneous Computational Clusters

2008

This paper discusses the design and the implementation of the LU factorization routines included in the Heterogeneous ScaLAPACK library, which is built on top of ScaLAPACK. These routines are used in the factorization and solution of a dense system of linear equations. They are implemented using optimized PBLAS, BLACS and BLAS libraries for heterogeneous computational clusters. We present the details of the implementation as well as performance results on a heterogeneous computing cluster.

ScaLAPACKComputer scienceMathematicsofComputing_NUMERICALANALYSISSymmetric multiprocessor systemParallel computingLU decompositionComputational sciencelaw.inventionMatrix decompositionFactorizationlawScalabilityLinear algebraConcurrent computing2008 International Symposium on Parallel and Distributed Computing

researchProduct

Two Job Cyclic Scheduling with Incompatibility Constraints

2001

The present paper deals with the problem of scheduling several repeated occurrences of two jobs over a finite or infinite time horizon in order to maximize the yielded profit. The constraints of the problem are the incompatibilities between some pairs of tasks which require a same resource.

Rate-monotonic schedulingMathematical optimizationJob shop schedulingComputer scienceStrategy and ManagementDistributed computingFlow shop schedulingDynamic priority schedulingManagement Science and Operations ResearchFair-share schedulingMultiprocessor schedulingComputer Science ApplicationsNurse scheduling problemManagement of Technology and InnovationTwo-level schedulingBusiness and International ManagementComputer Science::Operating Systems

researchProduct