Search results for "Parallel"

showing 10 items of 667 documents

An implicitly parallel EDA based on restricted boltzmann machines

2014

We present a parallel version of RBM-EDA. RBM-EDA is an Estimation of Distribution Algorithm (EDA) that models dependencies between decision variables using a Restricted Boltzmann Machine (RBM). In contrast to other EDAs, RBM-EDA mainly uses matrix-matrix multiplications for model estimation and sampling. Hence, for implementation, standard libraries for linear algebra can be used. This allows an easy parallelization and leads to a high utilization of parallel architectures. The probabilistic model of the parallel version and the version on a single core are identical. We explore the speedups gained from running RBM-EDA on a Graphics Processing Unit. For problems of bounded difficulty like …

Restricted Boltzmann machineSpeedupEstimation of distribution algorithmArtificial neural networkComputer scienceLinear algebraGraphics processing unitBoltzmann machineParallel computingProceedings of the 2014 Annual Conference on Genetic and Evolutionary Computation
researchProduct

Ekskurs XI. Sprofanowana świątynia (Ez 8,1-18) i dolina suchych kości (Ez 37,1-14) w świetle retoryki hebrajskiej

2021

Kontekstem badań było to, że komentatorzy Księgi Ezechiela nie są zgodni w sprawie struktury badanych tekstów i proponują odmienne schematy. Celem badań stało się odkrycie struktury, którą starożytny autor natchniony zawarł w tekście. By osiągnąć założony cel, zastosowano metodę retoryki hebrajskiej, którą opracował Roland Meynet. W wyniku przeprowadzonych badań udało się odkryć, że sprofanowana świątynia ma strukturę paralelno-koncentryczną, składającą się z 9 elementów (A, B, C, D, E, D’, C’, B’, A’), natomiast dolina suchych kości też ma strukturę paralelno-koncentryczną, na którą składa się 5 elementów (A, B, C, B’, A’). Osiągnięte wyniki pozwoliły wyciągnąć wspólny wniosek dla dwóch ba…

Roland Meynetprorok Ezechiel; Księga Ezechiela; sprofanowana świątynia; dolina su chych kości; struktura paralelno-koncentryczna; retoryka hebrajskaprophet Ezekiel; Book of Ezekiel; Profaned Temple; Valley of Dry Bones; parallel-concentric structure; Hebrew Rhetoric
researchProduct

Scalable Dense Factorizations for Heterogeneous Computational Clusters

2008

This paper discusses the design and the implementation of the LU factorization routines included in the Heterogeneous ScaLAPACK library, which is built on top of ScaLAPACK. These routines are used in the factorization and solution of a dense system of linear equations. They are implemented using optimized PBLAS, BLACS and BLAS libraries for heterogeneous computational clusters. We present the details of the implementation as well as performance results on a heterogeneous computing cluster.

ScaLAPACKComputer scienceMathematicsofComputing_NUMERICALANALYSISSymmetric multiprocessor systemParallel computingLU decompositionComputational sciencelaw.inventionMatrix decompositionFactorizationlawScalabilityLinear algebraConcurrent computing2008 International Symposium on Parallel and Distributed Computing
researchProduct

Large-scale genome-wide association studies on a GPU cluster using a CUDA-accelerated PGAS programming model

2015

[Abstract] Detecting epistasis, such as 2-SNP interactions, in genome-wide association studies (GWAS) is an important but time consuming operation. Consequently, GPUs have already been used to accelerate these studies, reducing the runtime for moderately-sized datasets to less than 1 hour. However, single-GPU approaches cannot perform large-scale GWAS in reasonable time. In this work we present multiEpistSearch, a tool to detect epistasis that works on GPU clusters. While CUDA is used for parallelization within each GPU, the workload distribution among GPUs is performed with Unified Parallel C++ (UPC++), a novel extension of C++ that follows the Partitioned Global Address Space (PGAS) model…

Scale (ratio)BioinformaticsComputer sciencePGASGPUCUDAGenome-wide association studyParallel computingGPU clusterSoftware_PROGRAMMINGTECHNIQUESTheoretical Computer ScienceComputational scienceCUDAHardware and ArchitectureUnified Parallel CProgramming paradigmPartitioned global address spacecomputerUPC++Softwarecomputer.programming_languageThe International Journal of High Performance Computing Applications
researchProduct

Large-Scale Clustering of Short Reads for Metagenomics On GPUs

2013

Scale (ratio)Computer scienceMetagenomicsParallel computingCluster analysisComputational science
researchProduct

Checkpointing Workflows for Fail-Stop Errors

2017

International audience; We consider the problem of orchestrating the exe- cution of workflow applications structured as Directed Acyclic Graphs (DAGs) on parallel computing platforms that are subject to fail-stop failures. The objective is to minimize expected overall execution time, or makespan. A solution to this problem consists of a schedule of the workflow tasks on the available processors and of a decision of which application data to checkpoint to stable storage, so as to mitigate the impact of processor failures. For general DAGs this problem is hopelessly intractable. In fact, given a solution, computing its expected makespan is still a difficult problem. To address this challenge,…

ScheduleComputer scienceworkflowDistributed computing[INFO.INFO-DS]Computer Science [cs]/Data Structures and Algorithms [cs.DS]010103 numerical & computational mathematics02 engineering and technologyParallel computing[INFO.INFO-SE]Computer Science [cs]/Software Engineering [cs.SE]01 natural sciencesTheoretical Computer Science[INFO.INFO-IU]Computer Science [cs]/Ubiquitous Computing[INFO.INFO-CR]Computer Science [cs]/Cryptography and Security [cs.CR]checkpointfail-stop error0202 electrical engineering electronic engineering information engineeringOverhead (computing)[INFO]Computer Science [cs]0101 mathematicsresilienceClass (computer programming)020203 distributed computingJob shop schedulingProbabilistic logic020206 networking & telecommunications[INFO.INFO-MO]Computer Science [cs]/Modeling and SimulationDynamic programmingTask (computing)[INFO.INFO-PF]Computer Science [cs]/Performance [cs.PF]WorkflowComputational Theory and MathematicsHardware and Architecture[INFO.INFO-MA]Computer Science [cs]/Multiagent Systems [cs.MA]Task analysis[INFO.INFO-ET]Computer Science [cs]/Emerging Technologies [cs.ET][INFO.INFO-DC]Computer Science [cs]/Distributed Parallel and Cluster Computing [cs.DC]Software
researchProduct

Serial In-network Processing for Large Stationary Wireless Sensor Networks

2017

International audience; In wireless sensor networks, a serial processing algorithm browses nodes one by one and can perform different tasks such as: creating a schedule among nodes, querying or gathering data from nodes, supplying nodes with data, etc. Apart from the fact thatserial algorithms totally avoid collisions, numerous recent works have confirmed that these algorithms reduce communications andconsiderably save energy and time in large-dense networks. Yet, due to the path construction complexity, the proposed algorithmsare not optimal and their performances can be further enhanced. To do so, in the present paper, we propose a new serial processing algorithm that, in most of the case…

ScheduleVisual sensor networkbusiness.industryComputer science020206 networking & telecommunications02 engineering and technology[INFO.INFO-SE]Computer Science [cs]/Software Engineering [cs.SE][INFO.INFO-MO]Computer Science [cs]/Modeling and Simulation020202 computer hardware & architectureSerial memory processing[INFO.INFO-IU]Computer Science [cs]/Ubiquitous ComputingKey distribution in wireless sensor networks[INFO.INFO-CR]Computer Science [cs]/Cryptography and Security [cs.CR][INFO.INFO-MA]Computer Science [cs]/Multiagent Systems [cs.MA]Sensor nodeScalability0202 electrical engineering electronic engineering information engineeringMobile wireless sensor network[INFO.INFO-ET]Computer Science [cs]/Emerging Technologies [cs.ET][INFO.INFO-DC]Computer Science [cs]/Distributed Parallel and Cluster Computing [cs.DC]businessWireless sensor networkComputer network
researchProduct

Simulation of parallel mechanisms for motion cueing generation in vehicle simulators using AM-FM bi-modulated signals

2018

Abstract The use of robotic motion platforms in vehicle simulators is relatively common. However, the process of testing and tuning the so-called washout algorithms, used for motion cueing generation in motion-based vehicle simulators, is complex. This process can be reduced in cost, simplified, improved, shortened and performed safer if virtual motion platforms are used instead of real devices. This paper deals with identifying a method to perform a fast but reliable simulation of parallel mechanisms to be used for motion cueing generation. The method relies on the use of Laplacian polynomial transfer function models by means of using AM-FM bi-modulated signals as reference inputs to achie…

Scheme (programming language)0209 industrial biotechnologyPolynomialComputer scienceMechanical EngineeringProcess (computing)Parallel manipulatorSystem identification02 engineering and technologyTransfer functionMotion (physics)Computer Science Applications020901 industrial engineering & automationControl and Systems Engineering0202 electrical engineering electronic engineering information engineering020201 artificial intelligence & image processingElectrical and Electronic EngineeringAM/FM/GIScomputerSimulationcomputer.programming_languageMechatronics
researchProduct

Iterative moment method for electromagnetic transients in grounding systems on CRAY T3D

1996

In this paper the parallel aspects of an electromagnetic model for transients in grounding systems based on an iterative scheme are investigated in a multiprocessor environment. A coarse and fine grain parallel solutions have been developed on the CRAY T3D, housed at CINECA, equipped with 64 processors working in space sharing modality. The performances of the two parallel approaches implemented according to the work sharing parallel paradigm have been evaluated for different problem sizes employing variable number of processors.

Scheme (programming language)Moment (mathematics)Fine grainComputer scienceGroundConjugate gradient methodMultiprocessingParallel computingcomputercomputer.programming_languageComputational science
researchProduct

A Parallel Implementation of the Tree-Structured Self-Organizing Map

2002

This paper presents how Self-Organizing Maps (SOMs)can be trained efficiently using several, simultaneously executing threads on a shared memory Symmetric MultiProcessing (SMP)computer. The training method is a batch version of the Tree-Structured Self-Organizing Map. We note that SMP type of parallel training is very useful for large data sets obtained from nature, the process industry or large document collections, since we do not encounter similar model size limitations as with hardware SOM implementations.

Self-organizing mapTree (data structure)Theoretical computer scienceShared memoryComputer scienceSymmetric multiprocessingMessage Passing InterfaceBatch processingMultiprocessingParallel computingThread (computing)Implementation
researchProduct