Search results for "Multi-core processor"

showing 10 items of 35 documents

Multicore optical fibres for astrophotonics

2011

We report progress towards multimode (MM) fibre filters for suppressing the OH emission that hinders ground-based observation of the early Universe. Fibre Bragg gratings (FBGs) can filter these narrow spectral lines in single-mode (SM) fibres [1]. Implementing them in MM fibres well-matched to astronomical instruments requires transitions between the MM fibre and several SM fibres [2]. Such hand-crafted “photonic lanterns” require many identical FBGs to be made and spliced in place. Instead we are pursuing the idea in multicore (MC) fibres, Fig. 1(a). The FBG is written at once in all the SM cores. The fibre is jacketed with low-index glass and tapered to form the core and cladding of a MM …

Multi-core processorOptical fiberMaterials scienceMulti-mode optical fiberbusiness.industryCladding (fiber optics)Spectral linelaw.inventionSubwavelength-diameter optical fibreOpticsFiber Bragg gratinglawPhotonicsbusiness2011 Conference on Lasers and Electro-Optics Europe and 12th European Quantum Electronics Conference (CLEO EUROPE/EQEC)

researchProduct

Experimental Study of Six Different Implementations of Parallel Matrix Multiplication on Heterogeneous Computational Clusters of Multicore Processors

2010

Two strategies of distribution of computations can be used to implement parallel solvers for dense linear algebra problems for Heterogeneous Computational Clusters of Multicore Processors (HCoMs). These strategies are called Heterogeneous Process Distribution Strategy (HPS) and Heterogeneous Data Distribution Strategy (HDS). They are not novel and have been researched thoroughly. However, the advent of multicores necessitates enhancements to them. In this paper, we present these enhancements. Our study is based on experiments using six applications to perform Parallel Matrix-matrix Multiplication (PMM) on an HCoM employing the two distribution strategies.

Multi-core processorParallel processing (DSP implementation)Computer scienceComputationLinear algebraParallel algorithmConcurrent computingMultiplicationParallel computingMatrix multiplication2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing

researchProduct

Accelerating collision detection for large-scale crowd simulation on multi-core and many-core architectures

2013

The computing capabilities of current multi-core and many-core architectures have been used in crowd simulations for both enhancing crowd rendering and simulating continuum crowds. However, improving the scalability of crowd simulation systems by exploiting the inherent parallelism of these architectures is still an open issue. In this paper, we propose different parallelization strategies for the collision check procedure that takes place in agent-based simulations. These strategies are designed for exploiting the parallelism in both multi-core and many-core architectures like graphic processing units (GPUs). As for the many-core implementations, we analyse the bottlenecks of a previous G…

Multi-core processorSpeedupComputer scienceParallel computingCollisionTheoretical Computer ScienceRendering (computer graphics)CrowdsHardware and ArchitectureScalabilityCollision detectionCrowd simulationGeneral-purpose computing on graphics processing unitsSoftwareThe International Journal of High Performance Computing Applications

researchProduct

Suffix Array Construction on Multi-GPU Systems

2019

Suffix arrays are prevalent data structures being fundamental to a wide range of applications including bioinformatics, data compression, and information retrieval. Therefore, various algorithms for (parallel) suffix array construction both on CPUs and GPUs have been proposed over the years. Although providing significant speedup over their CPU-based counterparts, existing GPU implementations share a common disadvantage: input text sizes are limited by the scarce memory of a single GPU. In this paper, we overcome aforementioned memory limitations by exploiting multi-GPU nodes featuring fast NVLink interconnects. In order to achieve high performance for this communication-intensive task, we …

Multi-core processorSpeedupComputer scienceSuffix array0102 computer and information sciences02 engineering and technologyParallel computingData structure01 natural scienceslaw.inventionCUDAShared memory010201 computation theory & mathematicslaw0202 electrical engineering electronic engineering information engineering020201 artificial intelligence & image processingSuffixData compressionProceedings of the 28th International Symposium on High-Performance Parallel and Distributed Computing

researchProduct

Flexible VLIW processor based on FPGA for real-time image processing

2011

Modern FPGA chips, with their larger memory capacity and reconfigurability potential, are opening new frontiers in rapid prototyping of embedded systems. With the advent of high density FPGAs it is now possible to implement a high performance Very Long Instruction Word (VLIW) processor core in an FPGA. With VLIW architecture, the processor effectiveness depends on the ability of compilers to provide sufficient Instruction Level Parallelism (ILP) from program code. This paper describes research result about enabling the VLIW processor model for real-time processing applications by exploiting FPGA technology. Our goals are to keep the flexibility of processors in order to shorten the developm…

Multi-core processorbusiness.industryComputer scienceApplication-specific instruction-set processorReconfigurabilityInstruction setComputer architectureVery long instruction wordEmbedded systemVHDLbusinessInstruction-level parallelismcomputercomputer.programming_languageFPGA prototypeProceedings of the 2011 Conference on Design & Architectures for Signal & Image Processing (DASIP)

researchProduct

Design and Implementation of a Low-cost Embedded Iris Recognition System on a Dual-core Processor Platform

2012

Abstract Design of a low-cost embedded iris recognition system is described in this paper. Firstly, we develop a simple and effective iris image acquisition unit, which is cheap and easy to use. This is achieved by both of hardware design and image evaluation algorithm development. Secondly, the iris recognition algorithm is introduced, including iris segmentation, image normalization, feature extraction, and code matching. The algorithm implementation architecture is based on an embedded dual-core processor platform, Texas Instruments TMS320DM6446 evaluation module (Davinci), which contains an ARM core and a DSP core in one chip. Thirdly, the evaluation experiments are performed on the est…

Multi-core processorbusiness.industryComputer scienceFeature extractionIris recognitionComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISIONNormalization (image processing)SegmentationIRIS (biosensor)General MedicinebusinessDigital signal processingComputer hardwareIFAC Proceedings Volumes

researchProduct

GPU accelerated Monte Carlo simulation of the 2D and 3D Ising model

2009

The compute unified device architecture (CUDA) is a programming approach for performing scientific calculations on a graphics processing unit (GPU) as a data-parallel computing device. The programming interface allows to implement algorithms using extensions to standard C language. With continuously increased number of cores in combination with a high memory bandwidth, a recent GPU offers incredible resources for general purpose computing. First, we apply this new technology to Monte Carlo simulations of the two dimensional ferromagnetic square lattice Ising model. By implementing a variant of the checkerboard algorithm, results are obtained up to 60 times faster on the GPU than on a curren…

Numerical AnalysisMulti-core processorPhysics and Astronomy (miscellaneous)Computer scienceApplied MathematicsMonte Carlo methodGraphics processing unitSquare-lattice Ising modelComputer Science ApplicationsComputational scienceComputational MathematicsCUDAModeling and SimulationIsing modelStatistical physicsGeneral-purpose computing on graphics processing unitsLattice model (physics)Journal of Computational Physics

researchProduct

Hybrid CPU/GPU Acceleration of Detection of 2-SNP Epistatic Interactions in GWAS

2014

This is a post-peer-review, pre-copyedit version of an article published in Lecture Notes in Computer Science. The final authenticated version is available online at: https://doi.org/10.1007/978-3-319-09873-9_57 [Abstract] High-throughput genotyping technologies allow the collection of up to a few million genetic markers (such as SNPs) of an individual within a few minutes of time. Detecting epistasis, such as 2-SNP interactions, in Genome-Wide Association Studies is an important but time consuming operation since statistical computations have to be performed for each pair of measured markers. In this work we present EpistSearch, a parallelized tool that, following the log-linear model appr…

POSIX ThreadsMulti-core processorBioinformaticsComputer scienceComputationCUDAParallel computingBioinformaticsPthreadsCUDAAccelerationComputingMethodologies_PATTERNRECOGNITIONTitan (supercomputer)Filter (video)EpistasisGWASEpistasis

researchProduct

WiseEye: A Platform to Manage and Experiment on Smart Camera Networks

2016

International audience; Embedded vision is probably at the edge of phenomenal expansion. The smart cameras are embedding some processing units which are more and more powerful. Last decade, high-speed image processing can be implemented on specifically designed architectures [1] nevertheless the designing time of such systems was quite high and time to market therefore as well. Since, powerful chips (i.e System On Chip) and quick prototyping methodologies are contently emerging [2],[3],[4] and enable more complex algorithms to be implemented faster. Moreover, smart cameras which are embedding flexible and powerful multi-core processors or Graphic Processors Unit (GPU) are now available and …

Real-time Image processingfall detectionSmart CameraMulti-core processorGPUsmart building[INFO.INFO-ES]Computer Science [cs]/Embedded Systems[ INFO.INFO-ES ] Computer Science [cs]/Embedded Systemscontrol accessphotopletysmography[INFO.INFO-ES] Computer Science [cs]/Embedded Systems

researchProduct

SWAPHI: Smith-Waterman Protein Database Search on Xeon Phi Coprocessors

2014

The maximal sensitivity of the Smith-Waterman (SW) algorithm has enabled its wide use in biological sequence database search. Unfortunately, the high sensitivity comes at the expense of quadratic time complexity, which makes the algorithm computationally demanding for big databases. In this paper, we present SWAPHI, the first parallelized algorithm employing Xeon Phi coprocessors to accelerate SW protein database search. SWAPHI is designed based on the scale-and-vectorize approach, i.e. it boosts alignment speed by effectively utilizing both the coarse-grained parallelism from the many co-processing cores (scale) and the fine-grained parallelism from the 512-bit wide single instruction, mul…

Smith–Waterman algorithmFOS: Computer and information sciencesMulti-core processorCoprocessorSpeedupSequence databaseComputer scienceParallel computingIntrinsicsComputer Science - Distributed Parallel and Cluster ComputingScalabilitySIMDDistributed Parallel and Cluster Computing (cs.DC)Xeon Phi

researchProduct