Search results for " computing"

showing 10 items of 2075 documents

CUDA-enabled Sparse Matrix–Vector Multiplication on GPUs using atomic operations

2013

We propose the Sliced Coordinate Format (SCOO) for Sparse Matrix-Vector Multiplication on GPUs.An associated CUDA implementation which takes advantage of atomic operations is presented.We propose partitioning methods to transform a given sparse matrix into SCOO format.An efficient Dual-GPU implementation which overlaps computation and communication is described.Extensive performance comparisons of SCOO compared to other formats on GPUs and CPUs are provided. Existing formats for Sparse Matrix-Vector Multiplication (SpMV) on the GPU are outperforming their corresponding implementations on multi-core CPUs. In this paper, we present a new format called Sliced COO (SCOO) and an efficient CUDA i…

SpeedupComputer Networks and CommunicationsComputer scienceSparse matrix-vector multiplicationParallel computingComputer Graphics and Computer-Aided DesignTheoretical Computer ScienceMatrix (mathematics)CUDAArtificial IntelligenceHardware and ArchitectureBenchmark (computing)MultiplicationGeneral-purpose computing on graphics processing unitsSoftwareSparse matrixParallel Computing

researchProduct

Finding near-perfect parameters for hardware and code optimizations with automatic multi-objective design space explorations

2012

Summary In the design process of computer systems or processor architectures, typically many different parameters are exposed to configure, tune, and optimize every component of a system. For evaluations and before production, it is desirable to know the best setting for all parameters. Processing speed is no longer the only objective that needs to be optimized; power consumption, area, and so on have become very important. Thus, the best configurations have to be found in respect to multiple objectives. In this article, we use a multi-objective design space exploration tool called Framework for Automatic Design Space Exploration (FADSE) to automatically find near-optimal configurations in …

SpeedupComputer Networks and CommunicationsDesign space explorationComputer sciencebusiness.industryParallel computingProgram optimizationMulti-objective optimizationComputer Science ApplicationsTheoretical Computer ScienceMicroarchitectureComputational Theory and MathematicsScalabilityCode (cryptography)Engineering design processbusinessSoftwareComputer hardwareConcurrency and Computation: Practice and Experience

researchProduct

CliffoSor: A Parallel Embedded Architecture for Geometric Algebra and Computer Graphics

2006

Geometric object representation and their transformations are the two key aspects in computer graphics applications. Traditionally, compute-intensive matrix calculations are involved to model and render 3D scenery. Geometric algebra (a.k.a. Clifford algebra) is gaining growing attention for its natural way to model geometric facts coupled with its being a powerful analytical tool for symbolic calculations. In this paper, the architecture of CliffoSor (Clifford Processor) is introduced. ClifforSor is an embedded parallel coprocessing core that offers direct hardware support to Clifford algebra operators. A prototype implementation on an FPGA board is detailed. Initial test results show more …

SpeedupComputer scienceClifford algebraSolid modelingParallel computingComputational geometryApplication softwarecomputer.software_genreComputational scienceComputer graphicsGeometric algebraComputingMethodologies_SYMBOLICANDALGEBRAICMANIPULATIONRepresentation (mathematics)computer

researchProduct

cuBool: Bit-Parallel Boolean Matrix Factorization on CUDA-Enabled Accelerators

2018

Boolean Matrix Factorization (BMF) is a commonly used technique in the field of unsupervised data analytics. The goal is to decompose a ground truth matrix C into a product of two matrices A and $B$ being either an exact or approximate rank k factorization of C. Both exact and approximate factorization are time-consuming tasks due to their combinatorial complexity. In this paper, we introduce a massively parallel implementation of BMF - namely cuBool - in order to significantly speed up factorization of huge Boolean matrices. Our approach is based on alternately adjusting rows and columns of A and B using thousands of lightweight CUDA threads. The massively parallel manipulation of entries …

SpeedupRank (linear algebra)Computer science02 engineering and technologyParallel computingMatrix decompositionCUDAMatrix (mathematics)Factorization020204 information systemsSingular value decomposition0202 electrical engineering electronic engineering information engineering020201 artificial intelligence & image processingMassively parallelInteger (computer science)2018 IEEE 24th International Conference on Parallel and Distributed Systems (ICPADS)

researchProduct

Reconfigurable Accelerator for the Word-Matching Stage of BLASTN

2013

BLAST is one of the most popular sequence analysis tools used by molecular biologists. It is designed to efficiently find similar regions between two sequences that have biological significance. However, because the size of genomic databases is growing rapidly, the computation time of BLAST, when performing a complete genomic database search, is continuously increasing. Thus, there is a clear need to accelerate this process. In this paper, we present a new approach for genomic sequence database scanning utilizing reconfigurable field programmable gate array (FPGA)-based hardware. In order to derive an efficient structure for BLASTN, we propose a reconfigurable architecture to accelerate the…

SpeedupSequence databaseHardware and ArchitectureComputer scienceSequence analysisGenomicsParallel computingElectrical and Electronic EngineeringData structureGenomic databasesSoftwareReconfigurable computingWord (computer architecture)IEEE Transactions on Very Large Scale Integration (VLSI) Systems

researchProduct

Quantum Machine Learning: A tutorial

2021

This tutorial provides an overview of Quantum Machine Learning (QML), a relatively novel discipline that brings together concepts from Machine Learning (ML), Quantum Computing (QC) and Quantum Information (QI). The great development experienced by QC, partly due to the involvement of giant technological companies as well as the popularity and success of ML have been responsible of making QML one of the main streams for researchers working on fuzzy borders between Physics, Mathematics and Computer Science. A possible, although arguably coarse, classification of QML methods may be based on those approaches that make use of ML in a quantum experimentation environment and those others that take…

SpeedupTheoretical computer scienceQuantum machine learningComputer scienceCognitive NeuroscienceQuantum reinforcement learningQuantum computingFuzzy logicPopularityComputer Science ApplicationsComputational speed-upDevelopment (topology)Artificial IntelligenceQuantum clusteringQuantum informationQuantumQuantum-inspired learning algorithmsQuantum computerQuantum autoencoders

researchProduct

Alignment-Free Sequence Comparison over Hadoop for Computational Biology

2015

Sequence comparison i.e., The assessment of how similar two biological sequences are to each other, is a fundamental and routine task in Computational Biology and Bioinformatics. Classically, alignment methods are the de facto standard for such an assessment. In fact, considerable research efforts for the development of efficient algorithms, both on classic and parallel architectures, has been carried out in the past 50 years. Due to the growing amount of sequence data being produced, a new class of methods has emerged: Alignment-free methods. Research in this ares has become very intense in the past few years, stimulated by the advent of Next Generation Sequencing technologies, since those…

SpeedupTheoretical computer scienceSettore INF/01 - InformaticaComputer scienceAlignment-free sequence comparison and analysis; Distributed computing; Hadoop; MapReduce; Software; Mathematics (all); Hardware and ArchitectureSequence alignmentContext (language use)Computational biologyDNA sequencingDistributed computingTask (project management)Alignment-free sequence comparison and analysisHadoopHardware and ArchitectureMathematics (all)Relevance (information retrieval)MapReducePattern matchingAlignment-free sequence comparison and analysiSoftware

researchProduct

Fast spiking neural network architecture for low-cost FPGA devices

2012

Spiking Neural Networks (SNN) consist of fully interconnected computation units (neurons) based on spike processing. This type of networks resembles those found in biological systems studied by neuroscientists. This paper shows a hardware implementation for SNN. First, SNN require the inputs to be spikes, being necessary a conversion system (encoding) from digital values into spikes. For travelling spikes, each neuron interconnection is characterized by weights and delays, requiring an internal neuron processing by a Postsynaptic Potential (PSP) function and membrane potential threshold evaluation for a postsynaptic output spike generation. In order to model a real biological system by arti…

Spiking neural networkReduction (complexity)InterconnectionComputer sciencebusiness.industryComputationEncoding (memory)Real-time computingSpike (software development)Function (mathematics)Field-programmable gate arraybusinessComputer hardware7th International Workshop on Reconfigurable and Communication-Centric Systems-on-Chip (ReCoSoC)

researchProduct

Invariant aspects in M-commerce environments

2005

Mobile phones and other small and powerful portable devices have revolutionized personal communication and affected the lifestyles of the people in the industrialized world. Following credible estimates, in a few years there will over two - billions of such portable devices in use. An emerging trend is the electronic commerce performed using mobile terminals over wireless networks, often called mobile commerce or M-commerce. Mobile commerce environments are characterized by high complexity, including myriads of technical and organizational aspects. This property makes it difficult to distinguish the more fundamental issues, structures, and concepts in mobile commerce from the hype. To captu…

Standardizationbusiness.industryComputer scienceWireless networkMobile commerceMobile computingBusiness modelHigh complexitymedia_common.cataloged_instanceInvariant (mathematics)European unionTelecommunicationsbusinessmedia_commonProceedings of the 6th international conference on Mobile data management

researchProduct

A Network Formation Game Approach to Study BitTorrent Tit-for-Tat

2007

The Tit-for-Tat strategy implemented in BitTorrent (BT) clients is generally considered robust to selfish behaviours. The authors of [1] support this belief studying how Tit-for-Tat can affect selfish peers who are able to set their upload bandwidth. They show that there is a "good" Nash Equilibrium at which each peer uploads at the maximum rate. In this paper we consider a different game where BT clients can change the number of connections to open in order to improve their performance. We study this game using the analytical framework of network formation games [2]. In particular we characterize the set of pairwise stable networks the peers can form and how the peers can dynamically reach…

Star networkComputer sciencebusiness.industryDistributed computingcomputer.file_formatNetwork formationTit for tatUploadsymbols.namesakeNash equilibriumsymbolsPairwise comparisonSet (psychology)businessBitTorrentcomputerComputer network

researchProduct