Search results for "scala"

showing 10 items of 1416 documents

Massively Parallel ANS Decoding on GPUs

2019

In recent years, graphics processors have enabled significant advances in the fields of big data and streamed deep learning. In order to keep control of rapidly growing amounts of data and to achieve sufficient throughput rates, compression features are a key part of many applications including popular deep learning pipelines. However, as most of the respective APIs rely on CPU-based preprocessing for decoding, data decompression frequently becomes a bottleneck in accelerated compute systems. This establishes the need for efficient GPU-based solutions for decompression. Asymmetric numeral systems (ANS) represent a modern approach to entropy coding, combining superior compression results wit…

020203 distributed computingComputer science020206 networking & telecommunicationsData_CODINGANDINFORMATIONTHEORY02 engineering and technologyParallel computingCUDAScalability0202 electrical engineering electronic engineering information engineeringCodecSIMDEntropy encodingMassively parallelDecoding methodsData compressionProceedings of the 48th International Conference on Parallel Processing
researchProduct

Fault-Tolerant Network-on-Chip Design for Mesh-of-Tree Topology Using Particle Swarm Optimization

2018

As the size of the chip is scaling down the density of Intellectual Property (IP) cores integrated on a chip has been increased rapidly. The communication between these IP cores on a chip is highly challenging. To overcome this issue, Network-on-Chip (NoC) has been proposed to provide an efficient and a scalable communication architecture. In the deep sub-micron level NoCs are prone to faults which can occur in any component of NoC. To build a reliable and robust systems, it is necessary to apply efficient fault-tolerant techniques. In this paper, we present a flexible spare core placement in Mesh-of-Tree (MoT) topology using Particle Swarm Optimization (PSO) by considering IP core failures…

020203 distributed computingComputer scienceDistributed computingParticle swarm optimizationTopology (electrical circuits)Fault toleranceHardware_PERFORMANCEANDRELIABILITY02 engineering and technologyNetwork topologyChip020204 information systemsScalabilityHardware_INTEGRATEDCIRCUITS0202 electrical engineering electronic engineering information engineeringBenchmark (computing)Overhead (computing)TENCON 2018 - 2018 IEEE Region 10 Conference
researchProduct

WarpDrive: Massively Parallel Hashing on Multi-GPU Nodes

2018

Hash maps are among the most versatile data structures in computer science because of their compact data layout and expected constant time complexity for insertion and querying. However, associated memory access patterns during the probing phase are highly irregular resulting in strongly memory-bound implementations. Massively parallel accelerators such as CUDA-enabled GPUs may overcome this limitation by virtue of their fast video memory featuring almost one TB/s bandwidth in comparison to main memory modules of state-of-the-art CPUs with less than 100 GB/s. Unfortunately, the size of hash maps supported by existing single-GPU hashing implementations is restricted by the limited amount of …

020203 distributed computingComputer scienceHash function0102 computer and information sciences02 engineering and technologyParallel computingData structure01 natural sciencesHash tableElectronic mailMemory management010201 computation theory & mathematicsScalability0202 electrical engineering electronic engineering information engineeringMassively parallelTime complexity2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS)
researchProduct

SWMapper: Scalable Read Mapper on SunWay TaihuLight

2020

With the rapid development of next-generation sequencing (NGS) technologies, high throughput sequencing platforms continuously produce large amounts of short read DNA data at low cost. Read mapping is a performance-critical task, being one of the first stages required for many different types of NGS analysis pipelines. We present SWMapper — a scalable and efficient read mapper for the Sunway TaihuLight supercomputer. A number of optimization techniques are proposed to achieve high performance on its heterogeneous architecture which are centered around a memory-efficient succinct hash index data structure including seed filtration, duplicate removal, dynamic scheduling, asynchronous data tra…

020203 distributed computingSpeedupXeonComputer scienceHash function020206 networking & telecommunications02 engineering and technologyParallel computingSupercomputerData structureDNA sequencingchemistry.chemical_compoundchemistryScalability0202 electrical engineering electronic engineering information engineeringDNASunway TaihuLight49th International Conference on Parallel Processing - ICPP
researchProduct

A Distributed Multi-Authority Attribute Based Encryption Scheme for Secure Sharing of Personal Health Records

2017

Personal health records (PHR) are an emerging health information exchange model, which facilitates PHR owners to efficiently manage their health data. Typically, PHRs are outsourced and stored in third-party cloud platforms. Although, outsourcing private health data to third-party platforms is an appealing solution for PHR owners, it may lead to significant privacy concerns, because there is a higher risk of leaking private data to unauthorized parties. As a way of ensuring PHR owners' control of their outsourced PHR data, attribute based encryption (ABE) mechanisms have been considered due to the fact that such schemes facilitate a mechanism of sharing encrypted data among a set of intende…

020205 medical informaticsRevocationbusiness.industryComputer scienceInternet privacyCloud computingAccess controlHealth information exchange02 engineering and technologyEncryptionComputer securitycomputer.software_genreOutsourcingScalability0202 electrical engineering electronic engineering information engineering020201 artificial intelligence & image processingAttribute-based encryptionbusinesscomputerProceedings of the 22nd ACM on Symposium on Access Control Models and Technologies
researchProduct

Scalable implementation of measuring distances in a Riemannian manifold based on the Fisher Information metric

2019

This paper focuses on the scalability of the Fisher Information manifold by applying techniques of distributed computing. The main objective is to investigate methodologies to improve two bottlenecks associated with the measurement of distances in a Riemannian manifold formed by the Fisher Information metric. The first bottleneck is the quadratic increase in the number of pairwise distances. The second is the computation of global distances, approximated through a fully connected network of the observed pairwise distances, where the challenge is the computation of the all sources shortest path (ASSP). The scalable implementation for the pairwise distances is performed in Spark. The scalable…

0209 industrial biotechnologyComputer science02 engineering and technologyRiemannian manifoldBottleneckManifoldsymbols.namesake020901 industrial engineering & automationShortest path problemSpark (mathematics)Scalability0202 electrical engineering electronic engineering information engineeringsymbols020201 artificial intelligence & image processingFisher informationAlgorithmDijkstra's algorithmFisher information metric2019 International Joint Conference on Neural Networks (IJCNN)
researchProduct

Scalability of GPU-Processed 3D Distance Maps for Industrial Environments

2018

This paper contains a benchmark analysis of the open source library GPU-Voxels together with the Robot Operating System (ROS) in large-scale industrial robotics environment. Six sensor nodes with embedded computing generate real-time point cloud data as ROS topics. The overall data from all sensor nodes is processed by a combination of CPU and GPU on a central ROS node. Experimental results demonstrate that the system is able to handle frame rates of 10 and 20 Hz with voxel sizes of 4, 6, 8 and 12 cm without saturation of the CPU or the GPU used by the GPU-Voxels library. The results in this paper show that ROS, in combination with GPU-Voxels, can be used as a viable solution for real-time …

0209 industrial biotechnologyComputer scienceNode (networking)Point cloud02 engineering and technologycomputer.software_genreFrame rateComputational science020901 industrial engineering & automationVoxelScalability0202 electrical engineering electronic engineering information engineeringBenchmark (computing)020201 artificial intelligence & image processingCollision detectionCentral processing unitcomputerComputingMethodologies_COMPUTERGRAPHICS2018 14th IEEE/ASME International Conference on Mechatronic and Embedded Systems and Applications (MESA)
researchProduct

Do Randomized Algorithms Improve the Efficiency of Minimal Learning Machine?

2020

Minimal Learning Machine (MLM) is a recently popularized supervised learning method, which is composed of distance-regression and multilateration steps. The computational complexity of MLM is dominated by the solution of an ordinary least-squares problem. Several different solvers can be applied to the resulting linear problem. In this paper, a thorough comparison of possible and recently proposed, especially randomized, algorithms is carried out for this problem with a representative set of regression datasets. In addition, we compare MLM with shallow and deep feedforward neural network models and study the effects of the number of observations and the number of features with a special dat…

0209 industrial biotechnologyrandom projectionlcsh:Computer engineering. Computer hardwareComputational complexity theoryComputer scienceRandom projectionlcsh:TK7885-789502 engineering and technologyMachine learningcomputer.software_genresupervised learningapproximate algorithmsSet (abstract data type)regressioanalyysi020901 industrial engineering & automationdistance–based regressionalgoritmit0202 electrical engineering electronic engineering information engineeringordinary least–squaresbusiness.industrySupervised learningsingular value decompositionminimal learning machineMultilaterationprojektioRandomized algorithmkoneoppiminenmachine learningScalabilityFeedforward neural network020201 artificial intelligence & image processingArtificial intelligenceapproksimointibusinesscomputerMachine Learning and Knowledge Extraction
researchProduct

Secure and efficient verification for data aggregation in wireless sensor networks

2017

Summary The Internet of Things (IoT) concept is, and will be, one of the most interesting topics in the field of Information and Communications Technology. Covering a wide range of applications, wireless sensor networks (WSNs) can play an important role in IoT by seamless integration among thousands of sensors. The benefits of using WSN in IoT include the integrity, scalability, robustness, and easiness in deployment. In WSNs, data aggregation is a famous technique, which, on one hand, plays an essential role in energy preservation and, on the other hand, makes the network prone to different kinds of attacks. The detection of false data injection and impersonation attacks is one of the majo…

021110 strategic defence & security studiesComputer Networks and Communicationsbusiness.industryComputer scienceComputerSystemsOrganization_COMPUTER-COMMUNICATIONNETWORKS0211 other engineering and technologiesEarly detection020206 networking & telecommunications02 engineering and technologyImpersonation attackComputer Science ApplicationsData aggregatorRobustness (computer science)Information and Communications TechnologySoftware deploymentScalability0202 electrical engineering electronic engineering information engineeringbusinessWireless sensor networkComputer networkInternational Journal of Network Management
researchProduct

Deduplication Potential of HPC Applications’ Checkpoints

2016

HPC systems contain an increasing number of components, decreasing the mean time between failures. Checkpoint mechanisms help to overcome such failures for long-running applications. A viable solution to remove the resulting pressure from the I/O backends is to deduplicate the checkpoints. However, there is little knowledge about the potential to save I/Os for HPC applications by using deduplication within the checkpointing process. In this paper, we perform a broad study about the deduplication behavior of HPC application checkpointing and its impact on system design.

0301 basic medicine03 medical and health sciences030104 developmental biologyComputer scienceDistributed computingScalabilityData_FILESRedundancy (engineering)Data deduplicationApplication checkpointing2016 IEEE International Conference on Cluster Computing (CLUSTER)
researchProduct