Search results for "DATA"
showing 10 items of 12992 documents
WarpDrive: Massively Parallel Hashing on Multi-GPU Nodes
2018
Hash maps are among the most versatile data structures in computer science because of their compact data layout and expected constant time complexity for insertion and querying. However, associated memory access patterns during the probing phase are highly irregular resulting in strongly memory-bound implementations. Massively parallel accelerators such as CUDA-enabled GPUs may overcome this limitation by virtue of their fast video memory featuring almost one TB/s bandwidth in comparison to main memory modules of state-of-the-art CPUs with less than 100 GB/s. Unfortunately, the size of hash maps supported by existing single-GPU hashing implementations is restricted by the limited amount of …
Massively Parallel Huffman Decoding on GPUs
2018
Data compression is a fundamental building block in a wide range of applications. Besides its intended purpose to save valuable storage on hard disks, compression can be utilized to increase the effective bandwidth to attached storage as realized by state-of-the-art file systems. In the foreseeing future, on-the-fly compression and decompression will gain utmost importance for the processing of data-intensive applications such as streamed Deep Learning tasks or Next Generation Sequencing pipelines, which establishes the need for fast parallel implementations. Huffman coding is an integral part of a number of compression methods. However, efficient parallel implementation of Huffman decompre…
Smart urbanism
2020
A characteristic of intelligent cities is urban development where local and national governments regulate land use plans in order to efficiently plan and control urban and real state growth.This article describes the case of use in urban licensing in the government of Colombia in which a digitalization system was designed and developed in the public administration where data collection, storage and analysis are carried out to control urban licensing in 37 cities of Colombia. The system allows visual and statistical analysis of the urban growth of the cities.The results obtained by the system have allowed to improve the control and legality of the construction licenses in Colombia and has co…
A segmentation algorithm for noisy images
2005
International audience; This paper presents a segmentation algorithm for gray-level images and addresses issues related to its performance on noisy images. It formulates an image segmentation problem as a partition of a weighted image neighborhood hypergraph. To overcome the computational difficulty of directly solving this problem, a multilevel hypergraph partitioning has been used. To evaluate the algorithm, we have studied how noise affects the performance of the algorithm. The alpha-stable noise is considered and its effects on the algorithm are studied. Key words : graph, hypergraph, neighborhood hypergraph, multilevel hypergraph partitioning, image segmentation and noise removal.
Rings for Privacy: an Architecture for Large Scale Privacy-Preserving Data Mining
2021
This article proposes a new architecture for privacy-preserving data mining based on Multi Party Computation (MPC) and secure sums. While traditional MPC approaches rely on a small number of aggregation peers replacing a centralized trusted entity, the current study puts forth a distributed solution that involves all data sources in the aggregation process, with the help of a single server for storing intermediate results. A large-scale scenario is examined and the possibility that data become inaccessible during the aggregation process is considered, a possibility that traditional schemes often neglect. Here, it is explicitly examined, as it might be provoked by intermittent network connec…
Image retrieval system for citizen services using penalized logistic regression models
2020
This paper describes a procedure to deal with large image collections obtained by smart city services based on interaction with citizens providing pictures. The semantic gap between the low-level image features and represented concepts and situations has been addressed using image retrieval techniques. A relevance feedback procedure is proposed for Content-Based Image Retrieval (CBIR) based on the modelling of user responses. One of the novelties of the proposal is that the feedback learning procedure can use the information that citizens themselves can provide when using these services.The proposed algorithm considers the probability of an image belonging to the set of those sought by the …
Combining congested-flow isolation and injection throttling in HPC interconnection networks
2011
Existing congestion control mechanisms in interconnects can be divided into two general approaches. One is to throttle traffic injection at the sources that contribute to congestion, and the other is to isolate the congested traffic in specially designated resources. These two approaches have different, but non-overlapping weaknesses. In this paper we present in detail a method that combines injection throttling and congested-flow isolation. Through simulation studies we first demonstrate the respective flaws of the injection throttling and of flow isolation. Thereafter we show that our combined method extracts the best of both approaches in the sense that it gives fast reaction to congesti…
5G IoT system for real-time psycho-acoustic soundscape monitoring in smart cities
2020
In Next-Generation Technologies, the monitoring of environmental noise nuisance in the Smart City should be as efficient as possible. 5G IoT systems offer a great opportunity to offload the node calculation, as they provide a number of new concepts for dynamic computing that previous technologies did not offer. In this case, a complete 5G IoT system for psycho-acoustic monitoring has been implemented using different options to offload the calculation of the parameters to different parts of the system. This offloading has been implemented by directly computing the metrics in the node (as a Raspberry Pi), and in a ESP32 device (FiPy) and by sampling the audio and sending it to the EDGE in the…
SWMapper: Scalable Read Mapper on SunWay TaihuLight
2020
With the rapid development of next-generation sequencing (NGS) technologies, high throughput sequencing platforms continuously produce large amounts of short read DNA data at low cost. Read mapping is a performance-critical task, being one of the first stages required for many different types of NGS analysis pipelines. We present SWMapper — a scalable and efficient read mapper for the Sunway TaihuLight supercomputer. A number of optimization techniques are proposed to achieve high performance on its heterogeneous architecture which are centered around a memory-efficient succinct hash index data structure including seed filtration, duplicate removal, dynamic scheduling, asynchronous data tra…
A Stochastic Routing Algorithm for Distributed IoT with Unreliable Wireless Links
2016
Punctual and reliable transmission of collected information is indispensable for many Internet of Things (IoT) applications. Such applications rely on IoT devices operating over wireless communication links which are intrinsically unreliable. Consequently to improve packet delivery success while reducing delivery delay is a challenging task for data transmission in the IoT. In this paper, we propose an improved distributed stochastic routing algorithm to increase packet delivery ratio and decrease delivery delay in IoT with unreliable communication links. We adopt the concept of absorbing Markov chain to model the network and evaluate the expected delivery ratio and expected delivery delay …