Search results for " distributed computing"
showing 10 items of 87 documents
Sorted deduplication: How to process thousands of backup streams
2016
The requirements of deduplication systems have changed in the last years. Early deduplication systems had to process dozens to hundreds of backup streams at the same time while today they are able to process hundreds to thousands of them. Traditional approaches rely on stream-locality, which supports parallelism, but which easily leads to many non-contiguous disk accesses, as each stream competes with all other streams for the available resources. This paper presents a new exact deduplication approach designed for processing thousands of backup streams at the same time on the same fingerprint index. The underlying approach destroys the traditionally exploited temporal chunk locality and cre…
DelveFS - An Event-Driven Semantic File System for Object Stores
2020
Data-driven applications are becoming increasingly important in numerous industrial and scientific fields, growing the need for scalable data storage, such as object storage. Yet, many data-driven applications cannot use object interfaces directly and often have to rely on third-party file system connectors that support only a basic representation of objects as files in a flat namespace. With sometimes millions of objects per bucket, this simple organization is insufficient for users and applications who are usually only interested in a small subset of objects. These huge buckets are not only lacking basic semantic properties and structure, but they are also challenging to manage from a tec…
Improving Collective I/O Performance Using Non-volatile Memory Devices
2016
Collective I/O is a parallel I/O technique designed to deliver high performance data access to scientific applications running on high-end computing clusters. In collective I/O, write performance is highly dependent upon the storage system response time and limited by the slowest writer. The storage system response time in conjunction with the need for global synchronisation, required during every round of data exchange and write, severely impacts collective I/O performance. Future Exascale systems will have an increasing number of processor cores, while the number of storage servers will remain relatively small. Therefore, the storage system concurrency level will further increase, worseni…
Effects and Benefits of Node Sharing Strategies in HPC Batch Systems
2019
Processor manufacturers today scale performance by increasing the number of cores on each CPU. Unfortunately, not all HPC applications can efficiently saturate all cores of a single node, even if they successfully scale to thousands of nodes. For these applications, sharing nodes with other applications can help to stress different resources on the nodes to more efficiently use them. Previous work has shown that the performance impact of node sharing is very application dependent but very little work has studied its effects within batch systems and for complex parallel application mixes. Administrators therefore typically fear the complexity of running a batch system supporting node sharing…
Development of a low-cost IoT system to detect and locate lightning strikes
2020
Lightnings are violent natural phenomena and can generate many expenditures, specially when they strike in urban areas. The identification of the concrete geographic area where they strike is of critical importance for emergency services in order to enhance their effectiveness by doing an intensive coverage of the affected area. To achieve this aim, this paper proposes a design, prototype and validation of a distributed network of Internet of Things (IoT) devices to enable detection and location of lightning strikes. The IoT devices are empowered with lightning detection capabilities and are synchronized with the other devices in the sensor network. All of them cooperate within a network th…
Privacy-Preserving Overgrid: Secure Data Collection for the Smart Grid
2020
In this paper, we present a privacy-preserving scheme for Overgrid, a fully distributed peer-to-peer (P2P) architecture designed to automatically control and implement distributed Demand Response (DR) schemes in a community of smart buildings with energy generation and storage capabilities. To monitor the power consumption of the buildings, while respecting the privacy of the users, we extend our previous Overgrid algorithms to provide privacy preserving data aggregation (PP-Overgrid). This new technique combines a distributed data aggregation scheme with the Secure Multi-Party Computation paradigm. First, we use the energy profiles of hundreds of buildings, classifying the amount of &ldquo
On the collision property of chaotic iterations based post-treatments over cryptographic pseudorandom number generators
2018
International audience; There is not a proper mathematical definition of chaos, we have instead a quite big amount of definitions, each of one describes chaos in a more or less general context. Taking in account this, it is clear why it is hard to design an algorithm that produce random numbers, a kind of algorithm that could have plenty of concrete appliceautifat (anul)d bions. However we must use a finite state machine (e.g. a laptop) to produce such a sequence of random numbers, thus it is convenient, for obvious reasons, to redefine those aimed sequences as pseudorandom; also problems arise with floating point arithmetic if one wants to recover some real chaotic property (i.e. propertie…
Quadratically Tight Relations for Randomized Query Complexity
2020
In this work we investigate the problem of quadratically tightly approximating the randomized query complexity of Boolean functions R(f). The certificate complexity C(f) is such a complexity measure for the zero-error randomized query complexity R0(f): C(f) ≤R0(f) ≤C(f)2. In the first part of the paper we introduce a new complexity measure, expectational certificate complexity EC(f), which is also a quadratically tight bound on R0(f): EC(f) ≤R0(f) = O(EC(f)2). For R(f), we prove that EC2/3 ≤R(f). We then prove that EC(f) ≤C(f) ≤EC(f)2 and show that there is a quadratic separation between the two, thus EC(f) gives a tighter upper bound for R0(f). The measure is also related to the fractional…
A new Scheme for RPL to handle Mobility in Wireless Sensor Networks
2017
Mobile wireless sensor networks (WSNs) are characterised by dynamic changes in the network topology leading to route breaks and disconnections. The IPv6 routing protocol for low power and lossy networks (RPL), which has become a standard, uses the Trickle timer algorithm to handle changes in the network topology. However, neither RPL nor Trickle timer are well adapted to mobility. This paper investigates the problem of supporting mobility when using RPL. It enhances RPL to fit with sensors' mobility by studying two cases. Firstly, it proposes to modify RPL in order to fit with a dynamic and hybrid topology in the context of medical applications. Secondly, it investigates a more general case…
Checkpointing Workflows for Fail-Stop Errors
2017
International audience; We consider the problem of orchestrating the exe- cution of workflow applications structured as Directed Acyclic Graphs (DAGs) on parallel computing platforms that are subject to fail-stop failures. The objective is to minimize expected overall execution time, or makespan. A solution to this problem consists of a schedule of the workflow tasks on the available processors and of a decision of which application data to checkpoint to stable storage, so as to mitigate the impact of processor failures. For general DAGs this problem is hopelessly intractable. In fact, given a solution, computing its expected makespan is still a difficult problem. To address this challenge,…