Search results for "scalability"
showing 10 items of 221 documents
A hybrid system for malware detection on big data
2018
In recent years, the increasing diffusion of malicious software has encouraged the adoption of advanced machine learning algorithms to timely detect new threats. A cloud-based approach allows to exploit the big data produced by client agents to train such algorithms, but on the other hand, poses severe challenges on their scalability and performance. We propose a hybrid cloud-based malware detection system in which static and dynamic analyses are combined in order to find a good trade-off between response time and detection accuracy. Our system performs a continuous learning process of its models, based on deep networks, by exploiting the growing amount of data provided by clients. The prel…
3-Dimensional Hydrodynamic Interaction of a Supernova Remnant Shock with an Isolated Cloud
2006
We report on a computational key-project in astrophysics. The project is aimed at studying the interaction of a supernova shock wave with interstellar clouds. We describe the numerical code used, namely FLASH, a multi-dimensional astrophysical hydrodynamics code for parallel computers developed at the FLASH center (The University of Chicago); our team collaborates with, and contributes to, the FLASH project. We discuss the resources required for the whole project, the I/O management, the performance and the scalability of the code on IBM/Sp4 at CINECA. Finally, we present a selection of results. © 2005 IEEE.
Information Processing Schemes Based on Monolayer Protected Metallic Nanoclusters
2011
Nanostructures are potentially useful as building blocks to complement future electronics because of their high versatility and packing densities. The fabrication and characterization of particular nanostructures and the use of new theoretical tools to describe their properties are receiving much attention. However, the integration of these individual systems into general schemes that could perform simple tasks is also necessary because modern electronics operation relies on the concerted action of many basic units. We review here new conceptual schemes that can allow information processing with ligand or monolayer protected metallic nanoclusters (MPCs) on the basis of the experimentally de…
Monitoring wireless sensor networks through logical deductive processes
2006
This paper proposes a distributed multi-agent architecture for wireless sensor networks management, which exploits the dynamic reasoning capabilities of the Situation Calculus in order to emulate the reactive behavior of a human expert to fault situations. The information related to network events is generated by tunable agents installed on the network nodes and is collected by a logical entity for network managing where it is merged with general domain knowledge, with the aim of identifying the root causes of faults, and deciding on reparative actions. The logical inference system has being devised to carry out automated isolation, diagnosis, and, whenever possible, repair of network anoma…
SWAPHI: Smith-Waterman Protein Database Search on Xeon Phi Coprocessors
2014
The maximal sensitivity of the Smith-Waterman (SW) algorithm has enabled its wide use in biological sequence database search. Unfortunately, the high sensitivity comes at the expense of quadratic time complexity, which makes the algorithm computationally demanding for big databases. In this paper, we present SWAPHI, the first parallelized algorithm employing Xeon Phi coprocessors to accelerate SW protein database search. SWAPHI is designed based on the scale-and-vectorize approach, i.e. it boosts alignment speed by effectively utilizing both the coarse-grained parallelism from the many co-processing cores (scale) and the fine-grained parallelism from the 512-bit wide single instruction, mul…
Mapping of BLASTP Algorithm onto GPU Clusters
2011
Searching protein sequence database is a fundamental and often repeated task in computational biology and bioinformatics. However, the high computational cost and long runtime of many database scanning algorithms on sequential architectures heavily restrict their applications for large-scale protein databases, such as GenBank. The continuing exponential growth of sequence databases and the high rate of newly generated queries further deteriorate the situation and establish a strong requirement for time-efficient scalable database searching algorithms. In this paper, we demonstrate how GPU clusters, powered by the Compute Unified Device Architecture (CUDA), OpenMP, and MPI parallel programmi…
Finding near-perfect parameters for hardware and code optimizations with automatic multi-objective design space explorations
2012
Summary In the design process of computer systems or processor architectures, typically many different parameters are exposed to configure, tune, and optimize every component of a system. For evaluations and before production, it is desirable to know the best setting for all parameters. Processing speed is no longer the only objective that needs to be optimized; power consumption, area, and so on have become very important. Thus, the best configurations have to be found in respect to multiple objectives. In this article, we use a multi-objective design space exploration tool called Framework for Automatic Design Space Exploration (FADSE) to automatically find near-optimal configurations in …
A parallel and sensitive software tool for methylation analysis on multicore platforms.
2015
Abstract Motivation: DNA methylation analysis suffers from very long processing time, as the advent of Next-Generation Sequencers has shifted the bottleneck of genomic studies from the sequencers that obtain the DNA samples to the software that performs the analysis of these samples. The existing software for methylation analysis does not seem to scale efficiently neither with the size of the dataset nor with the length of the reads to be analyzed. As it is expected that the sequencers will provide longer and longer reads in the near future, efficient and scalable methylation software should be developed. Results: We present a new software tool, called HPG-Methyl, which efficiently maps bis…
Long read alignment based on maximal exact match seeds
2012
Abstract Motivation: The explosive growth of next-generation sequencing datasets poses a challenge to the mapping of reads to reference genomes in terms of alignment quality and execution speed. With the continuing progress of high-throughput sequencing technologies, read length is constantly increasing and many existing aligners are becoming inefficient as generated reads grow larger. Results: We present CUSHAW2, a parallelized, accurate, and memory-efficient long read aligner. Our aligner is based on the seed-and-extend approach and uses maximal exact matches as seeds to find gapped alignments. We have evaluated and compared CUSHAW2 to the three other long read aligners BWA-SW, Bowtie2 an…
Aspects Concerning SVM Method’s Scalability
2008
In the last years the quantity of text documents is increasing continually and automatic document classification is an important challenge. In the text document classification the training step is essential in obtaining a good classifier. The quality of learning depends on the dimension of the training data. When working with huge learning data sets, problems regarding the training time that increases exponentially are occurring. In this paper we are presenting a method that allows working with huge data sets into the training step without increasing exponentially the training time and without significantly decreasing the classification accuracy.