Search results for " data structure."
showing 8 items of 88 documents
Adaptive reference-free compression of sequence quality scores
2014
Motivation: Rapid technological progress in DNA sequencing has stimulated interest in compressing the vast datasets that are now routinely produced. Relatively little attention has been paid to compressing the quality scores that are assigned to each sequence, even though these scores may be harder to compress than the sequences themselves. By aggregating a set of reads into a compressed index, we find that the majority of bases can be predicted from the sequence of bases that are adjacent to them and hence are likely to be less informative for variant calling or other applications. The quality scores for such bases are aggressively compressed, leaving a relatively small number at full reso…
Repetitiveness Measures based on String Attractors and Burrows-Wheeler Transform: Properties and Applications
2023
$O(n^2 log n)$ Time On-line Construction of Two-Dimensional Suffix Trees
2007
The two-dimensional suffix tree of an n × n square matrix A is a compacted trie that represents all square submatrices of A [11]. For the off-line case, i.e., A is given in advance to the algorithm, it is known how to build it in optimal time, for any type of alphabet size [11], [18]. Motivated by applications in Image Compression [22], Giancarlo and Guaiana [14] considered the on-line version of the two-dimensional suffix tree and presented an O(n2 log2 n)-time algorithm, which we refer to as GG. That algorithm is a nontrivial generalization of Ukkonen’s on-line algorithm for standard suffix trees [23]. The main contribution in this paper is an O(logn) factor improvement in the time comple…
Decremental 2- and 3-connectivity on planar graphs
1996
We study the problem of maintaining the 2-edge-, 2-vertex-, and 3-edge-connected components of a dynamic planar graph subject to edge deletions. The 2-edge-connected components can be maintained in a total ofO(n logn) time under any sequence of at mostO(n) deletions. This givesO(logn) amortized time per deletion. The 2-vertex- and 3-edge-connected components can be maintained in a total ofO(n log2n) time. This givesO(log2n) amortized time per deletion. The space required by all our data structures isO(n). All our time bounds improve previous bounds.
Visual Tools in Virtual Reality: Complex Environment
1994
In this paper, we analyse an integrated system able to merge graphical and vision technique in order to improve virtual space environment. Virtual space is characterized by Dynamic Visual Icons and Virtual Reality for arising a hererogeneous environment. Essentially, we propose a fusion technique between visual icon and virtual space, where their integration is supported by Visual lcon Grammar (VIG) working on Dynamic lcon and Visual World. VIG allows to test the actions of the Dynamic Icon on the active Visual Word metaphor at the lime "t", and the different rage of transactions between user and VW(visual query, view and brows of under-world,... ). Moreover, user can define, modify and rem…
Lightweight LCP construction for next-generation sequencing datasets
2012
The advent of "next-generation" DNA sequencing (NGS) technologies has meant that collections of hundreds of millions of DNA sequences are now commonplace in bioinformatics. Knowing the longest common prefix array (LCP) of such a collection would facilitate the rapid computation of maximal exact matches, shortest unique substrings and shortest absent words. CPU-efficient algorithms for computing the LCP of a string have been described in the literature, but require the presence in RAM of large data structures. This prevents such methods from being feasible for NGS datasets. In this paper we propose the first lightweight method that simultaneously computes, via sequential scans, the LCP and B…
A Repository for Multirelational Dynamic Networks
2012
Nowadays, WWW contains a number of social media sites, which are growing rapidly. One of the main features of social media sites is to allow to its users creation and modification of contents of the site utilizing the offered WWW interfaces. Such contents are referred to as user generated contents and their type varies from site to site. Social media sites can be modeled as constantly evolving multirelational directed graphs. In this paper we discuss persistent data structures for such graphs, and present and analyze queries performed against the structures. We also estimate the space requirements of the proposed data structures, and compare them with the naive "store each complete snapshot…
A Generic Architecture for a Social Network Monitoring and Analysis System
2011
This paper describes the architecture and a partial implementation of a system designed for the monitoring and analysis of communities at social media sites. The main contribution of the paper is a novel system architecture that facilitates long-term monitoring of diverse social networks existing and emerging at various social media sites. It consists of three main modules, the crawler, the repository and the analyzer. The first module can be adapted to crawl different sites based on ontology describing the structure of the site. The repository stores the crawled and analyzed persistent data using efficient data structures. It can be implemented using special purpose graph databases and/or …