Search results for "data compression"
showing 9 items of 99 documents
The Alternating BWT: an algorithmic perspective
2020
Abstract The Burrows-Wheeler Transform (BWT) is a word transformation introduced in 1994 for Data Compression. It has become a fundamental tool for designing self-indexing data structures, with important applications in several areas in science and engineering. The Alternating Burrows-Wheeler Transform (ABWT) is another transformation recently introduced in Gessel et al. (2012) [21] and studied in the field of Combinatorics on Words. It is analogous to the BWT, except that it uses an alternating lexicographical order instead of the usual one. Building on results in Giancarlo et al. (2018) [23] , where we have shown that BWT and ABWT are part of a larger class of reversible transformations, …
The Myriad Virtues of Suffix Trees
2006
Wavelet Trees have been introduced in [Grossi, Gupta and Vitter, SODA ’03] and have been rapidly recognized as a very flexible tool for the design of compressed full-text indexes and data compressors. Although several papers have investigated the beauty and usefulness of this data structure in the full-text indexing scenario, its impact on data compression has not been fully explored. In this paper we provide a complete theoretical analysis of a wide class of compression algorithms based on Wavelet Trees. We also show how to improve their asymp- totic performance by introducing a novel framework, called Generalized Wavelet Trees, that aims for the best combination of binary compressors (lik…
Comparative study of multi-2D, Full 3D and hybrid strategies for multi/hyperspectral image compression
2009
In this paper, we investigate appropriate strategies for multi/hyperspectral image compression. In particular, we compare the classic multi-2D compression strategy and two different implementations of 3D strategies (Full 3D and hybrid). All strategies are combined with a PCA decorrelation stage to optimize performance. For multi-2D and hybrid strategies, we propose a weighted version of PCA. Finally, for consistent evaluation, we propose a larger comparison framework than the conventionally used PSNR. The results are significant and show the weaknesses and strengths of each strategy.
Maximum likelihood difference scaling of image quality in compression-degraded images.
2007
International audience; Lossy image compression techniques allow arbitrarily high compression rates but at the price of poor image quality. We applied maximum likelihood difference scaling to evaluate image quality of nine images, each compressed via vector quantization to ten different levels, within two different color spaces, RGB and CIE 1976 L(*)a(*)b(*). In L(*)a(*)b(*) space, images could be compressed on average by 32% more than in RGB space, with little additional loss in quality. Further compression led to marked perceptual changes. Our approach permits a rapid, direct measurement of the consequences of image compression for human observers.
Massively Parallel Huffman Decoding on GPUs
2018
Data compression is a fundamental building block in a wide range of applications. Besides its intended purpose to save valuable storage on hard disks, compression can be utilized to increase the effective bandwidth to attached storage as realized by state-of-the-art file systems. In the foreseeing future, on-the-fly compression and decompression will gain utmost importance for the processing of data-intensive applications such as streamed Deep Learning tasks or Next Generation Sequencing pipelines, which establishes the need for fast parallel implementations. Huffman coding is an integral part of a number of compression methods. However, efficient parallel implementation of Huffman decompre…
Adaptive reference-free compression of sequence quality scores
2014
Motivation: Rapid technological progress in DNA sequencing has stimulated interest in compressing the vast datasets that are now routinely produced. Relatively little attention has been paid to compressing the quality scores that are assigned to each sequence, even though these scores may be harder to compress than the sequences themselves. By aggregating a set of reads into a compressed index, we find that the majority of bases can be predicted from the sequence of bases that are adjacent to them and hence are likely to be less informative for variant calling or other applications. The quality scores for such bases are aggressively compressed, leaving a relatively small number at full reso…
Machine learning at the interface of structural health monitoring and non-destructive evaluation
2020
While both non-destructive evaluation (NDE) and structural health monitoring (SHM) share the objective of damage detection and identification in structures, they are distinct in many respects. This paper will discuss the differences and commonalities and consider ultrasonic/guided-wave inspection as a technology at the interface of the two methodologies. It will discuss how data-based/machine learning analysis provides a powerful approach to ultrasonic NDE/SHM in terms of the available algorithms, and more generally, how different techniques can accommodate the very substantial quantities of data that are provided by modern monitoring campaigns. Several machine learning methods will be illu…
Perceptual adaptive insensitivity for support vector machine image coding.
2005
Support vector machine (SVM) learning has been recently proposed for image compression in the frequency domain using a constant epsilon-insensitivity zone by Robinson and Kecman. However, according to the statistical properties of natural images and the properties of human perception, a constant insensitivity makes sense in the spatial domain but it is certainly not a good option in a frequency domain. In fact, in their approach, they made a fixed low-pass assumption as the number of discrete cosine transform (DCT) coefficients to be used in the training was limited. This paper extends the work of Robinson and Kecman by proposing the use of adaptive insensitivity SVMs [2] for image coding u…
An extension of the Burrows-Wheeler Transform and applications to sequence comparison and data compression
2005
We introduce a generalization of the Burrows-Wheeler Transform (BWT) that can be applied to a multiset of words. The extended transformation, denoted by E, is reversible, but, differently from BWT, it is also surjective. The E transformation allows to give a definition of distance between two sequences, that we apply here to the problem of the whole mitochondrial genome phylogeny. Moreover we give some consideration about compressing a set of words by using the E transformation as preprocessing.