6533b830fe1ef96bd1296e68
RESEARCH PRODUCT
Estimating the length of minimal spanning trees in compression of files
Olli NevalainenJarmo Ernvallsubject
Discrete mathematicsSpanning treeComputer Networks and CommunicationsApplied MathematicsShortest-path treeMinimum spanning treeConnected dominating setCombinatoricsComputational MathematicsGraph (abstract data type)Undirected graphSoftwareMathematicsofComputing_DISCRETEMATHEMATICSMathematicsMinimum degree spanning treedescription
Compression of a formatted file by a minimal spanning tree (MST) is studied. Here the records of the file are considered as the nodes of a weighted undirected graph. Each record pair is connected in the graph and the corresponding arc is weighted by the sum of field lengths of those fields which differ in the two records. The actual compression is made by constructing an MST of the graph and by storing it in an economic way to preserve the information of the file. The length of the MST is a useful measure in the estimation of the power of the compression. In the paper we study upper bounds of this length, especially in the case where the field lengths of the different fields may vary. The upper bounds are derived by analyzing the so-called Gray-code sequences of the records. These sequences may be considered as spanning paths of the graph and their lengths give upper bounds of the length of the MST. In the study we show how a short spanning path can be constructed in this way. The results are also experimentally tested.
year | journal | country | edition | language |
---|---|---|---|---|
1984-03-01 | BIT |