Search results for "Indexing"
showing 10 items of 94 documents
A robust blind 3-D mesh watermarking based on wavelet transform for copyright protection
2019
Nowadays, three-dimensional meshes have been extensively used in several applications such as, industrial, medical, computer-aided design (CAD) and entertainment due to the processing capability improvement of computers and the development of the network infrastructure. Unfortunately, like digital images and videos, 3-D meshes can be easily modified, duplicated and redistributed by unauthorized users. Digital watermarking came up while trying to solve this problem. In this paper, we propose a blind robust watermarking scheme for three-dimensional semiregular meshes for Copyright protection. The watermark is embedded by modifying the norm of the wavelet coefficient vectors associated with th…
Novel Results on the Number of Runs of the Burrows-Wheeler-Transform
2021
The Burrows-Wheeler-Transform (BWT), a reversible string transformation, is one of the fundamental components of many current data structures in string processing. It is central in data compression, as well as in efficient query algorithms for sequence data, such as webpages, genomic and other biological sequences, or indeed any textual data. The BWT lends itself well to compression because its number of equal-letter-runs (usually referred to as $r$) is often considerably lower than that of the original string; in particular, it is well suited for strings with many repeated factors. In fact, much attention has been paid to the $r$ parameter as measure of repetitiveness, especially to evalua…
Large-scale compression of genomic sequence databases with the Burrows-Wheeler transform
2012
Motivation The Burrows-Wheeler transform (BWT) is the foundation of many algorithms for compression and indexing of text data, but the cost of computing the BWT of very large string collections has prevented these techniques from being widely applied to the large sets of sequences often encountered as the outcome of DNA sequencing experiments. In previous work, we presented a novel algorithm that allows the BWT of human genome scale data to be computed on very moderate hardware, thus enabling us to investigate the BWT as a tool for the compression of such datasets. Results We first used simulated reads to explore the relationship between the level of compression and the error rate, the leng…
Languages with mismatches and an application to approximate indexing
2005
In this paper we describe a factorial language, denoted by L(S, k,r), that contains all words that occur in a string 5 up to k mismatches every r symbols. Then we give some combinatorial properties of a parameter, called repetition index and denoted by R(S,k,r), defined as the smallest integer h ? 1 such that all strings of this length occur at most in a unique position of the text S up to k mismatches every r symbols. We prove that R(S, k, r) is a non-increasing function of r and a non-decreasing function of k and that the equation r = R(S, k, r) admits a unique solution. The repetition index plays an important role in the construction of an indexing data structure based on a trie that rep…
Fauna Europaea: Hymenoptera - Apocrita (excl. Ichneumonoidea)
2015
Fauna Europaea provides a public web-service with an index of scientific names (including important synonyms) of all living European land and freshwater animals, their geographical distribution at country level (up to the Urals, excluding the Caucasus region), and some additional information. The Fauna Europaea project covers about 230,000 taxonomic names, including 130,000 accepted species and 14,000 accepted subspecies. This represents a huge effort by more than 400 contributing specialists throughout Europe and is a unique (standard) reference suitable for many users in science, government, industry, nature conservation and education. Hymenoptera is one of the four largest orders of inse…
PESI - a taxonomic backbone for Europe
2015
Reliable taxonomy underpins communication in all of biology, not least nature conservation and sustainable use of ecosystem resources. The flexibility of taxonomic interpretations, however, presents a serious challenge for end-users of taxonomic concepts. Users need standardised and continuously harmonised taxonomic reference systems, as well as highquality and complete taxonomic data sets, but these are generally lacking for nonspecialists. The solution is in dynamic, expertly curated web-based taxonomic tools. The Pan-European Species-directories Infrastructure (PESI) worked to solve this key issue by providing a taxonomic e-infrastructure for Europe. It strengthened the relevant social (…
Fauna Europaea: Helminths (Animal Parasitic)
2014
The Laotian Rock Rat Laonastes aenigmamus Jenkins, Kilpatrick, Robinson & Timmins, 2005 was originally discovered in Lao People's Democratic Republic in 2005. This species has been recognized as the sole surviving member of the otherwise extinct rodent family Diatomyidae. Laonastes aenigmamus was initially reported only in limestone forests of Khammouane Province, Central Lao. A second population was recently discovered in Phong Nha Ke Bang National Park (PNKB NP), Quang Binh Province, Central Vietnam in 2011. The confirmed distribution range of L. aenigmamus in Vietnam is very small, approximately 150 km , covering low karst mountains in five communes of Minh Hoa District, Quang Binh Provi…
Fauna Europaea: Diptera – Brachycera
2015
Fauna Europaea provides a public web-service with an index of scientific names (including important synonyms) of all extant multicellular European terrestrial and freshwater animals and their geographical distribution at the level of countries and major islands (east of the Urals and excluding the Caucasus region). The Fauna Europaea project comprises about 230,000 taxonomic names, including 130,000 accepted species and 14,000 accepted subspecies, which is much more than the originally projected number of 100,000 species. Fauna Europaea represents a huge effort by more than 400 contributing taxonomic specialists throughout Europe and is a unique (standard) reference suitable for many user c…
Sorted deduplication: How to process thousands of backup streams
2016
The requirements of deduplication systems have changed in the last years. Early deduplication systems had to process dozens to hundreds of backup streams at the same time while today they are able to process hundreds to thousands of them. Traditional approaches rely on stream-locality, which supports parallelism, but which easily leads to many non-contiguous disk accesses, as each stream competes with all other streams for the available resources. This paper presents a new exact deduplication approach designed for processing thousands of backup streams at the same time on the same fingerprint index. The underlying approach destroys the traditionally exploited temporal chunk locality and cre…
A two-armed bandit collective for hierarchical examplar based mining of frequent itemsets with applications to intrusion detection
2014
Published version of a chapter in the book: Transactions on Computational Collective Intelligence XIV. Also available from the publisher at: http://dx.doi.org/10.1007/978-3-662-44509-9_1 In this paper we address the above problem by posing frequent item-set mining as a collection of interrelated two-armed bandit problems. We seek to find itemsets that frequently appear as subsets in a stream of itemsets, with the frequency being constrained to support granularity requirements. Starting from a randomly or manually selected examplar itemset, a collective of Tsetlin automata based two-armed bandit players - one automaton for each item in the examplar - learns which items should be included in …