Search results for " indexing"
showing 10 items of 88 documents
From Nerode's congruence to Suffix Automata with mismatches
2009
AbstractIn this paper we focus on the minimal deterministic finite automaton Sk that recognizes the set of suffixes of a word w up to k errors. As first result we give a characterization of the Nerode’s right-invariant congruence that is associated with Sk. This result generalizes the classical characterization described in [A. Blumer, J. Blumer, D. Haussler, A. Ehrenfeucht, M. Chen, J. Seiferas, The smallest automaton recognizing the subwords of a text, Theoretical Computer Science, 40, 1985, 31–55]. As second result we present an algorithm that makes use of Sk to accept in an efficient way the language of all suffixes of w up to k errors in every window of size r of a text, where r is the…
Comparing DNA sequence collections by direct comparison of compressed text indexes
2012
Popular sequence alignment tools such as BWA convert a reference genome to an indexing data structure based on the Burrows-Wheeler Transform (BWT), from which matches to individual query sequences can be rapidly determined. However the utility of also indexing the query sequences themselves remains relatively unexplored. Here we show that an all-against-all comparison of two sequence collections can be computed from the BWT of each collection with the BWTs held entirely in external memory, i.e. on disk and not in RAM. As an application of this technique, we show that BWTs of transcriptomic and genomic reads can be compared to obtain reference-free predictions of splice junctions that have h…
La iconografía en la era digital: hacia una heurística para el estudio del contenido de las imágenes medievales
2014
Este artículo invita a reflexionar sobre la validez de la iconografía como método para describir los temas representados en las obras artísticas medievales. En esta ocasión, se ha puesto un mayor énfasis en investigar sus implicaciones epistemológicas y los sesgos que resultan del proceso de transformar las imágenes en palabras. El objetivo principal es tratar de que aflore la llamada 'brecha semántica', una especie de barrera que impide representar verbalmente un medio no léxico, como es el visual, de manera satisfactoria y sin mermas. Tras un sucinto recorrido por el pensamiento griego, con un especial interés por la écfrasis, se sugiere que el aparente equilibrio entre las capacidades se…
Indización y uso de los Descriptores MeSH en Hospitalización a Domicilio
2017
Objetivo: Analizar la utilización de los Descriptores, como Major Topic, en la indización de los artículos sobre Hospitalización Domiciliaria en la base de datos MEDLINE.Método: Estudio descriptivo transversal de los registros de indización recogidos en la base de datos MEDLINE (vía PubMed) hasta 2016. El término utilizado, como descriptor principal para la búsqueda fue «Home Care Services, Hospital-Based».El método de muestreo fue la aleatorización simple sin reemplazo, tomando como base el número total de referencias obtenidas (tamaño muestral 386).Resultados: Se observaron diferencias significativas en la utilización de los Descriptores asociados a hospitalización a domicilio. La compara…
A simple and efficient face detection algorithm for video database applications
2000
The objective of this work is to provide a simple and yet efficient tool to detect human faces in video sequences. This information can be very useful for many applications such as video indexing and video browsing. In particular the paper focuses on the significant improvements made to our face detection algorithm presented by Albiol, Bouman and Delp (see IEEE Int. Conference on Image Processing, Kobe, Japan, 1999). Specifically, a novel approach to retrieve skin-like homogeneous regions is presented, which is later used to retrieve face images. Good results have been obtained for a large variety of video sequences. Peer Reviewed
A Novel Web Service for Mammography Images Indexing
2013
Medical community needs to extract precise information from a large amount of data. These data are a collection of different types such as text documents, images and video. Currently medical technology do not provide an intelligent methodology for documents recovery and classification of such documents based on their content. In this work the radiological structured reports are analysed with the corresponding mammographic images. The presented system is composed of an Indexing Engine and a Searching Engine, based on innovative methods for IR (Information Retrieval). The proposed work is useful for physicians as support diagnosis system, for students as learning support system, and finally, …
A Two-layer Partitioning for Non-point Spatial Data
2021
Non-point spatial objects (e.g., polygons, linestrings, etc.) are ubiquitous and their effective management is always timely. We study the problem of indexing non-point objects in memory. We propose a secondary partitioning technique for space-oriented partitioning indices (e.g., grids), which improves their performance significantly, by avoiding the generation and elimination of duplicate results. Our approach is novel and of a high impact, as (i) it is extremely easy to implement and (ii) it can be used by any space-partitioning index. We show how our approach can be used to boost the performance of spatial range queries. We also show how we can avoid performing the expensive refinement s…
A text based indexing system for mammographic image retrieval and classification
2014
Abstract In modern medical systems huge amount of text, words, images and videos are produced and stored in ad hoc databases. Medical community needs to extract precise information from that large amount of data. Currently ICT approaches do not provide a methodology for content-based medical images retrieval and classification. On the other hand, from the Internet of Things (IoT) perspective, the ICT medical data can be produced by several devices. Produced data complies with all Big Data features and constraints. The IoT guidelines put at the center of the system a new smart software to manage and transform Big Data in a new understanding form. This paper describes a text based indexing sy…
A Novel Approach to Improve the Accuracy of Web Retrieval
2010
General purpose search engines utilize a very simple view on text documents: They consider them as bags of words. It results that after indexing, the semantics of documents is lost. In this paper, we introduce a novel approach to improve the accuracy of Web retrieval. We utilize the WordNet and WordNet SenseRelate All Words Software as main tools to preserve the semantics of the sentences of documents and user queries. Nouns and verbs in the WordNet are organized in the tree hierarchies. The word meanings are presented by numbers that reference to the nodes on the semantic tree. The meaning of each word in the sentence is calculated when the sentence is analyzed. The goal is to put each nou…
Semantic retrieval: an approach to representing, searching and summarising text documents
2011
Nowadays, the internet is the major source of information for millions of people. There are many search tools available on the net but finding appropriate text information is still difficult. The retrieval efficiency of the presently used systems cannot be significantly improved: ‘bag of words’ interpretation causes losing semantics of texts. We applied the functional approach to represent English text documents. It allows taking into account semantic relations between words when indexing documents and use ordinary English sentences as queries to a search engine. The proposed retrieval mechanisms return only highly relevant documents. They make it possible to generate content-aware summarie…