Search results for " indexing"

showing 10 items of 88 documents

From Nerode's congruence to Suffix Automata with mismatches

2009

AbstractIn this paper we focus on the minimal deterministic finite automaton Sk that recognizes the set of suffixes of a word w up to k errors. As first result we give a characterization of the Nerode’s right-invariant congruence that is associated with Sk. This result generalizes the classical characterization described in [A. Blumer, J. Blumer, D. Haussler, A. Ehrenfeucht, M. Chen, J. Seiferas, The smallest automaton recognizing the subwords of a text, Theoretical Computer Science, 40, 1985, 31–55]. As second result we present an algorithm that makes use of Sk to accept in an efficient way the language of all suffixes of w up to k errors in every window of size r of a text, where r is the…

General Computer ScienceOpen problem[INFO.INFO-DS]Computer Science [cs]/Data Structures and Algorithms [cs.DS]0102 computer and information sciences02 engineering and technologyString searching algorithm01 natural sciencesTheoretical Computer ScienceCombinatoricsDeterministic automatonSuffix automata0202 electrical engineering electronic engineering information engineeringCombinatorics on words Indexing Suffix Automata Languages with mismatches Approximate string matchingMathematicsDiscrete mathematicsCombinatorics on wordsApproximate string matchingSettore INF/01 - InformaticaLanguages with mismatchesComputer Science::Computation and Language (Computational Linguistics and Natural Language and Speech Processing)PrefixCombinatorics on wordsDeterministic finite automaton010201 computation theory & mathematicsSuffix automatonIndexing020201 artificial intelligence & image processingSuffixComputer Science::Formal Languages and Automata TheoryComputer Science(all)
researchProduct

Comparing DNA sequence collections by direct comparison of compressed text indexes

2012

Popular sequence alignment tools such as BWA convert a reference genome to an indexing data structure based on the Burrows-Wheeler Transform (BWT), from which matches to individual query sequences can be rapidly determined. However the utility of also indexing the query sequences themselves remains relatively unexplored. Here we show that an all-against-all comparison of two sequence collections can be computed from the BWT of each collection with the BWTs held entirely in external memory, i.e. on disk and not in RAM. As an application of this technique, we show that BWTs of transcriptomic and genomic reads can be compared to obtain reference-free predictions of splice junctions that have h…

Genomics (q-bio.GN)SequenceComputer sciencebusiness.industrySearch engine indexingSequence alignmentPattern recognitionConstruct (python library)Data structureBurrows-Wheeler Transform; Splice junctions; External memoryExternal memoryFOS: Biological sciencesCode (cryptography)Quantitative Biology - GenomicsBurrows-Wheeler TransformArtificial intelligencebusinessSplice junctionsAuxiliary memoryReference genome
researchProduct

La iconografía en la era digital: hacia una heurística para el estudio del contenido de las imágenes medievales

2014

Este artículo invita a reflexionar sobre la validez de la iconografía como método para describir los temas representados en las obras artísticas medievales. En esta ocasión, se ha puesto un mayor énfasis en investigar sus implicaciones epistemológicas y los sesgos que resultan del proceso de transformar las imágenes en palabras. El objetivo principal es tratar de que aflore la llamada 'brecha semántica', una especie de barrera que impide representar verbalmente un medio no léxico, como es el visual, de manera satisfactoria y sin mermas. Tras un sucinto recorrido por el pensamiento griego, con un especial interés por la écfrasis, se sugiere que el aparente equilibrio entre las capacidades se…

HistoryMedieval artUNESCO::CIENCIAS DE LAS ARTES Y LAS LETRASHistoryLiterature and Literary TheoryPanoramaFilologíasRepresentation (arts)Otras filologías modernasIconography; Medieval Art; Middle Ages; image-text relationship; ekphrasis; digital collections; iconographic indexing; thesauriVisual artsControlled vocabulary:CIENCIAS DE LAS ARTES Y LAS LETRAS [UNESCO]Rhetorical questionIconographyRivalrySemantic gapDigital humanitiesMagnificat Cultura i Literatura Medievals
researchProduct

Indización y uso de los Descriptores MeSH en Hospitalización a Domicilio

2017

Objetivo: Analizar la utilización de los Descriptores, como Major Topic, en la indización de los artículos sobre Hospitalización Domiciliaria en la base de datos MEDLINE.Método: Estudio descriptivo transversal de los registros de indización recogidos en la base de datos MEDLINE (vía PubMed) hasta 2016. El término utilizado, como descriptor principal para la búsqueda fue «Home Care Services, Hospital-Based».El método de muestreo fue la aleatorización simple sin reemplazo, tomando como base el número total de referencias obtenidas (tamaño muestral 386).Resultados: Se observaron diferencias significativas en la utilización de los Descriptores asociados a hospitalización a domicilio. La compara…

Home hospitalizationTelemedicineInformation retrievalGeographySample size determinationSearch engine indexingMeSH DescriptorsSubject (documents)Medline databaseCartographyTerm (time)Hospital a Domicilio
researchProduct

A simple and efficient face detection algorithm for video database applications

2000

The objective of this work is to provide a simple and yet efficient tool to detect human faces in video sequences. This information can be very useful for many applications such as video indexing and video browsing. In particular the paper focuses on the significant improvements made to our face detection algorithm presented by Albiol, Bouman and Delp (see IEEE Int. Conference on Image Processing, Kobe, Japan, 1999). Specifically, a novel approach to retrieve skin-like homogeneous regions is presented, which is later used to retrieve face images. Good results have been obtained for a large variety of video sequences. Peer Reviewed

Image segmentationObject detectionbusiness.industryComputer scienceImage processingImage segmentation:Enginyeria de la telecomunicació [Àrees temàtiques de la UPC]Object detectionTelecomunicacióImage sequencesDatabase indexingVideo trackingTelecommunicationVideo databasesVideo browsingComputer visionArtificial intelligenceImage retrievalFace detectionbusinessImage retrievalProceedings 2000 International Conference on Image Processing (Cat. No.00CH37101)
researchProduct

A Novel Web Service for Mammography Images Indexing

2013

Medical community needs to extract precise information from a large amount of data. These data are a collection of different types such as text documents, images and video. Currently medical technology do not provide an intelligent methodology for documents recovery and classification of such documents based on their content. In this work the radiological structured reports are analysed with the corresponding mammographic images. The presented system is composed of an Indexing Engine and a Searching Engine, based on innovative methods for IR (Information Retrieval). The proposed work is useful for physicians as support diagnosis system, for students as learning support system, and finally, …

Indexing EngineInformation retrievalmedicine.diagnostic_testComputer scienceSearching EngineStructured ReportSearch engine indexingHealth technologycomputer.software_genreMedical ApplicationTeaching FilesSearch engineInformation RetrievalmedicineMammographyUser interfaceWeb serviceImage retrievalcomputerDiagnosiMammography2013 27th International Conference on Advanced Information Networking and Applications Workshops
researchProduct

A Two-layer Partitioning for Non-point Spatial Data

2021

Non-point spatial objects (e.g., polygons, linestrings, etc.) are ubiquitous and their effective management is always timely. We study the problem of indexing non-point objects in memory. We propose a secondary partitioning technique for space-oriented partitioning indices (e.g., grids), which improves their performance significantly, by avoiding the generation and elimination of duplicate results. Our approach is novel and of a high impact, as (i) it is extremely easy to implement and (ii) it can be used by any space-partitioning index. We show how our approach can be used to boost the performance of spatial range queries. We also show how we can avoid performing the expensive refinement s…

Information engineeringDistributed databaseRange query (data structures)Computer scienceSearch engine indexingScalabilityTwo layerPoint (geometry)Data miningcomputer.software_genreSpatial analysiscomputer
researchProduct

A text based indexing system for mammographic image retrieval and classification

2014

Abstract In modern medical systems huge amount of text, words, images and videos are produced and stored in ad hoc databases. Medical community needs to extract precise information from that large amount of data. Currently ICT approaches do not provide a methodology for content-based medical images retrieval and classification. On the other hand, from the Internet of Things (IoT) perspective, the ICT medical data can be produced by several devices. Produced data complies with all Big Data features and constraints. The IoT guidelines put at the center of the system a new smart software to manage and transform Big Data in a new understanding form. This paper describes a text based indexing sy…

Information retrievalComputer Networks and CommunicationsComputer sciencebusiness.industrySearch engine indexingBig datacomputer.software_genreDICOMSearch engineMedical images indexing and classificationHardware and ArchitectureInformation retrievalMedical documents indexing and classificationData miningMedical diagnosisbusinessClassifier (UML)computerSoftwareFuture Generation Computer Systems
researchProduct

A Novel Approach to Improve the Accuracy of Web Retrieval

2010

General purpose search engines utilize a very simple view on text documents: They consider them as bags of words. It results that after indexing, the semantics of documents is lost. In this paper, we introduce a novel approach to improve the accuracy of Web retrieval. We utilize the WordNet and WordNet SenseRelate All Words Software as main tools to preserve the semantics of the sentences of documents and user queries. Nouns and verbs in the WordNet are organized in the tree hierarchies. The word meanings are presented by numbers that reference to the nodes on the semantic tree. The meaning of each word in the sentence is calculated when the sentence is analyzed. The goal is to put each nou…

Information retrievalConcept searchComputer sciencebusiness.industryInformationSystems_INFORMATIONSTORAGEANDRETRIEVALSearch engine indexingWord processingWordNetcomputer.software_genreSemanticsComputingMethodologies_ARTIFICIALINTELLIGENCETree (data structure)NounComputingMethodologies_DOCUMENTANDTEXTPROCESSINGArtificial intelligencebusinesscomputerNatural language processingSentence2010 5th International Conference on Future Information Technology
researchProduct

Semantic retrieval: an approach to representing, searching and summarising text documents

2011

Nowadays, the internet is the major source of information for millions of people. There are many search tools available on the net but finding appropriate text information is still difficult. The retrieval efficiency of the presently used systems cannot be significantly improved: ‘bag of words’ interpretation causes losing semantics of texts. We applied the functional approach to represent English text documents. It allows taking into account semantic relations between words when indexing documents and use ordinary English sentences as queries to a search engine. The proposed retrieval mechanisms return only highly relevant documents. They make it possible to generate content-aware summarie…

Information retrievalConcept searchbusiness.industryComputer scienceSearch engine indexingSemantic searchFunctional approachWord searchSemanticscomputer.software_genreBag-of-words modelVisual WordArtificial intelligencebusinesscomputerNatural language processingInternational Journal of Information Technology, Communications and Convergence
researchProduct