Search results for " retrieval."

showing 10 items of 1102 documents

File system scalability with highly decentralized metadata on independent storage devices

2016

This paper discusses using hard drives that integrate a key-value interface and network access in the actual drive hardware (Kinetic storage platform) to supply file system functionality in a large scale environment. Taking advantage of higher-level functionality to handle metadata on the drives themselves, a serverless system architecture is proposed. Skipping path component traversal during the lookup operation is the key technique discussed in this paper to avoid performance degradation with highly decentralized metadata. Scalability implications are reviewed based on a fuse file system implementation. Peer Reviewed

Information storage and retrieval systemsComputer scienceDistributed computingInterface (computing)Key-value storages02 engineering and technologycomputer.software_genreObject storagesLookupsComputer clusterServerData_FILES0202 electrical engineering electronic engineering information engineering:Informàtica::Arquitectura de computadors [Àrees temàtiques de la UPC]Virtual storageFile systemMetadataFile organizationScalability020206 networking & telecommunications020207 software engineeringFile systemsObject storageMetadataKineticsInformació -- Sistemes d'emmagatzematge i recuperacióScalabilityGrid computingSystems architecturecomputerCluster computing
researchProduct

GekkoFS — A Temporary Burst Buffer File System for HPC Applications

2020

Many scientific fields increasingly use high-performance computing (HPC) to process and analyze massive amounts of experimental data while storage systems in today’s HPC environments have to cope with new access patterns. These patterns include many metadata operations, small I/O requests, or randomized file I/O, while general-purpose parallel file systems have been optimized for sequential shared access to large files. Burst buffer file systems create a separate file system that applications can use to store temporary data. They aggregate node-local storage available within the compute nodes or use dedicated SSD clusters and offer a peak bandwidth higher than that of the backend parallel f…

Information storage and retrieval systemsPOSIXFile systemBurst buffersComputer scienceProcess (computing)computer.software_genreDistributed file systemsComputer Science ApplicationsTheoretical Computer ScienceMetadataInformació -- Sistemes d'emmagatzematge i recuperacióComputational Theory and MathematicsHardware and ArchitecturePOSIXHPC:Informàtica::Sistemes d'informació::Emmagatzematge i recuperació de la informació [Àrees temàtiques de la UPC]ScalabilityOperating systemBandwidth (computing)High performance computingIsolation (database systems)Càlcul intensiu (Informàtica)computerSoftwareJournal of Computer Science and Technology
researchProduct

Multimedia Retrieval by Means of Merge of Results from Textual and Content Based Retrieval Subsystems

2010

The main goal of this paper it is to present our experiments in ImageCLEF 2009 Campaign (photo retrieval task). In 2008 we proved empirically that the Text-based Image Retrieval (TBIR) methods defeats the Content-based Image Retrieval CBIR "quality" of results, so this time we developed several experiments in which the CBIR helps the TBIR. The TBIR System [6] main improvement is the named-entity sub-module. In case of the CBIR system [3] the number of low-level features has been increased from the 68 component used at ImageCLEF 2008 up to 114 components, and only the Mahalanobis distance has been used. We propose an ad-hoc management of the topics delivered, and the generation of XML struct…

InformáticaMahalanobis distanceTelecomunicacionesInformation retrievalcomputer.internet_protocolComputer scienceSearch engine indexing02 engineering and technologyContent-based image retrieval01 natural sciencesData retrievalHuman–computer information retrieval0103 physical sciences0202 electrical engineering electronic engineering information engineering020201 artificial intelligence & image processingVisual Word010306 general physicsImage retrievalcomputerXML
researchProduct

Some Results Using Different Approaches to Merge Visual and Text-Based Features in CLEF’08 Photo Collection

2009

This paper describes the participation of the MIRACLE team at the ImageCLEF Photographic Retrieval task of CLEF 2008. We succeeded in submitting 41 runs. Obtained results from text-based retrieval are better than content-based as previous experiments in the MIRACLE team campaigns [5, 6] using different software. Our main aim was to experiment with several merging approaches to fuse text-based retrieval and content-based retrieval results, and it happened that we improve the text-based baseline when applying one of the three merging algorithms, although visual results are lower than textual ones.

InformáticaTelecomunicacionesInformation retrievalComputer sciencebusiness.industrySearch engine indexingInformationSystems_INFORMATIONSTORAGEANDRETRIEVAL020206 networking & telecommunications02 engineering and technologyClefSoftwareHuman–computer information retrieval0202 electrical engineering electronic engineering information engineering020201 artificial intelligence & image processingVisual WordDocument retrievalbusinessImage retrievalMerge (version control)
researchProduct

Cover Feature: Research Data in Chemistry – Results of the first NFDI4Chem Community Survey (Z. Anorg. Allg. Chem. 23‐24/2020)

2020

Inorganic ChemistryInformation retrievalChemistryFeature (computer vision)Cover (algebra)Chemistry (relationship)Community surveyResearch dataZeitschrift für anorganische und allgemeine Chemie
researchProduct

Institutionalism, cultural institutions and cultural policy in the Nordic countries

2010

In the article our aim is to analyse theoretically the questions: (1) what is the relevance of institutional approach in research about cultural policy and cultural institutions, and (2) how do the ...

Institutional approachPolitical economyPolitical scienceInstitutionalismGeneral Earth and Planetary SciencesRelevance (information retrieval)New institutionalismCultural institutionSocial scienceGeneral Environmental SciencePath dependenceCultural policyNordisk kulturpolitisk tidsskrift
researchProduct

BUCC Shared Task: Cross-Language Document Similarity

2015

We summarise the organisation and results of the first shared task aimed at detecting the most similar texts in a large multilingual collection. The dataset of the shared was based on Wikipedia dumps with interlanguage links with further filtering to ensure comparability of the paired articles. The eleven system runs we received have been evaluated using the TREC evaluation metrics. 1 Task description Parallel corpora of original texts with their translations provide the basis for multilingual NLP applications since the beginning of the 1990s. Relative scarcity of such resources led to greater attention to comparable (=less parallel) resources to mine information about possible translations…

InterlanguageDocument similarityInformation retrievalComputer sciencebusiness.industryInformationSystems_INFORMATIONSTORAGEANDRETRIEVALArtificial intelligencecomputer.software_genrebusinesscomputerNatural language processingTask (project management)Proceedings of the Eighth Workshop on Building and Using Comparable Corpora
researchProduct

RDF* Graph Database as Interlingua for the TextWorld Challenge

2019

This paper briefly describes the top-scoring submission to the First TextWorld Problems: A Reinforcement and Language Learning Challenge. To alleviate the partial observability problem, characteristic to the TextWorld games, we split the Agent into two independent components: Observer and Actor, communicating only via the Interlingua of the RDF* graph database. The RDF* graph database serves as the “world model” memory incrementally updated by the Observer via FrameNet informed Natural Language Understanding techniques and is used by the Actor for the efficient exploration and planning of the game Action sequences. We find that the deep-learning approach works best for the Observer componen…

InterlinguaInformation retrievalGraph databaseComputer scienceBacktrackingbusiness.industryDeep learningNatural language understandingcomputer.file_formatcomputer.software_genrelanguage.human_languagelanguageReinforcement learningArtificial intelligenceRDFFrameNetbusinesscomputer2019 IEEE Conference on Games (CoG)
researchProduct

Multi-data models translations in interoperable information systems

1996

Interoperation of heterogeneous and autonomous information systems has traditionally been hampered by semantic differences in their data models. In this paper, we address the problem by defining a methodology called TIME, which is based on an extensible meta model. Its key features are: a set of meta-types which can be used to represent the syntax and the semantics of data modeling concepts, a knowledge base of transformation rules that map a meta-type into other meta-types, and an inference engine which uses the transformation rules to translate schema from source to target models. The extensibility of the meta-model is achieved by organizing the meta-types into a generalization hierarchy …

InteroperationInformation retrievalKnowledge basebusiness.industryComputer scienceInteroperabilityRelational modelInformation systemInference enginebusinessData modelingMetamodeling
researchProduct

Reducing the Human Effort in Text Line Segmentation for Historical Documents

2021

Labeling the layout in historical documents for preparing training data for machine learning techniques is an arduous task that requires great human effort. A draft of the layout can be obtained by using a document layout analysis (DLA) system that later can be corrected by the user with less effort than doing it from scratch. We research in this paper an iterative process in which the user only supervises and corrects the given draft for the pages automatically selected by the DLA system with the aim of reducing the required human effort. The results obtained show that similar DLA quality can be achieved by reducing the number of pages that the user has to annote and that the accumulated h…

Iterative and incremental developmentTraining setInformation retrievalComputer sciencemedia_common.quotation_subjectQuality (business)SegmentationLine (text file)Document layout analysisHistorical documentmedia_commonTask (project management)
researchProduct