6533b856fe1ef96bd12b3145

RESEARCH PRODUCT

Metadata-Oriented Language Model in Translingual Retrieval of Digital Data

Jolanta Mizera-pietraszkoJolanta Tancula

subject

Multilingual Digital LibrariesInformation retrievalComputer sciencebusiness.industryInformationSystems_INFORMATIONSTORAGEANDRETRIEVALLanguage Modellingcomputer.software_genreQuery languageCross-Language Information RetrievalQuery expansionUniversal Networking LanguageComputingMethodologies_DOCUMENTANDTEXTPROCESSINGData control languageLanguage modelArtificial intelligenceDocument retrievalbusinesscomputerNatural language processingCross-language information retrievalRDF query languagecomputer.programming_language

description

Translingual retrieval relies on processing a source language to retrieve digital document content in a target language. From the perspective of successful browsing digital catalogues, probability of retrieving the full text document in a language other than the query language is close to zero owning to the fact that it is not only the library collection, but especially a problem of matching the index terms with the query keywords which are assumed to be their translation equivalents. In addition, hardly any digital library system is incorporated with a translation component. As a result, such a matching is rather coincidental. Our approach to the translingual document retrieval problem is to build a metadata language model that based on a digital document computes such a word sequence which ranks the document collection on the basis of the probability of generating a particular query keyword alignment.

10.1109/icdim.2015.7381891http://ieeexplore.ieee.org/document/7381891/