Search results for "Trie"

showing 10 items of 4468 documents

Wordnet and semidiscrete decomposition for sub-symbolic representation of words

2009

A methodology for sub-symbolic semantic encoding of words is presented. The methodology uses the standard, semantically highly-structured WordNet lexical database and the SemiDiscrete matrix Decomposition to obtain a vector representation with low memory requirements in a semantic n-space. The application of the proposed algorithm over all the WordNet words would lead to a useful tool for the sub-symbolic processing of texts.

Information retrievalComputer sciencebusiness.industryWordNetDecomposition (computer science)Artificial intelligenceRepresentation (mathematics)computer.software_genrebusinessLexical databasecomputerNatural language processingMatrix decomposition
researchProduct

Publish By Example

2008

We propose an approach for producing database publishing programs by example. The main idea is to interactively build an example document, representative of the program output. The system infers from this document, without ambiguity, the publishing program. The end-user does not need to know a programming language, a query language or the database schema.

Information retrievalComputer sciencebusiness.industrycomputer.internet_protocolRelational databasemedia_common.quotation_subjectDatabase schemaInformationSystems_DATABASEMANAGEMENTAmbiguityQuery languageInformation engineeringNeed to knowbusinessPublicationcomputerXMLmedia_common2008 Eighth International Conference on Web Engineering
researchProduct

Towards semantic-based RSS merging

2009

Merging information can be of key importance in several XML-based applications. For instance, merging the RSS news from different sources and providers can be beneficial for end-users (journalists, economists, etc.) in various scenarios. In this work, we address this issue and mainly explore the relatedness relationships between RSS entities/ elements. To validate our approach, we also provide a set of experimental tests showing satisfactory results. © 2009 Springer-Verlag Berlin Heidelberg

Information retrievalComputer sciencecomputer.internet_protocolRSSINF/01 - INFORMATICAComputerApplications_COMPUTERSINOTHERSYSTEMScomputer.file_formatSet (abstract data type)Semantic similarityArtificial IntelligenceKey (cryptography)Document Object ModelcomputerXML
researchProduct

Combining content extraction heuristics

2008

The main text content of an HTML document on the WWW is typically surrounded by additional contents, such as navigation menus, advertisements, link lists or design elements. Content Extraction (CE) is the task to identify and extract the main content. Ongoing research has spawned several CE heuristics of different quality. However, so far only the Crunch framework combines several heuristics to improve its overall CE performance. Since Crunch, though, many new algorithms have been formulated. The CombinE system is designed to test, evaluate and optimise combinations of CE heuristics. Its aim is to develop CE systems which yield better and more reliable extracts of the main content of a web …

Information retrievalComputer sciencemedia_common.quotation_subjectDesign elements and principlescomputer.software_genreCrunchTask (project management)Content extractionQuality (business)Data miningHeuristicsWeb documentcomputermedia_commonProceedings of the 10th International Conference on Information Integration and Web-based Applications & Services
researchProduct

A Novel Approach to Improve the Accuracy of Web Retrieval

2010

General purpose search engines utilize a very simple view on text documents: They consider them as bags of words. It results that after indexing, the semantics of documents is lost. In this paper, we introduce a novel approach to improve the accuracy of Web retrieval. We utilize the WordNet and WordNet SenseRelate All Words Software as main tools to preserve the semantics of the sentences of documents and user queries. Nouns and verbs in the WordNet are organized in the tree hierarchies. The word meanings are presented by numbers that reference to the nodes on the semantic tree. The meaning of each word in the sentence is calculated when the sentence is analyzed. The goal is to put each nou…

Information retrievalConcept searchComputer sciencebusiness.industryInformationSystems_INFORMATIONSTORAGEANDRETRIEVALSearch engine indexingWord processingWordNetcomputer.software_genreSemanticsComputingMethodologies_ARTIFICIALINTELLIGENCETree (data structure)NounComputingMethodologies_DOCUMENTANDTEXTPROCESSINGArtificial intelligencebusinesscomputerNatural language processingSentence2010 5th International Conference on Future Information Technology
researchProduct

Extracting Semantic Knowledge from Unstructured Text Using Embedded Controlled Language

2016

Nowadays, most of the data on the Web is still in the form of unstructured text. Knowledge extraction from unstructured text is highly desirable but extremely challenging due to the inherent ambiguity of natural language. In this article, we present an architecture of an information extraction system based on the concept of Embedded Controlled Language that allows for extracting formal semantic knowledge from an unstructured text corpus. Moreover, the presented approach has a potential to support multilingual input and output.

Information retrievalConcept searchNoisy text analyticsbusiness.industryComputer scienceText simplification010401 analytical chemistryText graph02 engineering and technologycomputer.software_genre01 natural scienceslanguage.human_language0104 chemical sciencesInformation extractionControlled natural languageKnowledge extractionExplicit semantic analysis0202 electrical engineering electronic engineering information engineeringlanguage020201 artificial intelligence & image processingArtificial intelligencebusinesscomputerNatural language processing2016 IEEE Tenth International Conference on Semantic Computing (ICSC)
researchProduct

Semantic retrieval: an approach to representing, searching and summarising text documents

2011

Nowadays, the internet is the major source of information for millions of people. There are many search tools available on the net but finding appropriate text information is still difficult. The retrieval efficiency of the presently used systems cannot be significantly improved: ‘bag of words’ interpretation causes losing semantics of texts. We applied the functional approach to represent English text documents. It allows taking into account semantic relations between words when indexing documents and use ordinary English sentences as queries to a search engine. The proposed retrieval mechanisms return only highly relevant documents. They make it possible to generate content-aware summarie…

Information retrievalConcept searchbusiness.industryComputer scienceSearch engine indexingSemantic searchFunctional approachWord searchSemanticscomputer.software_genreBag-of-words modelVisual WordArtificial intelligencebusinesscomputerNatural language processingInternational Journal of Information Technology, Communications and Convergence
researchProduct

Automatic building of a visual interface for content-based multiresolution retrieval of paleontology images

2001

In this article we present research work in the field of content-based image retrieval in large databases applied to the paleontology image database of the Universite´ de Bourgogne, Dijon, France, called ‘‘TRANS’TYFIPAL.’’ Our indexing method is based on multiresolution decomposition of database images using wavelets. For each family of paleontology images we try to find a model image that represents it. The K-means automatic classification algorithm divides the space of parameters into several clusters. A model image for each cluster is computed from the wavelet transform of each image of the cluster. Then a search tree is built to offer users a graphic interface for retrieving images. So …

Information retrievalContextual image classificationComputer sciencebusiness.industrySearch engine indexingComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION020206 networking & telecommunicationsImage processing02 engineering and technologyContent-based image retrievalAtomic and Molecular Physics and OpticsSearch treeComputer Science ApplicationsPaleontologyAutomatic image annotation[INFO.INFO-TI]Computer Science [cs]/Image Processing [eess.IV]0202 electrical engineering electronic engineering information engineering020201 artificial intelligence & image processingComputer visionVisual WordArtificial intelligenceElectrical and Electronic EngineeringbusinessImage retrievalComputingMilieux_MISCELLANEOUS
researchProduct

Heuristic Method to Improve Systematic Collection of Terminology

2016

In this paper, we propose an experimental tool for analysis and graphical representation of glossaries. The original heuristic algorithms and analysis methods incorporated into the tool appeared to be useful to improve the quality of the glossaries. The tool was used for analysis of ISTQB Standard Glossary of Terms Used in Software Testing. There are instances of problems found in ISTQB glossary related to its consistency, completeness, and correctness described in the paper.

Information retrievalCorrectnessGlossaryComputer scienceHeuristicConcept mapcomputer.software_genreTerminologyConsistency (database systems)Completeness (order theory)Data miningRepresentation (mathematics)GeneralLiterature_REFERENCE(e.g.dictionariesencyclopediasglossaries)computer
researchProduct

Combining OWL ontologies usingE-Connections

2006

The standardization of the Web Ontology Language (OWL) leaves (at least) two crucial issues for Web-based ontologies unsatisfactorily resolved, namely how to represent and reason with multiple distinct, but linked ontologies, and how to enable effective knowledge reuse and sharing on the Semantic Web. In this paper, we present a solution for these fundamental problems based on E-Connections. We aim to use E-Connections to provide modelers with suitable means for developing Web ontologies in a modular way and to provide an alternative to the owl:imports construct. With such motivation, we present in this paper a syntactic and semantic extension of the Web Ontology language that covers E-Conn…

Information retrievalDatabaseComputer Networks and Communicationsbusiness.industrySemantic Web Rule Languagecomputer.internet_protocolComputer scienceWeb Ontology LanguageOntology (information science)computer.software_genreSocial Semantic WebOWL-SHuman-Computer InteractionUpper ontologySemantic Web StackbusinesscomputerSemantic WebSoftwarecomputer.programming_languageJournal of Web Semantics
researchProduct