Search results for "Information Retrieval"
showing 10 items of 924 documents
Schema theory: A new approach?
1987
Building Semantic Trees from XML Documents
2016
International audience; The distributed nature of the Web, as a decentralized system exchanging information between heterogeneous sources, has underlined the need to manage interoperability, i.e., the ability to automatically interpret information in Web documents exchanged between different sources, necessary for efficient information management and search applications. In this context, XML was introduced as a data representation standard that simplifies the tasks of interoperation and integration among heterogeneous data sources, allowing to represent data in (semi-) structured documents consisting of hierarchically nested elements and atomic attributes. However, while XML was shown most …
A novel XML document structure comparison framework based-on sub-tree commonalities and label semantics
2012
International audience; XML similarity evaluation has become a central issue in the database and information communities, its applications ranging over document clustering, version control, data integration and ranked retrieval. Various algorithms for comparing hierarchically structured data, XML documents in particular, have been proposed in the literature. Most of them make use of techniques for finding the edit distance between tree structures, XML documents being commonly modeled as Ordered Labeled Trees. Yet, a thorough investigation of current approaches led us to identify several similarity aspects, i.e., sub-tree related structural and semantic similarities, which are not sufficient…
Building Ontologies from XML Data Sources
2009
In this paper, we present a tool called X2OWL that aims at building an OWL ontology from an XML datasource. This method is based on XML schema to automatically generate the ontology structure, as well as, a set of mapping bridges. The presented method also includes a refinement step that allows to clean the mapping bridges and possibly to restructure the generated ontology.
Two Methods for Schema Design for Intelligent XML Documents in Organizations
2007
XML markup language provides means for incorporating semantics, i.e. “meaning” of logical content parts residing within documents. Therefore it has become the lingua franca for Semantic Web, e-Business applications and for enterprise application integration. In order to realize novel, intelligent XML-based document applications in organizations, schemas defining the domain-oriented semantics are needed. So far, the potential of XML has not bee fully utilized in organizational documents, due to the lack of XML support in common and inexpensive office software. Due to the arrival of XML support on common software such as Microsoft Office 2007 and Open Office 2.0 organizations need knowledge a…
Aspects on XML Document Content Reuse in Organizaotins
2007
Designing the reuse of information residing in documents is more complex than for information in databases. Document content is designed for humans and organized with regard to communicational purposes for organizational work. In addition, content organization within documents is affected by the requirements of multichannel publishing and layout design for content presentation. Efficient content reuse in organizational documents requires that the ways the content is created and stored within and across documents and other content resources, such as databases, should be identified. XML provides technological means for document content reuse. The designers of XML document production need to b…
Ontology-based integration of XML data: Schematic marks as a bridge between syntax and semantic level
2007
This paper presents an ontology integration approach of XML data. The approach is composed of two pillars the first of which is based on formal language and XML grammars analysis. The second pillar is based on ontology and domain ontology analysis. The keystone of this architecture which creates a bridge between the two pillars is based on the concept of schematic marks introduced in this paper. These schematic marks make it possible to establish the link between the syntactic level and the semantic level for our integration framework.
XML document-grammar comparison: related problems and applications
2011
10.2478/s13537-011-0005-1; International audience; XML document comparison is becoming an ever more popular research issue due to the increasingly abundant use of XML. Likewise, a growing interest fosters the development of XML grammar matching and comparison, due to the proliferation of heterogeneous XML data sources, particularly on the Web. Nonetheless, the process of comparing XML documents with XML grammars, i.e., XML document and grammar similarity evaluation, has not yet received the attention it deserves. In this paper, we provide an overview on existing research related to XML document/grammar comparison, presenting the background and discussing the various techniques related to th…
An overview on XML similarity: Background, current trends and future directions
2009
In recent years, XML has been established as a major means for information management, and has been broadly utilized for complex data representation (e.g. multimedia objects). Owing to an unparalleled increasing use of the XML standard, developing efficient techniques for comparing XML-based documents becomes essential in the database and information retrieval communities. In this paper, we provide an overview of XML similarity/comparison by presenting existing research related to XML similarity. We also detail the possible applications of XML comparison processes in various fields, ranging over data warehousing, data integration, classification/clustering and XML querying, and discuss some…
Extensible User-Based XML Grammar Matching
2009
International audience; XML grammar matching has found considerable interest recently due to the growing number of heterogeneous XML documents on the web and the increasing need to integrate, and consequently search and retrieve XML data originated from different data sources. In this paper, we provide an approach for automatic XML grammar matching and comparison aiming to minimize the amount of user effort required to perform the match task. We propose an open framework based on the concept of tree edit distance, integrating different matching criterions so as to capture XML grammar element semantic and syntactic similarities, cardinality and alternativeness constraints, as well as data-ty…