Search results for " XML"
showing 10 items of 34 documents
XML document-grammar comparison: related problems and applications
2011
10.2478/s13537-011-0005-1; International audience; XML document comparison is becoming an ever more popular research issue due to the increasingly abundant use of XML. Likewise, a growing interest fosters the development of XML grammar matching and comparison, due to the proliferation of heterogeneous XML data sources, particularly on the Web. Nonetheless, the process of comparing XML documents with XML grammars, i.e., XML document and grammar similarity evaluation, has not yet received the attention it deserves. In this paper, we provide an overview on existing research related to XML document/grammar comparison, presenting the background and discussing the various techniques related to th…
An overview on XML similarity: Background, current trends and future directions
2009
In recent years, XML has been established as a major means for information management, and has been broadly utilized for complex data representation (e.g. multimedia objects). Owing to an unparalleled increasing use of the XML standard, developing efficient techniques for comparing XML-based documents becomes essential in the database and information retrieval communities. In this paper, we provide an overview of XML similarity/comparison by presenting existing research related to XML similarity. We also detail the possible applications of XML comparison processes in various fields, ranging over data warehousing, data integration, classification/clustering and XML querying, and discuss some…
Extensible User-Based XML Grammar Matching
2009
International audience; XML grammar matching has found considerable interest recently due to the growing number of heterogeneous XML documents on the web and the increasing need to integrate, and consequently search and retrieve XML data originated from different data sources. In this paper, we provide an approach for automatic XML grammar matching and comparison aiming to minimize the amount of user effort required to perform the match task. We propose an open framework based on the concept of tree edit distance, integrating different matching criterions so as to capture XML grammar element semantic and syntactic similarities, cardinality and alternativeness constraints, as well as data-ty…
Transforming XML documents to OWL ontologies: A survey
2015
The aims of XML data conversion to ontologies are the indexing, integration and enrichment of existing ontologies with knowledge acquired from these sources. The contribution of this paper consists in providing a classification of the approaches used for the conversion of XML documents into OWL ontologies. This classification underlines the usage profile of each conversion method, providing a clear description of the advantages and drawbacks belonging to each method. Hence, this paper focuses on two main processes, which are ontology enrichment and ontology population using XML data. Ontology enrichment is related to the schema of the ontology (TBox), and ontology population is related to …
XCDL: an XML-oriented visual composition definition language
2010
International audience; XML data flow has reached beyond the world of computer science and has spread to other areas such as data communication, e-commerce and instant messaging. Therefore, manipulating this data by non expert programmers is becoming imperative. On one hand, Mashups have emerged a few years ago, providing users with visual tools for web data manipulation but not necessarily XML specific. Mashups have been leaning towards functional composition but no formal languages have yet been defined. On the other hand, visual languages for XML have been emerging since the standardization of XML, and mostly relying on querying XML data for extraction or structure transformations. These…
Introduction to the Enterprise Content Management and XML Minitrack
2005
Content management in contemporary enterprises concerns a variety of information resources: documents in different forms, databases, and metadata such as ontologies, annotations, and indexes. XML and the web are important technologies used to support both resource integration and distribution.
Learning from the Past : The Women Writers Project and Thirty Years of Humanities Text Encoding
2017
In recent years, intensified attention in the humanities has been paid to data: to data modeling, data visualization, “big data”. The Women Writers Project has dedicated significant effort over the past thirty years to creating what Christoph Schöch calls “smart clean data”: a moderate-sized collection of early modern women’s writing, carefully transcribed and corrected, with detailed digital text encoding that has evolved in response to research and changing standards for text representation. But that data—whether considered as a publication through Women Writers Online, or as a proof of the viability of text encoding approaches like those expressed in the Text Encoding Initiative (TEI) Gu…
Transcribing the "Estoria de Espanna" using crowdsourcing: Strategies and aspirations
2015
This paper examines the specific strategies for recruitment and retention of volunteer transcribers in use in two collaborative transcription projects: Transcribe Bentham (University College, London) and the Estoria de Espanna Digital Project (University of Birmingham). The aim of the paper is to review the strategies used by Transcribe Bentham, a more mature crowdsourced electronic transcription project, with a view to informing the strategies put into place in the Estoria project, which has started transcribing using crowdsourcing more recently. The paper discusses the difficulties faced by crowdsourced electronic transcription projects and how these have been and are being resolved in th…
"Tea for two": the Archive of the Italian Latinity of the Middle Ages meets the CLARIN infrastructure
2020
This paper aims at showing how integrating the Archive of the Italian Latinity of the Middle Ages (ALIM) into the ILC4CLARIN repository can provide mutual benefits. Making ALIM available to a large community of scholars and researchers, on the one side, represents the first step to reduce the lack of resources for Medieval Latin in CLARIN and, on the other side, constitutes an unprecedented contribution to not only linguistic investigations, but also to the studies of the culture and science at the basis of the Western European society. The paper describes the adopted approach aiming to keep intact the structure of the archive and its metadata, which are both accurately mirrored into the IL…
"Lavorare" il dato linguistico: prospettive e limiti. Alcune considerazioni dall'esperienza dell'Atlante Linguistico della Sicilia (ALS)
2018
The present work focuses on the relationship between linguistic research and the use of new technologies for linguistic data processing and analysis. Starting from the experience of the Atlante Linguistico della Sicilia (ALS), this paper describes a XML schema, based on the theory of trasferenza (transference) by Regis (2013), for the annotation and analysis of the data from the onomasiological questions of the ALS sociovariational questionnaire. Moreover, this modest case study tries to make clear the pros and cons of technological devices in linguistic research.