0000000000603057
AUTHOR
Frank Wm. Tompa
Why Use XML?
Since its inception a decade ago, XML has become a standard technology for software engineers, all Web browsers are able to parse and show XML documents, and huge XML data resources are available from the Internet. Many of the documents are in XHTML, but other XML applications are quite common as well. XML has also become a format that is increasingly common in the files of local disks. This success would not have been possible without collaborative efforts throughout the Web community. Such world-wide collaborative development has included standards, software applications, and case implementations that can serve as models when developing new solutions. In this chapter we consider what ki…
Data-Centric and Multimedia Components
The content of XML documents is often primarily plain text, interspersed with various headers and perhaps some lists and tables. However, there are many applications for which the content of documents is not primarily narrative in nature, but instead includes (portions of) data records that are subject to storage and computational manipulation. The latter documents are sometimes referred to as data-centric or record-like, and they rely extensively on precise descriptions of the forms of data that can appear. In this chapter we first introduce the data type definition capabilities in XML Schema. We then consider the types of data very common in traditional databases: numeric data, dates, and…
Grammars++ for modelling information in text
Abstract Grammars provide a convenient means to describe the set of valid instances in a text database. Flexibility in choosing a grammar can be exploited to provide information modelling capability by designing productions in the grammar to represent entities and relationships of interest to database applications. Additional constraints can be specified by attaching predicates to selected nonterminals in the grammar. When used for database definition, grammars can provide the functionality that users have come to expect of database schemas. Extended grammars can also be used to specify database manipulation, including query, update, view definition, and index specification.
Requirements for XML document database systems
The shift from SGML to XML has created new demands for managing structured documents. Many XML documents will be transient representations for the purpose of data exchange between different types of applications, but there will also be a need for effective means to manage persistent XML data as a database. In this paper we explore requirements for an XML database management system. The purpose of the paper is not to suggest a single type of system covering all necessary features. Instead the purpose is to initiate discussion of the requirements arising from document collections, to offer a context in which to evaluate current and future solutions, and to encourage the development of proper …
Adopting XML for Large-Scale Information
This book has presented many different ways to encode information in XML format and the purposes for doing so. In this concluding chapter we consider problems related to managing XML information assets and the methods available to address those problems. Approaches for persistently storing XML data can be divided into file storage and database storage, and the research community has been especially active in designing new solutions for XML databases. However, adoption of XML often means massive migration procedures from some legacy data into the XML format; examples of migration cases are given. While describing the problems related to adopting XML, we give examples of the kinds of data fo…