Wattle Software - producers of XMLwriter XML editor
 Home | Site Map 
 About Latest Version
 Awards & Reviews
 User Comments
 Download XMLwriter
 Download Plug-ins
 Download Help Manual
 Downloading FAQ
 Buy XMLwriter
 Sales Support
 Sales FAQ
 Sales Support
 Technical Support
 Submit a Bug Report
 Feedback & Requests
 Technical FAQ
 XML Links
 XML Training
 XMLwriter User Tools
 The XML Guide
 XML Book Samples
Wattle Software
 About Us
 Contact Details
Designing Distributed Applications with XML, ASP, IE5, LDAP and MSMQ

Buy this book


Chapter 5 Metadata

We've taken an informal approach to structure in our XML so far. We've assumed an implicit DTD and enforced it through carefully written code. In fact, we allowed a violation of an implicit DTD when the error didn't change the meaning of the document. You may be wondering by now if this is such a good idea. After all, haven't we said that the teams working on our applications are only loosely connected? Shouldn't there be some way for them to learn how a specific vocabulary is put together?

In fact, we're going to see that validation is but one use for "data about data". We'll introduce metadata, see how we use it in XML today, and look ahead to its future. We'll see that these future uses of metadata can be more important to networked applications than validation. Indeed, these uses will enable us to build applications with fewer assumptions than we can today. In this chapter, you will learn the following:

  • What is metadata?
  • How is metadata used today?
  • A brief overview of W3C metadata proposals, including RDF, MCF, XML Data and DCD
  • How one metadata implementation can be used to dynamically generate HTML input forms

The W3C has a number of proposals before it dealing with metadata in one form or another. All are on or beyond the cutting edge in terms of use in production applications. For that reason alone, we must approach this chapter as an experiment. The support for metadata in parsers is also spotty. Nevertheless, we need to see how far we can take metadata in order to see why we should care about it. Once we've seen what we can do with metadata, we can begin to organize our applications so they may readily support whatever metadata proposals are ultimately adopted.

What is Metadata?

We obey some structural rules whenever we write an XML document. The XML specification itself provides a syntax, which must be followed in order for the result to be considered well-formed XML. In addition, the vocabulary in which the document is written imposes more rules. It tells us the names of allowed tags and the attributes of those tags. It tells us the structure of our documents by telling us what elements may contain. Metadata's role is in telling us the rules of a vocabulary. A well-written vocabulary mirrors the application domain. A vocabulary about banking will inherently teach a layman something about the nature of banking. The syntactic rules of XML and the rules of any particular vocabulary are thus data about data. Philosophers long ago coined the term metadata to describe this. Our rules convey no information in the vocabulary, but they tell us what may be written under the vocabulary.

Obviously, metadata is important if we want to validate an XML document. We took an introductory look at DTDs in Chapter 3. Our chosen parser, MSXML, became a validating parser with version 5.0 of Internet Explorer, and we can use this feature to avoid syntactic errors in the data we exchange. Simply adopt a resolution in your organization that all documents must be valid, provide DTDs, and turn on the validation feature of MSXML. Apart from a small performance penalty, however, we won't see much change in our applications merely by enforcing the validity of our documents.

Metadata tells us about our vocabularies, so we should be able to use it to discover how new vocabularies work. While that is a utopian ideal, we'll see that we can make shrewd use of metadata in the service of our third principle: services will be provided as self-describing data. XML documents become truly self-describing when vocabulary metadata is available.

How Metadata is Used Today

The only use of metadata in the XML 1.0 recommendation is in the use of Document Type Definitions (DTD). DTDs give us much of what we would like to know regarding a document. They completely specify the structure of XML documents. Elements and their attributes are discussed, optional items are noted, and so forth. DTDs are the only formally approved mechanism for validating XML documents. They suffer from one great flaw, however. DTDs are written using a syntax other than XML. You can't use the XML DOM to parse and traverse a DTD. Obviously, it isn't impossible to write a parser for handling DTDs as every validating XML parser must include a DTD parser. It is simply annoying and inconvenient. As a result, there is great interest in replacing today's DTDs with an XML vocabulary for describing metadata.

W3C Proposals

Metadata is a broad topic within the W3C and is managed by the W3C Metadata Activity. The W3C's interest in metadata extends to more than just XML. One of the earliest efforts was PICS, the Platform for Internet Content Selection, an initiative to build a mechanism for applying rating labels to Web sites. Obviously, a rating scheme must be able to describe content to some degree, so PICS came to be a way to create general rating systems. Yet PICS was modest in scope; it only attempted to describe what could be encoded in HTML pages. This certainly simplifies the task, but it makes it unsuitable for use as a general purpose metadata language.

PICS also inspired other efforts. The broadest is the W3C's Resource Description Format (RDF). More recently, the XML community has advanced more specialized proposals such as XML Data and the Document Content Description (DCD) for XML. Another activity, XML Namespaces, is not precisely a metadata activity, but as we shall see it can provide us with interesting information. Since the XML namespaces activity is a W3C recommendation and is simpler than the proper metadata activities, let's begin there.

W3C documents are prone to frequent changes in status. You can find a summary of current status at http://www.w3.org/TR/.

©1998 Wrox Press Limited, US and UK.

Buy this book

Select a Book

Beginning XML
Beginning XHTML
Professional XML
Professional ASP XML
Professional XML Design...
Professional XSLT...
Professional VB6 XML
Designing Distributed...
Professional Java XML...
Professional WAP

© Wattle Software 1998-2019. All rights reserved.