Wattle Software - producers of XMLwriter XML editor
 Home | Site Map 
XMLwriter
 Screenshots
 Features
 About Latest Version
 Awards & Reviews
 User Comments
 Customers
Download
 Download XMLwriter
 Download Plug-ins
 Download Help Manual
 Downloading FAQ
Buy
 Buy XMLwriter
 Pricing
 Upgrading
 Sales Support
 Sales FAQ
Support
 Sales Support
 Technical Support
 Submit a Bug Report
 Feedback & Requests
 Technical FAQ
Resources
 XML Links
 XML Training
 XMLwriter User Tools
 The XML Guide
 XML Book Samples
Wattle Software
 About Us
 Contact Details
Designing Distributed Applications with XML, ASP, IE5, LDAP and MSMQ

Buy this book

BackContentsNext

Namespaces

We encountered XML namespaces briefly in Chapter 3. If you recall, we established that namespaces are a means of naming some vocabulary for the purpose of reusing elements it contains in another vocabulary. If someone has published an excellent vocabulary for describing demographic information and we are working on a vocabulary for an advertising application, we might wish to reuse the demographics tags within our own vocabulary. We should have a means of pointing back to some description of the vocabulary, both for the purposes of attribution and for maintaining a link to the authoritative source for the vocabulary. At a minimum, we want a way to identify a particular tag name usage as being defined in the demographics vocabulary. This prevents confusion when the same tag name is used by multiple vocabularies. If an element is marked as belonging to a particular vocabulary, the meaning should be unambiguous.

Declarations

Let's review the syntax for XML namespaces. We declare a namespace as follows:

<tagname xmlns[:name]=URI>

The namespace applies to the named tag and its contents. If we are going to deal with a number of tags from the same namespace, it is convenient to declare the namespace at the highest possible level. Tags that are not qualified are assumed to belong to the containing namespace. Note that the URI need not refer to a DTD or other online definition. While that is useful, it is not a requirement. The URI must simply provide a unique designator for the namespace.

With that in mind, here are some valid namespace declarations:

<People xmlns:mynames="http://www.myserver.com/mynames/schemal">

<Things xmlns:stuff="urn:someschema-things.com-things">

<Concepts xmlns:pr="urn:astronomical-schema:pulsars">

You may not be familiar with the prefix urn. It stands for universal resource name and is a specific kind of universal resource indicator (URI). Unlike a URL, which is another specific type of URI, a URN just provides a name. Presumably, the name is widely understood. In the examples above, the stuff and pr prefixes will be used to denote namespaces, but we have not provided any way for the curious reader to find out more about what they mean. Hopefully, they are as familiar to the recipient of the declarations as HTML and other universally recognized namespaces. If we want to use a number of namespaces liberally throughout a document, we should declare them early and provide a namespace prefix with which to qualify individual element and attribute names. If we want to have an XML document whose root element is <TRANSACTION> and which borrows names from the BANKING and FINANCE namespaces, we should declare both namespaces in the root element:

<TRANSACTION xmlns:bank="urn:financial-schema-BANK"

xmlns:fin="urn:financial-schema-FINANCE">

... some usage of the namespaces here ...

</TRANSACTION>

Using Namespaces to Qualify Names

Tags use namespace declarations in one of two ways. They explicitly use the namespace if the tag name is qualified by the prefix specified in the declaration. Our <TRANSACTION> example declared two namespace prefixes, bank and fin. Extending our <TRANSACTION> example:

<TRANSACTION xmlns:bank="urn:financial-schema-BANK"

xmlns:fin="urn:financial-schema-FINANCE">

<bank:institution>Shaky Finance Corp.</bank:institution>

<fin:instrument>certificate of deposit</fin:instrument>

</TRANSACTION>

The institution element comes from the BANK namespace, while the instrument element comes from the FINANCE namespace.

Alternately, an element or attribute name is implicitly qualified by the namespace declaration in whose scope it appears. If we declare a namespace in some element, any element that is not otherwise qualified is assumed to belong to the declared namespace. Suppose we changed our <TRANSACTION> example just a bit:

<TRANSACTION xmlns="urn:financial-schema-BANK"

xmlns:fin="urn:financial-schema-FINANCE">

<institution>Shaky Finance Corp.</bank:institution>

<fin:instrument>certificate of deposit</fin:instrument>

</TRANSACTION>

Note we've omitted a namespace prefix for the BANK namespace. The institution element has no prefix, but it is implicitly assumed to come from the BANK namespace because it is contained within the scope of that declaration.

Searching for Namespace Declarations

Namespaces are only intended to uniquely name elements and attributes. Enumerating the namespaces used within a document can tell us something about the meaning of the document, however. The fact that a namespace is used within a document indicates that some meaningful term has been borrowed from another vocabulary. If we can identify the namespaces referenced in a document, we will see what domains contributed to its meaning.

The first task is to enumerate all the namespace declarations within a given XML document. We could certainly traverse the entire DOM parse tree and examine each attribute found for the substring xmlns. Fortunately, we don't have to go to that much trouble. MSXML supports the Extensible Style Language (XSL) draft, and XSL includes a powerful pattern matching language. We can use this to enumerate all the elements matching a particular pattern in our case, every attribute that declares a namespace.

This topic is well over the cutting edge. Not only is XSL a work in progress, but the XSL pattern matching support in MSXML includes some extensions that have been submitted to the W3C as a note regarding a query language for XML. The syntax that follows will certainly change and is meant to indicate one way a query language can be used to help us in our search for metadata.

Consider the following fragment of an XML document. We have included two namespace declarations within the root element.

<?xml version="1.0"?>

<TOP xmlns:i="urn:myschema-first" xmlns:ii="urn:myschema-second">

...

</TOP>

We can use the selectNodes method of the node object to apply an XSL pattern string to the document and receive an enumeration of all attributes matching the pattern. Assuming we've created an instance of MSXML in the variable parser and loaded the document above successfully, we can make the following call:



rNSDeclarations = 
		   parser.documentElement.selectNodes("//@*[nodeName() >= 
               'xmlns']");

The selectNodes method takes a string conforming to the rules of the XSL pattern matching syntax. Generally speaking, the string above breaks down into a search scope, an indication of what we are searching for, and a filter.

The scope is denoted by //, meaning the entire document starting from the root. A single slash / would denote the root itself, while ./would indicate the current context. The context is the point from which we start the search. We are searching from the root element, but we want to look at the entire document.

The symbols @* mean any attribute. It will be our filter that is going to have to limit the search because we don't know exactly what attribute names we are searching for. This is because we can declare a namespace prefix, but there is no way we can know in advance what that will be.

Everything within the square brackets is the filter. nodeName() is a built-in function of MSXML that can be evaluated at runtime. It will give us the name of the attribute. The constraint >= 'xmlns' gives us any attribute that begins with the substring xmlns. This will match declarations that have prefixes defined, as well as those that do not define a prefix.

If we apply the selection call to the sample XML document, we find we have an enumeration of two items. These will be node objects, so we can get their text property (this is equivalent to childNodes.item(0).nodeValue). Calling it on each of our two namespace declarations yields:

urn:myschema-first

urn:myschema-second

If we called for the xml property of the node object instead of the text property, we would get the entire attribute declaration, e.g., xmlns:ii="urn:myschema-second".

Other Namespace Support in MSXML

The DOM support in MSXML allows us to look at the parts of an element or attribute name. The nodeName property gives us the entire qualified name. prefix gives us the namespace prefix, if any. basename yields the unqualified name. For the element <ii:More>, the results are:

Property

Result

nodeName

ii:More

prefix

ii

Basename

More

Enumerating Namespace Usage

Now we'll put this all together to analyze an XML document for foreign namespace usage. We want to list all the namespace declarations in a document, together with the qualified elements and attributes taken from them and used in the document.

The following code comes from the sample file NSTest.html. All code samples are available for download from our Web site at http://www.wrox.com and this example can be run from our site at http://webdev.wrox.co.uk/books/2270/.

The selectNodes call we saw before gives us a list of declarations:

rNSDeclarations =
      parser.documentElement.selectNodes("//@*[nodeName() >= 'xmlns']");

Now we have a collection of namespace declarations. We want to iterate through the collection and perform searches for qualified elements and attributes. We'll continue to use the selectNodes call and search the entire document. We obtain a collection of qualified elements with this call:

rQualElements = parser.documentElement.selectNodes("//*[nodeName()>= '" + 
            declaration.basename + ":']");

Note we've dropped the @ character from the search pattern. The unqualified * character indicates that we are looking for elements. We use the basename of the declaration together with a colon to give us the qualifying prefix. Recall that the declaration has the prefix xmlns, and a basename consisting of the prefix to use to qualify names from this namespace. There's a problem, however. Since the XSL pattern matching syntax doesn't allow us to use wildcards in our nodeName selection, we may get some element names that aren't from this namespace. For example, if our prefix is aa, then a qualified node zz:XYZ will match the search. The collection we obtain from the search is guaranteed to include all the names for which we are searching, but may include other names as well. Consequently, before we list any element names, we have to test the element's prefix against the declaration's basename.

for (nj = 0; nj < rQualElements.length; nj++)

{ element = rQualElements(nj); if (element.prefix == declaration.basename) { ListLine(results, element.basename, "blue", tabsize, linesize); linesize += 4; } }

We use a similar approach to obtain the qualified attribute names:

rQualAttributes = parser.documentElement.selectNodes("//@*[nodeName() >= '"
      + declaration.basename + ":']");

      // some formatting code here

      for (nk = 0; nk < rQualAttributes.length; nk++)
      {
      attribute = rQualAttributes(nk);
      if (attribute.prefix == declaration.basename)
            {
            ListLine(results, attribute.basename, "red", tabsize, linesize); linesize += 4;

We'll use the following XML for our test document:

<?xml version="1.0"?>
<OUTER
xmlns:pe="urn:schema-process-engineering"
xmlns:xmit="urn:myschema-transmission">
   <xmit:INSIDE xmit:more="extra">Filler</xmit:INSIDE>
   <pe:OPERATINGPOINT>
      128.4
      <pe:UNITS>deg F</pe:UNITS>
      <xmit:PADDING>xxxyyy</xmit:PADDING>
      </pe:OPERATINGPOINT>
      <MIDDLE>
      <xmit:STUFFING>zzz</xmit:STUFFING>
      <pe:SETPOINT>
         129

<pe:UNITS>deg F</pe:UNITS> </pe:SETPOINT> <pe:LIMIT> 250 <pe:UNITS>deg F</pe:UNITS> </pe:LIMIT> </MIDDLE> </OUTER>

Here's the result when we run NSTest.html:

We can clearly see the schema-process-engineering namespace is used heavily. A human reader might be able to make something of the names, particularly if the namespace creator used names descriptive of a particular problem domain.

An automated agent might be given a list of namespace URIs in which a user is interested. Given that and a usage listing such as we produced above, the agent could assign a relevance priority to each document it encounters. Namespaces alone don't give much metadata, but they do give us a clue to what a document might be talking about. Here's the complete listing for our enumeration function:

function OnEnumNS()
{
   var parser = new ActiveXObject("microsoft.XMLDOM");
   var results = document.all("concordance");
   var rNSDeclarations, rQualElements, rQualAttributes;
   var declaration, element, attribute, ni, nj, nk;

   linesize = 4;
   tabsize = 0;

   results.innerHTML = "";

   if (parser != null)
   {
	parser.async = false;
	parser.load("namespace.xml");
	if (parser.readyState == COMPLETED && parser.parseError == "")
	{
	// namespace declarations
	rNSDeclarations = 
		parser.documentElement.selectNodes("//@*[nodeName() >= 'xmlns']");

		for (ni = 0; ni < rNSDeclarations.length; ni++)
		{
		   declaration = rNSDeclarations(ni);
		   ListLine(results, declaration.text, "black", tabsize, linesize);
		   linesize += 4;
		   rQualElements = 
   				parser.documentElement.selectNodes("//*[nodeName() >= '" 
						+ declaration.basename + ":']");
	
		if (rQualElements.length > 0)
			tabsize += 12;
		for (nj = 0; nj < rQualElements.length; nj++)
		{
		element = rQualElements(nj);
		if (element.prefix == declaration.basename)
			{
			ListLine(results, element.basename, "blue", tabsize, linesize);
			linesize += 4;
			}
		}
		if(rQualElements.length > 0)

tabsize -= 12; rQualAttributes = parser.documentElement.selectNodes("//@*[nodeName() >= '" + declaration.basename + ":']"); if (rQualAttributes.length > 0) tabsize += 12; for(nk = 0; nk < rQualAttributes.length; nk++) { attribute = rQualAttributes(nk); if (attribute.prefix == declaration.basename) { ListLine(results, attribute.basename, "red", tabsize, linesize); linesize += 4; } } if (rQualAttributes.length > 0) tabsize -= 12; } } else alert("Parser error:" + parser.parseError.reason); } }

Next we move on from namespaces to turn our attention to true metadata.


BackContentsNext
©1998 Wrox Press Limited, US and UK.

Buy this book



Select a Book

Beginning XML
Beginning XHTML
Professional XML
Professional ASP XML
Professional XML Design...
Professional XSLT...
Professional VB6 XML
Designing Distributed...
Professional Java XML...
Professional WAP

© Wattle Software 1998-2019. All rights reserved.