Using XML Queries and Transformations
Now we have an easy and platform-independent method of describing XML data,
validating its type as we wish and modifying and reading it programmatically.
So we basically have a transportable miniature database. No surprise then that
when you start to work with it, you'll feel the need for a query mechanism.
Using the DOM, you can get to each and every node in your document, but it can
get tiresome, maneuvering through the hierarchies of children to find that
single node you are interested in.
What we would like to have is an XML version of SQL. We
would like to say "Get me all nodes of
type X that have descendants of type Y". Many initiatives in
this direction have been started up. There were some working groups specifying
only a query language, but query mechanisms were also part of the drafts under
development for transformation (XSLT) and linking technologies (XPointer). Then
the W3C joined efforts with some of the working groups to specify XPath.
XPath is a simple syntax to select a subset of
the nodes in a document. It now has recommendation status and is used in both
the XSLT and XPointer standards (as we'll see later in this chapter and in the
Later in this chapter you will understand the importance of
XPath in the context of transforming one document type to another, but
first we will look at using XPath as a pure querying tool. In the
initial release of IE5, a basic version of XPath implementation was included (then called XQL).
Once XPath and XSLT gained recommendation status,
Microsoft promised to deliver a fully compliant implementation of XPath and
XSLT soon, and in January 2000 Microsoft shipped a developers preview of the
MSXML library. In the appendices for XPath and XSLT, you can find exactly which
features are supported in which releases.
We will work with the full version of XPath in this chapter.
If you want to program for the MSXML library that came with IE5 originally (if
you cannot update to the newer version on all installed versions), you are
restricted to a subset of XPath. We will indicate what can be used in the
earlier IE5 versions in a separate section.
Be aware of the fact that several (more powerful) XML query
languages are still under development. These include a syntax called XQL, that has firm
support from IBM and an initiative from the W3C, called XML Query, which is still in the first stages of specification. At the
moment, XPath is the only way that has reached recommendation status and it
looks like it will be a long time before anything else will.
This chapter will cover:
XPath for querying a document
XSLT for transforming a document
Styling a document with Cascading Style Sheets
Styling a document by using transformations (XSLT)
XPath Query Syntax
Before we get into the syntax of
an XPath query, we have to discuss the concept of a context node. In XPath, a query is not automatically done over the
whole of the content, but always has a starting point or context node. This can
be any node in the node tree that constitutes the document. From this
"fixed point" you can issue queries like "give me all
your children". This kind of query only
makes sense if there is a starting point defined. This starting point
may be the root node, of course, which would query the entire document.
This may seem a bit abstract now,
but just remember: an XPath query is done from a certain starting point in the
Have a look at the following XPath query:
This query would translate to plain English as: Get the TABLE
elements from all descendants (children, children's children, etc) of the
context node". The first part of this query, descendant,
is called the axis of the query. The second part, TABLE, is called the node test.
The axis is the searching direction; if a node along the specified axis
conforms to the node test, it is included in the result set. These patterns can
be very complex and can have subqueries in them. We will look at that later.
First, we will list all available all axes that can be used in a query:
All direct children of the context node. Excludes
All children and children's children etc
The direct parent (and only the direct
parent) of the context node (if any).
All ancestors of the context node. Always
includes the root node (unless the root node is the context node).
All siblings to the context node that appear later in the document.
All siblings to the
context node that appear earlier in the document.
All nodes in the document that come after it (in document order).
All nodes in the document that come
before it (in document order).
Contains the attributes of the context
Contains the namespace nodes of the
context node. This includes an entry for the default namespace and the
implicitly declared XML namespace.
Only the context node itself.
All descendants and the context node itself.
All ancestors and
the context node itself.
The ancestor, descendant,
axes partition the document. This means that these five axes together contain all nodes of the tree
(except attributes and namespaces), but do not overlap. This means that an
ancestor is not on the preceding axis and that a descendant
is not on the following
axis, as illustrated in the following diagram:
Different Node Tests
The sample we showed before used a literal name (TABLE)
as a node test. This is only one of the ways to specify what a selected node
should look like. Other valid values are:
text() which is true for all text nodes.
* which is true for any node of the principal type and
every axis has its own principal node type. For most axes the principal node
type is 'element', but for the attribute axis it is 'attribute' and for the
namespace axis, the principal type is 'namespace'.
comment() which is true for all comment nodes.
processing-instruction() which is true for all processing instruction nodes.
node() which is true for all nodes
These node type tests take no arguments. Only the processing-instruction
can be passed a literal; if an argument is passed, the node test is only true
for a processing instruction that has a name equal to the argument.
The following are examples of XPath queries using different axes and node tests. This selects all descendant
elements from the context node:
This selects the name attribute from the context node:
This selects the parent node of the context node:
This selects all namespaces that are valid in the context
This means that it includes the default namespace, the xml
namespace, any namespaces that are declared in the context node, and any
namespaces declared in ancestors of the context node that have not been
overruled by declarations in their children. The overruling of a namespace
happens when one element declares a prefix to a certain URI and a child node declares
a namespace with the same prefix, but with another URI. In this case, the first
declaration is removed and becomes invisible from nodes that are descendants of
the element wit the second declaration.
Finally, this query selects all comment
nodes that are a direct child of the context node:
Building a Path
Several of the XPath expressions we have seen up until now can
be appended to each other to form a longer expression. This is done in a way
similar to building a full directory path from several directory names: by
separating them with forward slashes. The first expression in the path is
evaluated in the original context; the result set from this expression forms
the context for the next. Each of the nodes in the result set is used as
context for the expression that follows and all the results of each query are
combined to one result set at the end. This would work as follows. This command
selects the parent
element of all name elements along the descendant
axis of our context node:
This selects all text nodes from paragraph
elements that are children of chapter elements that are children of
elements that are children of our context node:
Absolute vs. Relative Paths
Just as with directory paths, we can make the XPath expression absolute
by prefixing a slash. This sets the expression context to the document root.
This is not the root element (compare with the documentElement
attribute of the DOMDocument object in the DOM), but the parent of the
root element (compare with the DOMDocument object itself). This
example would select all attributes on the root element:
However, the next example would select nothing, because the
document root cannot carry attributes:
An XPath that consists of only a slash (/)
always refers to the document root.
Abbreviated Form / IE5 Compatible Form
The abbreviated notation of XPath is intended to keep the queries
shorter. But the most important reason to learn the shorthand notation is that
it used to be the full notation (according to the working draft) at the moment
that the first release of IE5 hit the shops. In fact, it wasn't even called
XPath back then, but was part of the XSL specification, which was later split
up in three parts. (More on that later in this chapter.) That's why in IE5,
only the shorthand syntax of XPath is implemented (in January 2000; Microsoft
released a preview of the newer version of the library, which will support the
full XPath specification). The main rules for the
abbreviated syntax are:
The child axis is the default axis
The attribute axis can be abbreviated to the prefix @
The self axis can be abbreviated to .
The parent axis can be abbreviated to ..
The descendant axis can be abbreviated to //
So these XPath expressions are valid in the IE5 implementation. The first command returns
attributes from TABLE elements in the whole document:
While this returns all text nodes that are children of PARAGRAPH
elements that are children of CHAPTER elements that are children of
the context element:
Now we have seen most of the basic elements of building XPaths.
There is only one more to discuss: predicates. Predicates are a way to
select a subset from a result set in an XPath (or part of an XPath). An XPath
with a predicate looks like this:
The axis and node test we have already seen. Now the predicate
expression gets appended in square brackets. Basically, what the predicate does
is place a filter on the result set. For each node in the set, the XPath
processor will test the predicate expression.
The Expression is True/False
If the expression evaluates to true, the node remains in the result set; if it evaluates to false, the node is
removed. The predicate can contain special XPath functions (we will see those
later, although we already met with text(), comment()
etc), numeric values and XPath expressions. This XPath expression would return
the second child element named chapter from the context node.
child::chapter[position() < 2]
The position() function returns the
position of the context node in its set. The set is the result of the node test
For the first node in the set, position() will return 1, for the
second 2, etc. The expression position() <
evaluates to true only for the first and second chapter elements found.
The Expression Returns a Number
If the expression evaluates to a numerical value n, it is
only true for the nth node. If the value is 2, only the second node in
the set will remain in the set, the rest will be deleted. The next example will
return only the first chapter element found among the children of the context
The number can also be the result of a calculation. The last()
function returns the number of nodes in the result set of the current context
node. Using this numeric value we can select the last chapter:
The Expression Returns a Node Set
If the result of the expression is a node set, the context node
is included if there are nodes in the node set. The context node is deleted if
the returned node set is empty. The expression can itself be an XPath
expression (with axes, node tests and predicates). The inner XPath is evaluated
with the outer XPath result as
its context. This is a powerful concept; it allows us to make sub-querying
constructions. The next example selects only those chapter
elements that have para elements among their children:
The outer XPath expression selects all chapter elements from the
children of the context node. Then, taking each of these chapter elements as
context, it tries to select para elements from their children.
The chapters that have an empty set of results are removed from the result set
of the outer XPath expression.
This query selects all messages that are a descendant
of the context query and have an ID attribute. Note that the results
of this query are the message elements, not ID attributes:
Here a node, the attribute confidentiality,
is compared with a literal string value:
In these cases, XPath compares the string value of the node
with the literal string value. If they are identical, the expression is true.
If the literal is numerical, the string value of the node is converted to a
numerical value and then compared. If a node set is compared with a literal
value, the expression is true if one of the elements in the set is identical to
the literal value. If two node sets are compared, the result is true if any one
node from the first can be matched with any one node from the second.
So in the example above, the predicate is true, if the
context node has a confidentiality attribute with value 'secret'.
Only if this is the case, the message will be selected.
Note that with this form of comparing, these two expressions
are not identical:
The first query selects all descendants that have an
attribute with value 'Teun'. The second one selects only descendants with all attributes set to 'Teun'
means 'not equal to'). If you don't immediately understand this, try to figure
out when this query evaluates to true:
It selects all descendants that have an attribute that does not have the value 'Teun'. The reverse
of this is selecting all descendants that have no attribute that has not
the value 'Teun', which is identical to selecting only descendants that have
all attributes set to value 'Teun'. In expressions like these, you can use the following operators:
Not equal to.
<, <=, >, >=
Less than, less than or equal to, greater than,
greater than or equal to.
Logical and, or.
+, -, *
Addition, subtraction, multiplication. Because can be part of a valid name and * can be used to indicate an arbitrary name, you have to make sure
they cannot be interpreted wrongly by leaving white space before the
Division (floating point).
Integer remainder of a division.
Union of two node sets (creates a new node
set holding all elements in the two node sets).
The filtered result set returned by an XPath expression with
a predicate expression can be further filtered by appending another predicate
to it. This example selects the fifth employee that has a function
child element with the value 'manager'. We first select all employee
nodes along the descendant axis, and then filter them with the [function='manager']
predicate. From this filtered result set, we again filter only the fifth
element with the predicate .
The following example looks very much the same, but selects
the fifth employee
element, but only if it has a child element of type function
with the value 'manager'. Otherwise it will return an empty node set.
As we have already seen, in the writing of predicates, functions that
perform complex operations are very handy, if not absolutely necessary. Some of
them we have already seen in some of the samples presented. We will show some
important functions here, but all other built-in functions specified by the
XPath recommendation are listed in Appendix C.
Node Set Functions
The last() function returns the index
number of the last node in the
context. For example this command selects the chapter
elements (along the child axis, which is the default axis in the shorthand
notation) that have exactly 5 paragraph children:
chapter[paragraph[last() = 5]]
The position() function returns the position of the current context node in the current result set.
For example this command selects the chapter children that have a fifth paragraph:
chapter[paragraph[position() = 5]]
Note that here we create a predicate to filter the results
of the outer expression, and this predicate uses an XPath expression that also
has a predicate. This recursive use of XPath expressions in predicates is a
powerful feature way to create sub-queries.
The count() function returns the number of nodes in the node set passed to it. This seems identical to the
function, but it isn't; the context it works on is different. It can be used to
do more or less equal things, but the syntax would be different. This example
selects the chapters
with exactly five paragraph children (identical to the example for the last()
chapter[count(paragraph) = 5]
Whereas this selects the chapters with five or more paragraph
children (identical to the example for the position() function):
chapter[count(paragraph) >= 5]
The id() function returns nodes that have the specified ID attribute. If the object passed to
the function is a node set, each of the elements is converted to its string
value. The function then returns all elements in the document that have one of
the ID values in the set.
If the passed object is anything else, the query parser
tries to convert it to a string and returns the element from the document that
has this string for an ID. This can, by definition, be only one element, for
id(//book[@publisher = 'WROX']/@authors)
This query returns all nodes that have an ID that matches
the content of the authors attributes on books that have their publisher
attribute set to 'WROX'. This kind of query can be extremely powerful.
However, they demand that the document is validated against a schema or DTD,
because without validation, the processor cannot know which attributes are IDs.
For doing things like this with invalidated documents, see the section on using
If your application has to act only on information in a specific
namespace (this is in fact very probable as soon as you are building real
applications), you will love the namespace-uri() function. It returns
a string containing the URI of the namespace of the passed node set. Normally,
the node set you pass will only contain one node. In fact, if you pass a node
set containing multiple nodes, the function will use the first node in the set.
So, if the node you pass is an element of type mydata:chapter,
the function will look for the declaration of the mydata
namespace and will return the value of the URI used there.
This next query will return all elements in the specified
For the handling of strings, several functions are included. We will not get into these in very much
depth. Most are what you would expect from string handling functions. They
cover concatenation, comparing and manipulating strings, and selecting a
substring from a string. We will show just a few functions here; refer to
Appendix C for the complete list.
This function converts the passed object to a string. This may be a
Boolean value that is converted to 'true', or a number value converted to its
string value (i.e. the number 3 would be converted to the string
"3"). If a node set is passed, the first node in the set is used.
This is for checking if the first string starts with the second string. The function returns true if
so, otherwise false. For example, this query returns all employee
elements that have a last-name attribute that starts with an 'A':
Note the use of the shorthand notation @last-name
translate(string, string, string)
The translate function takes a string and, character-by-character, translates characters which match the
second string into the corresponding characters in the third string. This is
the only way to convert from lower to upper case in XPath. That would look like
this (with extra white space added for readability). This code would translate
the employee last names to upper case and then select those employees whose
last names begin with A.
If the second string has more characters than the third
string, these extra characters will be removed from the first string. If the
third string has more characters than the second string, the extra characters
As for strings, a set of functions is available for number
handling, but we will not list them all here. They are available in Appendix C.
We will show a few of the most important and instructive examples.
The number() function converts any passed value to a number.
Its behaviour depends on the type of the passed parameter. Some possible
If a string is passed, the value of the string is
converted to the mathematical value that it displays (following the IEE 754
If a Boolean value is passed, true is converted to 1,
false to 0.
If a node set is passed, it is first converted to a
string (as if using the string() function). Then the string
is converted to a number.
The number function has no support for language-specific
formats. The string value passed in should be of a language neutral format.
The sum() function returns the sum of the numerical values of all passed nodes. The numerical
value is the result of the conversion of their string values. For example, this
query selects the industry elements that have customer
elements as children, whose totalturnover attributes sum to an
amount larger than 1 million:
The round() function is a typical number function. It rounds a floating point value to the nearest
integer value. Other ways of making an integer from a floating point value are floor()
The functions that handle Boolean values are not very special. The only really useful one is the not()
function, which converts a Boolean value to its opposite. Other than that,
there are the true()
functions that always return true and false respectively, and the lang()
function that can be used to check the language of the content (if this is
indicated with the xml:lang attribute).
IE5 implements a subset of XPath. If you are developing for the MSXML
objects in the initial IE5 release, you have to know what features are
implemented in XPath and which are not. Microsoft has committed to implementing
the full standard in all later versions. It is unclear what backward compatibility
will exist with syntax elements that are not part of the W3C recommendation.
Here we will show the differences between the IE5 implementation and the W3C
IE5 knows only the abbreviated syntax for axis and node test. You cannot use the
syntax with a double colon. This limits the number of axes, because not all
XPath axes have an abbreviated form (for example namespace,
Not all of the built-in functions of XPath are supported in
IE5. The most notable difference is the last() function which is called end()
in IE5. Also, many functions are not supported at all. Here is a
full list of the supported functions in IE5:
attribute() returns all attribute
nodes of the context node.
cdata() returns all CDATA nodes that are children of the
comment() returns all comment
nodes that are children of the context node.
date() casts a value to date format.
element() returns all elements that
are children of the context node.
end() synonymous to last()
in the XPath recommendation.
index() returns the index number of
the node within its parent.
node() returns all nodes (except
attributes and the root node) that are children of the context node.
nodeName() returns the tag name
(includes namespace prefix).
nodeType() returns a number indicating
the node type.
number() casts values to number
pi() returns all processing
instruction nodes that are children of the context node.
text() returns all nodes that
represent a text value, that are children of the context node. This includes both
nodes and CDATA
textnode() returns all text
nodes that are children of the context node.
value() returns the value of an
element or attribute.
In the source code download, you
will find a small Visual Basic application, called XPathTester.vbp, which allows you to both practice the writing of XPath
queries and test their performance. If you start the application, you will see
The Query tester frame can be used once an XML document is loaded. After loading such a
document (take a big one like Macbeth.xml),
you can see the structure of the document in the tree view control to the left.
If you select a node and type an XPath expression in the Query
text box, you can execute this query, using the selected node as your context
node. All matching nodes are listed in the list box. If you click a list item,
the underlying XML source is shown in the text box on the right. Note that the
number of seconds needed for performing the query is shown directly under the
list box. Use this application to practice writing queries. Notice how more
specific queries have a better performance than very general ones. Also,
queries that specify the structural relations of elements are much faster than
queries specifying the text content of elements and attributes.