Giving Style to XML
With XML only consisting of data content, there is a clear
need for ways to display this content. This
is commonly referred to as 'styling the content'. At the time of writing, there
are two W3C standard stylesheet languages: CSS
(Cascading Stylesheets) and XSLT. Both can be used to assign certain looks to
specific element types in an XML document.
Until now we have seen how XSLT can be used to transform XML
documents from one format to another. The original goal for specifying XSLT was
its use as a styling mechanism. But before we try to use XSLT to transform to
HTML documents, we'll first have a look at the Cascading Stylesheets.
Using CSS in HTML
You have probably seen CSS before in an HTML context. It is a syntax for specifying the appearance of elements
in an HTML document in a structured way. It allows for associating one
stylesheet with many content documents, thus centralizing the common layout in
one place. The 'Cascading' part of the name refers to the feature of overriding
a global property locally by redefining and inheriting properties from parent
elements in the document. CSS strikes a fine balance between centralization and
developer flexibility.
The current recommendation of the W3C is at version 2. The
most important difference between versions 1 and 2 is that support for other
media-types was included (printing documents) and more complex selectors were
introduced (a selector in CSS2 can be compared with the match
attribute in an XSLT template). CSS properties on elements can specify:
Font size, family, color, variant (for example
smallcaps), style (for example italic)
Color, background color, background image (including
tiling and positioning)
Line, word and letter spacing
Alignment, underlining, overlining
The margins, borders, etc of boxes (boxes are TABLE
elements, but also P and BODY elements)
List styles (square bullets, etc), display (not
displayed, as block, inline)
Very detailed positioning and units (inches, cm,
pixels, points)
Some simple uses of CSS in HTML are shown here before we head on to
styling XML.
<P
style="text-align:center;text-decoration:underline">This is
text</P>
The above code would show be displayed like this:
This is text
The same would be accomplished by inserting this code at the
beginning of the HTML document:
<STYLE>
P {
text-align:center;
text-decoration:underline;
}
</STYLE>
Or by associating the HTML document with an external
stylesheet by doing this:
<LINK href="mystyle.css"
rel="style sheet" type="text/css">
While a file is present in the same directory, called mystyle.css
with this content:
P {
text-align:center;
text-decoration:underline;
}
There is a lot more to using CSS from HTML, especially when
you programmatically change the styles during display. We will not cover the
use of classes and more complex concepts in CSS here.
Using CSS in XML
Using CSS to style an XML document is very simple if you know the way it
works with HTML. The way of referencing the stylesheet is different, using the
processing instruction xml-stylesheet instead of the LINK
element. Inline stylesheets are not possible. Let's look at an example:
<?xml version="1.0"
encoding="iso-8859-1"?>
<?xml-stylesheet
type="text/css" href="article.css"?>
<Article>
<Authors>
<Author>James Britt</Author>
<Author>Teun Duynstee</Author>
</Authors>
<Title>A cool article</Title>
<Intro>An introductory text here ... </Intro>
<Body>The body text of the article comes here ... </Body>
<Related>
<Item type="URL"
loc="http://www.asptoday.com/art2">Some other article</Item>
<Item type="local" loc="2"/>
</Related>
</Article>
This example refers to a cascading stylesheet with a
relative URL. The type attribute contains a MIME type indicating the kind
of stylesheet. Using a different MIME type, we can also use this syntax to
associate an XSLT stylesheet with the document – more on that later.
If we leave the article.css stylesheet document empty,
all text nodes will be displayed flowing over the whole page. What we want is
the title to appear larger and have everything aligned, more like how we would
expect an article to look, just like this:

The most important thing to realize is that the elements in our document have no style
whatsoever. Normally in HTML styling, the P element (paragraph) has some
properties set by default. For example: the P element, but also the H1
to H6
elements all have their display attribute set to 'block'.
This indicates that the element requires its own line in the document. In XML,
the CSS processor assumes nothing. So let's start styling the title of the
article:
It should appear on its own line
It should be a little larger than the rest of the text
It should be centered and underlined
We want to use a sans-serif font
Converting this into a CSS statement, we would get:
Title {
display:block;
text-align:center;
text-decoration:underline;
font-size:14pt;
font-family:helvetica
}
Doing this for all elements in the article document, we
could come up with a stylesheet like this:
BODY {
color:black;
display:block;
width:80%;
margin-left:20%;
}
Intro {
color:black;
font-weight:bold;
display:block;
line-height:150%;
width:80%;
margin-left:20%;
}
Author {
text-align:right;
font-size:8pt;
display:block;
text-decoration:italic;
}
TITLE
{
display:block;
text-align:center;
text-decoration:underline;
font-size:14pt;
font-family:helvetica
}
Related {
display: none;
}
The good thing is that it is a standard and it works. The
bad thing is that it is a bit limited in areas other than the visible style,
for example:
Reordering and sorting of elements is not possible
Generation of text is hard
It can be done using the before and after pseudo-elements, but for more than
the really basic additions, it's too difficult and besides, these are not
implemented in most browsers.
Adding functionality, such as creating a link from
certain content elements, is not possible
Some documents are suitable for styling this way. They have
a content that is already in the order of reading and don't need much extra
functionality beyond the formatting of the content. Often, the data in XML
documents needs some more rigorous form of styling. In these cases, XSLT can be
used.
Good points of CSS include:
Many web developers are familiar with the language
Good performance
Using XSLT for Adding Style
We have seen quite a lot of XSLT in this chapter. We
saw that it is a language for converting one XML-based document into another.
HTML looks very much like an XML-based syntax, only the rules to determine if
the document is well-formed are less strict. If we use XSLT to transform XML to
an HTML page, the result must always be valid XML. This means that you cannot
create just any kind of HTML from XML with XSLT, but for any valid HTML
document it is possible to create an HTML document that looks the same and can
be created from XML. So if you want text displayed as:
Text Text Text Text
The HTML you would normally use would look like this:
Text <B>text <I>text</B>
text</I>
However, to be valid XML, it would have to be
rewritten as:
Text <B>text
<I>text</I></B><I> text</I>
Recently, the W3C specified XHTML.
XHTML is the same as HTML, but must always be a valid XML document. DTDs for XHTML have been
published. You can find the specification, including the associated DTDs at www.w3.org/TR/xhtml1/. Any XHTML
document can be generated from a source using XSLT. However, be careful – if
you use these DTDs to validate your XHTML document, you must use HTML in
lowercase. In most of the examples in this book, we use uppercase HTML elements
(I think this makes the stylesheet elements and the literal HTML elements
easier to distinguish). So, the examples do not generate valid XHTML.
As you may remember, the XSLT output element allowed us to
choose the method html for outputting HTML instead of XML. If you do this,
you can be sure that XML specifics such as processing instructions and closed
empty elements like <BR/> will not confuse HTML browsers. But if you do
so, you must also be aware of the fact that your output is not valid XML
anymore and therefore also not valid XHTML.
So basically anything that can be shown in a browser can be
the styled representation of an XML document using XSLT, and this was actually
one of the main purposes of developing XSLT in the first place. Using it for
transforming into formats other than HTML was only added later. We have already
seen a lot of XSLT in this chapter. We will just have a look at some examples
and common techniques.
Styling the Article
We'll take the same source documents that we used to show the use of CSS on
XML. The documents contain the text of an article and include some references
to both remote and other local articles. A sample article looks like this:
<?xml version="1.0"
encoding="iso-8859-1"?>
<Article>
<Authors>
<Author>James Britt</Author>
<Author>Teun Duynstee</Author>
</Authors>
<Title>A cool article</Title>
<Intro>An introductory text here ... </Intro>
<Body>The body text of the article comes here ... </Body>
<Related>
<Item type="URL" loc="http://www.asptoday.com/art2">Some
other article</Item>
<Item type="local" loc="2"/>
</Related>
</Article>
Now, we want to use XSLT to go beyond the styling that CSS
made possible. We will include a link for each of the related articles. The
remote references we will display with the title in our source document, but
for the local references, we will look up the details of those referred
articles and include that information in our styled document.
<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:template
match="Article">
<HTML><BODY>
<xsl:apply-templates
select="Title"/>
<xsl:apply-templates
select="Intro"/>
<xsl:apply-templates
select="Body"/>
<xsl:apply-templates
select="Authors"/>
<xsl:apply-templates
select="Related"/>
</BODY></HTML>
</xsl:template>
<xsl:template
match="Title"><H1><xsl:apply-templates/></H1></xsl:template>
<xsl:template match="Intro">
<p
style="width:80%;font-weight:bold"><xsl:apply-templates/></p>
</xsl:template>
<xsl:template match="Body">
<p
style="width:80%"><xsl:apply-templates/></p>
</xsl:template>
<xsl:template
match="Authors">
<p>Author(s): <xsl:apply-templates
select="Author"/></p>
</xsl:template>
<xsl:template
match="Author">
<xsl:apply-templates/>
<xsl:if
test="position() != last()">, </xsl:if>
</xsl:template>
<xsl:template
match="Related">
<p>Related items:<br/><xsl:apply-templates/></p>
</xsl:template>
<xsl:template match="Item">
<xsl:if test="@type='URL'">
<a href="{@loc}"><xsl:value-of
select="."/></a>
</xsl:if>
<xsl:if test="@type='local'">
<a href="art{@loc}.xml">
<xsl:value-of select="document(concat('art', @loc,
'.xml'))/Article/Title"/>
</a>(
<xsl:apply-templates select=
"document(concat('art', @loc,
'.xml'))/Article/Authors/Author"/>
)
</xsl:if>
<br/>
</xsl:template>
</xsl:stylesheet>
Except for the part that generates the HTML for the related
articles, everything is fairly simple in
the above document. The template that matches on the document
element Article,
reorders several items in the required order, starting with the title and
placing the authors and related articles at the end. The job of specifying how
each of these items should look is delegated to other templates. The templates
for Title,
Intro
and Body
elements do nothing special. They just output a bit of extra formatting code
around the content.
The Author template is a bit more
interesting. It is designed to create a comma-separated list in the output.
This is done by placing if tags around the literal comma. The
test
attribute checks if the current author happens to be the last one in the
current node set. If so, it doesn't generate the comma.
Then we have the related articles. The template for Item
generates a link to these articles. There are two kinds of article references.
Some are external, referring to some URL on the web. These have a text node as
their content. This is the title of the article, which we want to show as link
text. The other type is a local reference. It refers to other articles in the
same directory, written in the same XML format as this one. These articles have
a title in them and we know where to find the title when we need it, so there
is no need for storing the titles of these related articles along with the
reference.
The Item template really consists of two
parts, one for remote links and one for local links. The remote one is simple.
It generates the HTML code for a link, using the href
attribute as an attribute value template. The content of the Item
element becomes the content of the A element in HTML.
The local references are more complicated. How are we going
to get hold of the title from these other files in the same directory? By using
the document()
function! This is demonstrated twice, once from the value-of
element, and once from the apply-templates. In the first case,
the processor just opens the other document, finds the title
element in the indicated spot and outputs this to the destination document. The
second case (fetching a list of authors from the referenced document) uses the apply-templates
element. Instead of passing a node set of local nodes to the processor to let
it find appropriate templates for them, in this case we hand a set of nodes from
another document. The processor does exactly the same. It uses the author
template that was already used for creating a comma-separated list of authors
of the article itself, but it is now used to create an author list for the
referenced document.
Creating Internal Links on Shakespeare
The next example is an XSLT stylesheet that styles
the play of Macbeth (and other plays in the same format). Apart from creating a
readable layout and highlighting the stage directions, we want to create a sort
of navigational structure that allows the reader to jump from the beginning of
an act to the beginning of the next or previous act. To do this, we will have
to introduce internal links in the HTML document.
An example XML document has the following structure:
<?xml version="1.0"?>
<!DOCTYPE PLAY SYSTEM
"play.dtd">
<PLAY>
<TITLE>The Tragedy of
Macbeth</TITLE>
<PERSONAE>
<TITLE>Dramatis Personae</TITLE>
<PERSONA>DUNCAN, king of
Scotland.</PERSONA>
<PGROUP>
<PERSONA>MALCOLM</PERSONA>
<PERSONA>DONALBAIN</PERSONA>
<GRPDESCR>his sons.</GRPDESCR>
</PGROUP>
...
</PERSONAE>
<ACT><TITLE>ACT I</TITLE>
<SCENE>
<TITLE>SCENE I. A desert
place.</TITLE>
<STAGEDIR>Thunder and lightning. Enter three Witches</STAGEDIR>
<SPEECH>
<SPEAKER>First Witch</SPEAKER>
<LINE>When shall we three meet again</LINE>
<LINE>In thunder, lightning, or in rain?</LINE>
</SPEECH>
</SCENE>
</ACT>
</PLAY>
A SPEECH element has always a SPEAKER
child element and at least one LINE
child element. SPEECH
and STAGEDIR
elements are children of SCENE elements; SCENE
elements are children of ACT elements, which are children of
the root PLAY
element. Phew!
The corresponding stylesheet is called play.xsl
and is part of the code download. It is rather big, and so has not been listed
in full. Most of this stylesheet is dedicated to the visible layout of several of the
content elements; however some of the templates deserve a closer look. For
example the SPEECH
template creates the speech in a table with the speaker's name on the first
line:
<xsl:template
match="SPEECH">
<TR><TD style="font-weight:bold">
<xsl:apply-templates select="SPEAKER"/>
</TD><TD><xsl:value-of select="LINE[1]"/></TD></TR>
<xsl:apply-templates select="LINE"/>
</xsl:template>
...
<xsl:template match="LINE">
<xsl:if test="position() > 1">
<TR><TD></TD><TD>
<xsl:value-of select="."/>
</TD></TR>
</xsl:if>
</xsl:template>
Note how the SPEECH template creates a row in the table, holding both the SPEAKER and the first LINE
element. To prevent the line from showing up twice, the LINE template only creates output for LINE elements that have a position higher than 1.
The most interesting part of the
stylesheet is the part that generates internal links forward and
backward. We will have a look at the templates for ACT elements, one part at a time. First let's look at the part
that generates the act title as an internal link target:
<A>
<xsl:attribute name="name">
<xsl:value-of select="generate-id()"/>
</xsl:attribute>
<xsl:value-of select="TITLE"/>
</A>
This fragment creates an A element, with the content of the TITLE child element contained within. On this element, a name attribute is generated. The value of this attribute is
determined by the function generate-id()
without passing a parameter. This causes the function to use the context node
(the ACT element) to generate a unique identifier. How the
identifier looks is processor specific.
The links forward and backward are
also generated using the generate-id()
function, but now by passing the next or previous ACT element to it:
<xsl:if test="following-sibling::ACT">
<A>
<xsl:attribute name="href">
<xsl:text>#</xsl:text>
<xsl:value-of
select="generate-id(following-sibling::ACT[1])"/>
</xsl:attribute>
>
</A>
</xsl:if>
Using the if element, we make sure that the 'next' link is only generated if there is any ACT element to link to. If so, an A element is generated, bearing an href attribute with an internal link. The # is hardcoded, but the rest of the string is generated by
the generate-id() function. We use the following-sibling axis, constrained by the ACT element name, and select the first node from the resulting
set. This node is the next ACT
element. Later in the destination document, this node will be used to generate
the full text of the next act and creating a link target at the spot of the act
title. This way, we make sure that the target name of the next act is the same
string as used in the href
attribute of this link.
A third technique that should be
noticed is the use of a parameter in the transformation of a PERSONA element in the PERSONAE
part. A PERSONA can appear as a direct child of the PERSONAE element or inside a PGROUP. If a PERSONA is
inside a PGROUP, we want the name to
appear indented. We use the same template for all PERSONA elements, but when called from within a PGROUP, we pass the parameter indented="yes" to it.
<xsl:template
match="PERSONA">
<xsl:param name="indented">no</xsl:param>
<TR><TD>
<xsl:if test="$indented = 'yes'">
<xsl:attribute
name="style">padding-left:20</xsl:attribute>
</xsl:if>
<xsl:value-of select="."/>
</TD></TR>
</xsl:template>
Client Side XSLT Styling
There is one more way to style XML documents with an XSLT stylesheet. It can be done by
the browser application. In this scenario, the web server sends a raw (without
layout) XML document to the client, but containing a processing instruction
that tells the browser which stylesheet to use. This processing instruction
uses the same syntax we saw used for attaching a Cascading Stylesheet to an XML
document. An example of this is as follows:
<?xml version="1.0" ?>
<?xml-stylesheet
href="transformation.xsl" type="text/xsl"?>
<CONTENT>
</CONTENT>
A web browser that supports XSLT, will download the referred
stylesheet (transformation.xsl)
and transform the XML document with that stylesheet before showing it to the
user. This technique can take a large part of the processing load from the
server to the client machine.
If you have little control over the browser application used
(as in most Internet scenarios), you will have to check on the server if the
user uses an XSLT supporting browser and transform the content to HTML on the
server if he doesn't.