Attributes
In addition to tags and
elements, XML documents can also include attributes.
Attributes are
simple name/value pairs associated with an element.
They are
attached to the start-tag, as shown below, but not to the end-tag:
<name nickname='Shiny John'>
<first>John</first>
<middle>Fitzgerald Johansen</middle>
<last>Doe</last>
</name>
Attributes
must have values – even if that value is just an empty string
(like "") – and
those values must be in quotes. So the following, which is part of a common
HTML tag, is not legal in XML:
and
neither is this:
Either
single quotes or double quotes are fine, but they have to match. For example, to make this
well-formed XML, you can use one of these:
<input checked='true'>
<input checked="true">
but you
can't use:
Because either
single or double quotes are allowed, it's easy to include quote characters in
your attribute values, like "John's
nickname" or 'I said "hi"
to him'.
You just have to be careful not to accidentally close your attribute, like 'John's nickname'; if an XML parser sees an
attribute value like this, it will think you're closing the value at the second
single quote, and will raise an error when it sees the "s" which comes right after it.
The same
rules apply to naming attributes as apply to naming elements: names are case
sensitive, can't start with "xml", and so on. Also, you can't have
more than one attribute with the same name on an element. So if we create an XML document like this:
<bad att="1" att="2"></bad>
we will
get the following error in IE5:
Try It Out – Adding Attributes to Al's CD
With all
of the information we recorded about our CD in our earlier Try It Out, we forgot to
include the CD's serial number, or the length of the disc. Let's add some
attributes, so that our hypothetical CD Player application can easily find this
information out.
1.
Open your cd.xml file created
earlier, and resave it to your hard drive as cd2.xml.
2. With our new-found attributes knowledge, add two attributes to the <CD> element,
like this:
<CD serial=B6B41B
disc-length='36:55'>
<artist>"Weird Al" Yankovic</artist>
<title>Dare to be Stupid</title>
<genre>parody</genre>
<date-released>1990</date-released>
<song>
<title>Like A Surgeon</title>
<length>
<minutes>3</minutes>
<seconds>33</seconds>
</length>
<parody>
<title>Like A Virgin</title>
<artist>Madonna</artist>
</parody>
</song>
<song>
<title>Dare to be Stupid</title>
<length>
<minutes>3</minutes>
<seconds>25</seconds>
</length>
<parody></parody>
</song>
</CD>
3. If you typed in exactly what's written above, when you display it in
IE5 it should look something like this:
4. Now edit the first attribute, like this:
<CD serial='B6B41B'
disc-length='36:55'>
5. Re-save the file, and view it in IE5. It will look something like
this:
How It Works
Using
attributes, we added some information about the CD's serial number and length
to our document:
<CD serial=B6B41B
disc-length='36:55'>
When the
XML parser got to the "=" character
after the serial attribute, it
expected an opening quotation mark, but instead it found a B. This is an error, and it caused the parser to stop and raise the
error to the user.
So we
changed our serial attribute
declaration:
and this
time the browser displayed our XML correctly.
The
information we added might be useful, for example, in the CD Player application
we considered earlier. We could write our CD Player to use the serial number of
a CD to load any previous settings the user may have previously saved (such as
a custom play list).
Why Use Attributes?
There have
been many debates in the XML
community about whether attributes are really necessary, and if so, where they
should be used. Here are some of the main points in that debate:
Attributes Can Provide
Metadata that May Not be Relevant to Most Applications Dealing with Our XML
For
example, if we know that some applications may care about a CD's serial number, but most won't, it may make
sense to make it an attribute. This logically separates the data most
applications will need from the data that most applications won't need.
In
reality, there is no such thing as "pure metadata" – all information
is "data" to some
application. Think about HTML; you could break the information in HTML into two
types of data: the data to be shown to a human, and the data to be used by the
web browser to format the human-readable data. From one standpoint, the data
used to format the data would be metadata, but to the browser or the person
writing the HTML, the metadata is the
data. Therefore, attributes can make sense when we're separating one type of
information from another.
What Do Attributes Buy Me
that Elements Don't?
Can't elements do anything attributes can do?
In other words, on the face of it there's really no
difference between:
<name nickname='Shiny
John'></name>
and:
<name>
<nickname>Shiny John</nickname>
</name>
So why bother to pollute the language with two ways
of doing the same thing?
The main reason that XML was invented was that SGML could do some great things, but it
was too massively difficult to use without a fully-fledged SGML expert on hand.
So one concept behind XML is a simpler, kinder, gentler SGML. For this reason,
many people don't like attributes, because they add a complexity to the
language that they feel isn't needed.
On the other hand, some people find attributes
easier to use – for example, they don't require nesting and you don't have to
worry about crossed tags.
Why Use Elements, if Attributes Take Up So Much Less Space?
Wouldn't it save bandwidth to use attributes
instead?
For example, if we were to rewrite our <name>
document to use only attributes, it might look like this:
<name nickname='Shiny John'
first='John' middle='Fitzgerald Johansen' last='Doe'></name>
Which takes up
much less space than our earlier code using elements.
However, in systems where size is really an issue,
it turns out that simple compression techniques would work much better than
trying to optimize the XML. And because of the way compression works, you
end up with almost the same file sizes regardless of whether attributes or
elements are used.
Besides, when you try to optimize XML this way, you
lose many of the benefits XML offers, such as readability and descriptive tag
names. And there are cases where using elements allows more flexibility and
scope for extension. For example, if we decided that first needed
additional metadata in the future, it would be much simpler to modify our code
if we'd used elements rather than attributes.
Why Use Attributes when Elements Look So Much Better? I Mean, Why
Use Elements when Attributes Look So Much Better?
Many people have different opinions as to whether
attributes or child elements "look better". In this case, it comes
down to a matter of personal preference and style.
In fact, much
of the attributes versus elements debate comes from personal preference. Many,
but not all, of the arguments boil down to "I like the one better than the
other". But since XML has both elements and attributes, and neither one is
going to go away, you're free to use both. Choose whichever works best for your
application, whichever looks better to you, or whichever you're most
comfortable with.