Designing Distributed Applications with XML, ASP, IE5, LDAP and MSMQ
Buy this book
 
XML Schema
This technology preview implements the core feature set of
the XML DCD submission while using the element names of XML Data. While the two
notes are similar in structure, there is some difference in keywords and the
like. Note that the inheritance and subclassing features described in an
appendix of the DCD note are not included in the preview implementation.
Generally speaking, XML Schema loses relations, aliases, inheritance, and
complex types as defined in XML Data.
At the time of writing, we are referencing the DCD note of
31 July 1998 and the XML Data Note of 5 January 1998.
The preview supports the following element types:
|
Schema
|
Root element of a schema definition
|
ElementType
|
Defines a class of elements
|
AttributeType
|
Defines a class of attributes
|
datatype
|
Specifies the type of an ElementType
or AttributeType element
|
element
|
Names a declared element class whose instances may
appear in instances of the element class defined by an ElementType element
|
attribute
|
Names a declared attribute class whose instances may
appear in instances of an AttributeType
declaration
|
description
|
Provides textual documentation for ElementType and AttributeType
elements
|
group
|
Defines a collection class
|
We will now explore XML Schema by building a simple schema
definition for documents describing employees. This will be a very simple
schema, with the following DTD:
<!ELEMENT EMPLOYEE (NAME, HIREDATE, MANAGER, DEPARTMENT)>
<!ATTLIST EMPLOYEE employment-category (full | part |contract) #REQUIRED>
<!ELEMENT NAME #PCDATA>
<!ELEMENT HIREDATE #PCDATA>
<!ELEMENT MANAGER #PCDATA>
<!ELEMENT DEPARTMENT #PCDATA>
Here's a sample document in this vocabulary:
<?xml version="1.0"?>
<EMPLOYEE xmlns="http://myserver/HR/employeeschema.xml" employment-category="full">
<NAME>Phineas Armbuster</NAME>
<HIREDATE>1990-01-02</HIREDATE>
<MANAGER>B. G. Trouble</MANAGER>
<DEPARTMENT>Finance_Collections</DEPARTMENT>
</EMPLOYEE>
Defining a Schema
The element is the
root of any schema definition. It is used to declare the name of the schema and
any namespaces required in the schema. It will typically declare the namespaces
for schemas and datatypes. Thus, for our example, we have:
<Schema name="EMPLOYEE" xmlns="urn:schemas-microsoft-com:xml-data"
xmlns:dt="urn:schemas-microsoft-com:datatypes">
Elements
Element types are declared using the ElementType
element. This element has one required attribute and four optional attributes:
|
Attribute
|
Description
|
name
|
required; name of the element
|
content
|
optional an enumeration of empty, textOnly (i.e., PCDATA),
eltOnly (i.e., elements), and mixed (i.e., elements and PCDATA)
|
dt:type
|
optional; data type of the element
|
model
|
optional; an enumeration of open (can contain content not defined in the schema) and closed
(can only include content defined in the schema)
|
order
|
optional; an enumeration of one
(i.e., only of a set of options, analogous to the | DTD operator), seq (i.e., the specified elements must
appear in the specified order), and many
(i.e., permits any of the named elements to appear zero or more times in any
order)
|
Take the definition of the NAME
element:
<ElementType name="NAME"
content="textOnly"/>
We've defined an element NAME
that contains text, i.e., PCDATA. Let's look at a
more challenging example, the EMPLOYEE
element.
<ElementType name="EMPLOYEE" content="eltOnly" model="closed" order="seq">
<attribute type="employment-category"/>
<element type="NAME"/>
<element type="HIREDATE"/>
<element type="MANAGER"/>
<element type="DEPARTMENT"/>
</ElementType>
We've said EMPLOYEE
can only contain other elements (content="eltOnly")
and only those elements we've specified in our schema (model="closed").
Moreover, the elements must all appear in the order listed (order="seq"). From there, we go on
to provide a list of the attributes (in this case, employment-category)
and elements (NAME, HIREDATE, MANAGER,
and DEPARTMENT) that are contained in an EMPLOYEE element.
Attributes
The EMPLOYEE definition
included an attribute element. Clearly, we have to be able to declare
attributes in a manner similar to the way we define elements. Not surprisingly,
the AttributeType element exists to do just that.
This element has the following attributes:
|
Attribute
|
Description
|
name
|
required; name of the attribute being defined
|
dt:type
|
optional; data type of the attribute
|
dt:values
|
optional; when dt:type
has the value enumeration, this
attribute provides the permissible values
|
default
|
optional; default value for the attribute. If dt:type appears, the value of this attribute
must be legal for that type.
|
required
|
optional; either of the enumerated values yes or no.
Denotes whether the attribute is required to appear on an element.
|
The definition for the employment-category
attribute looks like this:
<AttributeType name="employment-category"
required="yes" dt:type="enumeration"
dt:values="full part contract"/>
We can see that this attribute is required to appear in any
element — like EMPLOYEE — that uses the
attribute. It is an enumerated type with the permissible values full, part,
and contract.
The Complete EMPLOYEE
At this point, it is worthwhile to present the entire schema
for our simple document type. We've managed to do a few things here that we
couldn't do in the DTD version of this document model. We pulled in foreign
namespaces to allow us to talk about schemas and data types. We've strongly
typed our elements and attributes, making life somewhat simpler for our
application programmers. The content and model information was elevated to an
explicit statement through the use of attributes. In the DTD, you had to parse
the line:
<!ELEMENT EMPLOYEE (NAME, HIREDATE, MANAGER, DEPARTMENT)>
to realize the EMPLOYEE
element has sequential order and can contain only elements. Here, we've said so
explicitly. Moreover, the model attribute lets us open up our model if our
applications require us to do so, whereas DTDs are closed by definition.
<Schema name="EMPLOYEE"
xmlns="urn:schemas-microsoft-com:xml-data"
xmlns:dt="urn:schemas-microsoft-com:datatypes">
<ElementType name="NAME" content="textOnly"/>
<ElementType name="HIREDATE" dt:type="date" content="textOnly"/>
<ElementType name="MANAGER" content="textOnly"/>
<ElementType name="DEPARTMENT" content="textonly"/>
<AttributeType name="employment-category" required="yes" dt:type="enumeration"
dt:values="full part contract"/>
<ElementType name="EMPLOYEE" content="eltOnly" model="closed" order="seq">
<description>Simple element for describing employees</description>
<attribute type="employment-category"/>
<element type="NAME"/>
<element type="HIREDATE"/>
<element type="MANAGER"/>
<element type="DEPARTMENT"/>
</ElementType>
</Schema>
Other XML Schema Elements
Our little example omitted two schema elements supported in
the technology preview: datatype and group. The datatype
element is an extension of the dt:type and
dt:values attributes for ElementType and AttributeType
elements. The datatype element allows
us to specify not only the type of an element or attribute, but also minimum
and maximum values. It has the following attributes:
|
Attribute
|
Description
|
dt:type
|
optional; specifies the type of the element or attribute
|
dt:values
|
optional; when dt:type
has the value enumeration, this
attribute allows us to specify the permissible values
|
dt:max
|
optional; maximum value inclusive of the
given value
|
dt:maxExclusive
|
optional; maximum value exclusive of the given value
|
dt:min
|
optional; minimum value inclusive of the given value
|
dt:minExclusive
|
optional; minimum value exclusive of the given value
|
dt:maxlength
|
optional; allows us to limit the length of certain data types
|
Let's apply this to HIREDATE.
Suppose our company came into existence on July 20, 1969 and will be disbanded
when the founder retires on December 31, 1999 — no Y2K worries for us then! The
definition becomes:
<ElementType name="HIREDATE" content="textonly">
<datatype dt:type="date" dt:min="1969-07-20" dt:max="1999-12-31"/>
</ElementType>
The group element organizes
content into a sequence. It specifies which elements appear, how often, and in
what sequence. The permissible attributes are:
|
Attribute
|
Description
|
maxoccurs
|
optional; the enumerated values 1 and * (at most one or many occurrences)
|
minoccurs
|
optional; the enumerated values 0 or 1 (a minimum of zero or one)
|
order
|
required; one of the enumerated values one, seq,
many
|
The attributes minoccurs and maxoccurs specify the minimum and maximum
number of times the group can occur. The order attribute specifies the sequence and content of the group.
The literal one means exactly one of the elements of the
group may occur. This is like the | (OR) operator in DTDs. The seq
attribute value means the elements in the group all appear, and appear in the
specified order. The value many means
that any of the elements may appear (or not) and in any order.
If we wanted to modify our EMPLOYEE
element so that an employee could belong to multiple departments with a
supervisor in each, e.g., the employee belongs to multiple teams, we would say:
<ElementType name="EMPLOYEE" content="eltOnly" model="closed" order="seq">
<description>Simple element for describing employees</description>
<attribute type="employment-category"/>
<element type="NAME"/>
<element type="HIREDATE"/>
<group order="seq" minoccurs="1" maxoccurs="*">
<element type="MANAGER"/>
<element type="DEPARTMENT"/>
</group>
</ElementType>
This would be a valid document under the revised schema:
<?xml version="1.0"?>
<EMPLOYEE xmlns="http://myserver/HR/employeeschema.xml" employment
-category="full">
<NAME>Phineas Armbuster</NAME>
<HIREDATE>1990-01-02</HIREDATE>
<MANAGER>B. G. Trouble</MANAGER>
<DEPARTMENT>Finance_Collections</DEPARTMENT>
<MANAGER>G. Marconi</MANAGER>
<DEPARTMENT>Engineering</DEPARTMENT>
</EMPLOYEE>
 
©1998 Wrox Press Limited, US and UK.
Buy this book
|