Before an XML document is actually transformed by a style sheet, it is first parsed into a tree-like structure. The actual transformation is performed on this tree.
The format of an XSL tree is partly specified by the XSLT standard, partly by the XPATH standard, and is similar to the tree structure defined by the DOM standard.
A node in an XSL tree has the form:
There are seven types of nodes:
Root
Element
Text
Attribute
Comment
Processing Instruction (PI)
Namespace
We can regard nodes as objects and types as classes. Here's a UML diagram showing the "is-a" and "has-a" relationships between these classes:
Every node has a name of type String:
The name of a root, comment, or text node is the empty string.
The name of an element node is its tag, qualified by its namespace URI
The name of a namnespace node is the local name of the URI
The name of a PI is its target
Every node has a value of type String:
The value of a comment is the comment without its delimeters
The value of a text node is the text
The value of a PI is the data part of the PI, if any
The value of an attribute is its attribute value
The value of an element or root node is the concatonation of the values of all of its text and element children. (We denote this value by *.)
For example, the following XML file contains a style sheet processing instruction followed by its root element:
<?xml version = "1.0"?>
<?xml:stylesheet type = "text/xsl" href =
"outline.xsl"?>
<examples xmlns:java = "lang">
<!-- control structures -->
<java:example number =
"1.0">
<description>
The while loop.
</description>
<![CDATA[
while(CONDITION) STATEMENT
]]>
</java:example>
</examples>
Here is the corresponding XSL tree:
Note 1: The namespace declaration of the root element (i.e., <examples>, not to be confused with the root node) is considered to be a namespace node, not an attribute node.
Note 2: The value of a PI of the form:
<? print Hello ?>
would be Hello, while its name would be print.
Note 3: The <?xml ... ?> element is not in the tree. In particular, it is not considered to be a PI node.
Note 4: Not all of the information that's in the original XML document is in the tree. For example, the fact that one of the text nodes was an unparsed CDATA node does not show up in the tree. Only core information items are guaranteed to appear. (As opposed to non-core and lexical items. See the xml-infoset.)
Recall that an XSLT processor evaluates XPATH expressions relative to the context supplied by the source tree.
value = processor.eval(exp, context);
There are three types of expressions:
<Expression> ::= <Operation> | <Location> | <Primary>
Operations involve standard infix binary and prefix unary operators (+, and, =, -, etc.)
Primary expressions are literals (numbers, strings), variable references, function calls, etc.
The value of an XPATH expression might be a number, Boolean, string, or node set. For example, assume pi is a variable defined to be 3.1416. Then the values of the following expressions:
$pi * 5
$pi < 5 and true()
$pi != 5 or not(nuts) and ($pi + 3) = -2
are 15.708, true, and true, respectively.
Note 1: "nuts" is a literal string and is equated with false.
Note 2: Therefore true must be a function, not a string.
Note 3: We can't write "$pi < 5"
Note 4: Expressions may contain calls to standard functions.
The value of a location is a source tree node set.
A location is a sequence of steps:
/STEP/STEP/STEP/etc
A location describes a set of nodes in the source tree. The simples type of location describes a path from the root node.
For example, assume the source document has the form:
<A>
<B> <C prop =
"p1"> c1 </C> <C prop = "p2"> c2 </C>
</B>
<B> <D> d1 </D>
<C prop = "p3"> c3 </C> </B>
</A>
Here's a template we can use to evaluate XPATH expressions:
<xsl:template match = "/">
<xsl:variable name =
"path" select = "/" />
<html>
<head> <title> XSL Tests
</title> </head>
<body>
<xsl:for-each select =
"$path">
Name
= <xsl:value-of select = "name(.)" /> <br />
Value
= <xsl:value-of select = "." /> <br /> <br />
</xsl:for-each>
</body>
</html>
</xsl:template>
Here is the output produced when path = "/"
Name =
Value = c1 c2 d1 c3
The root node has no name, and its value is the concatonation of the values of all of its children.
Here is the output produced when path = "/A"
Name = A
Value = c1 c2 d1 c3
Here is the output produced when path = "/A/B"
Name = B
Value = c1 c2
Name = B
Value = d1 c3
Here's the output when path = "/A/B/C"
Name = C
Value = c1
Name = C
Value = c2
Name = C
Value = c3
We can also include attributes in our path. For example, here's the output when path = "/A/B/C/@prop"
Name = prop
Value = p1
Name = prop
Value = p2
Name = prop
Value = p3
Predicates can be used to filter out unwanted nodes. Only nodes that pass the test specified by the predicate will be included.
Here's the output when path = "/A/B[2]"
Name = B
Value = d1 c3
When path = "/A/B/C[@prop = 'p3']"
Name = C
Value = c3
Predicates may contain function calls. Here's the output
when
path = "/A/B/C[contains(., 'c1')]"
Name = C
Value = c1
Note that the call to contains is needed because the value of the node contains whitespace characters.
If we modify our template so that it matches B nodes, then we can use relative path expressions. These are paths that don't begin with a slash.
<xsl:template match = "B">
<xsl:variable name =
"path" select = "C" />
<html>
<head> <title> XSL Tests
</title> </head>
<body>
<xsl:for-each select =
"$path">
Name
= <xsl:value-of select = "name(.)" /> <br />
Value
= <xsl:value-of select = "." /> <br /> <br />
</xsl:for-each>
</body>
</html>
</xsl:template>
Here's the output produced when path = "C"
Name = C
Value = c1
Name = C
Value = c2
Name = C
Value = c3
We can use . and .. in path expressions. Here's the output when path = ".."
Name = A
Value = c1 c2 d1 c3
Name = A
Value = c1 c2 d1 c3
A got printed twice, because the pattern "B" selected two nodes.
Format of a step is:
AXIS::TEST[PRED]
For the remainder of the section we will use the following XML file:
<?xml version = "1.0"?>
<A>
<B> <B1> B1 </B1>
<B2> B2 </B2> </B>
<X prop1 = "p" prop2 =
"q">
<C prop1 = "p1"> <C1> C1 </C1> <C2> C2
</C2> </C>
<D prop2 = "p2"> <D1> D1 </D1> <D2> D2
</D2> </D>
</X>
<E> <E1> E1 </E1>
<E2> E2 </E2> </E>
</A>
Here's is a sketch of the tree, showing the current node:
Our style sheet makes the shaded node, X, our current node, then examines various axes:
<xsl:template match = "/">
<html>
<head> <title> XSL Tests
</title> </head>
<body>
<xsl:apply-templates select =
"A/X" />
</body>
</html>
</xsl:template>
<xsl:template match="X">
<xsl:for-each select =
"self::*">
Name = <xsl:value-of select =
"name(.)" /> <br />
Value = <xsl:value-of select =
"." /> <br /> <br />
</xsl:for-each>
</xsl:template>
Here are the results:
child
Name = C
Value = C1 C2
Name = D
Value = D1 D2
self (.)
Name = X
Value = C1 C2 D1 D2
parent (..)
Name = A
Value = B1 B2 C1 C2 D1 D2 E1 E2
ancestor
Name = A
Value = B1 B2 C1 C2 D1 D2 E1 E2
ancestor-or-self
Name = A
Value = B1 B2 C1 C2 D1 D2 E1 E2
Name = X
Value = C1 C2 D1 D2
descendant-or-self (//)
Name = X
Value = C1 C2 D1 D2
Name = C
Value = C1 C2
Name = C1
Value = C1
Name = C2
Value = C2
Name = D
Value = D1 D2
Name = D1
Value = D1
Name = D2
Value = D2
descendant
Name = C
Value = C1 C2
Name = C1
Value = C1
Name = C2
Value = C2
Name = D
Value = D1 D2
Name = D1
Value = D1
Name = D2
Value = D2
attribute (@*)
Name = prop1
Value = p
Name = prop2
Value = q