XSLT stands for eXstensible Style Language: Transforms. It can be viewed as a language for writing programs (called style sheets) that translate input XML documents into various types of output documents.
Assume retailer A wants to provide customers with a web page called products.html that describes the products they can buy. A is supplied by wholesaler B. Every day A downloads the inventory of B as an XML file called inventoryB.xml. Of course inventoryB.xml contains more information than A's customers need to see (the wholesale prices, for example). Also, inventoryB.xml is formatted as an XML file, not an HTML file.
To solve this problem, A uses an XSLT style sheet that transforms inventoryB.xml into the desired products.html.
Wholesaler B is supplied by wholesaler C. Wholesaler C represents his inventory as a file called inventoryC.xml. Unfortunately, although B and C distribute the same type of products, the proprietary formats of their XML inventory languages are slightly different. B writes an XSLT style sheet to extract product descriptions from inventoryC.xml, convert them to the product descriptions used by B, and add these descriptions to inventoryB.xml.
Professor Smith's resume is stored in a file called
resume.xml. When an HTTP request is received for the resume, Smith's server
extracts the locale and language of the request. If, for example, the request
is from
Professor Smith's web site also gives users the option of downloading his resume in pdf format. When the user selects this option, the server uses a style sheet to translate resume.xml into resume.pdf.
At Gangsta High student grades are stored in an XML file called grades.xml. An XSLT style sheet scans through grades.xml. Each time it encounters a GPA below 2.0, it fires an event. Various event handlers are notified when such an event is fired. These handlers can do things like send email to the student's parents advising them of the situation.
An XSLT program is executed by an XSLT processor. The processor parses the input document into a tree, then feeds the tree, with the style sheet program, to the XSLT engine. The engine executes the style sheet program. This produces a result tree. A serializer writes the result tree to an output document:
An XSLT style sheet is an XML document, too. In other words, an XSLT program is a tree that transforms trees into other trees!
<?xml version = "1.0" ?>
<xsl:stylesheet
xmlns:xsl =
"http://www.w3.org/1999/XSL/Transform"
version = "1.0">
<!-- top level elements go here -->
</xsl:stylesheet>
Note 1: Notice the xml declaration. Style sheets are XML documents. We could write meta style sheets that transform other style sheets.
Note 2: The root element tag can also be:
<xsl:transform ... >
...
</xsl:transform>
Note 3: All XSLT tags are part of the xsl namespace, which is mapped to the URI:
http://www.w3.org/1999/XSL/Transform
The URI:
http://www.w3.org/TR/WD-xsl
is used for WD-xsl, the Microsoft XSLT dialect.
Note 4: If we use the document() function to process multiple input documents, we will need to change the version number to 1.1.
Note 5: We can add additional namespace declarations.
<xsl:include href = "utils.xsl"/>
<xsl:import href = "tools.xsl"/>
Using import and include elements, we can modularize our style sheets. In this case our style sheet is a collection of xsl modules linked by include and import elements. The principal module is the starting point.
The include element simply dumps the templates of utils.xsl into the including module. The import directive dumps the templates of tools.xsl in the importing module, but assigns them a lower "import" precedence, which affects how the processor searches for templates.
The processor transforms a source tree into a result tree. The result tree is then written to a file. This process is called serialization. The result tree can be serialized as HTML, XML, or as text. The default is HTML.
<xsl:output method = "html"/>
<xsl:output method = "xml"/>
<xsl:output method = "text"/>
<xsl:variable name = "pi" select =
"3.14"/>
<xsl:variable name = "area" select = "$pi * 5 * 5"/>
<xsl:param name = "title" select = "'Sales Report'"/>
Note 1: We dereference a parameter or variable by placing a $ in front of it.
Note 2: String literals must be quoted. Even when they appear in a quoted context.
A template is a rule that describes how to generate a piece of the result tree. A template can either be explicitly called by name, or automatically called by the processor when its pattern matches the current node of the source tree:
<xsl:template match = "PATTERN" name =
"NAME">
<!-- local params declared here
-->
<!-- template body goes here -->
</xsl:template>
Instantiating a template body creates one or more sub-trees of the result tree. A template body contains literal text that is written directly into the result tree, as well as XML instructions that must be executed. Here are some popular instructions:
<xsl:apply-templates select = "PATTERN" />
<xsl:call-template name = "cube">
<xsl:with-param name =
"arg" select = "3"/>
</xsl:call-template>
<xsl:if test = "TEST"> ... </xsl:if>
<xsl:choose>
<xsl:when test =
"TEST1"> ... </xsl:when>
<xsl:when test =
"TEST2"> ... </xsl:when>
<xsl:when test =
"TEST3"> ... </xsl:when>
<xsl:otherwise> ...
</xsl:when>
</xsl:choose>
<xsl:for-each select = "PATH"> ... </xsl:for-each>
<xsl:value-of select = "EXP"/>
java org.apache.xalan.xslt.Process
-in input.xml -xsl sheet.xsl -out
output.html
<%@page
import="javax.xml.transform.*,
javax.xml.transform.stream.*,
java.io.*"
%>
<%
String
ls_path = request.getServletPath();
ls_path
= ls_path.substring(0,
ls_path.indexOf("xslProcessor.jsp"));
String
styleSheet = request.getParameter("ss");
String
xmlFile = request.getParameter("doc");
String
ls_xml = application.getRealPath(ls_path + xmlFile);
String
ls_xsl = application.getRealPath(ls_path + styleSheet);
String
path = request.getParameter("path");
StreamSource
xml = new StreamSource(new File(ls_xml));
StreamSource
xsl = new StreamSource(new File(ls_xsl));
StreamResult
result = new StreamResult(out);
TransformerFactory
tFactory = TransformerFactory.newInstance();
Transformer
transformer = tFactory.newTransformer(xsl);
if
(path != null && !path.equals(""))
transformer.setParameter("path",
path);
transformer.transform(xml,
result);
%>
Here's an HTML form that provides users with a button. Clicking on the button will cause output.html to be downloaded to the user:
<form action = "xslProcessor.jsp">
<input type = "hidden"
name = "ss" value = "sheet.xsl" />
<input type = "hidden"
name = "doc" value = "input.xml" />
<input type = "submit"
value = "Transform" />
</form>
java com.icl.saxon.StyleSheet input.xml sheet.xsl > output.html
A transformer factory creates a transformer from a style sheet:
TransformerFactory factory = TransformerFactory.newInstance();
Transformer transformer = factory.newTransformer(style sheet);
A source is anything that can function as the source of a document or style sheet. This might be a file or the output of a parser (DOM or SAX). For example, assume our style sheet is contained in a file called test.xsl, and our document is contained in a file called books.xml, then:
Source style sheet = new StreamSource(new
File("test.xsl"));
Source document = new StreamSource(new File("books.xsl"));
The output of a transformer will be a result object. We can use such an object to build an output document, a DOM tree, or a source of SAX events. Suppose our result will be an HTML tree which we would like to send to a web browser. Every Java Server Page (JSP) has an implicit out object that's connected to the requesting browser, so:
Result result = new StreamResult(out);
We are now ready to ask the transformer to transform:
transformer.transform(document, result);
But what does the transform method do?
class Transformer {
void transform(Source src, Result res)
{
1. Find a template, t, that matches
root of src
2. res = instantiate body of t
}
// etc.
}
If there are several templates that match the root, a conflict resolution policy is invoked to find the best fit.
If there are no candidates, a built-in template is used.
The body of a template can be viewed as a tree consisting of two types of nodes, instructions and literals. (White space, comments, and processing instructions are ignored.)
This is almost the result tree, only the instruction nodes (shaded) must be replaced by the sub-trees they produce when they are executed.
The body of the following template:
<xsl:template match="/">
<html>
<head><title> XSL Test
</title></head>
<body>
<p> That's all folks!
</p>
<xsl:comment> for now
</xsl:comment>
</body>
</html>
</xsl:template>
is an HTML tree that contains a single XSLT instruction:
<xsl:comment> for now </xsl:comment>
Executing this instruction produces a single HTML comment node:
<!-- for now -->
Here's the result of instantiating the template:
<html>
<head><title> XSL Test
</title></head>
<body>
<p> That's all folks!
</p>
<!-- for now -->
</body>
</html>
A more interesting XSLT instruction is value-of:
<xsl:value-of select = "EXPRESSION" />
This instruction will be replaced by the value of EXPRESSION. For example, the instruction:
<xsl:value-of select="6 * 7"/>
will be replaced by the text node "42".
Of course expressions may contain variable references and calls to built-in functions. A variable declaration is an element of the form:
<xsl:variable name = "NAME" select = "EXPRESSION" />
This element may be placed inside a template (in which case the variable has template scope) or outside of all templates (in which case the variable has global scope). It's a little misleading to call this a variable declaration, because XSLT is a side-effect-free language, and therefore does not provide a way to update variables. Hence, what is being declared might be more properly called a constant.
A variable reference is the name of the variable prefaced by a "$" sign.
For example, the following declarations and template:
<xsl:variable name = "pi" select =
"3.1416" />
<xsl:variable name = "rad" select = "10" />
<xsl:variable name = "rad2" select = "$rad * $rad" />
<xsl:template match="/">
<html>
<head><title> XSL Test
</title> </head>
<body>
Approximate area =
<xsl:value-of select =
"round($rad2 * $pi)" />
</body>
</html>
</xsl:template>
Produces the following HTML tree:
<html>
<head><title> XSL Test
</title></head>
<body>
Approximate area = 314
</body>
</html>
Still more interesting, is that EXPRESSION may refer to nodes in the source tree. In other words, the nodes produced by EXPRESSION may depend on and include source tree nodes. This is because the processor evaluates expressions relative to a context:
result = processor.eval(EXPRESSION, CONTEXT)
The context includes two variables:
currentNodeList = current set of source tree nodes being
processed
currentNode = element of currentNodeList currently being processed
Initially, the current node list consists of a single node, the root of the source tree. This was determined by the match attribute of the template:
<xsl:template match="/">
Thus, template instructions can pull data from the source tree, manipulate it, then stick it in the result tree.
A phone book consists of one or more contacts. A contact consists of a name and phone number. Here's an example:
<phonebook>
<contact>
<name> Joe Smith </name>
<phone areaCode =
"510"> 553-2410 </phone>
</contact>
<contact>
<name> Bob Jones </name>
<phone
areaCode = "415"> 222-3456 </phone>
</contact>
<contact>
<name> Work </name>
<phone areaCode =
"415"> 345-2222 </phone>
</contact>
</phonebook>
Our style sheet consists of a single template containing a single value-of instruction:
<xsl:template match="/">
<html>
<head><title> XSL Test
</title> </head>
<body>
<xsl:value-of
select="phonebook/contact/name"/>
</body>
</html>
</xsl:template>
To instantiate this template we must execute the value-of instruction. Its selected expression:
phonebook/contact/name
is construed as a relative path in the source tree. The value of this expression will be the set of all source tree nodes that can be reached by this path from the current (i.e., root) node of the source tree. In other words, its value is the set of names:
{ "Joe Smith", "Bob Jones", "Work" }
In this case, the XSLT processor replaces the value-of instruction in the template by the first element of the set. The rest are ignored.
<html>
<head> <title> XSL Test
</title> </head>
<body>
Joe Smith
</body>
</html>
We could have selected the second (third) name by specifying position 2 (3) after the second step:
<xsl:value-of select="phonebook/contact[2]/name"/>
For example, instantiating the template:
<xsl:template match="/">
<html>
<head><title> XSL Test
</title> </head>
<body>
<xsl:value-of
select="phonebook/contact[1]/name"/> <br />
<xsl:value-of
select="phonebook/contact[2]/name"/> <br />
<xsl:value-of
select="phonebook/contact[3]/name"/> <br />
</body>
</html>
</xsl:template>
produces:
<html>
<head><title> XSL Test
</title></head>
<body>
Joe Smith <br>
Bob Jones <br>
Work <br>
</body>
</html>
We can also access the attributes of source tree nodes by placing an attribute node (@ATTRIBUTE) as the last step. Instantiating the following template:
<xsl:template match="/">
<html>
<head><title> XSL Test
</title> </head>
<body>
The phone number of
<xsl:value-of
select="phonebook/contact/name"/>
is (
<xsl:value-of select =
"phonebook/contact/phone/@areaCode"/>
) <xsl:value-of select =
"phonebook/contact/phone"/>
</body>
</html>
</xsl:template>
produces the tree:
<html>
<head> <title> XSL Test
</title> </head>
<body>
The phone number of Joe Smith is
(510) 553-2410
</body>
</html>
Of course a style sheet may contain many templates. The processor will select the "best" one and instantiate it. A template may contain apply-template instructions:
<xsl:apply-templates select = "PATH" />
where PATH is a path-valued expression.
This instruction selects a set of nodes in the source tree, finds an appropriate template to match each one, then instantiates it. The instruction is replaced by the set of instantiated templates. The set of selected source tree nodes is the set of all children of the current node that match PATH.
The following XML file contains a partial inventory of a book shop that sells books, CDs, and magazines. The structure of the file is non-uniform.
<?xml version = "1.0"?>
<inventory>
<book>
<title> How to Draw
</title>
<author> Thomas Kinkaid
</author>
<publisher> Prentice Hall
</publisher>
</book>
<cd>
<title> Guitar Made Easy
</title>
<artist> Eddy Van Halen
</artist>
<label> Colombia
</label>
</cd>
<book>
<title> Guitar Made Easy
</title>
<author> Eddy Van Halen
</author>
<publisher> Modern Library
</publisher>
</book>
<magazine>
<title> Time </title>
<issue> May 2003
</issue>
</magazine>
<cd>
<title> Trucker Tunes
</title>
<artist> Commander Cody
</artist>
<label> Vox </label>
</cd>
</inventory>
Our style sheet contains six templates. None of them match the root node, so the processor selects the default template, which searches for a template that matches its only child, the inventory node. The following template is selected and instantiated:
<xsl:template match="inventory">
<xsl:message> Entering inventory
template </xsl:message>
<html><head><title>
XSL Test </title> </head>
<body>
<h1> Book Shop Inventory
</h1>
<hr />
<h2> Books </h2>
<xsl:apply-templates select
= "book"/>
<hr />
<h2> CDs </h2>
<xsl:apply-templates select
= "cd"/>
<hr />
<h2> Magazines </h2>
<xsl:apply-templates select
= "magazine"/>
<hr />
</body>
</html>
<xsl:message> Exiting inventory
template </xsl:message>
</xsl:template>
This template contains three apply-template instructions. The first instruction selects all children of the current inventory node that are books. The processor searches for a template that processes books:
<xsl:template match="book">
<xsl:message> Entering book
template </xsl:message>
<xsl:number/>.
Title: <xsl:value-of select =
"title" /> <br />
Author: <xsl:value-of select =
"author" /> <br />
Publisher: <xsl:value-of select =
"publisher" /> <br /> <br />
<xsl:message> Exiting book
template </xsl:message>
</xsl:template>
This template contains two new instructions. The number instruction is replaced by the number of the child node being processed. The body of the message instruction is displayed by the processor. This is useful for debugging.
This template will be applied to every book child of the inventory. It extracts the title, author, and publisher of each one.
The second apply-template instruction in the inventory template selects all CD nodes and searches for a CD processing template:
<xsl:template match="cd">
<xsl:message> Entering cd
template </xsl:message>
<xsl:number/>.
<xsl:apply-templates select =
"title" />
<xsl:apply-templates select =
"artist" />
<xsl:apply-templates select =
"label" />
<br />
<xsl:message> Exiting cd template
</xsl:message>
</xsl:template>
This template uses apply-templates to select title, artist, and label children and searches for an appropriate template. Each of these templates call apply-templates without a select attribute. In this case the default template is selected, which simply prints the text node children of the current node.
<xsl:template match="artist">
<xsl:message> Entering artist
template </xsl:message>
Artist: <xsl:apply-templates/>
<br />
<xsl:message> Exiting artist template
</xsl:message>
</xsl:template>
<xsl:template match="label">
<xsl:message> Entering label
template </xsl:message>
Label: <xsl:apply-templates/>
<br />
<xsl:message> Exiting label
template </xsl:message>
</xsl:template>
<xsl:template match="title">
<xsl:message> Entering title
template </xsl:message>
Title: <xsl:apply-templates/>
<br />
<xsl:message> Exiting title
template </xsl:message>
</xsl:template>
Here's the output produced by the processor:
file:///C:/pearce/xslt/people/test4.xsl; Line 6; Column 17; Entering inventory template
file:///C:/pearce/xslt/people/test4.xsl; Line 27; Column 17; Entering book template
file:///C:/pearce/xslt/people/test4.xsl; Line 32; Column 20; Exiting book template
file:///C:/pearce/xslt/people/test4.xsl; Line 27; Column 17; Entering book template
file:///C:/pearce/xslt/people/test4.xsl; Line 32; Column 20; Exiting book template
file:///C:/pearce/xslt/people/test4.xsl; Line 36; Column 17; Entering cd template
file:///C:/pearce/xslt/people/test4.xsl; Line 58; Column 17; Entering title template
file:///C:/pearce/xslt/people/test4.xsl; Line 60; Column 17; Exiting title template
file:///C:/pearce/xslt/people/test4.xsl; Line 46; Column 17; Entering artist template
file:///C:/pearce/xslt/people/test4.xsl; Line 48; Column 17; Exiting artist template
file:///C:/pearce/xslt/people/test4.xsl; Line 52; Column 17; Entering label template
file:///C:/pearce/xslt/people/test4.xsl; Line 54; Column 17; Exiting label template
file:///C:/pearce/xslt/people/test4.xsl; Line 42; Column 17; Exiting cd template
file:///C:/pearce/xslt/people/test4.xsl; Line 36; Column 17; Entering cd template
file:///C:/pearce/xslt/people/test4.xsl; Line 58; Column 17; Entering title template
file:///C:/pearce/xslt/people/test4.xsl; Line 60; Column 17; Exiting title template
file:///C:/pearce/xslt/people/test4.xsl; Line 46; Column 17; Entering artist template
file:///C:/pearce/xslt/people/test4.xsl; Line 48; Column 17; Exiting artist template
file:///C:/pearce/xslt/people/test4.xsl; Line 52; Column 17; Entering label template
file:///C:/pearce/xslt/people/test4.xsl; Line 54; Column 17; Exiting label template
file:///C:/pearce/xslt/people/test4.xsl; Line 42; Column 17; Exiting cd template
file:///C:/pearce/xslt/people/test4.xsl; Line 58; Column 17; Entering title template
file:///C:/pearce/xslt/people/test4.xsl; Line 60; Column 17; Exiting title template
file:///C:/pearce/xslt/people/test4.xsl; Line 23; Column 17; Exiting inventory template
Here's the output produced:
<html>
<head><title> XSL Test
</title></head>
<body>
<h1> Book Shop Inventory
</h1> <hr>
<h2> Books </h2>
1. Title: How to Draw <br>
Author: Thomas Kinkaid <br>
Publisher: Prentice Hall <br>
<br>
2.Title: Guitar Made Easy <br>
Author: Eddy Van Halen <br>
Publisher: Modern Library <br>
<br>
<hr>
<h2> CDs </h2>
1. Title: Guitar Made Easy <br>
Artist: Eddy Van Halen <br>
Label:
<br>
2. Title: Trucker Tunes <br>
Artist: Commander Cody <br>
Label: Vox <br>
<br><hr>
<h2> Magazines </h2>
Title: Time <br>
May 2003
<hr>
</body>
</html>