This page describes how to use Saxon XSLT Stylesheets, either from the command line, or from the Java API, or as an Applet in the browser.
The Java class net.sf.saxon.Transform
has a main program that
may be used to apply a given style sheet to a given source XML document. The form of
command is:
java net.sf.saxon.Transform [options] source-document stylesheet [ params...]
The options must come first, then the two file names, then the params. The stylesheet is omitted if the -a option is present.
If you are are not using any Java extension functions, you can use the simpler form of command:
java -jar dir/saxon7.jar [options] source-document stylesheet [ params ]
Note, however, that this does not work if you need to load user-written extension functions from the classpath.
The options are as follows (in any order):
-a | Use the xml-stylesheet processing instruction in the source document to identify the stylesheet to be used. The stylesheet argument should be omitted. |
-c | Indicates that the stylesheet argument identifies a compiled stylesheet rather than an XML source stylesheet. The stylesheet must have been previously compiled as described in Compiling a Stylesheet. |
-ds | -dt | Selects the implementation of the internal tree model. -dt selects the "tinytree" model (the default). -ds selects the traditional tree model. See Choosing a tree model below. |
-l | Switches line numbering on for the source document. Line numbers are accessible through the extension function saxon:line-number(), or from a trace listener. |
-m classname | Use the specified Emitter to process the output from xsl:message. The class must implement the net.sf.saxon.output.Emitter class. This interface is similar to a SAX ContentHandler, it takes a stream of events to generate output. In general the content of a message is an XML fragment. By default the standard XML emitter is used, configured to write to the standard error stream, and to include no XML declaration. Each message is output as a new document. |
-noext | Suppress calls on extension functions, other than system-supplied Saxon and EXSLT extension
functions. This option is useful when loading an untrusted stylesheet, perhaps from a remote
site using an http:// URL; it ensures that the stylesheet cannot call Java methods
and thereby gain privileged access to resources on your machine.
|
-o filename | Send output to named file. In the absence of this option, the results go to standard output.
If the source argument identifies a directory, this option is mandatory and must also identify
a directory; on completion it will contain one output file for each file in the source directory.
You will also need to use this option (rather than sending the results to standard output) if
the stylesheet writes secondary output files using the xsl:result-document
instruction; the href attribute of this instruction is regarded as a relative URL,
and is interpreted relative to the URL of the principal output destination.
|
-r classname | Use the specified URIResolver to process all URIs. The URIResolver is a user-defined class, that extends the net.sf.saxon.URIResolver class, whose function is to take a URI supplied as a string, and return a SAX InputSource. It is invoked to process URIs used in the document() function, in the xsl:include and xsl:import elements, and (if -u is also specified) to process the URIs of the source file and stylesheet file provided on the command line. |
-t | Display version and timing information to the standard error output. The output also traces the files that are read and writting, and extension modules that are loaded. |
-T | Display stylesheet tracing information to the standard error output. This traces execution of each instruction in the stylesheet, so the output can be quite voluminous. Also switches line numbering on for the source document. |
-TJ | Switches on tracing of the binding of calls to external Java methods. This is useful when analyzing why Saxon fails to find a Java method to match an extension function call in the stylesheet, or why it chooses one method over another when several are available. |
-TL classname | Run the stylesheet using the specified TraceListener. The classname names a user-defined class, which must implement net.sf.saxon.trace.TraceListener |
-TP | Run the stylesheet using the TraceListener TimedTraceListener . This creates
an output file giving timings for each instruction executed. This output file can subsequently
be analyzed to give an execution time profile for the stylesheet.
See Performance Analysis below. |
-u | Indicates that the names of the source document and the style document are URLs; otherwise they are taken as filenames, unless they start with "http:" or "file:", in which case they are taken as URLs |
-v | Requests XML validation of the source file and of any files read using the document() function. Requires an XML parser that supports validation. |
-w0, w1, or w2 | Indicates the policy for handling recoverable errors in the stylesheet: w0 means recover silently, w1 means recover after writing a warning message to the system error output, w2 means signal the error and do not attempt recovery. (Note, this does not currently apply to all errors that the XSLT recommendation describes as recoverable). The default is w1. |
-x classname | Use specified SAX parser for source file and any files loaded using the document() function. The parser must be the fully-qualified class name of a Java class that implements the org.xml.sax.Parser or org.xml.sax.XMLReader interface |
-y classname | Use specified SAX parser for stylesheet file, including any loaded using xsl:include or xsl:import. The parser must be the fully-qualified class name of a Java class that implements the org.xml.sax.Parser or org.xml.sax.XMLReader interface |
-? | Display command syntax |
source-document | Identifies the source file or directory. Mandatory. If this is a directory, all the files in the directory will be processed individually. In this case the -o option is mandatory, and must also identify a directory, to contain the corresponding output files. A directory must be specified as a filename, not as a URL. The source-document can be specified as "-" to take the source from standard input. |
stylesheet | Identifies the stylesheet. Mandatory unless the -a option is used. If the -c option is used, this argument identifies a compiled stylesheet. The stylesheet argument can be specified as "-" to read the stylesheet from standard input. |
A param takes the form name=value
, name being the
name of the parameter, and value the value of the parameter. These parameters are
accessible within the stylesheet as normal variables, using the $name
syntax, provided
they are declared using a top-level xsl:param
element. If there is no such declaration, the supplied
parameter value is silently ignored. If the xsl:param
element has an as
attribute indicating the required type, then the string value supplied on the command line is cast
to this type: this may result in an error, for example if an integer is required and the supplied value
cannot be converted to an integer.
A param preceded by a leading exclamation mark (!) is interpreted as an output parameter.
For example, !indent=yes
requests indented output. This is equivalent to specifying
the attribute indent="yes"
on an xsl:output
declaration in the stylesheet.
An output parameter specified on the command line overrides one specified within the stylesheet.
A param preceded by a leading plus sign (+) is interpreted as a filename or directory.
The content of the file is parsed as XML, and the resulting document node is passed to the stylesheet
as the value of the parameter. If the parameter value is a directory, then all the immediately contained
files are parsed as XML, and the resulting sequence of document nodes is passed as the value of the
parameter. For example, +lookup=lookup.xml
sets the value of the stylesheet parameter
lookup
to the document node at the root of the tree representing the parsed contents
of the file lookup.xml
.
Under Windows, and some other operating systems, it is possible to supply a value containing
spaces by enclosing it in double quotes, for example name="John Smith"
. This is a feature
of the operating system shell, not something Saxon does, so it may not work the
same way under every operating system.
If the parameter name is in a non-null namespace, the parameter can be given a value using
the syntax {uri}localname=value
. Here uri
is the namespace URI of the
parameter's name, and localname
is the local part of the name.
This applies also to output parameters. For example, you can set the indentation level to 4 by
using the parameter !{http://saxon.sf.net/}indent-spaces=4
. For the extended set of
output parameters supported by Saxon, see extensions.html.
If the -a
option is used, the name of the stylesheet is
omitted. The source document must contain a <?xml-stylesheet?>
processing instruction
before the first element start tag; this processing instruction must have a pseudo-attribute href
that
identifies the relative or absolute URL of the stylsheet document, and a pseudo-attribute type whose
value is text/xml
, application/xml
, or text/xsl
. For example:
<?xml-stylesheet type="text/xsl" href="../style3.xsl" ?>
It is also possible to refer to a stylesheet embedded within the source document, provided it has an id attribute and the id attribute is declared in the DTD as being of type ID. For example:
<?xml-stylesheet type="text/xsl" href="#style1" ?>
<!DOCTYPE BOOKLIST SYSTEM "books.dtd"
<!ATTLIST xsl:transform id ID #IMPLIED>
<
<BOOKLIST>
...
<xsl:transform id="style1" version="1.0" xmlns:xsl="...">
...
</xsl:transform>
</BOOKLIST>
Saxon (from release 7.3) allows a compiled stylesheet (a Templates object)
to be saved to disk as a file.
The transformation can then be run using this compiled stylesheet by using the -c option
to the command java net.sf.saxon.Transform
.
The actual compilation of the stylesheet can be achieved using the command
java net.sf.saxon.Compile
. The format of the command is:
java net.sf.saxon.Compile [options] stylesheet output [ params ]
The options available are a subset of the options for running a transformation, described above. The relevant options are -t (give progress messages), -u (stylesheet argument is a URI, not a filename), -r (specify a URIResolver for use at compile time), -y (specify an XML parser for parsing the source stylesheet).
You can use any file extension for the compiled stylesheet, I generally use .sxx (Saxon XSLT executable).
The file actually contains the Java serialization of a data structure that is then used at run-time to
drive the transformation process. As an alternative to using the net.sf.saxon.Compile
command
from the command line, you can use the Java serialization API directly, to write the net.sf.saxon.Templates
object to an ObjectOutputStream.
Stylesheet compilation is a little fragile at this release. It has proved difficult to test it
comprehensively. One known restriction is that stylesheets containing saxon:collation
declarations
cannot be compiled (because it uses Java classes that are not serializable). There may be other restrictions:
please let me know if you find any.
The term "compile" is stretching a point. The executable that is produced does not contain machine instructions, or even interpreted Java bytecode. It contains instructions in the form of a data structure that Saxon itself can interpret. Note that the format of compiled stylesheets is unlikely to be stable from one Saxon release to the next.
Saxon provides two implementations of the internal tree data structure (or tree model). The tree model can be chosen by an option on the command line (-dt for the tiny tree, -ds for the standard tree) or from the Java API. The default is to use the tiny tree model. The choice should make no difference to the results of a transformation (except the order of attributes and namespace declarations) but only affects performance.
There is an exception to this with release 7.3: the so-called "standard" tree model (-ds) does not support type annotations. If you want to use the new features to annotate element and attribute nodes, use the "tiny tree" (-dt).
Generally speaking, the tiny tree model is faster to build but slower to navigate. It therefore performs better when you visit each node on the tree once or less. The standard tree model may perform better (sometimes very much better) when each node is visited many times, especially when you use the preceding or preceding-sibling axis.
The tiny tree model gives most benefit when you are processing a large document. It uses a lot less memory, so it can prevent thrashing when the size of document is such that the standard tree doesn't fit in real memory.
If in doubt, stick with the default.
Rather than using the interpreter from the command line, you may want to include it in your own application, perhaps one that enables it to be used within an applet or servlet. If you run the interpreter repeatedly, this will always be much faster than running it each time from a command line.
Saxon incorporates support for the JAXP 1.1 API, also known as TrAX. This is compatible with the API for invoking other XSLT processors such as Xalan and jd-xslt.
This API is described in the documentation provided with JDK 1.4. It is available online at http://java.sun.com/j2se/1.4/docs/api/ Look for the javax.xml.transform package.
More information and examples relating to the JAXP 1.1 API can be found in the TraxExamples.java example application found in the samples directory.
The types of object that can be supplied as stylesheet parameters are not defined
in the JAXP specification: they are implementation-dependent. Saxon takes the Java object
supplied, and converts it to an XPath value using the same
rules as it applies for the return value from a Java
extension function: for these rules,
see Saxon Extensibility. If the resulting value
is an atomic value, it is cast to the required type of the parameter as specified in the
xsl:param
declaration, using the XPath casting rules. If the value is non-atomic (for example,
if it is a node, or a sequence of integers), then no conversion is attempted, instead, the value must
match the required type as stated.
The JAXP TransformerFactory interface provides a configuration method
setAttribute()
for setting implementation-defined configuration parameters. The
parameters supported by Saxon have names defined by constants in the class
net.sf.saxon.FeatureKeys
. The names of these properties and their meanings,
are described in the table below.
property | meaning |
ALLOW_EXTERNAL_FUNCTIONS | A Boolean: true if the stylesheet allows external functions to be called.
Default is true. The setting false is recommended in an environment
where untrusted stylesheets may be executed. Setting this value to false
also disables user-defined extension elements, together with the writing of multiple
output files, all of which carry similar security risks. |
TRACE_EXTERNAL_FUNCTIONS | A Boolean: true if the tracing of calls to external Java methods is required. Default is false. This switch is useful when analyzing why Saxon fails to find a Java method to match an extension function call in the stylesheet, or why it chooses one method over another when several are available. The trace output is sent to System.err. |
TIMING | A Boolean: true if basic timing information is to be output to the standard error output stream. |
TREE_MODEL | An Integer: Builder.STANDARD_TREE or Builder.TINY_TREE. Selects an implementation of the Saxon tree model. The default is Builder.TINY_TREE. |
TRACE_LISTENER | An instance of the class net.sf.saxon.trace.TraceListener. This object will be notified of significant events occurring during the transformation, for tracing or debugging purposes. |
LINE_NUMBERING | A Boolean. Indicates whether line numbers are to be maintained for the source document. This will not be possible if the source document is supplied as a DOM. The line numbers are accessible through the tracing interface, and also via the saxon:line-number() extension function. |
RECOVERY_POLICY | An Integer. Indicates how dynamic errors should be handled. One of the values (defined as constants in the Controller class) RECOVER_SILENTLY, RECOVER_WITH_WARNINGS, or DO_NOT_RECOVER). |
MESSAGE_EMITTER_CLASS | The full name of a class that implements the net.sf.saxon.output.Emitter interface; the class will be used to format the output of the xsl:message instruction. |
SOURCE_PARSER_CLASS | The full name of a class that implements the org.xml.sax.XMLReader interface; the class will be used to parse source documents (that is, the principal source document plus any secondary source documents read using the document() function) |
STYLE_PARSER_CLASS | The full name of a class that implements the org.xml.sax.XMLReader interface; the class will be used to parse stylesheet documents (that is, the principal stylesheet module plus any secondary source documents read using xsl:include or xsl:import) |
OUTPUT_URI_RESOLVER | An instance of the class net.sf.saxon.OutputURIResolver; this object will be used to resolve URIs
of secondary result documents specified in the href attribute of the xsl:result-document
instruction |
VALIDATION | A Boolean. Indicates whether the XML parser should be asked to validate source documents against their DTD. This applies to the initial source document and any source documents read using the document() function, unless handled by a user-written URIResolver. |
NAME_POOL | A instance of class net.sf.saxon.om.NamePool .
Indicates that the supplied NamePool should be used as the target (run-time) NamePool by all
stylesheets compiled (using newTemplates() ) after this call on setAttribute .
Normally a single system-allocated NamePool is used for all stylesheets compiled while the Java VM
remains loaded; this attribute allows user control over the allocation of NamePools.
Note that source trees used as input to a transformation must be built using the same NamePool
that is used when the stylesheet is compiled: this will happen automatically if the input
to a transformation is supplied as a SAXSource or StreamSource but it is under user control if
you build the source tree yourself. |
Saxon's implementation of the JAXP Transformer
interface is the class net.sf.saxon.Controller
.
This provides a number of options beyond those available in the standard JAXP interface, for example the
ability to set an output URI resolver for secondary output documents, and a method to set the initial mode
before the transformation starts. You can access these methods by casting
the Transformer
to a Controller
. The methods are described in the JavaDoc documentation
supplied with the product.
When using the JAXP interface, you can set serialization properties using a java.util.Properties
object. The names of the core XSLT 1.0 properties, such as method
, encoding
,
and indent
, are defined in the JAXP class javax.xml.transform.OutputKeys
.
Additional properties, including Saxon extensions and XSLT 2.0 extensions, have names defined by
constants in the class net.sf.saxon.event.SaxonOutputKeys
. The values of the properties
are exactly as you would specify them in the xsl:output
declaration.
It is possible to run Saxon from an applet in the browser. This means that the transformation will run on the client machine, which saves resources on the server. It also means that it is possible for the user to navigate around the downloaded XML document, performing repeated transformations to see different pages of data, without ever returning to the server: this too reduces load on the server, as well as giving the user an improved response time once the XML data and the supporting Saxon software (saxon7.jar is 632Kb) have been downloaded.
Because Saxon now requires JDK 1.4, you must first ensure that the Sun Java plug-in is installed as the default Java VM for your browser
To support transformations running in an applet, Saxon includes a support routine
net.sf.saxon.XSLTProcessorApplet.class
which can be used to drive the transformation.
This rountine is in fact borrowed from the class of the same name included in the Xalan-Java 2.0.0
product. It drives the transformation using TrAX interfaces, which means that the code
works equally well with either product.
To use this routine, you first need to include an applet in your HTML page. This might take the form:
<applet
name="xslControl"
code="net.sf.saxon.XSLTProcessorApplet.class"
archive="saxon7.jar"
height="0"
width"0">
<param name="documentURL" value="source.xml"/>
<param name="styleURL" value="style.xsl"/>
</applet>
For production use it is better to use the HTML <OBJECT>
element.
This can be used to ensure that the right Java VM is used with the applet, preventing
failures that will confuse the end user.
To run a transformation, all you have to do is to call the getHTMLText() method provided by the applet, for example
<body onLoad="target.innerHTML=document.xslControl.getHTMLText()">
<div id="target">
</body>
You can specify the source document and stylesheet dynamically using the methods
xslControl.setDocumentURL(string)
or xslControl.setStyleURL(string)
.
If you don't specify
a stylesheet, Saxon will try to locate one using the <?xml-stylesheet?>
processing
instruction in the source document.
You can set parameters for the transformation (a single parameter only, unfortunately)
by calling xslControl.setStylesheetParam("name", "value")
.
There are some sample applications using Saxon as an applet in
the samples/applet
folder. Make sure that the saxon7.jar
file
is in an accessible location before you try to use them. If you want to use them as written, there
must be a copy of saxon7.jar
in the directory containing the relevant HTML pages.
Saxon comes with a simple tool allowing profiling of the execution time in a stylesheet.
To run this tool, first execute the transformation with the -TP
option, which
sends timed tracing information to the standard error output file:
java -jar dir/saxon7.jar -TP source stylesheet 2>profile.xml
Then run another transformation to create a profile report from the output of this tracing tool:
java -jar dir/saxon7.jar profile.xml timing-profile.xsl >profile.html
Finally, view the resulting profile.html
file in your browser.
The output identifies instructions in the original stylesheet by their name, line number, and the last few characters of the URI of their module. For each instruction it gives the number of times the instruction was executed, the average time in milliseconds of each execution, and the total time. The table is sorted according to a weighting function that attempts to put the dominant instructions at the top. These will not necessarily be those with the greatest time, which tend to be instructions that were only executed once but remained active for the duration of the transformation.
Michael H. Kay
2 May 2003