Standards Conformance
SAXON XSLT implements the XSLT 1.0 and XPath 1.0 Recommendations from the World Wide Web Consortium, found at http://www.w3.org/TR/1999/REC-xslt-19991116 and http://www.w3.org/TR/1999/REC-xpath-19991116, which are referred to here collectively as "XSLT"
SAXON is 100% conformant to the mandatory requirements of these standards, subject to the limitations noted below.
SAXON also
implements certain facilities defined in the draft XSLT 1.1 specification, but these
are only available when the stylesheet specifies version="1.1"
, or when the relevant
construct is within a literal result element that specifies xsl:version="1.1"
.
Note that the W3C XSL Working Group has announced that XSLT 1.1 will not be taken beyond
working draft status. Most of these facilities (with the exception of the Java and Javascript
language bindings) have been carried forward into XSLT 2.0, but in some cases with different syntax.
The XSLT 1.1 features supported are as follows:
SAXON will automatically convert a result tree fragment to a node-set when required, as defined in the draft XSLT 1.1 specification. (This feature is retained in XSLT 2.0, though in a more generalized form).
SAXON supports the <xsl:document> element defined in the draft XSLT 1.1 specification.
The implementation is incomplete: in particular (a) the href attribute is interpreted
as a filename relative to the current directory (not the base URI), and (b) the provisions
relating to use of
SAXON supports the use of xsl:script to define external Java functions. The archive attribute is available only with a JDK1.2 or later JVM. The rules for identifying a Java class and a method within the class are as defined in the working draft XSLT 1.1 specification. The Context object implements the XSLTContext interface defined in XSLT 1.1.
SAXON supports the use of xml:base to define the Base URI of a node, as defined in
the XSLT 1.1 specifications, except in the case of the base URI of a processing instruction contained
in an external entity. (This feature is supported regardless of the setting of
[xsl:]version
)
SAXON also supports synonyms saxon:script
for xsl:script
, and
saxon:output
for xsl:document
; the synonyms can be used in a version 1.0
stylesheet, following the normal rules for extension elements. Conversion of a result tree fragment
to a node-set can be achieved using the saxon:node-set()
or exslt:node-set()
extension functions, regardless of the version setting.
SAXON is dependant on the user-selected XML parser to ensure conformance with the XML 1.0 Recommendation and the XML Namespaces Recommendation.
SAXON implements the <?xml-stylesheet?> processing instruction as described in the W3C Recommendation Associating StyleSheets with XML Documents. The href pseudo-attribute must be a URI identifying an XML document containing a stylesheet, or a URI with a fragment identifier identifying an embedded stylesheet. The fragment must be the value of an ID attribute declared as such in the DTD.
For a more up-to-date list of limitations in this release see limitations.html.
SAXON supports language-dependent sorting and numbering only for English, but offers APIs that allow support of other languages via user-written additions.
The XSLT 1.0 specification defines the semantics of the format-number
function in terms of the DecimalFormat
class in Java JDK 1.1. In Saxon,
the behavior follows whichever version of the JDK is actually being used. This means that
extensions defined in later JDK releases will be available in Saxon even though they are
errors according to the specification. In addition, Saxon's format-number
requires
all characters used in the decimal format string to be 16-bit characters; Unicode surrogate
pairs are not supported.
The JDK 1.1 implementation of DecimalFormat
included a "feature" that was
not described in the JDK 1.1 documentation, but was subsequently made part of the specification:
if a subpicture is included for negative numbers, the only part of the subpicture that is taken into
account is the prefix and suffix. Because Saxon relies on the Java implementation, it incorporates
this "feature", which has not been retained in the XSLT 2.0 specification of format-number
.
Another limitation imposed by the Java DecimalFormat
class is that the characters
used for digits, for the decimal point and grouping separator, and for other roles such as percent signs,
must all be 2-byte characters, that is, Unicode characters with codepoints less than 65536.
An erratum to the XSLT 1.0 specification states that xsl:copy
and xsl:copy-of
may be used to add a namespace node to a newly constructed element. This works correctly in Saxon, provided
that the name of the namespace node does not clash with the namespace prefix that Saxon has allocated to the
element for serialization purposes. (In principle, this namespace prefix should not be present in the result
tree, and should therefore be incapable of causing a conflict.) More details: see bug
637117.
Saxon's checking of the characters used in a QName is an approximation to the rules in the XML specification. Some rarely-used characters are permitted when they should be rejected, and there may also be a few cases where valid characters are rejected.
As permitted by the XSLT specification, Saxon imposes limits on the processing resources that can be used by a transformation. Most of these limits are implicit in the design of the NamePool:
These limits apply to all the documents sharing a single NamePool. The limits can be extended by using multiple NamePool instances.
The XSLT specification says that the documentation for an implementation should specify
which URI schemes
are supported. SAXON supports the URI scheme implemented by the Java java.net.URL
class, with
the optional addition of a fragment identifier, as described below.
Additionally,
SAXON allows the user to nominate a URIResolver class which can be used to implement any URI scheme
the user wants.
The XSLT specification says that the documentation for an implementation should specify for which media types fragment identifiers are supported. The standard URI resolver supports access to XML documents only. A simple fragment identifier is allowed, consisting of the value of an ID attribute in the document. The effect is to return the subdocument rooted at the element with this identifier if there is one, or an empty document otherwise. For example, the URI mydoc.xml#aaa locates the XML document mydoc.xml, and if it contains an element <eeee id="aaa">, where id is an attribute of type ID, then the document retrieved is an XML document with this <eeee> element as its outermost (document) element.
The values of the vendor-specific system properties are:
xsl:version | 1.0 |
xsl:vendor | SAXON n.n.n from Michael Kay |
xsl:vendor-url | http://saxon.sf.net/ |
All three values are subject to change in future releases. Users wishing to test whether the processor is SAXON are advised to test whether the xsl:vendor system property starts with the string "SAXON".
SAXON implements a number of extensions to standard XSLT, following the rules for extension functions and extension elements where appropriate. The extensions are documented in extensions.html. They are all implemented in accordance with the provisions in the standard for extensibility.
The following is the list of encodings recognized by the built-in AElfred parser (case-insensitive):
ISO-8859-1, 8859_1, ISO8859_1 US-ASCII, ASCII UTF-8, UTF8 ISO-10646-UCS-2, UTF-16, UTF-16BE, UTF-16LE
The encodings available on output are the intersection of:
ascii, us-ascii, utf-8, utf8, utf-16, utf16, iso-8859-1, iso-8859-2 ko18-r, cp852, cp1250, windows-1250, cp1251, windows-1251 (again case-insensitive)
with whatever your Java VM supports.
If you select an encoding that the Java VM recognizes, but which is not in the above list, then the output will be written in the requested encoding, but all non-ASCII characters will be written as character references.
Saxon can be used with any SAX-conformant XML parser. The extent of XML conformance depends entirely on the chosen parser.
The default parser is a version of Ælfred. There is one known non-conformance in the version of the AElfred parser provided with the Saxon product: it does not enforce the constraint that the contents of a general entity must be well-formed. Note, however, that this parser does not perform XML validation.
SAXON accepts input (both source document and stylesheet) from any standards-compliant DOM implementation.
SAXON allows the result tree to be attached to any Document or Element node of an existing DOM. Any DOM implementation can be used, provided it is mutable.
SAXON's internal tree structure (which is visible through the Java API, including the case where Java extensions functions are called from XPath expressions) conforms with the minimal requirements of the DOM level 2 core Java language binding. This DOM interface is read-only, so all attempts to call updating methods throw an appropriate DOM exception. No optional features are implemented. The DOM interfaces to Saxon's tree structure do not reveal namespace nodes as attributes. This means it is not possible to get information about namespace declarations except by calls such as getPrefix() and getNamespaceURI() on Element and Attr nodes).
If an extension function returns a DOM Node or NodeList, this must consist only of Nodes in a tree constructed using Saxon. Since Saxon's trees cannot be updated using DOM methods, this means that the nodes returned must either be nodes from the original source tree, or nodes from a tree constructed using Saxon's proprietary API. It is not possible to construct the tree using DOM methods such as createElement() and createAttribute().
Saxon implements the JAXP 1.1 API (including TrAX), as defined in JSR-63. Saxon implements the interfaces in the javax.xml.transform package in full, including support for SAX, DOM, and Stream input, and SAX, DOM, and Stream output.
There are restrictions in using transform() on a DOMSource when the node to be transformed is a node other than the root (i.e. the DOM Document node). These apply only if the supplied DOM is a third-party DOM, not if it is a Saxon-constructed tree. Specifically, if the start node is not the root then it must be an element; and it must not have an ancestor or preceding-sibling node, or an ancestor with a preceding-sibling node, that is an entity reference node or CDATA section node. In addition, the element must be part of a tree that is rooted at a Document node.
When an identity transform is carried out (that is, a transform that uses no Templates object), with the source being a DOMSource, then the entire DOM Document is copied, regardless of the start node wrapped by the DOMSource object. The specification in this area is not clear, but Saxon's behavior differs from other implementations.
Saxon also implements the javax.xml.parsers
API. The SAX interfaces are implemented in full.
The DOM interfaces are limited by the capabilities of the Saxon DOM, specifically the fact that
it is read-only. Nevertheless, the DocumentBuilder may be used to construct a Saxon tree, or to
obtain an empty Document node which can be supplied in a DOMResult to hold the result of a
transformation.
Where the XSLT specification requires that an error be signaled, Saxon produces an error message and terminates stylesheet execution. In the case of errors detected at compile time, it attempts to report as many errors as possible before terminating; in the case of run-time errors, it terminates after the first error.
Where the XSLT specification states that the processor may recover from an error, Saxon takes one of three actions as described in the table below. Either it signals the error and terminates execution, or it recovers silently from the error in the manner permitted by the specification, or it places the action under user control. In the latter case there are three options: report the error and terminate, recover silently, or (the default) recover after writing a warning to the system error output stream. These actions can be modified by supplying a user-defined ErrorListener.
Handling of individual recoverable errors is described in the table below.
Error | Action |
There is more than one template rule that matches a node, with the same import precedence and priority | User option |
There is more that one xsl:namespace-alias statement for a given prefix, with the same import precedence | Recover silently |
An element name defined using xsl:element is invalid | User option |
An attribute name defined using xsl:attribute is invalid | User option |
There are several attribute sets with the same import precedence that define the same named attribute | Recover silently |
A processing-instruction name defined using xsl:processing-instruction is invalid | User option |
A node other than a text node is written to the result tree while instantiating xsl:attribute, xsl:comment, or xsl:processing-instruction | User option |
Invalid characters are written to the content of a comment or processing instruction | User option |
An attribute node or namespace node is written directly to the root of a result tree fragment, or to any other node that is not an element node. | User option |
The document() function identifies a resource that cannot be retrieved | User option |
There are several xsl:output elements specifying the same attribute with the same import precedence | Recover silently |
disable-output-escaping is used for a text node while instantiating xsl:attribute, xsl:comment, or xsl:processing-instruction | Recover silently |
disable-output-escaping is used for a text node within a result tree fragment that is subsequently converted to a string or number | Recover silently |
disable-output-escaping is used for a text node containing a character that cannot be output using the target encoding | Recover silently |
Michael H. Kay
Saxonica Limited
22 June 2005