Saxon home page

XSLT Patterns


Contents
Introduction
Pattern syntax

Introduction

This document gives an informal description of the syntax of XSLT patterns. For a formal specification, see the XSLT recommendation. Pattern syntax has not changed significantly in XSLT 2.0, except that any XPath 2.0 expression may now be used within a predicate.

Patterns define a condition that a node may or may not satisfy: a node either matches the pattern, or it does not. The syntax of patterns is a subset of that for Nodeset Expressions (defined in expressions.html), and formally, a node matches a pattern if it is a member of the node set selected by the corresponding expression, with some ancestor of the node acting as the current node for evaluating the expression. For example a TITLE node matches the pattern "TITLE" because it is a member of the node set selected by the expression "TITLE" when evaluated at the immediate parent node.

XSLT patterns may be used either in an XSLT stylesheet, or as a parameter to various Java interfaces in the Saxon API. The syntax is the same in both cases. In the Java interface, patterns are encapsulated by the net.sf.saxon.pattern.Pattern class, and are created by calling the static method Pattern.make().

In style sheets, patterns are used primarily in the match attribute of the xsl:template element. They are also used in the count and from attributes of xsl:number, the match attribute of xsl:key, and the group-starting-at attribute of xsl:for-each-group. In Java applications, patterns are used when nominating a node handler using Controller.setHandler().

Pattern syntax

Saxon supports the full XSLT syntax for patterns. The rules below describe a simplified form of this syntax (for example, it omits the legal but useless pattern '@comment()'):


pattern          ::= path ( '|' path )*
path             ::= anchor? remainder? (Note 1)

anchor           ::= '/' | '//' | id | key
id               ::= 'id' '(' value ')'
key              ::= 'key' '(' literal ',' value ')'
value            ::= literal | variable-reference

remainder        ::= path-part ( sep path-part )* 
sep              ::= '/' | '//'
path-part        ::= node-match predicate+
node-match       ::= element-match | text-match | attribute-match | pi-match | node-match
element-match    ::= 'child::'? ( name | '*' ) 
text-match       ::= 'text' '(' ')' 
attribute-match  ::= ('attribute::' | '@') ( name | '*' ) 
pi-match         ::= 'processing-instruction' '(' literal? ')'
node-match       ::= 'node' '(' ')'

predicate        ::= '[' ( boolean-expression | numeric-expression ) ']'

Note 1: not all combinations are allowed. If the anchor is '//' then the remainder is mandatory.

The form of a literal is as defined in expressions; and a predicate is itself a boolean or numeric expression. As with predicates in expressions, a numeric predicate [P] is shorthand for the boolean predicate [position()=P].

Informally, a pattern consists of either a single path or a sequence of paths separated by vertical bars. An element matches the match-pattern if it matches any one of the paths.

A path consists of a sequence of path-parts separated by either "/" or "//". There is an optional separator ("/" or "//") at the start; a "//" has no effect and can be ignored. The last path-part may be an element-match, a text-match, an attribute-match, a pi-match, or a node-match; in practice, a path-part other than the last should be an element-match.

The axis syntax child:: and attribute:: may also be used in patterns, as described in the XSLT specification.

Examples of patterns:

Pattern Meaning
XXX Matches any element whose name (tag) is XXX
* Matches any element
XXX/YYY Matches any YYY element whose parent is an XXX
XXX//YYY Matches any YYY element that has an ancestor named XXX
/*/XXX Matches any XXX element that is immediately below the top-level element in the document
*[@NAME] Matches any element with a NAME attribute
SECTION/PARA[1] Matches any PARA element that is the first PARA child of a SECTION element
SECTION[TITLE="Contents"] Matches any SECTION element whose first TITLE child element has the value "Contents"
A/TITLE | B/TITLE | C/TITLE Matches any TITLE element whose parent is of type A or B or C (Note that this cannot be written "(A|B|C)/TITLE", although that is a valid node-set expression.)
/BOOK//* Matches any element in a document provided the top-level element in the document is named "BOOK"
A/text() Matches the character content of an A element
A/@* Matches any attribute of an A element

Michael H. Kay
12 November 2002