This document is an informal guide to the syntax of XPath 2.0 expressions, which are used in Saxon both within XSLT stylesheets, and in the Java API. For formal specifications, see the XPath 2.0 specification, except where differences are noted here.

XPath expressions may be used either in an XSL stylesheet, or as a parameter to various Java
interfaces. The syntax is the same in both cases. In the Java interface, expressions are handled
using the `net.sf.saxon.xpath.XPathEvaluator`

class, and are parsed using a call such as
`XPathEvaluator.createExpression("$a + $b")`

. This returns an object of class `net.sf.saxon.xpath.XPathExpression`

,
which provides two methods for evaluating the expression: `evaluate()`

, which returns the value
of the expression, and `iterator()`

, which allows iteration over the items in the sequence
returned by the expression. For further details of these methods, see the API documentation.

An important change in XPath 2.0 is that all values are now considered as sequences. A sequence
consists of zero or more items; an item may be a node or a simple-value. Examples of simple-values
are integers, strings, booleans, and dates. A single value such as
a number is considered as a sequence of length 1. The empty sequence is written as `()`

;
a singleton sequence may be written as `"a"`

or `("a")`

, and a general
sequence is written as `("a", "b", "c")`

.

The node-sets of XPath 1.0 are replaced in XPath 2.0 by sequences of nodes. Path expressions will return node sequences whose nodes are in document order with no duplicates, but other kinds of expression may return sequences of nodes in any order, with duplicates permitted.

This page summarizes the syntactic constructs and operators provided in XPath 2.0. The functions provided in the function library are listed separately: see functions.html.

**String literals** are written as "London" or 'Paris'. In each case you can use the opposite
kind of quotation mark within the string: 'He said "Boo"', or "That's rubbish". In a stylesheet
XSL expressions always appear within XML attributes, so it is usual to use one kind of delimiter for
the attribute and the other kind for the literal. Anything else can be written using XML character
entities. In XPath 2.0, string delimiters can be doubled within the string to represent the
delimiter itself: for example `<xsl:value-of select='"He said, ""Go!"""'/>`

**Numeric constants** follow the Java rules for decimal literals: for example, `12`

or `3.05`

; a
negative number can be written as (say) `-93.7`

, though technically the minus sign is not part of the
literal. (Also, note that you may need a space before the minus sign to avoid it being treated as
a hyphen within a preceding name). The numeric literal is taken as a double precision floating
point number if it uses scientific notation (e.g. `1.0e7`

), as fixed point decimal
if it includes a full stop, or as a integer otherwise. Decimal values in Saxon have unlimited
precision, integers are limited to 64 bits. Note that a value such as `3.5`

was
handled as a double-precision floating point number in XPath 1.0, but as a decimal number in
XPath 2.0: this may affect the precision of arithmetic results. Saxon implements decimal arithmetic
using the Java class `java.math.BigDecimal`

There are no boolean constants as such: instead use the function calls `true()`

and
`false()`

.

Constants of other data types can be written using constructors, which look like function calls
but require a string literal as their argument. For example, `xs:float("10.7")`

produces
a single-precision floating point number. Saxon implements constructors for many of the built-in data types
defined in XML Schema Part 2: for a full list see conformance.html.

An example for **date** and **dateTime** values:
you can write constants for these data types as `xs:date("2002-04-30")`

or `xs:dateTime("1966-07-31T15:00:00Z")`

.

*The latest (November 2002) draft of XPath 2.0 allows the argument to a constructor to
contain whitespace, as determined by the whitespace facet for the target data type. This feature
is implemented in Saxon 7.4*

The value of a variable (local or global variable, local or global parameter) may be referred to
using the construct `$`

, where *name**name* is the variable name.

The variable is always evaluated at the textual place where the expression containing it appears;
for example a variable used within an `xsl:attribute-set`

must be in scope at the point where the
attribute-set is defined, not the point where it is used.

A variable may take a value of any data type, and in general it is not possible to determine its data type statically.

It is an error to refer to a variable that has not been declared.

Starting with XPath 2.0, variables (known as range variables) may be declared within
an XPath expression, not only using `xsl:variable`

elements in the stylesheet. The
expressions that declare variables are the `for`

, `some`

, and `every`

expressions.

*Saxon 7.4 does not allow two range variables within an expression to have the same name.*

A function call in XPath 2.0 takes the form `F ( arg1, arg2, ...) `

. In general, the
function name is a QName. A library of core functions is defined in the XPath 2.0 and XSLT 2.0
specifications. For details of these functions, including notes on their implementation
in this Saxon release, see functions.html.
Additional functions are available (in a special namespace) as Saxon extensions:
these are listed in extensions.html. Further functions may be
implemented by the user, either as XSLT *stylesheet functions* (see xsl:function),
or as Java *extension functions* (see extensibility.html).

*In Saxon 7.4, the core function library is in no namespace; the functions are referenced
without using a namespace prefix.*

*Saxon 7.4 implements function calls using the XPath 2.0 function call rules.
Essentially, this means that the supplied value is not implicitly
cast to the required type unless (a) the supplied value is an untyped element or attribute node,
or (b) backwards compatibility mode is set (by setting version="1.0" and the required
type is string or number. In all other cases, casting must be done explicitly if required.*

The basic primitive for accessing a source document is the *axis step*. Axis steps
may be combined into path expressions using the path operators `/`

and `//`

,
and they may be filtered using filter expressions in the same way as the result of any other
expression.

An axis step has the basic form `axis :: node-test`

, and selects nodes on a given axis
that satisfy the node-test. The axes available are:

ancestor | Selects ancestor nodes starting with the current node and ending with the document node |

ancestor-or-self | Selects the current node plus all ancestor nodes |

attribute | Selects all attributes of the current node (if it is an element) |

child | Selects the children of the current node, in documetn order |

descendant | Selects the children of the current node and their children, recursively (in document order) |

descendant-or-self | Selects the current node plus all descendant nodes |

following | Selects the nodes that follow the current node in document order, other than its descendants |

following-sibling | Selects all subsequent child nodes of the same parent node |

parent | Selects the parent of the current node |

preceding | Selects the nodes that precede the current node in document order, other than its ancestors |

preceding-sibling | Selects all preceding child nodes of the same parent node |

self | Selects the current node |

When the child axis is used, `child::`

may be omitted, and when the attribute
axis is used, `attribute::`

may be abbviated to `@`

. The expression
`parent::node()`

may be shortened to `..`

*The expression . is no longer synonymous with self::node(),
since it may now select items that are not nodes. If the context item is not a node, any use of a
path expression will raise an error.*

The node-test may be:

- a node name
`prefix:*`

to select nodes in a given namespace`*:localname`

to select nodes with a given local name, regardless of namespace`text()`

(to select text nodes)`node()`

(to select any node)`processing-instruction()`

(to select any processing instruction)`processing-instruction('literal')`

to select processing instructions with the given name (target)`comment()`

to select comment nodes

*Saxon 7.4 allows the constructs @node(), @text(), etc, which were
allowed in XPath 1.0 but are not allowed in the current XPath 2.0 draft.*

In general an expression may be enclosed in parentheses without changing its meaning.

If parentheses are not used, operator precedence follows the sequence below, starting with the operators that bind most tightly. Within each group the operators are evaluated left-to-right

Operator | Meaning |

[] | predicate |

/, // | path operator |

cast as, treat as | type conversion |

except, intersect | set difference and intersection |

|, union | union operation on sets |

unary - | unary minus |

*, div, idiv, mod | multiply, divide, integer divide, modulo |

+, - | plus, minus |

to | range expression |

=, !=, is, isnot, <, <=;, >, >=;, eq, ne, lt, le, gt, ge | comparisons |

instance of, castable as | type tests |

if | conditional expressions |

some, every | quantified expressions |

for | iteration (mapping) over a sequence |

and | Boolean and |

or | Boolean or |

, (comma) | Sequence concatenation |

*The latest (November 2002) drafts of XPath 2.0 and XSLT 2.0 allow a, b, c as a top-level
expression. This is not yet implemented in Saxon 7.4. Saxon allows the comma operator only within
parentheses.*

The various operators are described, in this order, in the sections that follow.

The notation `E[P]`

is used to select items from the sequence obtained by evaluating
`E`

. If the predicate `P`

is numeric, the predicate selects an item if its
position (counting from 1) is equal to `P`

; otherwise, the *effective boolean value*
of `P`

determines whether an item is selected or not. The effective boolean value of a sequence
is false if the sequence is empty, or if it contains a single item that is one of: the boolean value
false, the zero-length string, or a numeric zero or NaN value. Otherwise, the effective boolean
value is true.

In XPath 2.0, `E`

may be any sequence, it is not restricted to a node sequence. Within
the predicate, the expression `.`

(dot) refers to the context item, that is, the item
currently being tested. The XPath 1.0 concept of context node has thus been generalized, for example
`.`

can refer to a string or a number.

Generally the order of items in the result preserves the order of items in `E`

. As a
special case, however, if `E`

is a step using a reverse axis (e.g. preceding-sibling), the
position of nodes for the purpose of evaluating the predicate is in reverse document order, but the
result of the filter expression is in forwards document order.

A path expression is a sequence of steps separated by the `/`

or `//`

operator.
For example, `../@desc`

selects the `desc`

attribute of the parent of the context
node.

In XPath 2.0, path expressions have been generalized so that any expression can be used as an operand
of `/`

, (both on the left and the right), so long as its value is a sequence of nodes. For
example, it is possible to use a union expression (in parentheses) or a call to the `id()`

or `key()`

functions. The right-hand operand is evaluated once for each node in the sequence
that results from evaluating the left-hand operand, with that node as the context item. In the result
of the path expression, nodes are sorted in document order, and duplicates are eliminated.

In practice, it only makes sense to use expressions on the right of `/`

if they depend
on the context item. It is legal to write `$x/$y`

provided both `$x`

and
`$y`

are sequences of nodes, but the result is exactly the same as writing `./$y`

.

Note that the expressions `./$X`

or `$X/.`

can be used to remove duplicates
from `$X`

and sort the results into document order. The same effect can be achieved by writing
`$X|()`

The operator `//`

is an abbreviation for `/descendant-or-self::node()/`

.
An expression of the form `/E`

is shorthand for `root(.)/E`

, and the expression
`/`

on its own is shorthand for `root(.)`

.

The expression `cast as T (E)`

converts the value of expression `E`

to type
`T`

. Since `T`

must currently be a built-in schema-defined simple type, the
effect is exactly the same as using the constructor function `T (E)`

.

*Saxon implements most of the conversions defined in the XPath 2.0 specifications, for the data
types that it supports, but the details of how the conversions are performed may vary in detail. The
specification is still evolving in this area.*

The expression `treat as T (E)`

is designed for environments that perform static type
checking. Saxon doesn't do static type checking, so this expression has very little use, except to
document an assertion that the expression `E`

is of a particular type. A run-time failure
will be reported if the value of `E`

is not of type `T`

; no attempt is made
to convert the value to this type.

These operators are new in XPath 2.0.

The expression `E1 except E2`

selects all nodes that are in `E1`

unless
they are also in `E2`

. Both expressions must return sequences of nodes. The results
are returned in document order. For example, `@* except @note`

returns all attributes
except the `note`

attribute.

The expression `E1 intersect E2`

selects all nodes that are in both `E1`

and
`E2`

. Both expressions must return sequences of nodes. The results
are returned in document order. For example, `preceding::fig intersect ancestor::chapter//fig`

returns all preceding `fig`

elements within the current chapter.

The `|`

operator was available in XPath 1.0; the keyword `union`

has been
added in XPath 2.0 as a synonym, because it is familiar to SQL users.

The expression `E1 union E2`

selects all nodes that are in either `E1`

or
`E2`

or both. Both expressions must return sequences of nodes. The results
are returned in document order. For example, `/book/(chapter | appendix)/sections`

returns
all `section`

elements within a `chapter`

or `appendix`

of the
selected `book`

element.

The unary minus operator changes the sign of a number. For example `-1`

is minus one, and
`-0e0`

is the double value negative zero.

Multiplication and division

The operator `*`

multiplies two numbers. If the operands are of different types, one
of them is promoted to the type of the other (for example, an integer is promoted to a decimal, a
decimal to a double). The result is the same type as the operands after promotion.

The operator `div`

divides two numbers. Dividing two integers produces a double; in other
cases the result is the same type as the operands, after promotion. In the case of decimal division,
the precision is the sum of the precisions of the two operands, plus six.

The operator `idiv`

performs integer division. For example, the result of
`10 idiv 3`

is `3`

.

The `mod`

operator returns the modulus (or remainder) after division. See the XPath 2.0
specification for details of the way that negative numbers are handled.

The operators `*`

and `div`

may also be used to multiply or divide
a duration by a number. For example, `fn:dayTimeDuration('PT12H') * 4`

returns the duration
two days.

The operators `+`

and `-`

perform addition and subtraction of numbers,
in the usual way. If the operands are of different types, one of them is promoted, and the result
is the same type as the operands after promotion. For example, adding two integers produces
an integer; adding an integer to a double produces a double.

Note that the `-`

operator may need to be preceded by a space to prevent it being
parsed as part of the preceding name.

*XPath 2.0 also allows these operators to be used for adding durations to dates and times, but this
is not yet implemented in Saxon. However, Saxon 7.4 does allow durations to be added to (or subtracted from)
durations.*

The expression `E1 to E2`

returns a sequence of integers. For example, `1 to 5`

returns the sequence `1, 2, 3, 4, 5`

. This is useful in `for`

expressions, for example
the first five nodes of a node sequence can be processed by writing `for $i in 1 to 5 return (//x)[$i]`

.

The simplest comparison operators are `eq`

, `ne`

, `lt`

`le`

, `gt`

, `ge`

. These compare two atomic values of the same type,
for example two integers, two dates, or two strings. In the case of strings, the default collation
is used (see saxon:collation). If the operands are
not atomic values, an error is raised.

The operators `=`

, `!=`

, `<`

, `<=`

,
`>`

, and `>=`

can compare arbitrary sequences. The result is true
if any pair of items from the two sequences has the specified relationship, for example
`$A = $B`

is true if there is an item in `$A`

that is equal to
some item in `$B`

. If an argument is a node, Saxon currently uses its string
value in the comparison, not its typed value as required by the XPath 2.0 specification.

*Saxon 7.4 implements the stricter rules of XPath 2.0 for type-checking the operands
of a comparison. Comparing a string to an integer is now an error: one of the values must
be explicitly cast to the type of the other. This is true even in backwards compatibility mode.
However, if one of the values is an untyped node, its value will be converted to the type of
the other operand; if both values are untyped, they will be compared as strings.*

The operators `is`

and `isnot`

test whether the operands represent the same
(identical) node. For example, `title[1] is *[@note][1]`

is true if the first `title`

child is the first child element that has a `@note`

attribute. If either operand is an
empty sequence the result is an empty sequence (which will usually be treated as false).

The operators `<<`

and `>>`

test whether one node precedes
or follows another in document order.

The expression `E instance of T`

tests whether the value of expression `E`

is an instance of type T, or of a subtype of T. For example, `$p instance of attribute+`

is
true if the value of `$p`

is a sequence of one or more attribute nodes. It returns false if the
sequence is empty or if it contains an item that is not an attribute node. The detailed rules for
defining types, and for matching values against a type, are given in the XPath 2.0 specification.

Saxon 7.3 implements only a subset of this syntax. It allows testing of a value against any built-in
simple type defined in XML Schema, except that some of the types are not yet implemented: see
conformance.html. The type can also be a *node-kind* such as
`element`

, `attribute`

, etc; or it can be one of the keywords `item`

or `node`

. The type can be optionally followed by the occurrence indicator `*`

,
`+`

, or `?`

.

Saxon also allows testing of the type annotation of an element or attribute node using tests of the
form `element of type T`

, `attribute of type T`

. This is of limited value at this
release, however, since the only way a node can acquire a type annotation is (a) if the node is
part of a temporary tree created within the stylesheet itself, or (b) if the node is an attribute with
a DTD-based type, for example ID.

The expression `E castable as T`

tests whether the expression `cast as T (E)`

would succeed. It is useful, for example, for testing whether a string contains a valid date before attempting
to cast it to a date. This is because XPath and XSLT currently provide no way of trapping the error if
the cast is attempted and fails.

XPath 2.0 allows a conditional expression of the form `if ( E1 ) then E2 else E3`

.
For example, `if (@discount) then @discount else 0`

returns the value of the `discount`

attribute if it is present, or zero otherwise.

The expression `some $x in E1 satisfies E2`

returns true if there is an item in the
sequence `E1`

for which the *effective boolean value* of `E2`

is true.
Note that `E2`

must use the range variable `$x`

to refer to the item being
tested; it does not become the context item. For example, `some $x in @* satisfies $x eq ""`

is true if the context item is an element that has at least one zero-length attribute value.

Similarly, the expression `every $x in E1 satisfies E2`

returns true if every item in the
sequence given by `E1`

satisfies the condition.

The expression `for $x in E1 return E2`

returns the sequence that result from evaluating
`E2`

once for every item in the sequence `E1`

. Note that `E2`

must
use the range variable `$x`

to refer to the item being
tested; it does not become the context item. For example, ```
sum(for $v in order-item return
$v/price * $v/quantity)
```

returns the total value of (price times quantity) for all the
selected `order-item`

elements.

The expression `E1 and E2`

returns true if the *effective boolean values* of
`E1`

and `E2`

are both true.

The expression `E1 or E2`

returns true if the *effective boolean values* of
either or both of `E1`

and `E2`

are true.

The expression `E1 , E2`

returns the sequence obtained by concatenating the sequences
`E1`

and `E2`

.

For example, `$x = ("London", "Paris", "Tokyo")`

returns true if the value of `$x`

is one of the strings listed.

*Saxon 7.4 does not allow this operator to appear at the top level: the comma operator may only
appear inside a parenthesized expression.*

Michael H. Kay

14 February 2003