Source: XQuery 1.0: An XML Query Language

An extract from XQuery 1.0: An XML Query Language

Copyright © Dave Pawson 2009, Revision 1.2. Date 2009-01-25T08:58:03Z.

Some useful references

Some links which I think are useful.

YMMV.

Table of contents


Introduction

Definition: data model. XQuery operates on the abstract, logical structure of an XML document, rather than its surface syntax. This logical structure, known as the data model, is defined in .

Definition: implementation defined. Implementation-defined indicates an aspect that may differ between implementations, but must be specified by the implementor for each particular implementation.

Definition: implementation dependent. Implementation-dependent indicates an aspect that may differ between implementations, is not specified by this or any W3C specification, and is not required to be specified by the implementor for any particular implementation.


Basics

Definition: value sequence item atomic value node. In the data model, a value is always a sequence.A sequence is an ordered collection of zero or more items.An item is either an atomic value or a node.An atomic value is a value in the value space of an atomic type, as defined in .A node is an instance of one of the node kinds defined in . Each node has a unique node identity, a typed value, and a string value. In addition, some nodes have a name. The typed value of a node is a sequence of zero or more atomic values. The string value of a node is a value of type xs:string. The name of a node is a value of type xs:QName.

Definition: singleton empty sequence. A sequence containing exactly one item is called a singleton. An item is identical to a singleton sequence containing that item. Sequences are never nested—for example, combining the values 1, (2, 3), and ( ) into a single sequence results in the sequence (1, 2, 3). A sequence containing zero items is called an empty sequence.

Definition: XDM instance. The term XDM instance is used, synonymously with the term value, to denote an unconstrained sequence of nodes and/or atomic values in the data model.

Definition: QName expanded QName. Names in XQuery are called QNames, and conform to the syntax in . Lexically, a QName consists of an optional namespace prefix and a local name. If the namespace prefix is present, it is separated from the local name by a colon. A lexical QName can be converted into an expanded QName by resolving its namespace prefix to a namespace URI, using the statically known namespaces. An expanded QName consists of an optional namespace URI and a local name. An expanded QName also retains its original namespace prefix (if any), to facilitate casting the expanded QName into a string. The namespace URI value is whitespace normalized according to the rules for the xs:anyURI type in . Two expanded QNames are equal if their namespace URIs are equal and their local names are equal (even if their namespace prefixes are not equal). Namespace URIs and local names are compared on a codepoint basis, without further normalization.

Definition: in-scope namespaces. Element nodes have a property called in-scope namespaces. The in-scope namespaces property of an element node is a set of namespace bindings, each of which associates a namespace prefix with a URI, thus defining the set of namespace prefixes that are available for interpreting QNames within the scope of the element. For a given element, one namespace binding may have an empty prefix; the URI of this namespace binding is the default namespace within the scope of the element.

Definition: URI. Within this specification, the term URI refers to a Universal Resource Identifier as defined in and extended in with the new name IRI. The term URI has been retained in preference to IRI to avoid introducing new names for concepts such as "Base URI" that are defined or referenced across the whole family of XML specifications.

Definition: expression context. The expression context for a given expression consists of all the information that can affect the result of the expression. This information is organized into two categories called the static context and the dynamic context.

Definition: static context. The static context of an expression is the information that is available during static analysis of the expression, prior to its evaluation. This information can be used to decide whether the expression contains a static error. If analysis of an expression relies on some component of the static context that has not been assigned a value, a static error is raised .

Definition: XPath 1.0 compatibility mode. XPath 1.0 compatibility mode.

Definition: statically known namespaces. Statically known namespaces. This is a set of (prefix, URI) pairs that define all the namespaces that are known during static processing of a given expression. The URI value is whitespace normalized according to the rules for the xs:anyURI type in . Note the difference between in-scope namespaces, which is a dynamic property of an element node, and statically known namespaces, which is a static property of an expression.

Definition: default element/type namespace. Default element/type namespace. This is a namespace URI or "none". The namespace URI, if present, is used for any unprefixed QName appearing in a position where an element or type name is expected. The URI value is whitespace normalized according to the rules for the xs:anyURI type in .

Definition: default function namespace. Default function namespace. This is a namespace URI or "none". The namespace URI, if present, is used for any unprefixed QName appearing in a position where a function name is expected. The URI value is whitespace normalized according to the rules for the xs:anyURI type in .

Definition: in-scope schema definitions. In-scope schema definitions. This is a generic term for all the element declarations, attribute declarations, and schema type definitions that are in scope during processing of an expression. It includes the following three parts:

Definition: in-scope schema type. In-scope schema types. Each schema type definition is identified either by an expanded QName (for a named type) or by an implementation-dependent type identifier (for an anonymous type). The in-scope schema types include the predefined schema types described in .

Definition: in-scope element declarations. In-scope element declarations. Each element declaration is identified either by an expanded QName (for a top-level element declaration) or by an implementation-dependent element identifier (for a local element declaration). An element declaration includes information about the element's substitution group affiliation.

Definition: substitution group. Substitution groups are defined in Part 1, Section 2.2.2.2. Informally, the substitution group headed by a given element (called the head element) consists of the set of elements that can be substituted for the head element without affecting the outcome of schema validation.

Definition: in-scope attribute declarations. In-scope attribute declarations. Each attribute declaration is identified either by an expanded QName (for a top-level attribute declaration) or by an implementation-dependent attribute identifier (for a local attribute declaration).

Definition: in-scope variables. In-scope variables. This is a set of (expanded QName, type) pairs. It defines the set of variables that are available for reference within an expression. The expanded QName is the name of the variable, and the type is the static type of the variable.

Definition: context item static type. Context item static type. This component defines the static type of the context item within the scope of a given expression.

Definition: function signature. Function signatures. This component defines the set of functions that are available to be called from within an expression. Each function is uniquely identified by its expanded QName and its arity (number of parameters). In addition to the name and arity, each function signature specifies the static types of the function parameters and result.

Definition: statically known collations collation. Statically known collations. This is an implementation-defined set of (URI, collation) pairs. It defines the names of the collations that are available for use in processing expressions.A collation is a specification of the manner in which strings and URIs are compared and, by extension, ordered. For a more complete definition of collation, see .

Definition: default collation. Default collation. This identifies one of the collations in statically known collations as the collation to be used by functions and operators for comparing and ordering values of type xs:string and xs:anyURI (and types derived from them) when no explicit collation is specified.

Definition: construction mode. Construction mode. The construction mode governs the behavior of element and document node constructors. If construction mode is preserve, the type of a constructed element node is xs:anyType, and all attribute and element nodes copied during node construction retain their original types. If construction mode is strip, the type of a constructed element node is xs:untyped; all element nodes copied during node construction receive the type xs:untyped, and all attribute nodes copied during node construction receive the type xs:untypedAtomic.

Definition: ordering mode. Ordering mode. Ordering mode, which has the value ordered or unordered, affects the ordering of the result sequence returned by certain path expressions, union, intersect, and except expressions, and FLWOR expressions that have no order by clause. Details are provided in the descriptions of these expressions.

Definition: default order for empty sequences. Default order for empty sequences. This component controls the processing of empty sequences and NaN values as ordering keys in an order by clause in a FLWOR expression, as described in . Its value may be greatest or least.

Definition: boundary-space policy. Boundary-space policy. This component controls the processing of boundary whitespace by direct element constructors, as described in . Its value may be preserve or strip.

Definition: copy-namespaces mode. Copy-namespaces mode. This component controls the namespace bindings that are assigned when an existing element node is copied by an element constructor, as described in . Its value consists of two parts: preserve or no-preserve, and inherit or no-inherit.

Definition: base URI. Base URI. This is an absolute URI, used when necessary in the resolution of relative URIs (for example, by the fn:resolve-uri function.) The URI value is whitespace normalized according to the rules for the xs:anyURI type in .

Definition: statically known documents. Statically known documents. This is a mapping from strings onto types. The string represents the absolute URI of a resource that is potentially available using the fn:doc function. The type is the static type of a call to fn:doc with the given URI as its literal argument. If the argument to fn:doc is a string literal that is not present in statically known documents, then the static type of fn:doc is document-node()?.

Definition: statically known collections. Statically known collections. This is a mapping from strings onto types. The string represents the absolute URI of a resource that is potentially available using the fn:collection function. The type is the type of the sequence of nodes that would result from calling the fn:collection function with this URI as its argument. If the argument to fn:collection is a string literal that is not present in statically known collections, then the static type of fn:collection is node()*.

Definition: statically known default collection type. Statically known default collection type. This is the type of the sequence of nodes that would result from calling the fn:collection function with no arguments. Unless initialized to some other value by an implementation, the value of statically known default collection type is node()*.

Definition: dynamic context. The dynamic context of an expression is defined as information that is available at the time the expression is evaluated. If evaluation of an expression relies on some part of the dynamic context that has not been assigned a value, a dynamic error is raised .

Definition: focus. The first three components of the dynamic context (context item, context position, and context size) are called the focus of the expression. The focus enables the processor to keep track of which items are being processed by the expression.

Definition: context item context node. The context item is the item currently being processed. An item is either an atomic value or a node.When the context item is a node, it can also be referred to as the context node. The context item is returned by an expression consisting of a single dot (.). When an expression E1/E2 or E1[E2] is evaluated, each item in the sequence obtained by evaluating E1 becomes the context item in the inner focus for an evaluation of E2.

Definition: context position. The context position is the position of the context item within the sequence of items currently being processed. It changes whenever the context item changes. When the focus is defined, the value of the context position is an integer greater than zero. The context position is returned by the expression fn:position(). When an expression E1/E2 or E1[E2] is evaluated, the context position in the inner focus for an evaluation of E2 is the position of the context item in the sequence obtained by evaluating E1. The position of the first item in a sequence is always 1 (one). The context position is always less than or equal to the context size.

Definition: context size. The context size is the number of items in the sequence of items currently being processed. Its value is always an integer greater than zero. The context size is returned by the expression fn:last(). When an expression E1/E2 or E1[E2] is evaluated, the context size in the inner focus for an evaluation of E2 is the number of items in the sequence obtained by evaluating E1.

Definition: variable values. Variable values. This is a set of (expanded QName, value) pairs. It contains the same expanded QNames as the in-scope variables in the static context for the expression. The expanded QName is the name of the variable and the value is the dynamic value of the variable, which includes its dynamic type.

Definition: function implementation. Function implementations. Each function in function signatures has a function implementation that enables the function to map instances of its parameter types into an instance of its result type.

Definition: current dateTime. Current dateTime. This information represents an implementation-dependent point in time during the processing of , and includes an explicit timezone. It can be retrieved by the fn:current-dateTime function. If invoked multiple times during the execution of , this function always returns the same result.

Definition: implicit timezone. Implicit timezone. This is the timezone to be used when a date, time, or dateTime value that does not have a timezone is used in a comparison or arithmetic operation. The implicit timezone is an implementation-defined value of type xs:dayTimeDuration. See for the range of legal values of a timezone.

Definition: available documents. Available documents. This is a mapping of strings onto document nodes. The string represents the absolute URI of a resource. The document node is the root of a tree that represents that resource using the data model. The document node is returned by the fn:doc function when applied to that URI. The set of available documents is not limited to the set of statically known documents, and it may be empty.

Definition: available collections. Available collections. This is a mapping of strings onto sequences of nodes. The string represents the absolute URI of a resource. The sequence of nodes represents the result of the fn:collection function when that URI is supplied as the argument. The set of available collections is not limited to the set of statically known collections, and it may be empty.

Definition: default collection. Default collection. This is the sequence of nodes that would result from calling the fn:collection function with no arguments. The value of default collection may be initialized by the implementation.

Definition: type annotation. Each element node and attribute node in an XDM instance has a type annotation (referred to in as its type-name property.) The type annotation of a node is a schema type that describes the relationship between the string value of the node and its typed value. If the XDM instance was derived from a validated XML document as described in , the type annotations of the element and attribute nodes are derived from schema validation. XQuery does not provide a way to directly access the type annotation of an element or attribute node.

Definition: static analysis phase. The static analysis phase depends on the expression itself and on the static context. The static analysis phase does not depend on input data (other than schemas).

Definition: static type. Each expression is then assigned a static type (step SQ6). The static type of an expression is a type such that, when the expression is evaluated, the resulting value will always conform to the static type. If the Static Typing Feature is supported, the static types of various expressions are inferred according to the rules described in . If the Static Typing Feature is not supported, the static types that are assigned are implementation-dependent.

Definition: dynamic evaluation phase. The dynamic evaluation phase is the phase during which the value of an expression is computed. It occurs after completion of the static analysis phase.

Definition: dynamic type. A dynamic type is associated with each value as it is computed. The dynamic type of a value may be more specific than the static type of the expression that computed it (for example, the static type of an expression might be xs:integer*, denoting a sequence of zero or more integers, but at evaluation time its value may have the dynamic type xs:integer, denoting exactly one integer.)

Definition: serialization. Serialization is the process of converting an XDM instance into a sequence of octets (step DM4 in Figure 1.) The general framework for serialization is described in .

Definition: data model schema. Some of the consistency constraints use the term data model schema. For a given node in an XDM instance, the data model schema is defined as the schema from which the type annotation of that node was derived. For a node that was constructed by some process other than schema validation, the data model schema consists simply of the schema type definition that is represented by the type annotation of the node.

Definition: static error. A static error is an error that must be detected during the static analysis phase. A syntax error is an example of a static error.

Definition: dynamic error. A dynamic error is an error that must be detected during the dynamic evaluation phase and may be detected during the static analysis phase. Numeric overflow is an example of a dynamic error.

Definition: type error. A type error may be raised during the static analysis phase or the dynamic evaluation phase. During the static analysis phase, a type error occurs when the static type of an expression does not match the expected type of the context in which the expression occurs. During the dynamic evaluation phase, a type error occurs when the dynamic type of a value does not match the expected type of the context in which the value occurs.

Definition: warning. In addition to static errors, dynamic errors, and type errors, an XQuery implementation may raise warnings, either during the static analysis phase or the dynamic evaluation phase. The circumstances in which warnings are raised, and the ways in which warnings are handled, are implementation-defined.

Definition: error value. In addition to its identifying QName, a dynamic error may also carry a descriptive string and one or more additional values called error values. An implementation may provide a mechanism whereby an application-defined error handler can process error values and produce diagnostic messages.

Definition: reverse document order. An ordering called document order is defined among all the nodes accessible during processing of a given , which may consist of one or more trees (documents or fragments). Document order is defined in , and its definition is repeated here for convenience. The node ordering that is the reverse of document order is called reverse document order.

Definition: document order stable. Document order is a total ordering, although the relative order of some nodes is implementation-dependent. Informally, document order is the order in which nodes appear in the XML serialization of a document.Document order is stable, which means that the relative order of two nodes will not change during the processing of a given , even if this order is implementation-dependent.

Definition: atomization. The semantics of some XQuery operators depend on a process called atomization. Atomization is applied to a value when the value is used in a context in which a sequence of atomic values is required. The result of atomization is either a sequence of atomic values or a type error [err:FOTY0012]. Atomization of a sequence is defined as the result of invoking the fn:data function on the sequence, as defined in .

Definition: effective boolean value. Under certain circumstances (listed below), it is necessary to find the effective boolean value of a value. The effective boolean value of a value is defined as the result of applying the fn:boolean function to the value, as defined in .

Definition: sequence type. A sequence type is a type that can be expressed using the SequenceType syntax. Sequence types are used whenever it is necessary to refer to a type in an XQuery expression. The term sequence type suggests that this syntax is used to describe the type of an XQuery value, which is always a sequence.

Definition: schema type. A schema type is a type that is (or could be) defined using the facilities of (including the built-in types of ). A schema type can be used as a type annotation on an element or attribute node (unless it is a non-instantiable type such as xs:NOTATION or xs:anyAtomicType, in which case its derived types can be so used). Every schema type is either a complex type or a simple type; simple types are further subdivided into list types, union types, and atomic types (see for definitions and explanations of these terms.)

Definition: typed value string value. Every node has a typed value and a string value. The typed value of a node is a sequence of atomic values and can be extracted by applying the fn:data function to the node.The string value of a node is a string and can be extracted by applying the fn:string function to the node. Definitions of fn:data and fn:string can be found in .

Definition: SequenceType matching. During evaluation of an expression, it is sometimes necessary to determine whether a value with a known dynamic type "matches" an expected sequence type. This process is known as SequenceType matching. For example, an instance of expression returns true if the dynamic type of a given value matches a given sequence type, or false if it does not.

Definition: subtype substitution. The use of a value whose dynamic type is derived from an expected type is known as subtype substitution. Subtype substitution does not change the actual type of a value. For example, if an xs:integer value is used where an xs:decimal value is expected, the value retains its type as xs:integer.


Expressions

Definition: primary expression. Primary expressions are the basic primitives of the language. They include literals, variable references, context item expressions, and function calls. A primary expression may also be created by enclosing any expression in parentheses, which is sometimes helpful in controlling the precedence of operators.

Definition: literal. A literal is a direct syntactic representation of an atomic value. XQuery supports two kinds of literals: numeric literals and string literals.

Definition: predefined entity reference. A string literal may contain a predefined entity reference. A predefined entity reference is a short sequence of characters, beginning with an ampersand, that represents a single character that might otherwise have syntactic significance. Each predefined entity reference is replaced by the character it represents when the string literal is processed. The predefined entity references recognized by XQuery are as follows:

Definition: character reference. A string literal may also contain a character reference. A character reference is an XML-style reference to a character, identified by its decimal or hexadecimal code point. For example, the Euro symbol (€) can be represented by the character reference €. Character references are normatively defined in Section 4.1 of the XML specification (it is implementation-defined whether the rules in or apply.) A static error is raised if a character reference does not identify a valid character in the version of XML that is in use.

Definition: variable reference. A variable reference is a QName preceded by a $-sign. Two variable references are equivalent if their local names are the same and their namespace prefixes are bound to the same namespace URI in the statically known namespaces. An unprefixed variable reference is in no namespace.

Definition: built-in function. The built-in functions supported by XQuery are defined in .

Definition: path expression. A path expression can be used to locate nodes within trees. A path expression consists of a series of one or more steps, separated by "/" or "//", and optionally beginning with "/" or "//". An initial "/" or "//" is an abbreviation for one or more initial steps that are implicitly added to the beginning of the path expression, as described below.

Definition: step. A step is a part of a path expression that generates a sequence of items and then filters the sequence by zero or more predicates. The value of the step consists of those items that satisfy the predicates, working from left to right. A step may be either an axis step or a filter expression. Filter expressions are described in .

Definition: axis step. An axis step returns a sequence of nodes that are reachable from the context node via a specified axis. Such a step has two parts: an axis, which defines the "direction of movement" for the step, and a node test, which selects nodes based on their kind, name, and/or type annotation. If the context item is a node, an axis step returns a sequence of zero or more nodes; otherwise, a type error is raised . An axis step may be either a forward step or a reverse step, followed by zero or more predicates.

Definition: principal node kind. Every axis has a principal node kind. If an axis can contain elements, then the principal node kind is element; otherwise, it is the kind of nodes that the axis can contain. Thus:

Definition: node test. A node test is a condition that must be true for each node selected by a step. The condition may be based on the kind of the node (element, attribute, text, document, comment, or processing instruction), the name of the node, or (in the case of element, attribute, and document nodes), the type annotation of the node.

Definition: name test. A node test that consists only of a QName or a Wildcard is called a name test. A name test is true if and only if the kind of the node is the principal node kind for the step axis and the expanded QName of the node is equal (as defined by the eq operator) to the expanded QName specified by the name test. For example, child::para selects the para element children of the context node; if the context node has no para children, it selects an empty set of nodes. attribute::abc:href selects the attribute of the context node with the QName abc:href; if the context node has no such attribute, it selects an empty set of nodes.

Definition: kind test. An alternative form of a node test called a kind test can select nodes based on their kind, name, and type annotation. The syntax and semantics of a kind test are described in and . When a kind test is used in a node test, only those nodes on the designated axis that match the kind test are selected. Shown below are several examples of kind tests that might be used in path expressions:

Definition: predicate. A predicate consists of an expression, called a predicate expression, enclosed in square brackets. A predicate serves to filter a sequence, retaining some items and discarding others. In the case of multiple adjacent predicates, the predicates are applied from left to right, and the result of applying each predicate serves as the input sequence for the following predicate.

Definition: comma operator. One way to construct a sequence is by using the comma operator, which evaluates each of its operands and concatenates the resulting sequences, in order, into a single result sequence. Empty parentheses can be used to denote an empty sequence.

Definition: filter expression. A filter expression consists simply of a primary expression followed by zero or more predicates. The result of the filter expression consists of the items returned by the primary expression, filtered by applying each predicate in turn, working from left to right. If no predicates are specified, the result is simply the result of the primary expression. The ordering of the items returned by a filter expression is the same as their order in the result of the primary expression. Context positions are assigned to items based on their ordinal position in the result sequence. The first context position is 1.

Definition: direct element constructor. An element constructor creates an element node. A direct element constructor is a form of element constructor in which the name of the constructed element is a constant. Direct element constructors are based on standard XML notation. For example, the following expression is a direct element constructor that creates a book element containing an attribute and some nested elements:

Definition: namespace declaration attribute. A namespace declaration attribute is used inside a direct element constructor. Its purpose is to bind a namespace prefix or to set the default element/type namespace for the constructed element node, including its attributes. Syntactically, a namespace declaration attribute has the form of an attribute with namespace prefix xmlns, or with name xmlns and no namespace prefix. The value of a namespace declaration attribute must be a URILiteral ; otherwise a static error is raised . All the namespace declaration attributes of a given element must have distinct names . Each namespace declaration attribute is processed as follows:

Definition: boundary whitespace. Boundary whitespace is a sequence of consecutive whitespace characters within the content of a direct element constructor, that is delimited at each end either by the start or end of the content, or by a DirectConstructor , or by an EnclosedExpr . For this purpose, characters generated by character references such as   or by CdataSections are not considered to be whitespace characters.

Definition: name expression. For those kinds of nodes that have names (element, attribute, and processing instruction nodes), the keyword that specifies the node kind is followed by the name of the node to be created. This name may be specified either as a QName or as an expression enclosed in braces. When an expression is used to specify the name of a constructed node, that expression is called the name expression of the constructor.

Definition: content expression. The final part of a computed constructor is an expression enclosed in braces, called the content expression of the constructor, that generates the content of the node.

Definition: computed element constructor. A computed element constructor creates an element node, allowing both the name and the content of the node to be computed.

Definition: binding sequence. The simplest example of a for clause contains one variable and an associated expression. The value of the expression associated with a variable in a for clause is called the binding sequence for that variable. The for clause iterates over the items in the binding sequence, binding the variable to each item in turn. If ordering mode is ordered, the resulting sequence of variable bindings is ordered according to the order of values in the binding sequence; otherwise the ordering of the variable bindings is implementation-dependent.

Definition: effective case. Each case clause specifies a SequenceType followed by a return expression. The effective case in a typeswitch expression is the first case clause such that the value of the operand expression matches the SequenceType in the case clause, using the rules of SequenceType matching. The value of the typeswitch expression is the value of the return expression in the effective case. If the value of the operand expression does not match any SequenceType named in a case clause, the value of the typeswitch expression is the value of the return expression in the default clause.

Definition: constructor function. The constructor function for a given type is used to convert instances of other atomic types into the given type. The semantics of the constructor function call T($arg) are defined to be equivalent to the expression (($arg) cast as T?).

Definition: extension expression. An extension expression is an expression whose semantics are implementation-defined. Typically a particular extension will be recognized by some implementations and not by others. The syntax is designed so that extension expressions can be successfully parsed by all implementations, and so that fallback behavior can be defined for implementations that do not recognize a particular extension.

Definition: pragma. An extension expression consists of one or more pragmas, followed by an expression enclosed in curly braces. A pragma is denoted by the delimiters (# and #), and consists of an identifying QName followed by implementation-defined content. The content of a pragma may consist of any string of characters that does not contain the ending delimiter #). The QName of a pragma must resolve to a namespace URI and local name, using the statically known namespaces.


Modules and Prologs

Definition: module. A query can be assembled from one or more fragments called modules. A module is a fragment of XQuery code that conforms to the Module grammar and can independently undergo the static analysis phase described in . Each module is either a main module or a library module.

Definition: main module. A main module consists of a Prolog followed by a Query Body. A query has exactly one main module. In a main module, the Query Body can be evaluated, and its value is the result of the query.

Definition: library module. A module that does not contain a Query Body is called a library module. A library module consists of a module declaration followed by a Prolog. A library module cannot be evaluated directly; instead, it provides function and variable declarations that can be imported into other modules.

Definition: Prolog. A Prolog is a series of declarations and imports that define the processing environment for the module that contains the Prolog. Each declaration or import is followed by a semicolon. A Prolog is organized into two parts.

Definition: setter target namespace. The first part of the Prolog consists of setters, imports, namespace declarations, and default namespace declarations. Setters are declarations that set the value of some property that affects query processing, such as construction mode, ordering mode, or default collation. Namespace declarations and default namespace declarations affect the interpretation of QNames within the query. Imports are used to import definitions from schemas and modules. Each imported schema or module is identified by its target namespace, which is the namespace of the objects (such as elements or functions) that are defined by the schema or module.

Definition: query body. The Query Body, if present, consists of an expression that defines the result of the query. Evaluation of expressions is described in . A module can be evaluated only if it has a Query Body.

Definition: version declaration. Any module may contain a version declaration. If present, the version declaration occurs at the beginning of the module and identifies the applicable XQuery syntax and semantics for the module. The version number "1.0" indicates a requirement that the module must be processed by an implementation that supports XQuery Version 1.0. If the version declaration is not present, the version is presumed to be "1.0". An XQuery implementation must raise a static error when processing a module labeled with a version that the implementation does not support. It is the intent of the XQuery working group to give later versions of this specification numbers other than "1.0", but this intent does not indicate a commitment to produce any future versions of XQuery, nor if any are produced, to use any particular numbering scheme.

Definition: encoding declaration. If present, a version declaration may optionally include an encoding declaration. The value of the string literal following the keyword encoding is an encoding name, and must conform to the definition of EncName specified in . The purpose of an encoding declaration is to allow the writer of a query to provide a string that indicates how the query is encoded, such as "UTF-8", "UTF-16", or "US-ASCII". Since the encoding of a query may change as the query moves from one environment to another, there can be no guarantee that the encoding declaration is correct.

Definition: module declaration. A module declaration serves to identify a module as a library module. A module declaration begins with the keyword module and contains a namespace prefix and a URILiteral . The URILiteral must be of nonzero length . The URILiteral identifies the target namespace of the library module, which is the namespace for all variables and functions exported by the library module. The name of every variable and function declared in a library module must have a namespace URI that is the same as the target namespace of the module; otherwise a static error is raised . In the statically known namespaces of the library module, the namespace prefix specified in the module declaration is bound to the module's target namespace.

Definition: boundary-space declaration. A boundary-space declaration sets the boundary-space policy in the static context, overriding any implementation-defined default. Boundary-space policy controls whether boundary whitespace is preserved by element constructors during processing of the query. If boundary-space policy is preserve, boundary whitespace is preserved. If boundary-space policy is strip, boundary whitespace is stripped (deleted). A further discussion of whitespace in constructed elements can be found in .

Definition: default collation declaration. A default collation declaration sets the value of the default collation in the static context, overriding any implementation-defined default. The default collation is the collation that is used by functions and operators that require a collation if no other collation is specified. For example, the gt operator on strings is defined by a call to the fn:compare function, which takes an optional collation parameter. Since the gt operator does not specify a collation, the fn:compare function implements gt by using the default collation.

Definition: base URI declaration. A base URI declaration specifies the base URI property of the static context. The base URI property is used when resolving relative URIs within a module. For example, the fn:doc function resolves a relative URI using the base URI of the calling module.

Definition: construction declaration. A construction declaration sets the construction mode in the static context, overriding any implementation-defined default. The construction mode governs the behavior of element and document node constructors. If construction mode is preserve, the type of a constructed element node is xs:anyType, and all attribute and element nodes copied during node construction retain their original types. If construction mode is strip, the type of a constructed element node is xs:untyped; all element nodes copied during node construction receive the type xs:untyped, and all attribute nodes copied during node construction receive the type xs:untypedAtomic.

Definition: ordering mode declaration. An ordering mode declaration sets the ordering mode in the static context, overriding any implementation-defined default. This ordering mode applies to all expressions in a module (including both the Prolog and the Query Body, if any), unless overridden by an ordered or unordered expression.

Definition: empty order declaration. An empty order declaration sets the default order for empty sequences in the static context, overriding any implementation-defined default. This declaration controls the processing of empty sequences and NaN values as ordering keys in an order by clause in a FLWOR expression. An individual order by clause may override the default order for empty sequences by specifying empty greatest or empty least.

Definition: copy-namespaces declaration. A copy-namespaces declaration sets the value of copy-namespaces mode in the static context, overriding any implementation-defined default. Copy-namespaces mode controls the namespace bindings that are assigned when an existing element node is copied by an element constructor or document constructor. Handling of namespace bindings by element constructors is described in .

Definition: schema import. A schema import imports the element declarations, attribute declarations, and type definitions from a schema into the in-scope schema definitions. The schema to be imported is identified by its target namespace. The schema import may bind a namespace prefix to the target namespace of the imported schema, or may declare that target namespace to be the default element/type namespace. The schema import may also provide optional hints for locating the schema.

Definition: module import. A module import imports the function declarations and variable declarations from one or more library modules into the function signatures and in-scope variables of the importing module. Each module import names a target namespace and imports an implementation-defined set of modules that share this target namespace. The module import may bind a namespace prefix to the target namespace, and it may provide optional hints for locating the modules to be imported.

Definition: module directly depends. A module Mdirectly depends on another module M (different from M) if a variable or function declared in Mdepends on a variable or function declared in M. It is a static error to import a module M if there exists a sequence of modules M ... M ... M such that each module directly depends on the next module in the sequence (informally, if M depends on itself through some chain of module dependencies.)

Definition: namespace declaration. A namespace declaration declares a namespace prefix and associates it with a namespace URI, adding the (prefix, URI) pair to the set of statically known namespaces. The namespace declaration is in scope throughout the query in which it is declared, unless it is overridden by a namespace declaration attribute in a direct element constructor.

Definition: initializing expression. If a variable declaration includes an expression, the expression is called an initializing expression. The initializing expression for a given variable must be evaluated before the evaluation of any expression that references the variable. The static context for an initializing expression includes all functions that are declared or imported anywhere in the Prolog, but it includes only those variables and namespaces that are declared or imported earlier in the Prolog than the variable that is being initialized.

Definition: variable depends. A variable $xdepends on a variable $y or a function f2 if a reference to $y or f2 appears in the initializing expression of $x, or if there exists a variable $z or a function f3 such that $xdepends on $z or f3 and $z or f3depends on $y or f2.

Definition: function depends. A function f1depends on a variable $y or a function f2 if a reference to $y or f2 appears in the body of f1, or if there exists a variable $z or a function f3 such that f1depends on $z or f3 and $z or f3depends on $y or f2.

Definition: user-defined function. A function declaration specifies whether a function is user-defined or external. For a user-defined function, the function declaration includes an expression called the function body that defines how the result of the function is computed from its parameters.. The static context for a function body includes all functions that are declared or imported anywhere in the Prolog, but it includes only those variables and namespaces that are declared or imported earlier in the Prolog than the function that is being defined.

Definition: external function. External functions are functions that are implemented outside the query environment. For example, an XQuery implementation might provide a set of external functions in addition to the core function library described in . External functions are identified by the keyword external. The purpose of a function declaration for an external function is to declare the datatypes of the function parameters and result, for use in type checking of the query that contains or imports the function declaration.

Definition: option declaration. An option declaration declares an option that affects the behavior of a particular implementation. Each option consists of an identifying QName and a StringLiteral.


Conformance

Definition: must may should. This section defines the conformance criteria for an XQuery processor. In this section, the following terms are used to indicate the requirement levels defined in . MUST means that the item is an absolute requirement of the specification.MAY means that an item is truly optional.SHOULD means that there may exist valid reasons in particular circumstances to ignore a particular item, but the full implications must be understood and carefully weighed before choosing a different course.

Definition: schema import feature. The Schema Import Feature permits the query Prolog to contain a schema import.

Definition: schema validation feature. The Schema Validation Feature permits a query to contain a validate expression (see .)

Definition: static typing feature. The Static Typing Feature provides support for the static semantics defined in , and requires implementations to detect and report type errors during the static analysis phase.

Definition: static typing extension. A conforming implementation that implements the Static Typing FeatureMAY also provide one or more static typing extensions. A static typing extension is an implementation-defined type inference rule that infers a more precise static type than that inferred by the type inference rules in . See for a formal definition of the constraints on static typing extensions.

Definition: optional axis. The following axes are designated as optional axes: ancestor, ancestor-or-self, following, following-sibling, preceding, and preceding-sibling.

Definition: Full Axis Feature. A conforming XQuery implementation that supports the Full Axis FeatureMUST support all the optional axes.

Definition: module feature. A conforming XQuery implementation that supports the Module Feature allows a query Prolog to contain a Module Import and allows library modules to be created.

Definition: serialization feature. A conforming XQuery implementation that supports the Serialization FeatureMUST provide means for serializing the result of a query, as specified in .


XQuery Grammar

[74] AbbrevForwardStep = "@"? NodeTest

[77] AbbrevReverseStep = ".."

[50] AdditiveExpr = MultiplicativeExpr ( ("+" | "-") MultiplicativeExpr )*

[47] AndExpr = ComparisonExpr ( "and" ComparisonExpr )*

[124] AnyKindTest = "node" "(" ")"

[100] AposAttrValueContent = AposAttrContentChar | CommonContent

[122] AtomicType = QName

[130] AttribNameOrWildcard = AttributeName | "*"

[132] AttributeDeclaration = AttributeName

[137] AttributeName = QName

[129] AttributeTest = "attribute" "(" (AttribNameOrWildcard ("," TypeName )?)? ")"

[71] AxisStep = (ReverseStep | ForwardStep ) PredicateList

[20] BaseURIDecl = "declare" "base-uri" URILiteral

[11] BoundarySpaceDecl = "declare" "boundary-space" ("preserve" | "strip")

[107] CDataSection = "<![CDATA[" CDataSectionContents "]]>"

[108] CDataSectionContents = (Char * - (Char* ']]>' Char*))

[44] CaseClause = "case" ("$" VarName "as")? SequenceType "return" ExprSingle

[57] CastExpr = UnaryExpr ( "cast" "as" SingleType )?

[56] CastableExpr = CastExpr ( "castable" "as" SingleType )?

[127] CommentTest = "comment" "(" ")"

[102] CommonContent = PredefinedEntityRef | CharRef | "{{" | "}}" | EnclosedExpr

[113] CompAttrConstructor = "attribute" (QName | ("{" Expr "}")) "{" Expr ? "}"

[115] CompCommentConstructor = "comment" "{" Expr "}"

[110] CompDocConstructor = "document" "{" Expr "}"

[111] CompElemConstructor = "element" (QName | ("{" Expr "}")) "{" ContentExpr ? "}"

[116] CompPIConstructor = "processing-instruction" (NCName | ("{" Expr "}")) "{" Expr ? "}"

[114] CompTextConstructor = "text" "{" Expr "}"

[48] ComparisonExpr = RangeExpr ( (ValueComp | GeneralComp | NodeComp ) RangeExpr )?

[109] ComputedConstructor = CompDocConstructor | CompElemConstructor | CompAttrConstructor | CompTextConstructor | CompCommentConstructor | CompPIConstructor

[25] ConstructionDecl = "declare" "construction" ("strip" | "preserve")

[94] Constructor = DirectConstructor | ComputedConstructor

[112] ContentExpr = Expr

[90] ContextItemExpr = "."

[16] CopyNamespacesDecl = "declare" "copy-namespaces" PreserveMode "," InheritMode

[19] DefaultCollationDecl = "declare" "default" "collation" URILiteral

[12] DefaultNamespaceDecl = "declare" "default" ("element" | "function") "namespace" URILiteral

[97] DirAttributeList = (S (QName S ? "=" S ? DirAttributeValue )?)*

[98] DirAttributeValue = ('"' (EscapeQuot | QuotAttrValueContent )* '"')| ("'" (EscapeApos | AposAttrValueContent )* "'")

[103] DirCommentConstructor = "<!--" DirCommentContents "-->"

[104] DirCommentContents = ((Char - '-') | ('-' (Char - '-')))*

[96] DirElemConstructor = "<" QName DirAttributeList ("/>" | (">" DirElemContent * "</" QName S ? ">"))

[101] DirElemContent = DirectConstructor | CDataSection | CommonContent | ElementContentChar

[105] DirPIConstructor = "<?" PITarget (S DirPIContents )? "?>"

[106] DirPIContents = (Char * - (Char* '?>' Char*))

[95] DirectConstructor = DirElemConstructor | DirCommentConstructor | DirPIConstructor

[125] DocumentTest = "document-node" "(" (ElementTest | SchemaElementTest )? ")"

[136] ElementDeclaration = ElementName

[138] ElementName = QName

[134] ElementNameOrWildcard = ElementName | "*"

[133] ElementTest = "element" "(" (ElementNameOrWildcard ("," TypeName "?"?)?)? ")"

[15] EmptyOrderDecl = "declare" "default" "order" "empty" ("greatest" | "least")

[29] EnclosedExpr = "{" Expr "}"

[31] Expr = ExprSingle ("," ExprSingle )*

[32] ExprSingle = FLWORExpr | QuantifiedExpr | TypeswitchExpr | IfExpr | OrExpr

[65] ExtensionExpr = Pragma + "{" Expr ? "}"

[33] FLWORExpr = (ForClause | LetClause )+ WhereClause ? OrderByClause ? "return" ExprSingle

[81] FilterExpr = PrimaryExpr PredicateList

[34] ForClause = "for" "$" VarName TypeDeclaration ? PositionalVar ? "in" ExprSingle ("," "$" VarName TypeDeclaration ? PositionalVar ? "in" ExprSingle )*

[73] ForwardAxis = ("child" "::")| ("descendant" "::")| ("attribute" "::")| ("self" "::")| ("descendant-or-self" "::")| ("following-sibling" "::")| ("following" "::")

[72] ForwardStep = (ForwardAxis NodeTest ) | AbbrevForwardStep

[93] FunctionCall = QName "(" (ExprSingle ("," ExprSingle )*)? ")"

[26] FunctionDecl = "declare" "function" QName "(" ParamList ? ")" ("as" SequenceType )? (EnclosedExpr | "external")

[60] GeneralComp = "=" | "!=" | "<" | "<=" | ">" | ">="

[45] IfExpr = "if" "(" Expr ")" "then" ExprSingle "else" ExprSingle

[8] Import = SchemaImport | ModuleImport

[18] InheritMode = "inherit" | "no-inherit"

[54] InstanceofExpr = TreatExpr ( "instance" "of" SequenceType )?

[53] IntersectExceptExpr = InstanceofExpr ( ("intersect" | "except") InstanceofExpr )*

[121] ItemType = KindTest | ("item" "(" ")") | AtomicType

[123] KindTest = DocumentTest | ElementTest | AttributeTest | SchemaElementTest | SchemaAttributeTest | PITest | CommentTest | TextTest | AnyKindTest

[36] LetClause = "let" "$" VarName TypeDeclaration ? ":=" ExprSingle ("," "$" VarName TypeDeclaration ? ":=" ExprSingle )*

[4] LibraryModule = ModuleDecl Prolog

[85] Literal = NumericLiteral | StringLiteral

[3] MainModule = Prolog QueryBody

[1] Module = VersionDecl ? (LibraryModule | MainModule )

[5] ModuleDecl = "module" "namespace" NCName "=" URILiteral Separator

[23] ModuleImport = "import" "module" ("namespace" NCName "=")? URILiteral ("at" URILiteral ("," URILiteral )*)?

[51] MultiplicativeExpr = UnionExpr ( ("*" | "div" | "idiv" | "mod") UnionExpr )*

[79] NameTest = QName | Wildcard

[10] NamespaceDecl = "declare" "namespace" NCName "=" URILiteral

[62] NodeComp = "is" | "<<" | ">>"

[78] NodeTest = KindTest | NameTest

[86] NumericLiteral = IntegerLiteral | DecimalLiteral | DoubleLiteral

[120] OccurrenceIndicator = "?" | "*" | "+"

[13] OptionDecl = "declare" "option" QName StringLiteral

[46] OrExpr = AndExpr ( "or" AndExpr )*

[38] OrderByClause = (("order" "by") | ("stable" "order" "by")) OrderSpecList

[41] OrderModifier = ("ascending" | "descending")? ("empty" ("greatest" | "least"))? ("collation" URILiteral )?

[40] OrderSpec = ExprSingle OrderModifier

[39] OrderSpecList = OrderSpec ("," OrderSpec )*

[91] OrderedExpr = "ordered" "{" Expr "}"

[14] OrderingModeDecl = "declare" "ordering" ("ordered" | "unordered")

[128] PITest = "processing-instruction" "(" (NCName | StringLiteral )? ")"

[28] Param = "$" QName TypeDeclaration ?

[27] ParamList = Param ("," Param )*

[89] ParenthesizedExpr = "(" Expr ? ")"

[68] PathExpr = ("/" RelativePathExpr ?)| ("//" RelativePathExpr )| RelativePathExpr

[35] PositionalVar = "at" "$" VarName

[66] Pragma = "(#" S ? QName (S PragmaContents )? "#)"

[67] PragmaContents = (Char * - (Char* '#)' Char*))

[83] Predicate = "[" Expr "]"

[82] PredicateList = Predicate *

[17] PreserveMode = "preserve" | "no-preserve"

[84] PrimaryExpr = Literal | VarRef | ParenthesizedExpr | ContextItemExpr | FunctionCall | OrderedExpr | UnorderedExpr | Constructor

[6] Prolog = ((DefaultNamespaceDecl | Setter | NamespaceDecl | Import ) Separator )* ((VarDecl | FunctionDecl | OptionDecl ) Separator )*

[42] QuantifiedExpr = ("some" | "every") "$" VarName TypeDeclaration ? "in" ExprSingle ("," "$" VarName TypeDeclaration ? "in" ExprSingle )* "satisfies" ExprSingle

[30] QueryBody = Expr

[99] QuotAttrValueContent = QuotAttrContentChar | CommonContent

[49] RangeExpr = AdditiveExpr ( "to" AdditiveExpr )?

[69] RelativePathExpr = StepExpr (("/" | "//") StepExpr )*

[76] ReverseAxis = ("parent" "::")| ("ancestor" "::")| ("preceding-sibling" "::")| ("preceding" "::")| ("ancestor-or-self" "::")

[75] ReverseStep = (ReverseAxis NodeTest ) | AbbrevReverseStep

[131] SchemaAttributeTest = "schema-attribute" "(" AttributeDeclaration ")"

[135] SchemaElementTest = "schema-element" "(" ElementDeclaration ")"

[21] SchemaImport = "import" "schema" SchemaPrefix ? URILiteral ("at" URILiteral ("," URILiteral )*)?

[22] SchemaPrefix = ("namespace" NCName "=") | ("default" "element" "namespace")

[9] Separator = ";"

[119] SequenceType = ("empty-sequence" "(" ")")| (ItemType OccurrenceIndicator ?)

[7] Setter = BoundarySpaceDecl | DefaultCollationDecl | BaseURIDecl | ConstructionDecl | OrderingModeDecl | EmptyOrderDecl | CopyNamespacesDecl

[117] SingleType = AtomicType "?"?

[70] StepExpr = FilterExpr | AxisStep

[126] TextTest = "text" "(" ")"

[55] TreatExpr = CastableExpr ( "treat" "as" SequenceType )?

[118] TypeDeclaration = "as" SequenceType

[139] TypeName = QName

[43] TypeswitchExpr = "typeswitch" "(" Expr ")" CaseClause + "default" ("$" VarName )? "return" ExprSingle

[140] URILiteral = StringLiteral

[58] UnaryExpr = ("-" | "+")* ValueExpr

[52] UnionExpr = IntersectExceptExpr ( ("union" | "|") IntersectExceptExpr )*

[92] UnorderedExpr = "unordered" "{" Expr "}"

[63] ValidateExpr = "validate" ValidationMode ? "{" Expr "}"

[64] ValidationMode = "lax" | "strict"

[61] ValueComp = "eq" | "ne" | "lt" | "le" | "gt" | "ge"

[59] ValueExpr = ValidateExpr | PathExpr | ExtensionExpr

[24] VarDecl = "declare" "variable" "$" QName TypeDeclaration ? ((":=" ExprSingle ) | "external")

[88] VarName = QName

[87] VarRef = "$" VarName

[2] VersionDecl = "xquery" "version" StringLiteral ("encoding" StringLiteral )? Separator

[37] WhereClause = "where" ExprSingle

[80] Wildcard = "*"| (NCName ":" "*")| ("*" ":" NCName )

Definition: delimiting terminal symbol. The delimiting terminal symbols are: S , "-", (comma), (semi-colon), (colon), "::", ":=", "!=", "?", "?>", "/", "//", "/>", (dot), "..", StringLiteral , "(", "(#", ")", "[", "]", "]]>", "{", "}", "@", "$", "*", "#)", "+", "<", "<!--", "<![CDATA[", "<?", "</", "<<", "<=", "=", ">", "-->", ">=", ">>", "|"

Definition: non-delimiting terminal symbol. The non-delimiting terminal symbols are: IntegerLiteral , NCName , QName , DecimalLiteral , DoubleLiteral , "ancestor", "ancestor-or-self", "and", "as", "ascending", "at", "attribute", "base-uri", "boundary-space", "by", "case", "cast", "castable", "child", "collation", "comment", "construction", "copy-namespaces", "declare", "default", "descendant", "descendant-or-self", "descending", "div", "document", "document-node", "element", "else", "empty", "empty-sequence", "encoding", "eq", "every", "except", "external", "following", "following-sibling", "for", "function", "ge", "greatest", "gt", "idiv", "if", "import", "in", "inherit", "instance", "intersect", "is", "item", "lax", "le", "least", "let", "lt", "mod", "module", "namespace", "ne", "node", "no-inherit", "no-preserve", "of", "option", "or", "order", "ordered", "ordering", "parent", "preceding", "preceding-sibling", "preserve", "processing-instruction", "return", "satisfies", "schema", "schema-attribute", "schema-element", "self", "some", "stable", "strict", "strip", "text", "then", "to", "treat", "typeswitch", "union", "unordered", "validate", "variable", "version", "where", "xquery"

Definition: symbol separators. Whitespace and Comments function as symbol separators. For the most part, they are not mentioned in the grammar, and may occur between any two terminal symbols mentioned in the grammar, except where that is forbidden by the annotation in the EBNF, or by the annotation.

Definition: whitespace. A whitespace character is any of the characters defined by .

Definition: ignorable whitespace. Ignorable whitespace consists of any whitespace characters that may occur between terminals, unless these characters occur in the context of a production marked with a annotation, in which case they can occur only where explicitly specified (see ). Ignorable whitespace characters are not significant to the semantics of an expression. Whitespace is allowed before the first terminal and after the last terminal of a module. Whitespace is allowed between any two terminals. Comments may also act as "whitespace" to prevent two adjacent terminals from being recognized as one. Some illustrative examples are as follows: