Understanding XPATH

1. Complex Path Checking
2. XPATH expressions

1.

Complex Path Checking

Jeni Tennison


> My problem is, I want to check the content 
>of the first <node2> (with "some content") before I decide if I 
>display the content of the other <node2> (with "other content"). I
>need to do this through the whole large document.
>
>Now, I'm not exactly experienced when it comes to XPath. I tried the 
>following, but as I already expected ;) , it didn't work.
>
> <xsl:template match="//node2">
    

I'm just going to go over your XPaths to make sure that you understand what they're doing before helping you with your actual problem. The syntax '//' within an XPath is short for '/descendant-or-self::node()/', so this expands to: /descendant-or-self::node()/node2 If you were using this expression to *select* nodes (e.g. in xsl:apply-templates or xsl:for-each), then it would select: all the 'node2' elements that are a child of any node that is a descendent of (or is itself) the root node

When you use an XPath to *match* nodes, as you are here (or if you were matching nodes to key into), it will match: any 'node2' element that is a child of any node that is a descendent of (or is itself) the root node

In other words, this matches any node2 element in the document. Another XPath that matches any node2 element in the document is simply 'node2', so you may as well say:

<xsl:template match="node2">
  ...
</xsl:template>

It's just a small thing, but it took me ages to get my head around the fact that the 'match' attributes don't *select* nodes, they test a node you've already select, which means they can usually be fairly simple.

> <xsl:if 
>test="contains(ancestor::following-sibling::child::node(),'X')">

You're going wrong here because you're forgetting to separate your steps properly. Each step is made up of an axis (like ancestor, following-sibling or child) and a node test. The node test usually gives the name of the node. Each step is separated from the next step using a '/'. I think you were trying for something like:

  ancestor::*/following-sibling::*/child::node()

This will select: any node (of any type) that is a child of any element that is a sibling that follows any element that is an ancestor of the current node

I doubt that you are really after any node of any type - are you really interested if an attribute or a comment contains an 'X'? If you are just after elements, then it would help to say so. You can also lose the child:: axis if you want - it's assumed by default. So try:

  <xsl:if test="contains(ancestor::*/following-sibling::*/*, 'X')">
    ...
  </xsl:if>

Anyway, your actual problem was that you wanted to check the content of the first node2 element to see whether you should process the second node2 element.

Within XSLT, processing flows from the top of the *tree* to the bottom of the tree rather than from the top of the *document* to the bottom of the document. So, the children of the root node are processed, and their processing involves the processing of their children, which involves the processing of their children and so on.

This means that the right place to decide whether to process a particular node is either higher up the *tree* rather than higher up the *document*. Another approach is is to process the node, but not produce any output unless certain conditions are met.

Here's an example that always processes the first node2, but only processes the second node2 if there's an ancestor of the first node2 which has a following sibling which has a child that contains an 'X':

<xsl:template match="root">
  <xsl:apply-templates select="node1/node2[1]" />
  <xsl:if
      test="contains(node1/node2[1]/ancestor::*/following-sibling::*/*, 'X')">
    <xsl:apply-templates select="node1/node2[2]" /> 
  </xsl:if>
</xsl:template>

Here's another example which decides whether to produce any output within the node2-matching template:

<xsl:template match="root">
  <xsl:apply-templates select="node1/node2" />
</xsl:template>

<xsl:template match="node2">
  <xsl:if
    test="position() = 1 or
          contains(node1/node2[1]/ancestor::*/following-sibling::*/*, 'X')">
    ...produce some output...
  </xsl:if>
</xsl:template>

2.

XPATH expressions

David Allouche

> <box>
>    <category name="someType">
>       <header>
>         <self>
>            <host>myhost</host>
>            <instance>9</instance>
>         </self>
>        <ref>
>           <host>thathost</host>
>           <instance>1010101</instance>
>        </ref>
> </header>
> 
> And the value of the header instance is 9 (passed from a web page to a
> servlet)
> What expression can I use to get the ref elemenent under the same header
> parent?

/box/category[@name='someType']/header is a nice start.

Just go on you XPath until you test what you want to test:

/box/category[@name='someType']/header/self[instance=$parameter]

Now use the parent axis to go up to the header element

/box/category[@name='someType']/header/self[instance=$parameter]/parent::header

Then go on as usual using the implied child axis

/box/category[@name='someType']/header/self[instance=$parameter]/parent::header/ref

It's done, but it needs to be shortened a bit to stay readable. First, as you know that the parent of the self element is a header element you can use a wildcard element name without changing the meaning of the XPath:

/box/category[@name='someType']/header/self[instance=$parameter]/parent::*/ref

Actually using parent axis with an explicit element name is only useful to test that a parent has a given name... Then, if your file structure is regular enough you can put in more wildcards without change of signification

/*/category[@name='someType']/*/self[instance=$parameter]/parent::*/ref

I guess you know enough of XSLT to use the context to get rid of any unnecessary steps at the beginning of the XPath.

But there is still other approaches, making heavier use of predicates, and possibly less performant. But it's mainly a matter of programming style.

//category[@name='someType']/*/ref[parent::*/self[instance=$parameter]]

//ref[ancestor::category[@name='someType'] and
                                  parent::*/self[instance=$parameter]]

Or even the most perverse and obfuscated:

ancestor-or-self::node()[boolean(count(.|/)-2)]/node()[ancestor::*
[not(descendant-or-self::*/parent::category)]/child::self[instance=
$parameter]][self::ref][generate-id(self::ref)=generate-id(//ref
[ancestor::category[@name='someType'])]

(if someone can find a bug in this one I'll buy him/her a beer) :-)