Chapter 4. Xproc steps

Revision History
Revision 0.12008-12-10T18:16:32ZDave Pawson
Initial Issue

Table of Contents

Steps
Documenting steps
Step basics

Steps

The basic tool of a pipeline, from the user view, is a step Xproc defines three types of step

atomic steps.
compound steps.
multi-container steps.

The atomic step is the simplest. A single action, a single p:declare-step. Most useful and the easiest to use. Examples are p:delete, p:identity etc.

[Note]Note

Remember that an implementer may provide other steps not defined in Xproc. Consult your implementers documentation for the full list

A compound step is a step that contains a subpipeline. I.e. multiple atomic steps connected by either explicit pipes or implied connections.

A multi-container step is a step that contains several alternate subpipelines. There are currently only two. p:choose and p:try. Each alternative is wrapped in an appropriate element. More on this later.

Documenting steps

Xproc provides a simple means of documenting your pipelines. The p:documentation element can appear just about anywhere to keep your intentions clear (and available). The only exception is the obvious one, as a child of p:inline, where it is taken as any other element.

p:documentation elements, and their content, is ignored by the Xproc processor. A good use is to extract the comments and the steps as a separate document, should you want additional out of line documentation.

Logging

p:log allows an output port to be named and used for logging output on any step. The principle is that any output port may be used to write logging information. Example 4.1 shows an example

Example 4.1. An example log output

<p:pipeline
    xmlns:p="http://www.w3.org/ns/xproc" >
  <p:xslt>
    <p:input port="source" >
      
      <p:inline>                  1
	<doc>
	  <title>Two steps</title>
	</doc>
      </p:inline>
      
    </p:input>
    
    <p:log
	port="result"               2
	href="log.txt"/>            3
    
    <p:input port="stylesheet" >
      <p:inline>
	<xsl:stylesheet version='1.0' 
	xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
	  <xsl:template match="/doc/title">
	    <doc>
	      <xsl:value-of select="."/>
	      <xsl:text>. logging</xsl:text>
	    </doc>
	  </xsl:template>
	  <xsl:template match="text()"/>
	</xsl:stylesheet>
      </p:inline>
    </p:input>
  </p:xslt>
</p:pipeline>  

1

A simple input file

2

The result port is hijacked | used for output,

3

to a log file log.txt


The p:xslt step simply selects an element from the input (the value of the 'doc' element), appends a period and the word 'logging' to it, and outputs to the log. The file shows

  [dpawson@marge tests]$ cat log.txt 
<!-- Start of Calabash output log for "result" to  \
"file:/files/xproc/tests/log.txt" at                  \
file:/files/xproc/tests/logging.xpl:30 -->
<!-- At FIXME: datetime -->
<px:document-sequence \
       xmlns:px='http://xmlcalabash.com/ns/document-sequence'>
<px:document>  \
      <doc>Two steps. logging</doc>\
</px:document>

when run on calabash. The lines are broken using the '\' character for convenience

In this example, the output file will also show the same result, however in a more practical example, the logging output is less likely to be the sole output of the pipeline

Step basics

An atomic step is defined using the p:declare-step element. The current schema lists the children for that element, i.e. the elements which are available for use within a step

As you see, it's quite a list. What you need will be your choice. The CR provides a good description of the functionality and syntax of each one starting at p:add-attribute.

[Important]Important

Note that the Xproc uses the following format


    <p:declare-step type="p:identity">
     <p:input port="source" sequence="true"/>
     <p:output port="result" sequence="true"/>
    </p:declare-step>
  

The p:declare-step element is simply used as a wrapper for atomic step(s). Don't start using it in the form provided in the CR!

Instead use the p:identity (or other element), provided in the type attribute of the CR

From section 3,

Elements in a pipeline document represent the pipeline, the steps it contains, the connections between those steps, the steps and connections contained within them, and so on. Each step is represented by an element; a combination of elements and attributes specify how the inputs and outputs of each step are connected and how options and parameters are passed.

What is implied, but not stated explictly, is that the sequence of steps, i.e. the ordering of steps is also critical, from the pipeline authors viewpoint.

Also note that steps can contain other steps!

Step types, options, variables, and parameters are named with QNames. Steps and ports are named with NCNames. So we have p:declare-step (A Qname) and <p:input port="source"></p:input>, where the port value is an NCName (see xml names, part II)

Scope. Xproc defines scope comprehensively. One of the goals is to enable a target of a step and a port to be readily distinguished. Steps are (very nearly) globally visible and ports must be unique within a step. That means that step and port will identify one possible connection point.