Chapter 1. Background

Why? I think because it scratches an itch. An ongoing discussion on xml-dev is discussing whether XML end to end is better/faster/bigger than using a marshalling unmarshalling approach. A side issue of that is the assumption that many processes are a common feature of XML data movement across the internet and local networks. Even for my tiny XML world, I often find myself stringing together four or five steps to get from my XML source to whatever I'm producing as output. On that basis I think that xproc is a worthwhile recommendation.

It seems that other groups have seen advantage in stringing operations together. ndw lists a couple of interested and interesting pipeline related projects! Perhaps Apache ant and Cocoon are the best known of these. They are certainly in widespread use both commercially and in OSS development.

At xproc.org, ndw calls it composing XML processes. I hope you get the picture.

Back to basics then. Pipelines?

If you have played with Unix/Linux you may be aware of an early design decision to make lots of tools dedicated to single tasks, enabling them to be strung together to make bigger, more powerful tools when used in cooperation and combination. The unwritten (AFAIK) assumption behind all this was that each 'step' in the pipeline produced no output if all went well, and left a deposit (an error message of some description) as output if something went wrong. This level of simplicity meant that it was simple to build strings of commands which sequentially processed the output of the previous step as its own input. The manner in which two operations were joined was called the pipe (See wikipedia) since 'stuff' went in one end and came out of the other. I find that elegant.

Other pipeline analogies come to mind. The chain of 'buckets' we used to use to try and extinguish a fire? We're acting as a pipeline to get water from one point to another.

The simple idea is that XML is taken in, processed in a chain of steps one joined to another and one or more output produces a result, generally in XML.

That's perhaps the reasoning behind the name. For those wanting precision on what it is the cr|rec provides just that.