| Revision History | ||
|---|---|---|
| Revision 0.2 | 2007-09-17T09:00:27Z | Dave Pawson |
| First attempt at documenting xweb processing | ||
| Revision 0.3 | 2010-05-04T08:52:12Z | Dave Pawson |
| Reviewed. | ||
Table of Contents
Literate programming involves integrating documentation and code (however you define that) into one instance. In this case, into one XML instance as outlined by Norm. I'm documenting it and bringing it up to date by using the relax NG schema and more modern stylesheets.
The terms are defined on wikipedia
It will come as no surprise to learn that this document is an instance of literate programming.
The source document is a docbook document, which happens to
have an article as its document
element. Whilst writing documentation, retain the docbook namespace
and make full use of the docbook elements. When you want to start
writing code, or xslt templates or something else other than
docbook, that's when the namespace changes. The namespace this
document uses for such content is http://nwalsh.com/xmlns/litprog/fragment
and the element name is fragment. This
is a block level element, used wherever a para element might
be. It's content is namespace and requires an xml:id value. The content of the fragment is transformed to extract the code,
i.e. the tangle side of the pair. Extracting
the documentation is the weave process. For
this tutorial, the output will be assumed to be XML as apposed to
code. The implications of this are that the
tangle stylesheet output has its output set to
xml, whereas were it producing C code or Java,
then the output mode would be set to
text.
The basic processing sequence for an xweb file is shown below, as a Linux shell script. File locations will need to be resolved for your setup
Fragment frag.root
#!/bin/bash java com.icl.saxon.StyleSheet -o tmp.xml litprog.xweb xsl/weave.xsl java com.icl.saxon.StyleSheet -o litprog.html tmp.xml xsl/mydocbook.xsl java net.sf.saxon.Transform -o code.xml litprog.xweb xsl/tangle.v2.xsl
The process sequence outlined above is, weave - to produce the documentation
(documentation always comes first :-), then tangle - to generate the
code. The weave step is a two stage process. The first generates pure
docbook xml, the second generates html from the docbook - using a
customization layer (mydocbook.xsl) which calls up the as delivered
docbook stylesheets. Two options in calling up tangle. Use either xslt 1.0 implementation
or xslt 2.0 if multiple output files are required. The output of this process is
litprog.html - the documentation and
code.xml which is the actual code.
That is the simple process outline. More follows, along with the details.
It was realised that people don't often build code or XML as a single one off exercise. It is common to develop the output over a period and in many files. The impact on xweb is that the output can be built up over time and needs to be built from several files. Ordering becomes a task then, since a developer may not build them in the order in which they are required as final output, for whatever reason. xweb answers this need by having a starting point and a sequence in which to build the output.
This is also the solution where documenting or even designing a solution consists of several parts. The author may design top down or bottom up. xweb comes to the rescue here, since I can document the parts as I design them, in pieces, then build the entire program up into a cohesive whole.
Fragment section2
<src:fragment xml:id="top">
<src:fragref linkend="frag.root"/>
<src:fragref linkend="section2"/>
<src:fragref linkend="markup"/>
<src:fragref linkend="section3"/>
</src:fragment>
The fragment above has an xml:id value
of 'top' which is special, in that it identifies the starting point
for the collection of output. In turn, each fragref is resolved to a fragment which is then added to the output. In
this way the source can be built up and the fragments added to the output which
cumulatively increases. The actual order is defined by the sequence
in the fragment wrapper with xml:id value of 'top'. This file above has such a
fragment.
The actual output file may need to vary between actual fragments. I.e. One fragment could be text (or
more accurately non-XML) whilst the next could be XML. Further
variations are that there may be more than one output file built
from the single xweb file. This is controlled by the mode attribute on the fragment element. It only takes one value if
present. It is chunked.
When the attribute is both present and has that value, another
attribute is required to specify the output filename. This is the href attribute, which should contain the name
of a file. The final attribute, output
determines if the output mode of the transform will be text or xml. All
three attributes must be present to get an output file seperate from
the main output. In order to get this seperate output file, an XSLT
2.0 processor must be used. The fragment in this section shows this. It
contains:
<src:fragment
xml:id="section3"
output='text'
mode='chunked'
href="output.txt"
>
.... content
</src:fragment>
So the section3 fragment has an
output format of text (the other alternative is xml), will be written
to a file called output.txt, all triggered by the
chunked value of the mode attribute
Fragment section3
This stands as output code for section 3. The src:fragment element
has the output attribute set to text, so this content should be a simple
copy of the input content.
The difference is that the mode is set to 'chunked', so an output
file named after the href attribute (output.txt) should be generated.
This also implies that the processor understands xslt 2.0
The use of mundane-result-prefixes="dc ns a nvdl r" enables some namespace
cleanup in the output. The attribute contains
a list of prefixes for which namespace declarations will not be
generated. This can also be specified as a stylesheet parameter,
$mundane-result-prefixes.
This fragment is the first one to be processed (by the
'tangle' phase) and dictates the order in which the various
fragments are processed into the output. Each fragref is resolved in turn to add to the
normal output stream. The linkend
attribute values should match one (and only one) id value on another
fragment. Note that if you are using docbook version 5, as this example does, then the id values won't be id values, they will be xml:id values.
Fragment top
frag.root
section2
section3
References to other documentation on tangle and weave.
Norman Walsh, Literate Programming in XML
Mark B Wroth, DBLP: DocBook-based Literate Programming
Robin Cover, a longer history of tangle and weave
Finally the xweb source for this file, and the modified stylesheets. All zipped up as xweb.zip