oXygen XML Editor

Docbook and Ant

1. Ant and Docbook
2. How to use ant with docbook
3. Using ANT with XSLT
4. ant

1.

Ant and Docbook

dawid.weis

A project dedicated to using ant with docbook. Project home page is here. Dawid describes it as:

I started this small project when I needed to render certain documentation in DocBook using ANT. Although DocBook is a set of XSLT stylesheets and ANT has a built-in XSLT task, I wanted some more functionality that was difficult to achieve with pure ANT. Specifically I wanted to:

Process DocBook using ANT's default XSLT processor in Java if no other options were available, or automatically detect the presence of XSLTPROC processor and use it instead of Java's one. XSLTPROC is much faster at processing XSLT and DocBook's stylesheets are huge, so the gain is significant.

Simplify the process of rendering -- I wanted all resources, such as figures or CSS stylesheets, to be copied automatically to the destination folder.

I wanted certain level of customization to the default DocBook, namely, inclusion of XML files using extension XML elements, not entity inclusion mechanism. In such approach, each chapter or section can be a separate file and all files can have their own DTD headers -- a neat thing if you're using an XML editor with syntax helpers (such as JEdit, see figure on the right).

Capability to render to various formats using one script and one set of ANT tasks -- HTML, HTML chunked into sections, PDF... all these are simple to use with the styler package (although at a price of limited customization capabilities).

I also changed some of the default docbook CSS stylesheets so that the rendered documentation looks... well, it seems nicer to me at least. The same applies to XSL:FO customization layer.

2.

How to use ant with docbook

DaveP (with lots of help from Norm)

Ant is a piece of java software which acts like glue, to stick lots of bits of software together. If you've used make on a Unix box or under cygwin, then you'll get the idea.

What I've done is use at as a glorified script or batch file to make it easier to do a few things

  • To pull all the bits of docbook together (hence the glue analogy). This is XML based, so by all I mean the stylesheets and the DTD

  • To keep up to speed with the teams stylesheets, they are way too fast for me.

  • To reduce the amount of work needed to install and use a new update of the stylesheets.

  • Perhaps inspire others to benefit from ant?

Pre-requisites. Unlike some, I'm not fussed where docbook is installed. Respecting others decisions to do this, I've presumed that your installation has some root location, for which I'm using a directory called docbook. I'm assuming the layout within that directory follows what I believe to be Norm's expectations for relative paths within that. I'll describe that in a minute. I'm presuming you are XML based. I'm sure ant can work with SGML but I haven't tried it. I assume you have a java 1.3 or later installation. I assume you have a copy of ant installed. As of today I'm using version 1.4.1. If you haven't, you can get it from the apache site. I feel sure that this lot will work with later versions. You can read more about ant on the apache site, its one of their projects. The resolver has moved from being a Sun product to an Apache one, see apache site. An article on catalogs is linked from there.

The main script / batch file

This file is used simply to run ant

The purpose of this file is to launch ant. No more, no less. Example 1, “The script or batch file” shows the file.

Example 1. The script or batch file

set JAVA_HOME=/jdk
java -cp /apps/ant/lib/ant.jar org.apache.tools.ant.Main

This calls up java (I assume that's on your environment variable path). giving it the ant class to run, org.apache.tools.ant.Main. Since no parameters are passed to ant (it takes quite a few), this makes assumptions about the location of its input information. The main file which controls ant is called build.xml, as described later. Without a command line parameter telling it where to find this file, it looks for a file named build.xml in the current directory. That means you must change to the directory holding the file build.xml, which sensibly is the same directory as the source file being processed. Once ant is running, the rest is all controlled by the build.xml file.

Build.xml, the ant build file.

This is where the flexibility is, and where most things are controlled from. Example 2, “build.xml” shows this file.

Before looking at this file, a few points. One line in particular controls the sequence of operation within the build file. Another specifies the base location for a few files. Feel free to tweak these as needed, they are pointed out in the callouts.

A little explanation, but the ant documentation is pretty good. Key things to watch out for are the properties and the target elements. For properties, read variables, used with a dollar and braces to obtain the value of the property, e.g. ${propertyName}. For targets, read actions. From the documentation, this could be quite a bit, but I've only used the java command for the XSLT processing. For detail on the target element see the ant documentation. Also note that the project element has an attribute called default, which specifies the 'output-all' target, which in turn depends on the others. This is the key to the sequence of targets which are executed.

Example 2. build.xml



<!-- 
This is the ant build file for use with Norm Walsh website DTD and stylesheets
for use with his resolver classes (found at sun site)
Revision: 1.2
Date    : 2008-02-06T13:09:16Z
Author  : DaveP
Updated to work with ant 1.7 and new resolver from Apache
 -->

 <!-- Set the base directory to the location of the xml files -->
<project 
     name="generate"   
     basedir="/sgml/site2/docbookx"              1
     default="output-all">                        2  
<description>Test for sun catalog resolver</description>
 
 <!-- docbook location: Everything taken relative to this for docbook stuff -->
 <property name="docbookHome" value="/sgml/docbook"/>         3                 
 <!-- Main stylesheets -->
 <property name="sSheetHome" value="${docbookHome}/docbook-xsl"/>      4
 <!-- Website DTD and stylesheets -->
<property name="websiteHome" value="${docbookHome}/website"/>        5             

 <!-- Main Docbook stylesheet -->
<property name="main.stylesheet" value="${sSheetHome}/html/docbook.xsl"/>    6
 <!-- Stylesheet to use for website processing. Normally
<property name="website.stylesheet" value="${websiteHome}/xsl/chunk-website.xsl"/> 
 I added extra templates, so I call this via an import in the actual one

 -->
<property name="website.stylesheet" value="mywebsite.xsl"/>          7


 <!-- Stylesheet to use for layout file -->
<property name="autolayout.stylesheet" value="${websiteHome}/xsl/autolayout.xsl"/>  8

              
<!-- Input properties:  -->                                      9
<!-- all files should be in this directry-->
<property name="in.dir" value="${basedir}"/>                          10  
<!-- input file for any docbook valid document -->
<property name="main.infile" value="dbtest.xml"/>                               11  

<!-- source file for doLayout target -->
<property name="autolayout.infile" value="newlayout.xml"/>                      12 
<!-- source file for website transform on second pass --> 
<property name="website.infile" value="autolayout.xml"/>                            13 
 

           <!-- Output Properties: Output directory -->                         14     
 
<property name="out.dir" value="${in.dir}" />  <!-- all files -->          15  
 <!-- Main output file used for docbook transform -->
 <property name="main.outfile" value="op.html"/>                                (16) 
 <!-- Null (dummy)output file for website transform -->
 <property name="website.outfile" value="op.html"/>                                   (17)   
 <!-- output file for website first pass, doLayout -->
 <property name="autolayout.outfile" value="autolayout.xml"/>                     (18) 

        <!-- Post XSLT transform parameter. Leave as is for Saxon -->
 <property name="param.args.post" value="saxon.extensions=1"/>           (19)



    <!-- XSLT engine class -->
 <property name="xslt.processor.class" value="com.icl.saxon.StyleSheet" />    (20)

 <!-- path for xslt processor. 
         Includes resolver and extensions and catalogManager.properties file.  -->
 <path id="xslt.processor.classpath">                                         (21)
  <pathelement path="/myjava/saxon655.jar" />  <!-- Saxon jar -->
  <pathelement path="/myjava/resolver.jar"/> <!-- resolver jar -->
  <pathelement path="${websiteHome}/extensions/saxon64.jar"/> <!-- docbook extensions -->
  <pathelement path="/sgml"/> <!-- for catalogManager.properties -->
 </path>

 <!-- Use latest javac -->
  <property name="build.compiler" value="modern"/>                           (22)


 <!--  -->
 <!--Initial processing: If needed.  -->
 <!--  -->
 <target name="init">
	<echo message="Do initialisational things" />                          (23)   
  <tstamp>
   <format property="TODAY_UK" pattern="d-MMM-yyyy" locale="en"/>
  </tstamp>
 <echo>building on ${TODAY_UK}</echo>
 </target>

    <!-- ================================================ -->
    <!---Generate output (select as needed)                 -->
    <!-- ================================================ -->
    <target name="output-all" depends="init,doMain,run-j">                    (24) 
 <!-- -->
    </target>
  
 
    <!-- ================================================ -->
    <!--      Use newlayout to create autolayout.xml             -->
    <!--      Needed if any files are added to the website       -->
    <!-- ================================================ -->
    <target name="doLayout" depends="init">                                                  
    <java classname="${xslt.processor.class}"                            (25) 
      fork="yes" 
      dir="${in.dir}"
      failonerror="true">
      <classpath refid="xslt.processor.classpath" />
      
     <arg line="-o ${out.dir}/${autolayout.outfile}"/>
     <arg line="-x org.apache.xml.resolver.tools.ResolvingXMLReader"/>
     <arg line="-y org.apache.xml.resolver.tools.ResolvingXMLReader"/>
     <arg line="-r org.apache.xml.resolver.tools.CatalogResolver"/>
     <arg line="${in.dir}/${autolayout.infile} ${autolayout.stylesheet} ${param.args.post}" /> 
    	</java>
    </target>
    
  
    <!-- ================================================ -->
    <!-- Generic XSLT-processor call (main docbook transform) -->
    <!-- ================================================ -->
    <target name="doMain">                                                     (26) 
	<java classname="${xslt.processor.class}" 
	      fork="yes" 
	      dir="${in.dir}"
	      failonerror="true">
	    <classpath refid="xslt.processor.classpath" />
  
     <arg line="-o ${out.dir}/${main.outfile}"/>
     <arg line="-l"/>
     <arg line="-x org.apache.xml.resolver.tools.ResolvingXMLReader"/>
     <arg line="-y org.apache.xml.resolver.tools.ResolvingXMLReader"/>
     <arg line="-r org.apache.xml.resolver.tools.CatalogResolver "/>
     <arg line="${in.dir}/${main.infile} ${main.stylesheet} ${param.args.post}" /> 
    	</java>
    </target>
    




    <!-- ================================================ -->
    <!-- Generic XSLT-processor call (website transform)  -->
    <!-- ================================================ -->
    <target name="doWebsite" >         <!-- depends="doLayout" -->            (27)   
	<java classname="${xslt.processor.class}" 
	      fork="yes" 
	      dir="${in.dir}"
	      failonerror="true">
	    <classpath refid="xslt.processor.classpath" />
  
     <arg line="-o ${out.dir}/${website.outfile}"/>
     <arg line="-x org.apache.xml.resolver.tools.ResolvingXMLReader"/>
     <arg line="-y org.apache.xml.resolver.tools.ResolvingXMLReader"/>
     <arg line="-r org.apache.xml.resolver.tools.CatalogResolver"/>
     <arg line="${in.dir}/${website.infile} ${website.stylesheet} ${param.args.post}" /> 
       	</java>
    </target>
    
  

    <!-- ================================================   -->
    <!--                run                               -->
    <!-- ================================================ -->

 <!-- Fill these in if you want to view files post build -->


<target name="run-j">                                                          (28) 

  <tstamp>
   <format property="TODAY_UK" pattern="d-MMM-yyyy" locale="en"/>
  </tstamp>


  <echo>View ${out.dir}/${main.outfile} output file.</echo>
  <echo>Completed at ${TODAY_UK}</echo>

</target>


</project>





3

The root directory where all the docbook DTD's and stylesheets are: Modify to suite you.

4

The directory where all the stylesheets are: Change if you change version.

5

The directory where the website DTD and stylesheets are: Modify if you change version

6

The main html stylesheet for docbook.

7

The website stylesheet

8

The autolayout stylesheet

9

The main input properties.

10

The directory holding the input XML documents

11

The input file being processed by the main XSL stylesheet: Modify to use your input file

12

Your layout file: Modify to use your input file for website layout

13

(Normally) the output of processing your layout file for website.

14

The section specifying the output files produced (N/A for website)

15

The output directory to use

(16)

The output file for use with main docbook stylesheet processing.

(17)

Final website output file, not normally used.

(18)

Output of the layout file processing

(19)

This is used to select Saxon as the processor, used by the stylesheets for extensions

(20)

Selects the saxon processor class for XSLT processing

(21)

These set up the classpath for the xslt processor. Includes the resolver, saxon extensions by Norm and the catalogManager.properties file for use by the resolver.

(22)

Says to ant use the latest (nearly) java machine.

(23)

Unused, but you could use it to clean out old files or whatever. This is the first 'target' that is run by ant.

(24)

Critical main control point. This selects which 'targets' are run. The example selects an init target, a doMain target and a run-j target. Modify this to select which targets you want. (Or add more targets)

(25)

This target runs the first stage of website processing. It creates a layout.xml file from your input file (newlayout.xml in the example)

(26)

This target runs the main stylesheet against an input file to produce the required output file.

(27)

This target uses the output of the previous website processing, and produces all the seperate files specified in layout.xml

(28)

As in the pre-processing, this is unused, perhaps you might want some post processing.


Moving docbook.cat to Oasis xml format

Oh dear, that was a non-event. Two emacs macro's and its done! I wonder if the xml catalog was built with this in mind? There is a twist though. Since the team don't know where on your machine it is going to reside, there is no base statement in it. I've added one as can be seen in Example 3, “docbook.cat.xml”

Example 3. docbook.cat.xml

<?xml version="1.0" ?>
<catalog 
	 xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog"
	 xml:base="/usr/home/docbook/">                                              [1]
  <!-- ......................................................................-->
  <!-- Catalog data for DocBook XML V4.1.2 ....................................-->
  <!-- File docbook.cat.xml .....................................................-->
 <!-- Modified by DaveP, 2 May 2002 -->
  <!-- Please direct all questions, bug reports, or suggestions for
     changes to the docbook@lists.oasis-open.org mailing list. For more
     information, see http://www.oasis-open.org/.
 -->

  <!-- This is the catalog data file for DocBook XML V4.1.2. It is provided as
     a convenience in building your own catalog files. You need not use
     the filenames listed here, and need not use the filename method of
     identifying storage objects at all.  See the documentation for
     detailed information on the files associated with the DocBook DTD.
     See SGML Open Technical Resolution 9401 for detailed information
     on supplying and using catalog data.
 -->

  <!-- ......................................................................-->
  <!-- DocBook driver file ..................................................-->

<public publicId="-//OASIS//DTD DocBook XML V4.1.2//EN"                           [2]
   uri="docbookx.dtd"/>

  <!-- ......................................................................-->
  <!-- DocBook modules ......................................................-->

<public publicId= "-//OASIS//DTD DocBook XML CALS Table Model V4.1.2//EN" 
uri="calstblx.dtd"/>
<public publicId= "-//OASIS//DTD XML Exchange Table Model 19990315//EN" 
uri="soextblx.dtd"/>
<public publicId= "-//OASIS//ELEMENTS DocBook XML Information Pool V4.1.2//EN" 
uri="dbpoolx.mod"/>
<public publicId= "-//OASIS//ELEMENTS DocBook XML Document Hierarchy V4.1.2//EN"
uri= "dbhierx.mod"/>
<public publicId= "-//OASIS//ENTITIES DocBook XML Additional General Entities V4.1.2//EN"
uri= "dbgenent.mod"/>
<public publicId= "-//OASIS//ENTITIES DocBook XML Notations V4.1.2//EN"
uri= "dbnotnx.mod"/>
<public publicId= "-//OASIS//ENTITIES DocBook XML Character Entities V4.1.2//EN"
uri= "dbcentx.mod"/>

  <!-- ......................................................................-->
  <!-- ISO entity sets ......................................................-->

<public publicId= "ISO 8879:1986//ENTITIES Diacritical Marks//EN"
uri= "ent/iso-dia.ent"/>
<public publicId= "ISO 8879:1986//ENTITIES Numeric and Special Graphic//EN"
uri= "ent/iso-num.ent"/>
<public publicId= "ISO 8879:1986//ENTITIES Publishing//EN"
uri= "ent/iso-pub.ent"/>
<public publicId= "ISO 8879:1986//ENTITIES General Technical//EN"
uri= "ent/iso-tech.ent"/>
<public publicId= "ISO 8879:1986//ENTITIES Added Latin 1//EN"
uri= "ent/iso-lat1.ent"/>
<public publicId= "ISO 8879:1986//ENTITIES Added Latin 2//EN"
uri= "ent/iso-lat2.ent"/>
<public publicId= "ISO 8879:1986//ENTITIES Greek Letters//EN"
uri= "ent/iso-grk1.ent"/>
<public publicId= "ISO 8879:1986//ENTITIES Monotoniko Greek//EN"
uri= "ent/iso-grk2.ent"/>
<public publicId= "ISO 8879:1986//ENTITIES Greek Symbols//EN"
uri= "ent/iso-grk3.ent"/>
<public publicId= "ISO 8879:1986//ENTITIES Alternative Greek Symbols//EN"
uri= "ent/iso-grk4.ent"/>
<public publicId= "ISO 8879:1986//ENTITIES Added Math Symbols: Arrow Relations//EN"
uri= "ent/iso-amsa.ent"/>
<public publicId= "ISO 8879:1986//ENTITIES Added Math Symbols: Binary Operators//EN"
uri= "ent/iso-amsb.ent"/>
<public publicId= "ISO 8879:1986//ENTITIES Added Math Symbols: Delimiters//EN"
uri= "ent/iso-amsc.ent"/>
<public publicId= "ISO 8879:1986//ENTITIES Added Math Symbols: Negated Relations//EN"
uri= "ent/iso-amsn.ent"/>
<public publicId= "ISO 8879:1986//ENTITIES Added Math Symbols: Ordinary//EN"
uri= "ent/iso-amso.ent"/>
<public publicId= "ISO 8879:1986//ENTITIES Added Math Symbols: Relations//EN"
uri= "ent/iso-amsr.ent"/>
<public publicId= "ISO 8879:1986//ENTITIES Box and Line Drawing//EN"
uri= "ent/iso-box.ent"/>
<public publicId= "ISO 8879:1986//ENTITIES Russian Cyrillic//EN"
uri= "ent/iso-cyr1.ent"/>
<public publicId= "ISO 8879:1986//ENTITIES Non-Russian Cyrillic//EN"
uri= "ent/iso-cyr2.ent"/>

  <!-- End of catalog data for DocBook XML V4.1.2 .............................-->
  <!-- ......................................................................-->

</catalog>

I'd explain it, but I think the comments tell much of the story. This catalog tells the resolver classes where to find all the actual docbook DTD and its ancialliary bits. [1] is the major change I've made, simply to tell docbook that this location is where all the docbook DTD stuff hangs out. [2] is the root of the docbook DTD.

I'd suggest this file is not modified.


CatalogManager.properties file

Moving on to the properties file.

The prime purpose of this file is to point to the catalog (the first one) to use. Example 4, “catalogManager.properties” shows an example.

Example 4. catalogManager.properties

#CatalogManager.properties

# 1 ..4
verbosity=2                                                              [1]

#If relative-catalogs is yes, relative catalogs in the catalogs property will be left relative; 
#otherwise they will be made absolute with respect to the base URI of this file. 
relative-catalogs=yes

# Always use semicolons in this list
catalogs=/sgml/catalog.xml                                               [2]

# either public or system
prefer=public

#this option controls whether or not a new instance of the resolver is constructed for each parse.
static-catalog=yes

#toggle whether or not the resolver classes obey the 
# <?oasis-xml-catalog?> processing instruction.
allow-oasis-xml-catalog-pi=yes

catalog-class-name=com.sun.resolver.Resolver

[1] This controls the screen printout as catalogs are processed. Keep it as low as you need until something goes wrong. [2] This is the pointer to the catalog to use.


Next the docbook higher level catalog, catalog.xml. This file controls the operation of the resolver software, mapping one thing to another, primarily as used here it is resolving remote (across the internet) addresses, to your local installed copy of those files. Example 5, “Catalog.xml” shows this file, with comments.

Example 5. Catalog.xml

<?xml version="1.0" ?> 

 <!-- set the base to the location of your docbook installtion
presumes a directory layout similar to Norms:
  docbook
    file docbookx.dtd
  docbook-xsl-x.xx  (stylesheets directory, x.xx is the version)
  website-x.xxx     (website stuff)
 -->
<catalog 
	 xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog"
	 xml:base="/usr/home/docbook/">                                      [1]

 <!--This group handles all docbook items.  -->
 <!-- Use formal public identifiers in preference to SYSTEM identifiers -->
 <group prefer="public" >

 <!-- Main docbook DTD -->
 

<nextCatalog catalog="docbook.cat.xml"/>                                  [2]



 <!-- Main Website DTD (change uri for -custom and -full )-->
  <public publicId="-//Norman Walsh//DTD Website V2.1b1//EN"              [3]
	  uri="website-2.1b1/website.dtd"/>

 <!-- Autolayout dtd (for website) -->
<public publicId="-//Norman Walsh//DTD Website Auto Layout V1.0//EN"      [4]
	uri="website-2.1b1/autolayout.dtd"/>


 <!-- Entry for jrefentry -->
<public publicId="  -//Norman Walsh//DTD JRefEntry V1.0//EN"              [5]
	uri="jrefentry-1.1/jrefentry.dtd"/>

 <!-- Chunk-website stylesheet -->
  <uri  
   name="http://docbook.sourceforge.net/release/xsl/current/html/chunker.xsl" [6]
   uri="docbook-xsl-1.50.0/html/chunker.xsl"/>

 <!-- main docbook stylesheet -->
<uri
     name="http://docbook.sourceforge.net/release/xsl/current/html/docbook.xsl" [7]
   uri ="docbook-xsl-1.50.0/html/docbook.xsl"/>
 </group>

</catalog>


[1] All uri attributes use this as the base from which other files are found

[2] Next catalog in the chain, the docbook DTD catalog

[3] The website DTD

[4] The layout DTD, for setting file relationships in website.

[5] The jrefentry DTD

[6] The chunker stylesheets, for producing multiple html output files

[7] The main docbook stylesheet.


Conclusion.

For now, that's the main options (xsl-fo excepted). It wouldn't take much to repeat for fo, for plain single html output, etc. If needed, I'll add that.

Hope that's enough to get it working for you. If its unclear, please let me know.

3.

Using ANT with XSLT

Robert Koberg


> Is there an easy way to invoke Saxon as the XSLT support within ANT?

The task, Multi-XSLT Ant Task: mtxslt at Sourceforge can be used to setup different processors.

4.

ant

Dawid Weiss

Many people have expressed their content with what I've done to ease the burden of various DocBook configurations and ANT properties, so I think it would be valuable for others to know such project exists. I basically wrote it for myself to have a cross-project support for DocBook documentation generation into various output formats (PDF and HTML mostly) without having to reconfigure everything all over again or alter the build files.

Please feel free to have a look at the project, I'd be very eager to know your opinion about it. cs.put.poznan.pl (or feel lucky on Google: ant docbook)