1. | Accessible docbook | ||||||||||
OK, the short answer: The DocBook XSL stylesheets do a pretty good job of making it possible to create accessible HTML. The main features are:
There are still some hard coded HTML formats that are not subject to control with CSS, but those are falling in number over time. | |||||||||||
2. | Cross Referencing using xref | ||||||||||
The XSL stylesheets use another kind of template system to form xref text and other generated text strings (not to be confused with XSL apply-templates). It uses templates of text strings, sort of like the strings used in printf statements, where some of the text is fixed and some is variable to be substituted at runtime. The string templates are localized, and may be highly customized. In your 1.41 distribution, you should find common/en.xml. That contains all the generated text strings for processing with the default lang="en". Search in en.xml for <context name="xref">. That element contains <template> elements for each kind of element that can have automatically generated xref text. The default one for chapter says: <template name="chapter" text="Chapter %n"/> The %n represents the item number (chapter number in this case). You can use %t to represent the title. So you could change it to: <template name="chapter" text='Most Worthy Chapter %n, "%t"'/> Note that I changed the enclosing attribute quote characters to permit using "" around the title. BTW, don't be put off by the message at the top of the file "Do not edit this file by hand!". That only applies if you are building the template files from the CVS source. Since the stylesheet distro doesn't come with the source or Makefile, you can ignore that comment and customize it to suit your needs. However, you can also do such customization without touching the original distro files. It takes a few more steps: 1. In 'common' directory, copy en.xml to a new name like custom-en.xml and make your changes there. 2. Also in common, copy l10n.xml to a new name like custom-l10n.xml. Edit custom-l10n.xml to change <!ENTITY en SYSTEM "en.xml"> so that it references your customized file. 3. In your customization layer for your stylesheet, add another parameter: <xsl:param name="l10n.xml" This parameter sets the parameter named "l10n.xml" (yes, it sure looks like a filename) to your custom filename, which pulls in your custom english template file. This parameter should probably be listed in param.xsl, but it isn't. Of course, if you were hoping for just a simple parameter that would turn on chapter titles, well, it isn't there. The template system gives you a great deal of control, but it is a bit more complex. I'm updating the doc on XSL customization to include stuff like this. | |||||||||||
3. | How do I make a cross reference to another part of my document? | ||||||||||
You make a cross reference by identifying where you want to go, and then making a link to it. To identify where you want to go, you add an <table id="MyJulySales"> To form a link to that table, you have two DocBook elements to choose from. In both cases you add a
Why not just use If you | |||||||||||
4. | How do I link to someone else's website? | ||||||||||
If you need to link to someone else's web site, then use For more information visit <ulink url="http://www.acme.com">the Acme website</ulink>. If you omit the text inside the element, then the url string is also used as the hot spot text. You can also pop the link up into a new window if you are processing to HTML. To get a new window, add to your ulink a | |||||||||||
5. | Can I do conditional text in DocBook? | ||||||||||
Conditional text means you have some text in your document that you only want to output when certain conditions are met. For example, maybe you need to write documentation that covers an application running on two different operating systems. Rather than write two separate documents that are mostly the same, you can write one document and mark as conditional the text elements that are specific to each operating system. This is typically done by assigning different values to attributes such as Once you have your document marked up this way, you can generate different versions by using the profiling stylesheet The profiling stylesheet is included in the DocBook XSL distribution in | |||||||||||
6. | How do I do conditional titles when only one title is allowed in DocBook elements? | ||||||||||
If you are doing conditional text, you may run into the need to have conditional titles as well. But DocBook elements such as The solution is to keep one <chapter><title> <phrase os="linux">Creating symbolic links</phrase> <phrase os="windows">Creating shortcuts</phrase> </title> The | |||||||||||
7. | Is there any way to generate change bars from a DocBook document? | ||||||||||
Yes, it can be done either by hand or semi-automatically. You can use the revisionflag attribute to track changes. If you have two versions of the document, you can use diffmk to automatically add the revisionflags. Then process the document with changebars.xsl and you'll get something like change bars, see http://www.w3.org/TR/2000/REC-xml-20001006-review.html. P.S. I have a java version of diffmk that is in some ways better than the perl version. I'm working on getting it released. | |||||||||||
8. | Changing language in a document, how to markup | ||||||||||
| |||||||||||
9. | xref to figure, or numbering figures. | ||||||||||
Then you want: <figure id="image_segments"> | |||||||||||
10. | How to include C source code in docbook | ||||||||||
An approach is the "literate programming" one of having the DocBook source write the C source when processed. This approach interests me enough that I wrote the code to do it -- it's available at http://www.west-point.org/users/usma1978/36200/LitProg/SGMLWEB/index.htm. Rafael R. Sevilla adds I have another approach that is a little more general: | |||||||||||
11. | How to mark up emacs keyboard codes | ||||||||||
For the C-h C-f example, try this:
| |||||||||||
12. | Mathml in docbook | ||||||||||
There is MathML customization for DocBook (see oasis ). With it you can use MathML inside of equation and inlineequation. However typing MathML without any supporting tools is very sluggish. Peter Ring adds The 'free' comment as meant for something else, the WebEQ Math Viewer , the TeXaide equation editor and the IceSoft browser. But now that we are at it, do you know if there are any other good free (GPL, BSD, whatever) tools for authoring and rendering MathML besides Amaya, PassiveTeX and a few other? There's a somewhat lacking page at W3C, and a list with links at the same place Mozilla will also be able to display (some) MathML; there's a project developing MathML support in Gecko (the rendering engine) that will sertainly 'spill over' to other projects. See Mozilla. To check if you current incarnation of Mozilla is built with MathML, try texvsmml.xml. You may also try with Amaya, but you'll get disappointed, because the page doesn't validate. There's an interesting MathML+Mozilla site at pear.math.pitt.edu. | |||||||||||
13. | remap DocBook elements to other element | ||||||||||
Nik Clayton mentioned a way to remap DocBook elements to other element names -- that is, to take something like this: <helpproject status="draft" remap="article"> and turn it into this: <article status="draft"> Here's a simple stylesheet that I think will do it. <?xml version='1.0'?> To make the best use of it, you'd also want to do some steps to add support for it in your DTD customization layer:
If you're creating a DocBook customization layer, you'd end up with an ATTLIST that looked something like this: <!ATTLIST helpproject If you do that, you don't have to include "remap" attributes in your document instances. But your XSLT engine will need access to your customization layer DTD; you won't be able to correctly process a document if it doesn't include a DOCTYPE declaration that references your customization-layer DTD where the #FIXED attributes are declared. This might not seem like a proper use of "remap", but I think Eve Maler and Jeanne El Andaloussi's "Developing SGML DTDs: From Text to Model to Markup" provides some support for using it that way. They discuss a "remap" attribute in that book (Conversion Markup section (8.4.2) of the Markup Model Design and Implementation chapter) and outline a couple different things it might be used for, including "transforming SGML documents to conform to a different DTD (or to the same DTD but with different or augmented contents)." Bob Stayton adds Clearly the remap attribute is intended to capture the former element name for a transformed element so that something special can be done with it. Your concern about proper use of remap comes from the phrase "previous markup scheme" in the description of the attribute in the Definitive Guide. You are using remap essentially for the "next markup scheme" instead of previous, right? Well, consider modifying your transformation to make it round trip. Change your stylesheet to add a remap attribute to your generated Docbook elements to indicate which help element they came from. Then you can write the reverse stylesheet that transforms a Docbook document back to the help customization. It reads the remap value of a Docbook element to decide which help element to create, and adds a remap value to indicate which Docbook element it came from. That makes the Docbook element name "previous". Even if you don't actually perform the reverse transformation, you have defined it and therefore can use remap as you have described, in a virtual sense. The documents are isomorphic, remapping to each other. Anyway, I think it would be useful for your transformation to capture which help element it came from in case some special handling is desired. Who knows, there may be cases where you need to do the reverse transformation. Consider the initial conversion of an existing DocBook document to a help document, for example. | |||||||||||
14. | Markup for daemon name | ||||||||||
<systemitem role="daemon"> or <command role="daemon"> depending on the | |||||||||||
15. | Add a title inside a list item | ||||||||||
Use | |||||||||||
16. | Setting column widths | ||||||||||
By using the colwidth attribute, for example <informaltable> See the CALS table spec for a complete description of relative width specs in CALS tables. | |||||||||||
17. | How to mark up a translator of a document | ||||||||||
I think this is the kind of thing <othercredit> ("A person or entity, other than an author or editor, credited in a document") is intended for. You can put a role attribute on <othercredit> to qualify it: <othercredit role="translator">. | |||||||||||
18. | How to cross reference to tables and images | ||||||||||
You do this with <xref>s. An example is easiest; Name the element you want to reference first, i.e. <figure id="my_figure"> then anywhere you want to refer to it, use <xref linkend="my_figure" />, which gets converted to "Figure x", where x is the figure number. With other elements, the appropriate noun is inserted in place of "Figure". Incidentally, you probably want to use element id's that describe what they label, i.e. <figure id="fig-image_description">, <section id="section-section_about_topic_x">. It's not compulsory in the slightest, but I'd argue that it makes the xref's more legible when you're writing the document. | |||||||||||
19. | Markup for synonyms in glossary | ||||||||||
Example below <glossentry id="loop.infinite"> | |||||||||||
20. | Markup for notes | ||||||||||
If the note is shown elsewhere in the document, like a Figure, Example, or Theorem, I would use xref. If the note is published externally then a bibliography would be appropriate. | |||||||||||
21. | Marking up protocols | ||||||||||
I definitely wouldn't use 'token'. A token is supposed to be a logically atomic unit of information. Examples of this are reserved words, operators, variable names, and '{' braces, in the C programming language. By "atomic", I mean that a token is logically indivisible, in a given context (e.g. you can brake the token 'return' into smaller groups of letters, but if you're trying to parse a C program, it isn't meaningful to do so). I would use 'systemitem'. Or, failing that, 'phrase'. In either case, the 'role' attribute should probably be used. For example: <systemitem role="protocol">HTTPS</systemitem> Now, perhaps you're thinking "gee, this is pretty darn verbose". If so, don't feel ashamed of yourself, however. Verbosity has disadvantages: the more work you make it for someone to say/do something, the less likely it is that they will. This leads to inconsistent markup. Furthermore, the 'role' attribute is an open-ended hook. Unless you customize your stylesheets to allow only certain values (but if you're doing this, why not just add a protocol element?), or check for non-sanctioned values in your processing application (i.e. stylesheets), there exists the possibility that someone may misspell "protocol", which will cause semantic loss of that instance. Another issue to consider is that perhaps a 'protocol' element one day gets added to the DocBook vocabulary, or you use a customization layer with it. It seems troublesome to go back and change all the documents that were previously written, to maximize the advantages of having this element (which may prevent you from ever getting around to adding it). "What's to be done about all this?", you may ask. I have a solution that I devised in a markup language with much more ad hoc semantics (dare I say it...? LaTeX): use indirection. Indirection allows you to centralize the translation, which enhances manageability and consistency. Specifically, I recommend that you define a general entity, and reference it for every case in which you want to mention HTTPS: <!ENTITY Https " Then, in your document, you can use it as: At the expense of host-side stream encryption and per-transaction key generation, as well as client-side decryption (requiring a compatible browser), &Https; enables secure web transactions to be performed over insecure networks. Now, not only is it easy to type, but if you make a spelling error (aside from leaving off the "s", perhaps), the parser will flag it as an error. Furthermore, you've centralized this definition, so that if the name of the standard, or the tags you want to use, ever change, you only have to make an edit in one place. Putting this definition in an external parameter entity can allow you to share this definition between multiple documents. I usually group multiple such entity definitions, by subject matter, into a external parameter entities. That way, I benefit from reuse, centralized management, and consistency across multiple documents and authors. One thing to watch out for is naming conflicts (especially since the first definition of an entity will silently override any subsequent ones). Therefore, I break down the problem by prefixing each entity name with a prefix that is common to the file in which it's defined. Each file then has a unique prefix. (You may recognize this technique from C or other programming languages that have no formal mechanism for precluding namespace collisions.) | |||||||||||
22. | user name and groupname markup | ||||||||||
The next version of DocBook will support 'username' and 'groupname' as explicit class values for systemitem. I suggest that you use role in the short term. | |||||||||||
23. | Whitespace problem in indexterm | ||||||||||
Consider your source:
For clarity, let's replace spaces by '+' characters: <para>Why+does+adding+indexterms+cause+spaces+to+appear+here: Now, unless you've taken special care, multiple adjacent spaces are generally treated as a single space, so we can reduce this to: <para>Why+does+adding+indexterms+cause+spaces+to+appear+here: Now, remove the index terms and what's left? <para>Why+does+adding+indexterms+cause+spaces+to+appear+here: Those spaces are not "adjacent" unfortunately, so each one is produced in the output. That's the source of the extra spaces. Unfortunately, the only way to avoid this problem is either to put the indexterms between paragraphs (which is logically wrong) or to make sure that you don't introduce extra spaces with your index terms: <para>Why does adding indexterms cause spaces to appear here:<indexterm> | |||||||||||
24. | Default encoding | ||||||||||
Strictly speaking, it's "or use UTF-16 with a byte-order mark", since you can have a byte-order mark with UTF-8. UTF-16 without a byte-order mark (BOM) can be mistaken for a number of other encodings, hence you need the BOM if you're omitting the encoding declaration. Both UTF-16 without the BOM and the 'number of other encodings' all need to have the encoding declaration so the XML processor can determine the encoding. UTF-16 with both the BOM and an encoding declaration is okay, too. 8-bit text without an encoding declaration is expected to be UTF-8. Hence, if the text isn't UTF-8, you need the encoding declaration. UTF-8 text with the BOM (EF BB BF) and without an encoding declaration should be recognised as UTF-8. However, using the BOM with UTF-8 wasn't mentioned in the Unicode Standard, Version 2.0 (which was current when XML 1.0 was published), so some early XML processors weren't designed to recognise the UTF-8 BOM. The UTF-8 BOM was not mentioned in Appendex F of XML 1.0, but is mentioned in Appendix F of XML 1.0 Second Edition (and was mentioned in the version of ISO/IEC 10646 current when XML 1.0 was published). | |||||||||||
25. | HTML to docbook | ||||||||||
Command Prompt makes a rather good and quite cheap product. It is also quite easy to use. commandprompt.com | |||||||||||
26. | MathML and docbook | ||||||||||
If you need MathML you can use DocBook with MathML module: on the oasis site | |||||||||||
27. | Inserting external code into docbook | ||||||||||
You can use following construct to include external code. You only need the inlinemediaobject wrapper if you are using 4.1.2.
with the 4.2 DTD, you can use textobject directly in programlisting:
Also, the text insertion process is an extension function, and is not available in xsltproc. You can use saxon, but you must include the saxon extensions jar file in your CLASSPATH, and you must set these two parameters to nonzero: use.extensions and textinsert.extension. | |||||||||||
28. | Lists and white space | ||||||||||
The content model for <para> includes #PCDATA, which means any white space is significant. The stylesheet should pass any white space through to the output. What happens to the white space then depends on the viewing application. How are you processing your content? HTML browsers have their own idea of how to display whitespace. If you are generating HTML, can you tell if the whitespace is getting through in both cases 1 and 3 above? I don't see either of those spaces in my browser when I process your example but the white space is in the HTML. | |||||||||||
29. | Executive Summary | ||||||||||
Try <abstract><title>Executive Summary</title> <para>...</para> </abstract> in the *info wrapper for starters. | |||||||||||
30. | Docbook 4.2 Image semantics | ||||||||||
I spent some time today (May 2002) working on new code to map DocBook V4.2 image semantics (a superset of previous semantics) to HTML. A number of compromises were required along the way. I probably won't be able to post the new code until I get back home, but here are the notes I wrote as I went. Comments, etc., most welcome. The HTML img element only supports the notion of content-area scaling; it doesn't support the distinction between a content-area and a viewport-area, so we have to make some compromises. 1. If only the content-area is specified, everything is fine. (If you ask for a three inch image, that's what you'll get.) 2. If only the viewport-area is provided: - If scalefit=1, treat it as both the content-area and the viewport-area. (If you ask for an image in a five inch area scaled to fit, we'll make the image five inches to fill that area.) - If scalefit=0, ignore it. Note: this is not quite the right semantic and has the additional problem that it can result in anamorphic scaling, which scalefit should never cause. 3. If both the content-area and the viewport-area is specified on a graphic element, ignore the viewport-area. (If you ask for a three inch image in a five inch area, we'll assume it's better to give you a three inch image in an unspecified area than a five inch image in a five inch area. Relative units also cause problems. As a general rule, the stylesheets are operating too early and too loosely coupled with the rendering engine to know things like the current font size or the actual dimensions of an image. Therefore: 1. We use a fixed size for pixels, $pixels.per.inch 2. We use a fixed size for "em"s, $points.per.em Percentages are problematic. In the following discussion, we speak of width and contentwidth, but the same issues apply to depth and contentdepth 1. A width of 50% means "half of the available space for the image." That's fine. But note that in HTML, this is a dynamic property and the image size will vary if the browser window is resized. 2. A contentwidth of 50% means "half of the actual image width". But the stylesheets have no way to assess the image's actual size. Treating this as a width of 50% is one possibility, but it produces behavior (dynamic scaling) that seems entirely out of character with the meaning. Instead, the stylesheets define a $nominal.image.width.in.points and convert percentages to actual values based on that nominal size. Scale can be problematic. Scale applies to the contentwidth, so a scale of 50 when a contentwidth is not specified is analagous to a width of 50%. (If a contentwidth is specified, the scaling factor can be applied to that value and no problem exists.) If scale is specified but contentwidth is not supplied, the nominal.image.width.in.points is used to calculate a base size for scaling. Warning: as a consequence of these decisions, unless the aspect ratio of your image happens to be exactly the same as (nominal width / nominal height), specifying contentwidth="50%" and contentdepth="50%" is NOT going to scale the way you expect (or really, the way it should). Don't do that. In fact, a percentage value is not recommended for content size at all. Use scale instead. Finally, align and valign are troublesome. Horizontal alignment is now supported by wrapping the image in a <div align="{@align}"> (in block contexts!). I can't think of anything (practical) to do about vertical alignment. | |||||||||||
31. | Image markup | ||||||||||
Interesting problem. Never came across this one although I have to admit that sooner or later it's supposed to happen. As you already noticed the percent sign is not allowed in entity values because it starts parameter entity references. [http://www.w3.org/TR/2000/REC-xml-20001006#NT-EntityValue]: According to [http://www.w3.org/TR/2000/REC-xml-20001006#entproc] I tried to use a character entity reference (%) to replace the percent sign and it worked with saxon 6.5.2. The following example document... <!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN" "/home/maiersn/share/sgml/docbkx412/docbookx.dtd" [ <!ENTITY figure '<mediaobject><imageobject> <imagedata fileref="image-file" width="75%"/></imageobject></mediaobject>'> ]> <article class="techreport"> &figure; </article> ...produces the following html output which again includes the percent sign in the width attribute's value of the img element... <html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
<title></title>
<meta name="generator" content="DocBook XSL Stylesheets V1.50.0">
</head>
<body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF">
<div class="article">
<div class="titlepage">
<hr>
</div>
<div class="mediaobject">
<img src="image-file" width="75%">
</div>
</div>
</body>
</html>
| |||||||||||
32. | biblioentry and biblioentry and bibliomixed | ||||||||||
Difference between biblientry and bibliomixed is very simple. Between elements in bibliomixed you must manually put puncation, when using biblientry this puncation is added automatically by stylesheet (and of course it will be added probably in a different way then you want/expect. | |||||||||||
33. | Table problems in fo output | ||||||||||
| 1) If i declare the size of the column in the colspec-tag like If you don't specify any widths, implementations are free to choose widths in the manner that you describe. Some do, some don't. Some do better than others. Rather than use explicit units of measure, you can use relative widths: In a two column table, <colspec colwidth="3*"/> <colspec colwidth="2*"/> the implementation is free to choose the width, but the ratio of the width of the first column to the second will be 3/2.
It's a bug. It will be fixed in 1.50.1, when it's released. I believe that it's fixed in 1.50.1-EXP2. But that's an experimental release. | 3) I tried to change the padding of the columns in a driver-file the Look in the FO file. If they show up there, you're seeing a formatter bug. If they don't, it's a stylesheet bug. I believe it's the former, but I've been wrong before :-) [Ednote:]That's a minority view! | |||||||||||
34. | website problems, setting directories | ||||||||||
Try | |||||||||||
35. | Resume | ||||||||||
Try XMLResume at sourceforge.net. It's a vocabulary specifically designed for resumes and includes stylesheets to generate text, HTML, and PDF. Mark Derricutt adds Take a look at HR-XML - hr-xml.org - theres an industry standard on resume xml layout. HR-XSL: sourceforge This is an open-source project that uses HR-XML to generate resumes from an XML master, so I think it answers the original question. Plus, HR-XSL uses DocBook XML as an intermediate representation, so it's definitely relevant to the DocBook project. | |||||||||||
36. | Content re-use in docbook | ||||||||||
Just small technical note. Current version of DocBook XSL stylesheets is able to do profiling on-the-fly together with conversion to HTML or FO in a single transformation. This makes whole process more user-friendly as there are not any additional steps needed to be done by user. For more info look at: sourceforge.net | |||||||||||
37. | Text direction and language | ||||||||||
I don't know of one. Wouldn't the direction be determined by the lang? > 2) What is the correct way to specify the language used within the Either 'lang' or 'xml:lang' attribute would be correct (note they use lowercase letters). The 'lang' attribute is declared in the DocBook DTD for just about every element. The 'xml:lang' attribute is outside the DocBook DTD, of course, but is defined in the xml namespace specifically for that purpose. The DocBook XSL stylesheets support both. | |||||||||||
38. | Produce a back cover | ||||||||||
You need to decide which elements in the markup describe/contain the matter for the back cover. The produce some DSSSL to handle the markup - and lay it out as you want (borrow heavily from the stuff that does the front cover). | |||||||||||
39. | Table markup | ||||||||||
Try it like this:
<informaltable frame="none">
<tgroup cols="4">
<colspec colnum="3" colsep="1"/>
<tbody>
<row>
<entry>a</entry>
<entry>b</entry>
<entry>c</entry>
<entry>d</entry>
</row>
<row>
<entry>e</entry>
<entry>f</entry>
<entry>g</entry>
<entry>h</entry>
</row>
<row rowsep="1">
<entry>i</entry>
<entry>j</entry>
<entry>k</entry>
<entry>l</entry>
</row>
<row>
<entry>m</entry>
<entry>n</entry>
<entry>o</entry>
<entry>p</entry>
</row>
</tbody>
</tgroup>
</informaltable>
That should work for FO and for HTML with CSS. | |||||||||||
40. | Link to biblioentry | ||||||||||
see <xref linkend="walsh97"> | |||||||||||
41. | Reference to biblioentry | ||||||||||
>I have a need to write biblioentries that do not For the HTML Style (V1.52) exchange: <xsl:text>[</xsl:text>
<xsl:choose>
<xsl:when test="local-name($node/child::*[1]) = 'abbrev'">
<xsl:apply-templates select="$node/abbrev[1]"/>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$node/@id"/>
</xsl:otherwise>
</xsl:choose>
<xsl:text>] </xsl:text>With:
<xsl:choose>
<xsl:when test="local-name($node/child::*[1]) = 'abbrev'">
<xsl:text>[</xsl:text><xsl:apply-templates
select="$node/abbrev[1]"/><xsl:text>]</xsl:text>
</xsl:when>
<xsl:otherwise>
<xsl:if test="$node/@id"><xsl:value-of
select="$node/@id"/></xsl:if>
</xsl:otherwise>
</xsl:choose>
| |||||||||||
42. | How to set the language? | ||||||||||
> I try all properties and it work fine except language. I make a The correct way to set language is to use lang attribute in instance of your document. E.g.: <?xml ...?> <!DOCTYPE ...> <book lang="fr"> ... </book> | |||||||||||
43. | Including parts of a book using entities | ||||||||||
I'm afraid a system entity cannot have a DOCTYPE declaration. It's a "feature" of the SGML and XML specs that I've never quite understood. I can understand not wanting to mix different doctypes, but if they all use the same doctype, why not permit it? The parser could certainly ignore any extra doctype declarations when it read the system entities. So you end up with system entity files without a DOCTYPE, which means they are not valid files and cannot be validated. It breaks the concept of modular documentation. Some editor programs like emacs and ArborText can dynamically add a DOCTYPE declaration to a system entity based on a processing instruction. If you care to switch to XML, I've developed a system of modular DocBook using XIncludes and olinks, where each module is maintained as a valid file. It is described at: my website | |||||||||||
44. | Index markup | ||||||||||
index starts an index. indexdiv divides an index into sections. indexentry is an entry in the index. I think those are pretty clear. Within an indexentry, the primary, secondary, and tertiary elements are used in order to make sure that the index accurately reflects the common understanding of an index. The problem with
<indexentry>
<item>foo
<item>bar</item>
</item>
</indexentry>
is two-fold. First, did the indexer really mean that? Did they really mean to have bar as a second level term, or did they intend for it to be a third level term and accidently delete one? Second, suppose you're trying to fix an index. You've printed it out and gone through and found inconsistencies. It's a lot easier to find the inconsistencies if they have distinct names rather than having to count start tags. | |||||||||||
45. | Book Title, why are there two? | ||||||||||
Well, much of the metadata in bookinfo is printed, on the title page. That's where the author information goes, for example. In general, metadata is information *about* the book, rather than part of the book's subject content. I think title came first, and bookinfo was added later as a way to collect the little bits of metadata that were building up as DocBook evolved. It is possible to have a title in both places, but the stylesheet 'policy' in XSL is to use bookinfo/title if that is available, otherwise use title. | |||||||||||
46. | How to use callouts | ||||||||||
Callouts are a good option. In html they do take a bit of tweaking, but look good. Took me some time to imitate Norm, but this is what works for me now. I could put them any position on the line.
<example id="bl.ch">
<title>Chapter page sequence</title>
<programlisting format="linespecific">
<fo:page-sequence
master-reference="chaps"
initial-page-number="1"
format="1"> <co id='pnf'/>
<fo:static-content
flow-name="xsl-region-before">
<fo:block text-align="outside"> <co id='hd'/>
Chapter <fo:retrieve-marker
retrieve-class-name="chapNum"/> <co id='cn'/>
<fo:leader leader-pattern="space" />
<fo:retrieve-marker
retrieve-class-name="chap"/> <co id='ttl'/>
<fo:leader leader-pattern="space" />
Page <fo:page-number font-style="normal" /> <co id='pna'/>
of <fo:page-number-citation ref-id='end'/> <co id='lp'/>
</fo:block>
</fo:static-content>
</programlisting>
<calloutlist>
<callout arearefs="pnf">
<para>The page number is formatted in Roman</para>
</callout>
<callout arearefs="hd">
<para>this block forms the header on these pages</para>
</callout>
<callout arearefs="cn">
<para>The chapter number is retrieved as a marker</para>
</callout>
<callout arearefs="ttl">
<para>The chapter title is retrieved as a marker.</para>
</callout>
<callout arearefs="pna">
<para>The page number is added</para>
</callout>
<callout arearefs="lp">
<para>Last page number of document</para>
</callout>
</calloutlist>
</example>One additional refinement to Dave's example. You can make the links bidirectional by adding an id to the <callout> and a linkends to the <co>: <co id='pnf' linkends="pnf-callout" /> ... <callout arearefs="pnf" id="pnf-callout"> Then you can click on the callout bug in the code to go to its explanatory text, and click on the callout label bug to go back to the code location. At least with XSL. | |||||||||||
47. | beginpage element | ||||||||||
You can't, unless you are willing to do some pretty heavy customizing of the chunking stylesheet. This element is not designed to produce chunking output. (although you are not alone in thinking that it is). See "DocBook: The Definitive Guide" for a description of the beginpage tag (docbook.org). The DocBook XSL stylesheets don't respond to <beginpage/> unless some customization was done. There is no facility in the chunking stylesheets for arbitrary chunks. A DocBook file is chunked at chapter and section breaks. You have some control over which sections cause a break using stylesheet parameters. See sourceforge for a description of the HTML stylesheet parameters. | |||||||||||
48. | Nesting Sections in Simplified docbook | ||||||||||
All the sections have to come "last". In other words, if you have this structure: <section> <title>Some Title</title> <!-- point A --> <para>Some material</para> <!-- point B --> </section> You can add a new section at "point B" but not at "point A" (because that would put the paragraph after the section and that's not allowed). | |||||||||||
49. | Image size problems | ||||||||||
I've been working on solutions for the same problem. I, too, found the two-different-imageobjects solutions less than satisfactory (that whole dual source issue). So far, my best solution has been external to DocBook. In GIMP or PhotoShop, I can alter the resolution to find the optimum pixels/inch (pixels/mm) for a given size. An image that looks good in the browser at, say, 640 pixels wide will--at a resolution of 72 pixels/inch (2.8346 pixels/mm)--be too wide (8.9 in[225.8 mm]) for standard page outputs. Resetting the resolution to 110 pixels/inch (4.33 pixels/mm) reduces the print version to a printable width of 5.8 in ( 147.8 cm) while retaining the total pixel width unaffected for HTML output. | |||||||||||
50. | Including parts of my docbook | ||||||||||
This is usually done by putting all your entity defns and references in a single file (say "entities.ent"), and just referencing this one from your master doc. | |||||||||||
51. | alt text or d link on images? | ||||||||||
Enclose description by phrase not by para. <phrase>d text goes to ALT, otherwise surrounded text is stored on separate HTML page with long description.
...
<textobject>
<phrase> This produces alt text </phrase>
</textobject>
....
...
<textobject>
<para> This produces seperate, dlink (ed) html file </para>
</textobject>
| |||||||||||
52. | Indexing | ||||||||||
It looks like the FO indexing machinery needs a bit of work. The way I read the stylesheets, if you want an index in print output, you add an <index> element. The $generate.index parameter is not consulted. It should be. You'll still need to include an empty <index> element to tell the stylesheet where you want it, though. Also, the selection of index terms for an index in the 'generate-index' template is "//indexterm[...]", So make sure they are present. | |||||||||||
53. | Why simpara tag? | ||||||||||
Some users want to prevent paragraphs from containing "block" elements (as HTML does). The simpara element gives them an alternative to para that has the semantics they want. And they can make a customization layer that's a proper subset of DocBook simply by removing the 'para' element from the DTD. | |||||||||||
54. | Are there any good uses for pi's | ||||||||||
Absolutely. I am totally exasperated by the folks that want to remove PIs from XML. "Here's your Swiss Army Knife, norm, oh, but we broke off the small blade (the internal subset) and we've removed the tweezers (PIs), because you don't really need those. And for good measure we welded the corkscrew open (thou shalt always put elements in a namespace). Is there anything else we can do to help you?"
Inappropriate? Hmm. I'd say that it was inappropriate to put proper information content in there: <p>This is a paragraph</p> <?p This is a special paragraph that is really important but I jammed it in a PI?> That'd be bad.
More-or-less. Maybe PI targets should have been allowed to be QNames, I don't know. Clearly they have to be named with the same considerations as they are a flat space. I've used "dbhtml" and "dbfo" in the DocBook stylesheets which may be a bit too broad. What can you do with the PIs in the DocBook styleheets?
Notations, alas, are a good idea that never really took off. And since the WXS idea of notations and the DTD idea are pretty different, they're probably dead.
I think so. Certainly I've seen other styles, but as a stylesheet author, it's very convenient. The alternative is either some other tokenizing strategy or a different PI target name for each possible parameter. | |||||||||||
55. | Image file extension selection | ||||||||||
That's a great method of selecting a graphic format if your build system is set up for it. I'm not sure why you are objecting, though, as you can do this now, without any change to the stylesheets, right? FYI, starting with the 1.59 XSL stylesheets, there is another way of selecting a graphic format at runtime. If the 'use.role.for.mediaobject' is nonzero, then a role="html" in an imageobject will select that imageobject (and its imagedata child) when processed by the html stylesheet. Likewise for a value of "fo". So you can set up a mediaobject like this:
<mediaobject>
<imageobject role="html">
<imagedata fileref="myImageFile.png" format="PNG"/>
</imageobject>
<imageobject role="fo">
<imagedata fileref="myImageFile.pdf" format="PDF"/>
</imageobject>
</mediaobject>
When you process this with the html stylesheet, you get the PNG graphic, and when you process with the fo stylesheet, you get the PDF graphic. For the xhtml stylesheet, you can add an object with role="xhtml" if you want a different one, otherwise the stylesheet falls back to selecting role="html". If you want finer control, such as the situation you describe here with PDF for PDF output and EPS for Postscript output, then you can use any role values you want. Then you pass the selected role value in a command line parameter 'preferred.mediaobject.role'.
<mediaobject>
<imageobject role="html">
<imagedata fileref="myImageFile.png" format="PNG"/>
</imageobject>
<imageobject role="eps">
<imagedata fileref="myImageFile.eps" format="EPS"/>
</imageobject>
<imageobject role="fo">
<imagedata fileref="myImageFile.pdf" format="PDF"/>
</imageobject>
</mediaobject>
To select the EPS format, set the stylesheet parameter preferred.mediaobject.role="eps" on the command line. This method is admitedly more verbose than yours, but is more flexible per object. It gives the author the opportunity to add graphics attributes for individual output formats to optimize each one, for example. It also fulfills the original design goal of the mediaobject wrapper: to allow the author to specify several potential objects, where one of which is selected at processing time. Closely related: Bob Stayton tells us
You need to put the gif in single quotes within the double quotes: <xsl:param name="graphic.default.extension" select="'gif'"/> Otherwise the stylesheet thinks you are trying to select the element named gif instead of a string. It's a common mistake. | |||||||||||
56. | How to not number figures | ||||||||||
If you don't want to put a title on a figure, then use 'informalfigure' instead of 'figure'. informalfigure doesn't take a title. If you want to change how the title is presented, i.e, without the "Figure 1", then you need to customize the generated text for figure. See: sagehill.net for how to do that. | |||||||||||
57. | abbrev tag needed in output | ||||||||||
There is no DocBook XSL template that outputs the HTML <abbr> tag. And there is no attribute on acronym that is for a title. There is some discussion in the DocBook Technical Committee about annotations like that, but nothing in the DTD yet. If you wanted to stuff your title in the xreflabel attribute, you could customize the stylesheet by adding something like this to your stylesheet customization layer:
<xsl:template match="acronym">
<abbr>
<xsl:attribute name="title" select="@xreflabel"/>
<xsl:call-template name="inline.charseq"/>
</abbr>
</xsl:template>
The xreflabel isn't quite the right attribute to use, but there isn't a better one at this point. | |||||||||||
58. | table titles | ||||||||||
If you don't want table titles, then you should use <informaltable> instead of <table>. The only difference is the <title> element. | |||||||||||
59. | Create an index | ||||||||||
Put empty <index/> element in a place where index should occur. | |||||||||||
60. | What is simpara for? | ||||||||||
So that customizers could limit users to a para element that didn't allow block content. | |||||||||||
61. | Reference to a glossary entry | ||||||||||
You can use glossterm element for this purpose. | |||||||||||
62. | Reference a bibliography | ||||||||||
This citation: <xref linkend="BrodyArticle"/> to this bibliography entry: <biblioentry id="BrodyArticle"> <abbrev>brody98</abbrev> <author>... </biblioentry> will generate this citation text: [brody98] At least with the XSL stylesheets. | |||||||||||
63. | Equations | ||||||||||
By figure tag, I presume you mean a title? There is the equation tag, but that requires a title. There is also informalequation, which does not require a title. You can enter math text this way:
<informalequation>
<mediaobject>
<textobject>
<phrase>Ψ(n,k)</phrase>
</textobject>
<mediaobject>
<informalequation>
| |||||||||||
64. | Repeated reference to screenshot | ||||||||||
An xref generates the link text from the object being pointed to. In this case, there is no text to generate. You could put the screenshot inside of a figure, give the figure a title, and put the id on the figure element. Then the xref has some text it can generate | |||||||||||
65. | How to set the language | ||||||||||
You can put a lang="de" attribute in an element that starts a page-sequence (chapter, etc) to add a language="de" property for that page-sequence. If you want it for the whole document, put lang="de" in the document's root element, or set the command line parameter l10n.gentext.language="de". Either one should put the language="de" attribute in fo:root. | |||||||||||
66. | Entity problems | ||||||||||
Because SGML traditionally has used SDATA entities. There's a bug there, however, in that if %sgml.features; is true, the entity declaration should be: <!ENTITY euro SDATA "[euro ]"><!-- euro sign, U+20AC NEW --> If adding SDATA fixes the problem, please let me know. If not, try adding your own declaration for € pointing to the Unicode code point. That should work, if your SGML processor understands Unicode. (SGML has all sorts of magic in the SGML Declaration to handle multiple character sets.) | |||||||||||
67. | Entity sets | ||||||||||
There is now a collection of entity definitions hosted at the W3C at W3C This is a mixture of an update to the existing data at W3C mathml with most of the text rephrased to be less mathml-specific, together with a new draft text of an update to ISO/IEC TR 9573 To include Unicode (ISO 10646) definitions of the entities rather than SGML SDATA entities. All the data and scripts to produce the site are also available, linked from the overview page. Currently the definitions are identical to the definitions in the forthcoming MathML 2 2nd edition PR draft. The draft ISO/IEC DTR 9573 contains tables detailing differences between these current definitions and the defintitions used by Docbook, HTML, and the Stix Consortium. It is hoped that this _draft_ set of definitions might form the basis of a shared, compatible set of definitions between different XML languages so that the current situation where <mo> & assymp; </mo> changes meaning if it is copied from a docbook+mathml document to a xhtml+mathml document might be avoided... | |||||||||||
68. | Shared entity definitions | ||||||||||
You can create a single file containing all of your shared entity declarations and then include it in each of your files. For example, you could have a file named "global.ent" that would look something like: <!ENTITY mystring "My String"> <!ENTITY another_string "Another String"> Then, the DOCTYPE declaration for each file would be: <!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook 4.2 XML//EN" "http://www...." [ <!ENTITY % global_entities SYSTEM "global.ent"> %global_entities; ]> (Note the use of the percent sign (%) when declaring such "parameter entities" that include DTD declarations.) | |||||||||||
69. | Entity reference to include external code listings | ||||||||||
The entity pointed to by an entityref must have an NDATA type as declared in the DTD. DocBook declares linespecific for this, so change your entity declaration to: <!ENTITY generic-request.dtd SYSTEM "/home/r_exley/tmp/code/xml/generic-request.dtd" NDATA linespecific> Then the entityref should work. The validation process should have pointed out this problem.
Inside a programlisting element, all white space is preserved, including that before and after your textobject. So change it to: <programlisting><textobject>
<textdata entityref="generic-request.dtd" />
</textobject></programlisting>
The whitespace inside textobject should be ignored, though. | |||||||||||
70. | Adding another mediatype | ||||||||||
For validation, you need to add to the list of notations supported by the DTD. You can extend the list yourself in the internal subset of the DTD. In your docbook file: <!DOCTYPE book PUBLIC etc. [ <!ENTITY % local.notation.class "| SWF"> <!NOTATION SWF SYSTEM "SWF"> ]> <book> ... In your imagedata or videodata element, add a format attribute:
Then you need to customize the stylesheet to update the template named 'is.graphic.format' to accept your format string:
xsl:template name="is.graphic.format">
<xsl:param name="format"></xsl:param>
<xsl:if test="$format = 'SVG'
or $format = 'PNG'
or $format = 'JPG'
or $format = 'JPEG'
or $format = 'linespecific'
or $format = 'GIF'
or $format = 'GIF87a'
or $format = 'GIF89a'
or $format = 'BMP'
or $format = 'SWF'">1</xsl:if>
</xsl:template>
If you use videoobject, then the HTML output will be <embed>. If you use imageobject, then the HTML output will be <img>. I'm not sure what happens after that. 8^) | |||||||||||
71. | Confusing entities | ||||||||||
Shows &8216; - ed
One problem is that the entity names, e.g., rsquor, aren't always easily matched to Unicode names. At [1], I see: The latest entities at [3] and the draft revision of ISO 9573 at [4] agrees with the above, but that's not too surprising since David, Norm, and I worked together on this, and I don't remember revisiting these quotes, so we might have missed something. I wonder where "rising" comes from in the iso-pub file? I wonder if the "r" at the end of these really means "reversed" in the case of the right quotes? I wonder what the 'r' means at the end of the left quotes, since there are neither reversed nor rising left quotes and, in fact, the iso-pub comment says they are "rising, low-9" quotes. I wonder if the answers to such questions will ever really be known or lost in time (somewhere around 1985 when this stuff was originally done)? My best guess is that the correct mappings are as follows: ldquo ISOnum 0x201C # LEFT DOUBLE QUOTATION MARK [correct now] but I could be wrong. Input from others who can substantiate the correct mapping would be appreciated. [1] http://www.unicode.org/Public/MAPPINGS/VENDORS/MISC/SGML.TXT | |||||||||||
72. | markup for generated content | ||||||||||
Personally, I tend to use custom XML processing instructions for this, since these are clearly targetted at ... a processor. For your example I'd use something like the following mark-up: <para> Signed <?gentext Location.City ?>, <?gentext Date.Long ?>: Christian </para> <para> (Processed by <?gentext Processor.Name ?> at <?gentext Time.Short ?>) </para> Of course you could use pseudo-attributes, like in "<?gentext type='Location.City' default='Munich' ?>" But these are just my own preferences. Others might do things differently. Btw., this template helps if you use that approach: http://docbook.sourceforge.net/release/xsl/current/doc/lib/lib.html#pi-attribute | |||||||||||
73. | How to mark up a persons middle initial? | ||||||||||
I would use <othername>. <personname> <firstname>Rune</firstname> <othername>E.</othername> <surname>Lausen</surname> </personname> | |||||||||||
74. | Multiple index terms | ||||||||||
I thought this was going to be really hard, because the indexing machinery [in the xsl] is so complex and difficult to follow. But actually, you can accomplish this without customizing the stylesheet at all. You can do it entirely in your source document. The sortas attribute can be used to separate them:
<indexterm><primary>stack</primary></indexterm>
<indexterm><primary sortas="stack classname">
<classname>stack</classname></primary></indexterm>
If all the instances of the classname indexterm have a sortas attribute that differs from the non-classname indexterms, then they will be treated as different index entries. You'll have to be careful to get them to sort together if you have other "stack somethings". You'll have to come up with a naming scheme, maybe using a character that occurs before all letters in the sort order. Combined with the earlier customization, you should get what you want. | |||||||||||
75. | Modular docbook books? | ||||||||||
I've written a description of using XInclude and olinks to create modular DocBook books. See this reference: Bobs site It means you don't have to have any special features in your XML editor, but it does mean your XSLT processor must be able to handle XIncludes. Ednote. Must read this. | |||||||||||
76. | Including XML in programlisting tags | ||||||||||
You must either excape the characters that trigger parsing: <programlisting> <setting> <server name="identifier" hostname="fqdn" port="3000"> <server name="identifier" hostname="fqdn" port="3300"> </settings> <programlisting> Or enclose the code in a CDATA section: <programlisting><![CDATA[ <setting> <server name="identifier" hostname="fqdn" port="3000"> <server name="identifier" hostname="fqdn" port="3300"> </settings> ]]><programlisting> | |||||||||||
77. | Index markup | ||||||||||
I have a recommendation for the process. It will go faster if whoever is creating the indexterms is set up to process the book and generate the index. If you have done much indexing you know that a good index is made by an iterative process. The first pass of adding entries will have small inconsistencies in vocabulary, groupings, see, and see also. The indexer has to be able to process the entries, review the index, and make adjustments in the indexterms. |