Docbook Language Translation

1. Translation without markup change

1.

Translation without markup change

Markus Schutz


> This is off-topic, but maybe some of you are sometimes faced with the 
> same problem:
>
> I have a long English Docbook document (440 pages in  printed output) 
> that I want to translate to German.
>
> There a two important points for me:
>
> - In the English document, I don't want to type in again all the 
> markup which took me so many hours. I want to reuse my precious 
> markup. The structure of the target document will be exactly the same.
>
> - I want to use a translation program to produce a first, rough 
> version of a (mechanical) translation.
>
> What is a good approach to do this? Can these steps be mixed? Are 
> there programs for these kind of jobs? Any experiences? Advice?

well, I'd not say that this is very off-topic, but in fact a common problem...

The good news is that there are tools to accomplish your task, the bad news is that they are not open source, but commercial software.

The following ones have been proved to do the job well:

- Trados Tag Editor by Trados (www.trados.com),
- Transit by Star Transit (www.star-transit.com),
- XLIFF Editor by Heartsome (www.heartsome.net) - which is the only Java-based and therefore platform-independent tool.

Maybe you can try to get an evaluation copy; on the other hand there might be specially priced licenses for research institutes.

All of these tools are translation memory systems: The basic idea is to involve a fuzzy search engine to re-use phrases you have already translated. This implies that when you begin from scratch (and have no translation memory yet), you cannot pre-translate your document. There are some machine-translation tools such as SYSTRAN, but I have no idea how well they fit the task, and whether they preserve your markup.

Jordi Vilalta adds

I'm currently working in a program called po4a (po for anything), which aims to help in the translation of documentation using the gettext standard po files. There's currently an sgml module that can handle pretty well the sgml docbook documents (and some simple xml documents). We're currently working on an Xml module which will soon be able to completely translate docbook xml documents.

This extracts the translateable strings from the original document and puts them in a po file. Then you can edit this manually or with one of the available programs for this (kbabel, gtranslator, poedit, emacs po mode...), and the program itself can generate a translated document from the original document and the translation po (using the original document's structure, and using the translated strings from the po file).

This also helps in the mantainance of the translations. I hope it helps you. If it doesn't fit your needs, you're welcome to open a bug report or contacting us in the project mailing list. You can find it at debian.org