In the Lisbon Geant4 International Workshop (2006), we agreed to move all Geant4 users' manuals except 'Physics Reference Manual' from HTML to DocBook. This small note provides a basic guide to use DocBook when you update/revise a G4 manual.
DocBook is a markup language just like (X)HTML. This means you use tags to mark up the structure and semantics of a document. For example, a paragraph is marked up by a <para> and </para> pair in DocBook, where a <p> and </p> pair is used in (X)HTML. Although DocBook has a much richer tag set, there is a very good correspondence between (X)HTML and DocBook tags. Therefore it is straight forward for you to use DocBook, because you know well the (X)HTML tags. We provide in the later section a table of mapping between (X)HTML and DocBook tags. By referring this map, you can immediately start to update a Geant4 manual written in DocBook. However before making a jump-start it is better to have some general knowledge of DocBook. In the next section, we give you a gentle introduction of DocBook using a simple example code. We hope that this makes your jump-start a little bit smoother one.
You may wonder what is a merit to move to DocBook if there is such a good correspondence between (X)HTML and DocBook tags. The following are major merits of moving.
Although we have said above that there is a very good correspondence between (X)HTML and DocBook tags, there are some number of key differences in the basic way to mark up a document in DocBook.
An easiest way to understand the basics of DocBook and its difference from (X)HTML is to read a simple DocBook document. In the following we provide a simple DocBook code and its annotation. We believe that, once you understand this simple example, you can make a jump-start to write the DocBook version of G4 users manual by referring the (X)HTML-DocBook mapping table in Section 3. If you want a more systematic tutorial, pick up one shown in References. For a detail explanation of all DocBook tags, see "O'Relly on-line book" shown in References.
The following example is created based on a very abbreviated version of 'Installation Guide' of Geant4 User Manual.
<book> --- (1)
<bookinfo> --- (2)
<title>Installation Guide</title> --- (3)
</bookinfo> --- (4)
<!-- ================================================ Chapter -->
<chapter id="ch1"> --- (5)
<title>Installation Introduction</title> --- (6)
<para> --- (7)
This section describes the global computing environment required
for installing the Geant4 toolkit. To set up your specific
computing environment for Geant4, refer to Section 2 of this Guide.
.......
</para> --- (8)
</chapter> --- (9)
<!-- ================================================ Chapter -->
<chapter id="ch2"> --- (10)
<title>Installation Procedures</title> --- (11)
<para>
Before installing Geant4, the required software listed in
<link linkend="sec1.1">section 1.1</link> --- (12)
(and <link linkend="sec1.2">1.2</link>
in the case of graphics drivers) of this Installation Guide must
already be installed on your system.
</para>
<para>
In this section, a short tutorial on how to install the toolkit's
kernel libraries is given. The installation of the Geant4 kernel
libraries and ..............
</para>
<!-- ================================================ Section -->
<sect1 id="sec1.1"> --- (13)
<title>Using the <literal>Configure</literal> Script</title>
--- (14)
<para>
A shell script is provided for building the libraries and to
allow easy installation in a specified area. .....
</para>
<itemizedlist> --- (15)
<listitem><para>the compiler to be used</para></listitem>
--- (16)
<listitem><para>the path where the Geant4 toolkit is to be
installed (<literal>$G4INSTALL</literal>)</para></listitem>
<listitem><para>definition of installation directory paths
(optional)</para></listitem>
<listitem><para>.......</para></listitem>
</itemizedlist> --- (17)
<!-- ============================================ Sub-Section -->
<sect2 id="sec1.1.1"> --- (18)
<title>Configuring the Environment to Use Geant4</title>
--- (19)
<para>
Once libraries have been installed, the user's environment must
be correctly set up for the usage of the Geant4 toolkit. ......
</para>
<para>
To generate the configuration scripts, the user should run
<literal>Configure</literal> placed in the installation area,
as follows:
<programlisting> --- (20)
> $G4INSTALL/Configure
</programlisting> --- (21)
.......
</para>
</sect2> --- (22)
</sect1> --- (23)
<!-- ================================================ Section -->
<sect1 id="sec1.2"> --- (24)
<title>Installing Geant4 Manually</title> --- (25)
<para>
Before proceeding with the installation, some key environment
variables must be defined in your user environment in order to
specify where all ......
</para>
</sect1> --- (26)
</chapter> --- (27)
</book> --- (28)
The following is an HTML output generated from the above DocBook code:
There are several ways to get this HTML output from Simple Example Code.
1) xsltproc
This is an xml stylesheet processor written in C. It is available in most
standard linux environments. Under the Windows environment it is available
through 'cygwin'.
For example under the CERN lxplus environment, do as follows to get an HTML output from the above example DocBook code:
lxplus$ xsltproc \
/usr/share/sgml/docbook/xsl-stylesheets-1.65.1-2/html/docbook.xsl \
example.xml > output.html
In the above 'xsltproc' command, '/usr/share/sgml/docbook/xsl-stylesheets-1.65.1-2/html/docbook.xsl' is a stylesheet used in converting docbook to html. The file 'example.xml' keeps 'SimpleExample Code'. The file 'output.html' is generated if there is no error in 'example.xml'. You can see 'output.html' with your favorite web browser.
When you want to run 'xsltproc' on your machine you have to take the following steps:
2) xalan
This is a Java based stylesheet processor written. If you have a standard
Java environment you probably can use this. Take note that you cannot use this
under the CERN lxplus environment - when we tested we got an error of no enough
space for object heap.
If you have an 'xalan' available environment, do as follows.
your-machine$ java org.apache.xalan.xslt.Process -in ./example.xml \
-xsl /usr/share/xml/docbook/stylesheet/nwalsh/current/html/docbook.xsl \
> output.html
3) Firefox and Miscrosoft Internet Explore
Using so-called 'XML-stylesheet Processing Instruction', you can directly
view a docbook document by the web browser if it awares the XSLT stylesheet
- Firefox, MS-IE for examples.
For viewing you add the following lines colored red at the top of you docbook document:
<?xml version="1.0" ?> <?xml-stylesheet type="text/xsl" href="/usr/share/sgml/docbook/xsl-stylesheets-1.65.1-2/html/docbook.xsl" ?> <!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN" "/usr/share/sgml/docbook/sgml-dtd-4.3-1.0-25/docbook.dtd" > <book> <bookinfo> <title>Installation Guide</title> </bookinfo> .......
Then you can directly view the xml file, for example:
your-machine$ firefox your-file.xml
[Memo]
The location of the two files ('docbook.xsl' and 'docbook.dtd') depends on your working environment. You may need to change the path to these files if you are not working under the 'cern/lxplus' environment.
4) XXE and Vex
You can process a docbook file using a so-called XML editor. XXE is an
XML editor provided by the company 'XmlMind'. It is available in two versions -
Standard Edition and Professional Edition. Standard Edition is free and
still it has enough capability for our purpose.
Good points of this tool are
You can get this tool from the following site:
Another choice of an XML editor is 'Vex'. It is also free though it requires a Java environment. You can get this tool from the following site:
5) Which tool to use?
There are more tools available either freely or commercially, though ones
mentioned above are good enough for our purpose.
You may ask which tool among above four we recommend to use. We guess that you may be attracted by WYSWYG tools like XXE or Vex, though we recommend to use 'xsltproc' or 'xalan' because they are used most commonly among the DocBook community.
The following is annotation to the above simple DocBook code.
<book>
<part>
<chapter>
<sect1>
<sect2>
.....
<sect5>
<para>
.....
</para>
</sect5>
.....
</sect2>
</sect1>
</chapter>
</part>
</book>
As seen in the above figure, you need to specify explicitly where an document element starts and ends. For example, you need to specify where a chapter starts by <chapter> and ends by </chapter>. Also for a section you use <sectN> and </sectN> (N=1,2,..5).
This is one of major differences from (X)HTML, where you do not specify where a chapter/section ends.
Also take note that you may start a DocBook file from a lower hierarchy tag like <chapter>, <sect1>,.....
In this section, we show a table of mapping between (X)HTML and DocBook tags. It does not show a complete mapping, but covers all (X)HTML tags appeared in the most recent version of Geant4 User Manuals. By referring this map, we expect that you can immediately start to update a Geant4 manual in DocBook.
Once you have changed a DocBook version of Geant4 document, it is mandatory that you verify your modifications using either one of tools described in Section 2.2. For this we recommend you to do the following unit test, ie. to process the file only you modified.
Suppose you have changed some part of the file 'introduction.xml' of 'User Manual - For Application Developers'. Then a possible way to validate your change is to execute the following command (under the CERN/lxplus environment):
lxplus$ xsltproc \
/usr/share/sgml/docbook/xsl-stylesheets-1.65.1-2/html/docbook.xsl \
introcution.xml > introduction.html
If there is an error in your modification, then 'xsltproc' will tell you where it occurs. Correct the error and process it again. If there is no error anymore then you will get the output 'introduction.html'. Please never to commit your unverified DocBook file to CVS.
[Memo]
<book> <---- add this root tag
<para>
.....
.....
</para>
<other docbook tag>
.....
</book> <---- add this closing root tag
Please do not forget to remove these temporary tags when you commit
the file to CVS.
The following table shows the mapping of (X)HTML tags that appeared in the Geant4 User Manuals to DocBook.
| (X)HTML Tag - Alphabetical Order | DocBook Tag |
|---|---|
<a href="http://www.cern.ch"> CERN </a> |
<ulink url="http://www.cern.ch"> CERN </ulink> |
<a href="#Section1"> Section1 </a> |
<link linkend="Section1"> Section1 </link> or <xref linkend="Section1" /> |
<b>emphasis</b> |
<emphasis>emphasis</emphasis> |
<blockquote>Quoted string</blockquote> |
<blockquote> <para>Quoted string</para> </blockquote> |
<body>body part</body> |
No corresponding tag |
<br /> |
No corresponding tag (Use <para> and </para>) |
<caption>title</caption> |
<title>title</title> |
<center>centering</center> |
No corresponding tag |
<code> program codes </code> |
<programlisting> program codes </programlisting> |
<dd>description</dd> (See <dl>) |
<listitem>description</listitem> |
<dl> <dt>list name</dt> <dd>description</dd> </dl> |
<variablelist>
<varlistentry>
<term>list name</term>
<listitem><para>description</para></listitem>
</varlistentry>
</variablelist>
|
<dt> list name </dt> (See <dl>) |
<varlistentry> <term>list name</term> <varlistentry> |
<em>emphasis</em> |
<emphasis>emphasis</emphasis> |
<font ...>strings</font> |
No corresponding tag |
<head>header part</head> |
No corresponding tag |
<hr /> |
No corresponding tag |
<html> ... </html> |
No corresponding tag |
<h1>head1</h1> <h2>head2</h2> ..... <h5>head5</h5> |
<chapter>chapter</chapter> <sect1>section1</sect1> <sect2>section2</sect2> ..... <sect5>section5</sect5> (See Section 2.3 for the usage of these tags.) |
<i>emphasis</i> |
<emphasis>emphasis</emphasis> |
<img ... /> |
<imageobject> <imagedata fileref="..." format="..."> </imageobject> |
<kbd>commands</kbd> |
<literal>commands</literal> |
<li>list item</li> (See <ul>) |
<listitem>list item<listitem> |
<literal>strings</literal> |
<literal>strings</literal> |
<meta /> |
No corresponding tag |
<ol> <li>list item</li> </ol> |
<orderedlist> <listitem><para>list item</para></listitem> </orderedlist> |
<p>paragraph</p> |
<para>paragraph</para> |
<pre>strings</pre> |
<literal>strings</literal> |
<span>region</span> |
No corrsponding tag |
<sub>subscript</sub> |
<subscript>subscript<subscript> |
<sup>superscript</sup> |
<superscript>superscript<superscript> |
<table>
<tr>
<th>header</th>
</tr>
<tr>
<td>content</td>
</tr>
</table>
|
<table>
<tgroup>
<row>
<entry>
<emphasis>header</emphasis>
</entry>
</row>
<row>
<entry>content</entry>
</row>
</tgroup>
</table>
|
<td>content</td> (See <table>) |
<entry>content</entry> |
<th>header</th> (See <table>) |
<entry><emphasis> header </emphasis><entry> (There is no one-to-one mapping in this case. This shows only an example of mapping.) |
<title>title</title> |
<title>title</title> |
<tr></tr> (See <table>) |
<row></row> |
<tt>strings</tt> |
<literal>strings</literal> |
<u>strings</u> |
<emphasis>strings</emphasis> |
<ul> <li><para>list item</li> </ul> |
<itemizedlist> <listitem><para>list item</para></listitem> </itemizedlist> |
<!-- comments --> |
<!-- comments --> |
As in (X)HTML the following reserved characters have to be escaped.
| Reserved Character | Escaped Format |
|---|---|
<
>
&
|
<
>
&
|
For symbols and Greek letters, you should use the unicode directly
| Not recommended to use | Recommended to use (Unicode) |
|---|---|
α
β
.....
|
α
β
.....
|
See the following page for unicodes of symbols and Greek letters.