TeX4ht system has the ability to translate any TeX or LaTeX document into other markup formats such as SGML, HTML, XML, MathML, OpenOffice format, Braille, etc. The system has an extensive load of TeX packages, hypertext fonts, a post-processor for dvi and another post-processor to generate CSS and image files of math formulae and equations. It works in three different stages. Given below is the summary (extracted from TeX4ht’s documentation by Eitan), of how the translation process works.

The system can be activated with a sequence of commands of the following form, typically embedded within a script.

       latex      x            (or ‘tex x’)
       latex      x
       latex      x
       tex4ht     x
       t4ht       x

The three compilations with LaTeX (or TeX) are needed to ensure proper links. The approach is illustrated in the following picture.

Translation process

The schematic diagram of the translation process.

  • x.tex: This is a source TeX or LaTeX or other TeX file that imports the style files tex4ht.sty and *.4ht. (The name is arbitrarily chosen for the purpose of our discussion here.) The style files define the features for the output.
  • tex4ht: The output of TeX is a standard dvi file interleaved with special instructions for the post-processor namely, tex4ht to use. The special instructions come from implicit and explicit requests made in the source file through commands of TeX4ht.

    The utility tex4ht, which is a binary program, translates the dvi code into standard text, while obeying the requests it gets from the special instructions. The special instructions may request the creation of files, insertion of HTML code, filtering of pictures, and so forth.

    In the extreme case that the source code contains no commands of TeX4ht, tex4ht gets pure dvi code and it outputs (almost) plain text with no hypertext elements in it.

    The special (\special) instructions seeded in the dvi code are not understood by dvi processors other than those of TeX4ht.

  • x.idv: This is a dvi file extracted from x.dvi, and it contains the images needed in the HTML files.
  • x.lg: This is a log file listing the pictures of x.idv, the png files that should be created, CSS information, and user directives introduced through the \Needs{...} command.
  • t4ht: This is an interpreter for executing the requests made in the x.lg script.