eLyXer User Guide
Alex Fernández (address@hidden)
1 The Basics
elixir, n: a substance believed to cure all ills1.
eLyXer (pronounced elixir) is a LyX to HTML converter. While there are a ton of such projects all over the web, eLyXer has a clear focus on flexibility and elegant output.
eLyXer (including this guide and all accompanying materials) is licensed under the
GPL version 3 or, at your option, any later version. See the
LICENSE file for details.
Please visit the
main page to find out about the latest developments.
1.1 System Requirements
eLyXer requires Python 2.3.4, and should work with versions up to 2.6.x; it will convert documents generated by LyX 1.5.x to 1.6.y. It has been tested on the most common operating systems: on Mac OS X, Linux and Windows, with and without CygWin.
Resource usage should be quite frugal; eLyXer runs quite happily on my 1st-gen Asus Eee, with its puny address@hidden and 512 MB of RAM. It should also be fast — the Eee can convert ~200 pages of LyX text in just over 50 seconds. Performance is fairly linear: 200 pages take 10 × as long as 20, so there are no scalability problems. Memory usage stays low even when processing large documents, and conversion can be done on the fly (with --lowmem) for even lower memory requirements.
1.2 Installation
First you will need to fetch the official distribution file from the
download area. Now, there are two ways to install eLyXer: the easy way and the elegant way. Both start by uncompressing the distributed file to a suitable directory. Just write at the command prompt:
$ tar -xzf elyxer-[version].tar.gz
Or for the .zip version:
$ unzip elyxer-[version].zip
A directory called elyxer should appear, where the main executable file elyxer.py resides.
The Easy Way
The easy way to install is just using this file directly, preferably placing it in the execution path. To do this on Linux, type as root:
# cp elyxer.py /usr/bin/
On Windows you can type (as a normal user):
> copy elyxer.py c:\Windows\
And on Mac OS X any user with administrative privileges (like the default user):
% sudo cp elyxer.py /usr/local/bin/
You can also put elyxer.py on any other directory in your path, like ~/bin, or even use an absolute path. In any instance you will be able to access elyxer.py as an executable file:
$ elyxer.py --help
> elyxer.py --help
% elyxer.py --help
On Windows you may also call explicitly the Python executable, and locate the file elyxer.py on your disk:
> Python.exe "c:\route to\elyxer.py" --help
This last usage may be useful if you are having trouble with file paths that include spaces.
The Elegant Way
The elegant (and not much more complicated) way of installing eLyXer is using Python distutils: it will install eLyXer as a Python module, available to any Python programs and as a standalone utility module. Go to the elyxer directory and type, as root:
# python setup.py install
Once again, on Windows you don’t have to be root:
> python setup.py install
On Mac OS X you can use sudo to have permission:
% sudo python setup.py install
Now you can run eLyXer as a Python module:
$ python -m elyxer --help
The main advantage is that you can run eLyXer without knowing the location of the file
elyxer.py, or having to play with any system variables such as
$PATH or
%PATH%. This procedure will also ensure that any Python packages trying to access eLyXer will find it. See
1.6↓ for LyX integration.
From now on we will assume that you are running elyxer.py as an executable file, but you should substitute that for python -m elyxer if you have opted for this elegant procedure. We will see some examples in the next sections.
1.3 Test Drive
Now you may want to try to convert the user guide:
$ elyxer.py --css lyx.css --title "eLyXer User Guide" docs/userguide.lyx docs/userguide2.html
or, for the elegant procedure:
$ python -m elyxer --css lyx.css --title "eLyXer User Guide" docs/userguide.lyx docs/userguide2.html
It should generate a working web page identical to the one distributed:
$ diff docs/userguide.html docs/userguide2.html
The typical output will contain just the changed lines, which in this case should be only the header with the file creation date. An example is shown on listing
1↓.
7c7
< <meta name="create-date" content="2009-09-11"/>
---
> <meta name="create-date" content="2009-09-12"/>
Algorithm 1 Example of diff output for functionally identical HTML files.
If nothing else appears (i.e. both files are functionally equal) then everything is working fine. If you have bash installed, to test that everything really works fine you can just run the included tests:
$ ./run-tests
It will run a number of test and check the results, so you can see if everything is well. You also need to have installed the command-line tool diff to show differences between two files.
eLyXer is a standalone command line tool. It can be invoked from the command line as:
$ elyxer.py [options] [source file] [destination file]
or, again for the elegant procedure:
$ python -m elyxer [options] [source file] [destination file]
If the source file is omitted then STDIN is used; likewise, if no destination file is specified eLyXer will output to STDOUT. This allows its use in pipes and other flexible configurations. Some examples:
$ elyxer.py file.lyx file.html
converts file.lyx to file.html. Debug messages are shown.
$ python -m elyxer file.lyx file.html
Just as before, but running eLyXer as a module (installed using the elegant procedure).
$ cat file.lyx | elyxer.py > file.html
converts file.lyx to file.html, as before. This time debug messages are not shown.
$ elyxer.py file.lyx | grep "<blockquote>" | wc
counts all blockquote paragraphs.
$ elyxer.py file.lyx | wget --no-check-certificate --spider -nv -F -i -
checks all external links in a document recursively. (Local links will appear as unresolved, but they can be ignored.)
1.5 Image Processing
HTML pages do not contain images, as PDF documents; rather they have pointers to image locations on disk. (Luckily LyX documents are the same in this respect.) eLyXer will generate pages that point to the same locations as the original images, when they are PNG images. Other image types like Encapsulated PostScript cannot be used directly from within pages. If the ImageMagick package is installed eLyXer will use the convert tool to create PNG versions of the images embedded in the document. If it is not installed eLyXer will show an error message and will not try to convert further images.
Image location is fragile. All images should be placed in the same location (and with the same structure) as the original document; and they should all be referenced relatively to the current document. During conversion from within LyX (in those versions where it is available) image locations can be lost. Your best bet is to do conversion in place.
1.6 LyX Integration
If you followed the elegant installation procedure (installing eLyXer as a module) LyX will automatically detect it after reconfiguration. Just make sure that you are running LyX 1.6.5 or later, and click on Tools ▷ Reconfigure.
If you followed the easy way (installing eLyXer as an executable file) integration of eLyXer within LyX requires a few steps depending on your operating system.
Unix/Linux:Make sure that the eLyXer program file (elyxer.py) is in the execution path (e.g. /usr/bin on Linux). Now run LyX and click on Tools ▷ Reconfigure. HTML export should now run elyxer.py.
Windows:Locate the directory where LyX was installed, and inside it the directory called bin. Now place the eLyXer program file (elyxer.py) in the bin directory, and run Tools ▷ Reconfigure. As before, HTML export should now run elyxer.py.
Mac OS X:Just like on every other Unix, place elyxer.py inside a directory in the execution path (like /opt/local/bin) and click on Tools ▷ Reconfigure.
On Windows, the
alternate installer comes bundled with a version of eLyXer, and it is enabled by default in View ▷ HTML.
2 Advanced Use
There are some advanced uses for eLyXer if you want to extract the most of it.
2.1 Command Line Options
eLyXer supports a few command line options:
--help: show command line help.
--debug: show debug messages. They may help a developer understand your problem.
--quiet: be quiet and do not output messages (except errors). This way you can avoid the comforting “Parsing line 1000” messages. When STDIN or STDOUT are used (e.g. in a pipeline) --quiet is always enabled.
--nocopy: do not show the copyright notice at the bottom. If there is no author in the document --nocopy is always enabled.
--title "title": change the title of the generated web page.
--directory "images_dir": look for images in the directory specified.
--destdirectory "dest_dir": converted images will end up into this directory.
--css "new.css": change default CSS. See section
2.2↓: CSS.
--version: show version number and date. Use to check which version you are actually running.
--html: generate HTML 4.0 (instead of XHTML). The resulting pages should be easier to import from certain word processors.
--unicode: restore full Unicode output. Right now switches midspaces to medium mathematical spaces.
--forceformat ".extension": Force the format implied by the extension (e.g. ".jpg" for JPEG) for output images.
--lyxformat: Return the highest LyX version that eLyXer understands. This parameter is provided to help with lyx2lyx integration, so that this tool knows if it must convert the file to a lower LyX format.
--toc: Activate table of contents generation; this option is on by default. See section
2.4↓: TOC.
--toctarget "original.html": Generate a table of contents with links to the original HTML file. See section
2.4↓: TOC.
--target "frame": Add a
target attribute to every link in the generated HTML, making all links point to the provided frame. Again, see section
2.4↓: TOC.
--lowmem: Activate a low memory mode which does not keep the whole document in memory: conversion is done on the fly. Keep in mind that some features as the TOC will be missing from the generated document.
When an option accepts an argument it can be added after a space as --target "frame", or with an equals sign as in --target="frame". The quotes are optional and can be useful if your arguments include e.g. spaces.
Adding Options In LyX
To add one of these options so that it is used from within LyX, you have to modify the converter line. To this effect open View ▷ HTML, find the converter for “LyX -> HTML” and edit the converter line. It should read something like this, if you followed the easy installation procedure:
elyxer.py --directory $$r $$i $$o
If you want to generate pure HTML instead of XHTML, change the line to:
elyxer.py --html --directory $$r $$i $$o
If you followed the elegant procedure it will rather be:
python -m elyxer --directory $$r $$i $$o
so change it to
python -m elyxer --html --directory $$r $$i $$o
And so on.
HTML output, as generated, can be a bit crude. Some CSS wizardry can go a long way to make eLyXer output look nicer.
eLyXer tags most elements with the type so you can later modify them using a CSS. The HTML header reads like listing
2↓, so the default remote CSS file is used.
<head>
<title>Your title here</title>
<link rel="stylesheet" href="" type="text/css" media="screen"/>
</head>
Algorithm 2 CSS link automatically added to HTML
This sample CSS file is published on nongnu.org and distributed along with the scripts, docs/lyx.css. (You may have found that your document changes its appearance with time — this is the reason. The main author regularly publishes a new, updated version of lyx.css on nongnu.org, and all documents using it automatically appear with the changes.)
To give your document a customized appearance (or for pages to be accessible offline) you probably will want to use your own CSS file; to use it first copy it to the directory where your document resides (e.g. renaming it to custom.css), and customize as needed. Then run elyxer.py with the following option:
$ elyxer.py --css=custom.css document.lyx page.html
This will make the generated page.html use your custom.css file. The ‘=’ sign between the constant ‘--css’ and the name of the CSS file is optional.
By default the generated web pages have the title “Converted Document”. If a PDF title is found then it is used instead. The proper LyX title (a paragraph of type “Title” embedded in the text) will also be used if found. But when --lowmem is in use eLyXer does not try to get the proper title, since it may be found in the middle of the document or not be present at all; scanning for it would mean doing two passes, one to look for the title in all the document and another to output the web page, and --lowmem implies on-the-fly conversion to save memory.
You can change the title of the generated web page with the --title option:
$ elyxer.py --title "My Beautiful Document" document.lyx page.html
2.4 Table Of Contents
A table of contents (or TOC) can be generated for every converted LyX document. You can optionally also add a target frame to every link. The trick is to combine both options to generate a TOC that links to the original document on a different frame. For example, if the original page is called page.html and you generated it with this command:
$ elyxer.py document.lyx page.html
you can generate the TOC linking to this page, and at the same time point it to frame contents:
$ elyxer.py --toc --toctarget page.html --target contents document.lyx page-toc.html
Then you can put it all together with a simple frameset generated manually. Just remember to place the original document in the frame called contents.
<html> <frameset cols="30%,70%"> <frame name="toc" src="" /> <frame name="contents" src="" /> </frameset></html>
Algorithm 3 An example frameset for TOC navegation
TOC generation accepts the same options as normal document conversion. For example, if you follow these instructions literally you will notice that the TOC has very wide margins and looks a bit weird; that is because it is using the default CSS. A special CSS file for TOC files is provided in docs/toc.css, so better results should be obtained with the --css option:
$ elyxer.py --toc --toctarget page.html --css docs/toc.css --target contents document.lyx page-toc.html
With a little bit of practice you will be able to generate useful (and nice looking) TOC files. You can see an example for the
user guide (if you are not already looking at it).
2.5 Segmenting Pages
Quite often you don’t want a huge monolithic page, but a set of linked pages. At the moment eLyXer does not allow you to do that, but a planned extension will.
2.6 HTML Code
The HTML code generated is technically XHTML Transitional, version 1.0
2, using UTF-8 encoding. Some programs have (in this day and age) trouble importing XHTML, notably some popular word processors. To work around this problem and provide more flexible output in general you can output HTML 4.0:
$ elyxer.py --html document.lyx page-to-import.html
Again, technically the code generated is HTML 4.01 Transitional
3 using UTF-8 encoding. Both versions should pass the W3C tests
4. If your particular web page doesn’t pass the tests, then it is a bug and it will be treated as such.
The CSS file for eLyXer uses some CSS2 features for math structures (fractions, arrays). This makes the output incompatible with older browsers; it requires Microsoft Internet Explorer 7, Firefox 3, Safari 3 or Chrome 1. Check the
Math Showcase to see if your browser can render eLyXer output correctly.
For better browser compatibility, medium mathematical spaces are substituted in the output with midspaces — improving the output for some popular browsers. If you want your mathematical spaces back, just use the --unicode option:
$ elyxer.py --unicode document.lyx page-to-import.html
3 Work in Progress
As you have already seen eLyXer is very much a work in progress.
3.1 Known Issues
The following issues (including bugs and missing features) are acknowledged. Some of them should be solved soon; others may take longer.
-
On Mac OS X the output of a message with Unicode characters may cause an error. Workaround: run elyxer.py with the --quiet option.
-
Phonetic alphabet symbols are not supported — if generated with LyX they only appear in a different color: [sample].
-
Vertical spacing is not preserved. (It is hard to do in CSS without significant mangling.)
-
Multi-column layouts are lost. This one is almost impossible to get right in CSS, so there are no plans to even try.
-
BibTeX support is primitive. (Bear in mind that the minimalistic approach using templates can be improved, but it will never be perfect.)
-
ERT (bare TeX code) is ignored.
-
Many AMS environments (like alignat, gather…) are not working or look strange — some non-AMS environments too.
-
Images are never scaled above their nominal resolution. This is seldom needed if at all, so there are no plans to change it; if people really need the feature just let the author know so it can be added as an option.
-
Compressed documents do not work. They are generated by checking Document ▷ Compressed from within LyX.
-
Internationalization: fixed texts (such as “Table of Contents”) appear always in English. This is a planned extension. For the moment users may translate the constants that appear in the main elyxer.py script under TranslationConfig, like in listing 4↓. Note that only the values (i.e. the constants after the colon) are translated: u’algorithm’:u’Listing ’ becomes u’algorithm’:u’Listado ’.
class TranslationConfig(object): "Configuration class from config file" … floats = { u’algorithm’:u’Listado ’, u’figure’:u’Figura ’, u’listing’:u’Listado ’, u’table’:u’Tabla ’, u’tableau’:u’Tabla ’, }
Algorithm 4 Translated section in TranslationConfig.
3.2 Contact Information
If your problem does not appear in the above list, please let the author know; you can find him at
address@hidden. In the words of Rich Talley: “the tool’s author really likes getting challenging documents and making eLyXer work with them”. You can send your sample documents and we will try to make eLyXer convert them acceptably. Any documents sent will be treated with the utmost confidentiality.
You can also join the
mailing list to discuss any information related to eLyXer. The author monitors the official LyX lists for mentions of eLyXer. Bugs can also be reported at the
Savannah page.
3.3 Extending eLyXer
eLyXer does not at the moment support all LyX features; sometimes it will ignore a command, sometimes it will signal it, and it might even refuse to work with certain documents. What can you do if eLyXer does not work with your LyX file? Worry not! Its flexible approach to processing allows anyone to write support for the missing commands.
eLyXer is written in Python so that it does not need to be compiled; its code is interpreted on the fly. See the accompanying
developer guide to learn how to extend eLyXer for your own purposes. If you know how to program in Python it should not be difficult to support other LyX features. If you don’t your best bet is to ask the author.
4 FAQ
Q: What versions of LyX are supported?
A: The tool is slowly improving; it now should work with LyX versions 1.5.5 to 1.6.5. It has been tested on Linux, Mac OS X and Windows.
Q: There are indeed a ton of similar projects over the web. Why add another one?
A: The four tools supported by LyX (tex4ht, hevea, tth and latex2html) gave inferior results a couple of years ago, and were quite inflexible. The author found the need for a good converter, while at the same time acknowledging the difficulty of the problem.
Q: Speaking of that: why build a LyX to HTML converter, instead of a more generic LaTeX to HTML converter?
A: The problem space is quite simplified, and therefore progress is much faster. To make it even easier eLyXer has historically centered on the subset of LyX functionality that is useful to most LyX users, leaving the rest for a later stage. Nowadays eLyXer aims to support the full LyX feature set.
Q: What can we expect from the tool in the future?
A: eLyXer should fulfill the needs of 99% of LyX users in the long term. It should also learn a couple of tricks of its own such as page segmenting. Eventually it could be distributed along with LyX as part of the standard installer.
Q: Why did you leave out my favorite feature <insert random LyX command here>?
A: In short, because nobody asked for it. Usually it is better to first aim for your own needs and what others request, and then worry about supporting everything, or you will never get anything done. But if you let me know I will be glad to help.
Q: My document changed its appearance without my intervention. Was it black magic, elves or what?
A: It probably uses the online CSS file, which is regularly updated. See section
2.2↑ for details.
Q: Why use an online CSS, instead of placing the CSS file in the same directory as the converted file?
A: There were pros and cons. An online CSS resource allowed me to update it for everyone at the same time, but might make it more difficult for people without an internet connection; local CSS files are more flexible but can also be confusing to novice users. In the end the online solution was preferred, with the --css option as a fallback.
Q: I found a bug, what should I do?
Bibliography
[1] WordReference.com: “definition of elixir”, accessed March 2009. http://www.wordreference.com/definition/elixir
[2] W3C: “XHTML™ 1.0 The Extensible HyperText Markup Language (Second Edition)”, revised 1 August 2002. http://www.w3.org/TR/xhtml1/
[3] W3C: “HTML 4.01 Specification”, 24 December 1999. http://www.w3.org/TR/REC-html40/
[4] W3C: “Markup Validation Service”, accessed March 2009. http://validator.w3.org/
Copyright (C) 2010 Alex Fernández (address@hidden)