groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Groff] doclifter on groffer.man


From: Deri James
Subject: Re: [Groff] doclifter on groffer.man
Date: Tue, 2 Jan 2007 03:33:47 +0000
User-agent: KMail/1.9.4

On Monday 01 January 2007 19:52, Eric S. Raymond wrote:
> Here is a slightly expanded version of a diagram I posted back towards
> the beginning of the discussion:
>
[...]
>
> The box in the middle is intended to indicate the use of DocBook as a
> common interchange format.

I may have, on occasions, over imbibed on seasonal refreshments in recent days 
so I thought I'd try and set down the knub of this discussion.

Technical documentation has 3 elements - content (the actual words written) - 
structure (gives context to the content) - style (controls the presentation 
of the structure and content).

1. It would be desirable to be able to browse/navigate/search *nix technical 
documentation in a consistent manner - HTML/Browser posited as solution.

(Dealing just with 'man' pages now)

2. Currently man pages generally use the -man macros, although there are no 
restrictions in using any *roff command/escape.

3. The 'man' page author intends to present technical information in the way 
he thinks it will be easiest for the audience to absorb, i.e. he will be more 
interested in presentation and content than structure. This tends to be 
counter to the aims of (1), since a common structure is required to add 
navigational tags and intelligent searching.

4. Fortunately, in the real world, most man page authors have used 
standard -man macros so some structural information can be derived from this. 
By using AI techniques further structure can be deduced.

5. This structure and content of a man page could be captured in XML-Docbook 
by a program called 'doclifter'. 

6. Using just content and structure 'clean' HTML could be produced relying on 
a standard CSS to control presentation. This would mean that presentation 
would not be preserved from the man page,  but would put this under control 
of the user (should high contrast colours, larger fonts, different text to 
speech voices to differentiate structure elements, etc. be required). 

7. Since HTML output would not preserve the original presentation, it would be 
desirable to offer the user a way of viewing the man page as it was 
originally intended (groffer??!). The user can then choose whether to print 
using the browser formatted output, or groff formatted output.

Point 6 above may be "new", in that it appears 'doclifter' is attempting to 
derive presentation information from the troff source (as well as content and 
structure). I would argue this is unnecessary, it would be more desirable to 
completely divorce content from presentation, storing content and structure 
in the XML and relying on a CSS to control presentation. If 'doclifter' 
solely concentrates on extracting content and structure it may be 
considerably simplified (to extract content 'nroff' is your friend ;-)). 
Mapping troff source (to extract structure info) to an nroff image of the 
page, may be easier than trying to track all groff commands and escapes.

Just my tuppence.

Cheers 

Deri




reply via email to

[Prev in Thread] Current Thread [Next in Thread]