groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Groff] Generating HTML / XML


From: Bill Ward
Subject: Re: [Groff] Generating HTML / XML
Date: Wed, 03 May 2006 11:36:53 -0500
User-agent: Microsoft-Entourage/10.1.4.030702.0

I've basically done what you suggested:

Rewritten a subset of the ms macro package to emit html tags, plus some awk
scripts to clean up the resulting output.  Some things are kludgey.  My goal
was for the basic heading and paragraph macros to produce something like:

.TL
Hello World
.SH 1
Subheading
.PP
Some stuff

Yields

<h4 class="TL">
Hello World
</h4>
<h4 class="SH_1">
Subheading
</h4>
<p class="PP">
Some stuff
</p>

And it seems to work on days that I don't break it ;-)

Just about everything gets a class or an id.  It requires a stack (as you
know) that I've implemented inside the macro package.  The stack
implementation is pretty unsophisticated because there is a deadline (Isn't
it always that way?).

My limited further thoughts on this convinced me that some sort of
"pop-until tag==X and class==Y" might work if somehow the nest level (.RS,
.RT) were taken into account.  An issue is that ms allows paragraphs inside
paragraphs (via .RS and .RT), but HTML objects to this.  Maybe implement
with <div>...</div>.

As you say, some sort of stack with a level/priority associated with stack
items seems to be necessary.   I realize my thoughts on this are still
pretty primitive.

Regards,

Bill Ward

> From: Larry Kollar <address@hidden>
> Date: Tue, 2 May 2006 09:02:07 -0400
> To: Gaius Mulley <address@hidden>
> Cc: Bill Ward <address@hidden>, address@hidden
> Subject: Re: [Groff] Generating HTML / XML
> 
> 
> Gaius Mulley wrote:
> 
>>> Thanks for the quick reply.  W.r.t. your question below, I like
>>> making "the
>>> www macro set initialise post-grohtml with the correct set of tags
>>> for
>>> headings, titles, preformatted text etc."  Sounds great to me.
>> 
>> ok, I wonder whether it could be improved if say the www.tmac told
>> post-grohtml which tags to use together with a tag priority (similar
>> to operator precedence) - which post-grohtml could then determine if a
>> tag should be nested with the current tag stack or whether the current
>> top of stack tag should be popped..
>> 
>> This modification _might_ allow post-grohtml to be emit trivial XML or
>> any tag based output (well in basic form anyway)
> 
> I gave this idea (turning grohtml into a general groxml post-
> processor) some passing thought a while back, but never really
> developed it to the point where I felt it worth sharing.
> 
> The idea I had was to make tag generation table-driven, similar to
> what AT&T nroff used for printer drivers (I wrote one for the NEC
> SpinWriter back when). The default table would still generate HTML,
> and probably live in the (version).tmac directory. An alternate table
> (for DocBook, DITA, OpenDoc, or whatever) could be specified with a
> command-line option, and live anywhere along the GROFF_TMAC_PATH.
> 
> 
> In the meantime, unless you have specific needs, grohtml produces
> HTML that can be cleaned up fairly easily & then transformed to
> something else. You can add a layer on top of ms/mwww to produce more
> customized output or add CLASS-based hooks for transformation. Here's
> an example:
> 
> .\" format MIB variables
> .de MIB
> .ie '\\*[.T]'html' .HTML \\$3<b class=\"mib\">\\$1</b>\\$2
> .el \\$3\f[HB]\s-1\\$1\s0\fP\\$2
> ..
> 
> For character-class formatting like that, a character-level tag (like
> B) gives you a fallback format if the CSS goes missing or a browser
> doesn't support it (or the user turned off CSS). Of course, you can
> use SPAN if you want the fallback to be plain text.
> 
> --
> Larry Kollar     k  o  l  l  a  r  @  a  l  l  t  e  l  .  n  e  t
> Unix Text Processing: "UTP Revival"
> http://unixtext.org/
> 
> 





reply via email to

[Prev in Thread] Current Thread [Next in Thread]