groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Groff] Re: Groff Digest, Vol 79, Issue 11


From: justin
Subject: [Groff] Re: Groff Digest, Vol 79, Issue 11
Date: Sat, 26 Mar 2011 12:42:11 -0600 (MDT)
User-agent: Alpine 1.10 (DEB 962 2008-03-14)


That did it, seems that nroff takes -Tutf8 by default, specifying
-Tascii along with the -mms option when running nroff worked. Your
suggestion to simply run groff with -ms and the -Thtml option also
worked out.

Thank you for the your time in helping me understand my problem,
I'm impressed with the support I've received.

Justin

On Sat, 26 Mar 2011, address@hidden wrote:

Send Groff mailing list submissions to
        address@hidden

To subscribe or unsubscribe via the World Wide Web, visit
        http://lists.gnu.org/mailman/listinfo/groff
or, via email, send a message with subject or body 'help' to
        address@hidden

You can reach the person managing the list at
        address@hidden

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Groff digest..."


Today's Topics:

  1. Re: Having a problem with parsing output to html... (justin)
  2. Re: Having a problem with parsing output to html...
     (Tadziu Hoffmann)


----------------------------------------------------------------------

Message: 1
Date: Fri, 25 Mar 2011 12:20:25 -0600 (MDT)
From: justin <address@hidden>
Subject: Re: [Groff] Having a problem with parsing output to html...
To: Keith Marshall <address@hidden>
Cc: address@hidden, address@hidden
Message-ID: <address@hidden>
Content-Type: text/plain; charset="iso-8859-1"


Hello

Yes, Keith was right, all the mumbo jumbo I wrote with the exception of
the two sentences swapping spaces was indeed related to the hyphenation.

I would have caught this, if my observation was of the parsed html text
and not of the actual html file.

I now installed groff 1.21, seeing if it would made a difference. The
problem with the two sentences swapping placing is now resolved.

The problem with hyphens, apostrophes, and dashes still remains.

I'm including a sample of the results.



On Fri, 25 Mar 2011, Keith Marshall wrote:

On 25 March 2011 04:38, Werner LEMBERG wrote:

Justin,

a simple example says more than thousand words... ?So please give us
an example we can examine.

Hear!  Hear!

At a first glance, it seems you have an encoding problem (but this
doesn't explain the strange things you see). ?The default encoding of
groff is latin1, and your input file is probably UTF8. ?Starting with
version 1.20, groff can handle UTF8 by use a new preprocessor.

The HTML output driver is still experimental (and basically
unmaintained currently due to lack of time and interest); it is easily
possible that you've found a bug.

Equally -- perhaps more -- likely, Justin has encountered a hyphenation
issue.  This:

On the 11th in my groff file, an "?" character is found after 64
characters have been printed, within the word hamburger, the text gets
parsed and printed as "ham?burger". If I change hamburger to donations
I have the "?" character show up at the 60th character on the line,
with donations being "dona?tions".

is reminiscent of an issue I myself observed, earlier this week.  I had
run some informally structured ASCII text through a sed filter, and then
through nroff, (v1.20.1), to produce an alternative layout.  Although I
had suppressed hyphenation (.hy 0), I did have several explicit ASCII
hyphen characters in the input stream; each of these was replaced, in
the output stream, by the three byte octal sequence 342 200 220, (which
I guess represents u2010 -- the Unicode hyphen which groff_char(7)
documents as the output form for hyphen).

Viewing this output with "less", on my UTF-8 aware console, it looked
absolutely fine, but after uploading as a package description file on my
SourceForge downloads page, each hyphen was rendered, by Firefox, with
unwanted whitespace surrounding it; rendered by Internet Explorer, each
hyphen was replaced by three characters of garbage, amongst it being the
"?" observed by Justin, IIRC.

So yes, I guess what you actually see is dependent on encoding, (and how
the viewer interprets the u2010 sequence, however it is encoded).  In my
case, I wanted real ASCII hyphens in my output stream; adding "-Tascii"
to my nroff command gave me that.

--
Regards,
Keith.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: bill_hicks.tr
Type: application/x-troff
Size: 5167 bytes
Desc: groff file
Url : 
http://lists.gnu.org/archive/html/groff/attachments/20110325/b8fa798f/bill_hicks.tr

------------------------------

Message: 2
Date: Fri, 25 Mar 2011 20:28:30 +0100
From: Tadziu Hoffmann <address@hidden>
Subject: Re: [Groff] Having a problem with parsing output to html...
To: address@hidden
Message-ID: <address@hidden>
Content-Type: text/plain; charset=us-ascii


cat file.tr | nroff -mms | groff -Thtml

I'm not sure what you're trying to accomplish.  I don't think it
makes sense to process already-formatted text (from the nroff
run) with groff again.  It's also unclear whether your error
messages are from the "nroff" or the "groff -Thtml" process.
(Btw, "nroff" is basically "groff -Tascii" or "groff -Tlatin1"
or "groff -Tutf8".)

You could try

 groff -ms -Thtml file.tr >file.html


Also, you have several lines beginning with an apostrophe.
The apostrophe is by default the no-break control character,
so groff complains about an undefined macro it thinks you're
trying to call.  You should substitute apostrophes which are
meant to be left quotes by "`" (character code 96 in ASCII)
or "\[oq]" (or "\[lq]" if you want double quotes).





------------------------------

_______________________________________________
Groff mailing list
address@hidden
http://lists.gnu.org/mailman/listinfo/groff


End of Groff Digest, Vol 79, Issue 11
*************************************




reply via email to

[Prev in Thread] Current Thread [Next in Thread]