groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Groff] HTML output with &amp instead of &


From: Bruno Haible
Subject: [Groff] HTML output with &amp instead of &
Date: Mon, 9 Apr 2001 15:36:18 +0200 (CEST)

Hi,

HTML output of named characters like \(dq comes out wrong: The resulting HTML
will have """ instead of """. Here is a sample input:

===================== feed this to = groff -mandoc -Thtml ====================
.TH WPRINTF 3  "November 20, 1999" "GNU" "Linux Programmer's Manual"
.SH NAME
\(dq hello \(dq
==============================================================================

This is because the get_html_translation function is called twice. I got
the following two backtraces (for \- which I'd defined to −):

1)
#0  0x804d109 in get_html_translation (f=0x8066a78, name=0x8066a30 "\\-") at 
post-html.cc:2431
#1  0x804cd4e in html_printer::add_to_sbuf (this=0x80618f0, code=18, 
name=0x8066a30 "\\-") at post-html.cc:2271
#2  0x804d567 in html_printer::set_char (this=0x80618f0, i=726, f=0x8066a78, 
env=0xbffff4a8, w=24, 
    name=0x8066a30 "\\-") at post-html.cc:2590
#3  0x8050c3a in printer::set_special_char (this=0x80618f0, nm=0x8066a30 "\\-", 
env=0xbffff4a8, widthp=0x0)
    at printer.cc:132
#4  0x804fe40 in do_file (filename=0x80568f2 "-") at input.cc:161
#5  0x804df6d in main (argc=1, argv=0xbffff564) at post-html.cc:2831

2)
#0  0x804d109 in get_html_translation (f=0x8066a78, name=0xbfffd31a "&") at 
post-html.cc:2431
#1  0x804d171 in char_translate_to_html (f=0x8066a78, buf=0xbfffe39c 
"vswprintf", buflen=4096, ch=38 '&', b=0, 
    and_single=1) at post-html.cc:2465
#2  0x804d42c in str_translate_to_html (f=0x8066a78, buf=0xbfffe39c 
"vswprintf", buflen=4096, 
    str=0x8066b74 "−", len=7, and_single=1) at post-html.cc:2556
#3  0x804c716 in html_printer::translate_to_html (this=0x80618f0, g=0x806bd88) 
at post-html.cc:2003
#4  0x804bf8d in html_printer::flush_globs (this=0x80618f0) at post-html.cc:1747
#5  0x804c00f in html_printer::flush_page (this=0x80618f0) at post-html.cc:1764
#6  0x804d800 in html_printer::end_page (this=0x80618f0) at post-html.cc:2656
#7  0x80504f6 in do_file (filename=0x80568f2 "-") at input.cc:388
#8  0x804df6d in main (argc=1, argv=0xbffff564) at post-html.cc:2831

There are two ways to fix this:
a) Don't call get_html_translation in the first pass at all. But then the buffer
would have to store 'int's, not 'unsigned char's. I don't understand why the
variable 'code' in html_printer::set_char is of type 'unsigned char'.
b) Hide get_html_translation's returned name from expansion in the second pass.

Here is a patch that implements b. I don't know whether it is correct. It only
appears to work :-)


2001-04-08  Bruno Haible  <address@hidden>

        * src/devices/grohtml/post-html.cc (html_printer::add_to_sbuf): Escape
        the html_glyph in the buffer.
        (str_translate_to_html): Output the unescaped escaped_char.

diff -r -c3 groff-current.orig/src/devices/grohtml/post-html.cc 
groff-current/src/devices/grohtml/post-html.cc
*** groff-current.orig/src/devices/grohtml/post-html.cc Sun Apr  8 19:10:26 2001
--- groff-current/src/devices/grohtml/post-html.cc      Sun Apr  8 17:14:39 2001
***************
*** 2276,2284 ****
--- 2276,2290 ----
        int   l          = strlen(html_glyph);
        int   i;
  
+       // Escape the name, so that "&" doesn't get expanded to "&amp;"
+       // later during translate_to_html.
+       add_char_to_sbuf('\\'); add_char_to_sbuf('(');
+ 
        for (i=0; i<l; i++) {
          add_char_to_sbuf(html_glyph[i]);
        }
+ 
+       add_char_to_sbuf('\\'); add_char_to_sbuf(')');
        }
      }
    }
***************
*** 2545,2550 ****
--- 2551,2560 ----
            if (f->contains(index) && (index != 0)) {
              buf[b] = f->get_code(index);
              b++;
+           } else {
+             t = max(0, min(e, buflen-b));
+             strncpy(&buf[b], escaped_char, t);
+             b += t;
            }
          }
        }

reply via email to

[Prev in Thread] Current Thread [Next in Thread]