[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Groff] HTML output with & instead of &
From: |
Bruno Haible |
Subject: |
[Groff] HTML output with & instead of & |
Date: |
Mon, 9 Apr 2001 15:36:18 +0200 (CEST) |
Hi,
HTML output of named characters like \(dq comes out wrong: The resulting HTML
will have """ instead of """. Here is a sample input:
===================== feed this to = groff -mandoc -Thtml ====================
.TH WPRINTF 3 "November 20, 1999" "GNU" "Linux Programmer's Manual"
.SH NAME
\(dq hello \(dq
==============================================================================
This is because the get_html_translation function is called twice. I got
the following two backtraces (for \- which I'd defined to −):
1)
#0 0x804d109 in get_html_translation (f=0x8066a78, name=0x8066a30 "\\-") at
post-html.cc:2431
#1 0x804cd4e in html_printer::add_to_sbuf (this=0x80618f0, code=18,
name=0x8066a30 "\\-") at post-html.cc:2271
#2 0x804d567 in html_printer::set_char (this=0x80618f0, i=726, f=0x8066a78,
env=0xbffff4a8, w=24,
name=0x8066a30 "\\-") at post-html.cc:2590
#3 0x8050c3a in printer::set_special_char (this=0x80618f0, nm=0x8066a30 "\\-",
env=0xbffff4a8, widthp=0x0)
at printer.cc:132
#4 0x804fe40 in do_file (filename=0x80568f2 "-") at input.cc:161
#5 0x804df6d in main (argc=1, argv=0xbffff564) at post-html.cc:2831
2)
#0 0x804d109 in get_html_translation (f=0x8066a78, name=0xbfffd31a "&") at
post-html.cc:2431
#1 0x804d171 in char_translate_to_html (f=0x8066a78, buf=0xbfffe39c
"vswprintf", buflen=4096, ch=38 '&', b=0,
and_single=1) at post-html.cc:2465
#2 0x804d42c in str_translate_to_html (f=0x8066a78, buf=0xbfffe39c
"vswprintf", buflen=4096,
str=0x8066b74 "−", len=7, and_single=1) at post-html.cc:2556
#3 0x804c716 in html_printer::translate_to_html (this=0x80618f0, g=0x806bd88)
at post-html.cc:2003
#4 0x804bf8d in html_printer::flush_globs (this=0x80618f0) at post-html.cc:1747
#5 0x804c00f in html_printer::flush_page (this=0x80618f0) at post-html.cc:1764
#6 0x804d800 in html_printer::end_page (this=0x80618f0) at post-html.cc:2656
#7 0x80504f6 in do_file (filename=0x80568f2 "-") at input.cc:388
#8 0x804df6d in main (argc=1, argv=0xbffff564) at post-html.cc:2831
There are two ways to fix this:
a) Don't call get_html_translation in the first pass at all. But then the buffer
would have to store 'int's, not 'unsigned char's. I don't understand why the
variable 'code' in html_printer::set_char is of type 'unsigned char'.
b) Hide get_html_translation's returned name from expansion in the second pass.
Here is a patch that implements b. I don't know whether it is correct. It only
appears to work :-)
2001-04-08 Bruno Haible <address@hidden>
* src/devices/grohtml/post-html.cc (html_printer::add_to_sbuf): Escape
the html_glyph in the buffer.
(str_translate_to_html): Output the unescaped escaped_char.
diff -r -c3 groff-current.orig/src/devices/grohtml/post-html.cc
groff-current/src/devices/grohtml/post-html.cc
*** groff-current.orig/src/devices/grohtml/post-html.cc Sun Apr 8 19:10:26 2001
--- groff-current/src/devices/grohtml/post-html.cc Sun Apr 8 17:14:39 2001
***************
*** 2276,2284 ****
--- 2276,2290 ----
int l = strlen(html_glyph);
int i;
+ // Escape the name, so that "&" doesn't get expanded to "&"
+ // later during translate_to_html.
+ add_char_to_sbuf('\\'); add_char_to_sbuf('(');
+
for (i=0; i<l; i++) {
add_char_to_sbuf(html_glyph[i]);
}
+
+ add_char_to_sbuf('\\'); add_char_to_sbuf(')');
}
}
}
***************
*** 2545,2550 ****
--- 2551,2560 ----
if (f->contains(index) && (index != 0)) {
buf[b] = f->get_code(index);
b++;
+ } else {
+ t = max(0, min(e, buflen-b));
+ strncpy(&buf[b], escaped_char, t);
+ b += t;
}
}
}
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [Groff] HTML output with & instead of &,
Bruno Haible <=