[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Lynx-dev] table formatting issue
From: |
Larry W. Virden |
Subject: |
[Lynx-dev] table formatting issue |
Date: |
Mon, 18 Apr 2005 14:17:05 -0400 (EDT) |
I'm trying to convert an HTML page to plain text. I'm struggling with
a behavior that seems counter intuitive. Perhaps there's a flag I am
missing.
My lynx (2.8.6dev11) invocation is this (I'm in an xterm which has 300 columns):
$ unset COLUMNS
$ /projects/intranet/bin/lynx -dont_wrap_pre -width=999 -hiddenlinks=ignore
-nobold -nocolor -nolist -dump -force_html /tmp/lib.html >
/volws/$USER/ldatae/$USER.lib.txt
The intent that I'm trying for is that lynx not wrap any lines of text .
However, I'm not successful in this attempt.
A sample of the html giving me fits (please understand that this is a extremely
cut down version)
looks like this:
<!doctype html public "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3c.org/TR/html4/loose.dtd">
<html>
<head>
</head>
<body>
<table border=0 cellspacing="15">
<tr>
<th align=left> Title/Author </th>
<th align=left> Format </th>
<th align=left> Status </th>
<th align=left> </th>
<th align=left> </th>
</tr>
<tr>
<td valign="top">Those who walk in darkness / Ridley,
John,</td>
<td valign="top">Book</td>
<td valign="top">
<table bgcolor="yellow">
<tr>
<td valign="top">
ready for pickup at Reynoldsburg
by 22Apr2005
</td>
</tr>
</table>
</td>
<td>
<form method=post action="/cgi-bin/wpcr1075.shtml">
<input type=hidden name="requestcd" value=1>
<input type=hidden name="title"
value="Those%20who%20walk%20in%20darkness%20/%20Ridley,%20John,">
<input type=hidden name="recordnumber"
value=59423>
<input type=hidden name="patronid"
value=210188336>
<input type=hidden name="rsp_reserve_cnt"
value="2">
<input type=hidden name="rsp_reserve_video_cnt"
value="5">
<input type=hidden name="rsp_print_count"
value="2">
<input type=hidden name="rsp_audio_count"
value="3">
<input type=hidden name="rsp_video_count"
value="0">
<input type=hidden name="rsp_im_count"
value="0">
<input type=hidden name="rsp_first_name"
value="LARRY">
<input type=hidden name="rsp_fine_balance"
value="0.0">
<input type=hidden name="matl_type_desc"
value="Book">
<input type=submit value="Cancel"
style="background-color: red">
</form>
</td>
<td valign=top>
</td>
</tr>
</table>
</body>
</html>
I have no control over what html is generated, and I'd rather not
massage the html after I fetch it if I don't have to.
Where as what I expected was:
Title/Author Format Status
Those who walk in darkness / Ridley, John, Book ready for pickup at
Reynoldsburg by 22Apr2005
[1]Cancel
Title/Author Format Status
Those who walk in darkness / Ridley, John, Book ready for pickup at
Reynoldsburg by 22Apr2005
what I get is
Title/Author Format Status
Those who walk in darkness / Ridley, John, Book
ready for pickup at Reynoldsburg by 22Apr2005
[1]Cancel
If I use Firefox or Mozilla with a window of 300 characters, and then
view the HTML, and the page is displayed as I was hoping lynx would
do.
Is there some additional flag I am needing to get lynx to not wrap things like
it is doing?
--
Tcl - The glue of a new generation. <URL: http://wiki.tcl.tk/ >
Larry W. Virden <mailto:address@hidden > <URL: http://www.purl.org/NET/lvirden/
>
Even if explicitly stated to the contrary, nothing in this posting should
be construed as representing my employer's opinions.
-><-
- [Lynx-dev] table formatting issue,
Larry W. Virden <=