lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

lynx-dev Tables


From: John Summerfield
Subject: lynx-dev Tables
Date: Fri, 3 Jul 1998 11:58:03 +0800 (WST)

I've begun using Lynx to fetch documents that are at fixed URLs to get
information regularly. I then run the result though perl scripts to
extract the information so I can feed it into other programs.

See http://www.stockrocket.com.au/com/clsprice/clsprice.html for an
example of the kind of file I'm retrieving (but be aware it's quite
large).

Here is an exerpt that's well-presented:
             _Company Name_ _Code_ _Open_ _High_ _Low_ _Close_ _Volume_
_Value_
     ACACIA RESOURCES LIMITED AAA 1.73 1.81 1.73 1.81 3648921 6397007.0
         ACACIA RESOURCES LIMITED AAAWMA 0.00 0.00 0.00 0.00 0 0.0

Unfortunately for me it goes on
      ACACIA RESOURCES LIMITED AAAWMB 0.00 0.00 0.00 0.02 0 0.0 ACACIA
         RESOURCES LIMITED AAAWXA 0.35 0.40 0.35 0.40 50000 19300.0
   Aapc Limited AAD 0.00 0.00 0.00 0.65 0 0.0 AAPT LIMITED AAP 3.10 3.10
                          3.06 3.06 19759 60986.2

The problem is that company names don't always start on new lines.


Here is a snippet of the html that creates this display:
<script>trclr(1);</script>
<td><font size=2>ACACIA RESOURCES LIMITED</font></td>
<td><font size=2>AAAWXA</font></td>
<td align=right><font size=2>0.35</font></td>
<td align=right><font size=2>0.40</font></td>
<td align=right><font size=2>0.35</font></td>
<td align=right><font size=2>0.40</font></td>
<td align=right><font size=2>      50000</font></td>
<td align=right><font size=2>      19300.0</font></td>
</tr>
<tr>
<td><font size=2>Aapc Limited                 </font></td>
<td><font size=2>AAD</font></td>
<td align=right><font size=2>0.00</font></td>
<td align=right><font size=2>0.00</font></td>
<td align=right><font size=2>0.00</font></td>
<td align=right><font size=2>0.65</font></td>
<td align=right><font size=2>          0</font></td>
<td align=right><font size=2>          0.0</font></td>
</tr>

I've no idea what the snippet of javascript does but I think that in this
case it seems to me that ignoring it is the right thing to do.

What I'd like to see is for Lynx to format each table row as a separate
line, either unconditionally or in conjunction with yet another
command-line switch.

I propose this behaviour be implemented in conjunction with the -dump
option. I don't have any prticular opinion as to what it should do in
normal display mode.

If anyone wishes to discuss the matter with me, pls send me a copy of any
correspondence: I don't plan to increase the volume of mail I receive by
subscribing to yet another mailing list.

I feel I've argued my case and am content to assume that my arguments are
overwhelming (and the difficulty not too great) or that those with the
expertise and too-little time are compentant to decide it shouldn't be
done.

Either way, unless it's changed extraordinarily quickly I'll be parsing
& reformatting this stuff in perl first.


Cheers
John Summerfield
http://os2.ami.com.au/os2/ for OS/2 support.
Configuration, networking, combined IBM ftpsites index.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]