lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Lynx-dev] List-of-links encoded improperly


From: Dimitrios Semitsoglou-Tsiapos
Subject: Re: [Lynx-dev] List-of-links encoded improperly
Date: Thu, 23 Feb 2017 12:28:52 +0200
User-agent: NeoMutt/20170206-210-ea631c-dirty (1.7.2)

On  Wed 22-Feb-17 19:38, Thomas Dickey wrote:
> On Wed, Feb 22, 2017 at 10:32:24PM +0200, Dimitrios Semitsoglou-Tsiapos wrote:
> > Greetings Lynx developers and users!
> >
> > I have noticed that in `-dump` mode lynx will percent-encode reserved
> > characters in the "list of links" if `-display_charset=UTF-8` is set (or
> > perhaps any value other than ISO-8859-1). This can cause some URLs to
> > effectively break.
> >
> > Would it perhaps be correct to simply ignore `display_charset` while
> > printing these URLs?
>
> not really - it's generating the file (not passing it on), and is
> using a known encoding.
>

I am probably misinterpreting the problem, so I will give an example. I
have received email from ebay where they encode URLs multiple times
within all their links. For example, here's three successive (but not
necessarily consecutive) chunks of a single URL:

HTML source                         lynx -dump
---------------------------------   ---------------------------
http://rover.ebay.com               http://rover.ebay.com
https%3A%2F%2Fsvcs.ebay.com         https://svcs.ebay.com
L%252B                              L%2B
http%253A%252F%252Frover.ebay.com   http%3A%2F%2Frover.ebay.com

>From those I have come up with a minimal example (they probably encode
too much personal information in their arguments for me to upload the
whole URL).


    # Verify the example URL redirects to their home page:
    $ url='https://svcs.ebay.com/delstats/email/location?ch=7%26di=12345'
    $ lynx -dump "$url" | head -1
       #[1]alternate [2]alternate [3]alternate [4]alternate [5]alternate5

    # Verify opening the URL from within lynx works
    $ echo '<a href='"$url"'>click me</a>' > /tmp/foo.html
    $ lynx /tmp/foo.html  # Now press return

    # Dump this working file:
    $ lynx -dump /tmp/foo.html
       [1]click me


      References

          1. https://svcs.ebay.com/delstats/email/location?ch=7&di=12345

    # Try to open the resulting URL:
    $ lynx 'https://svcs.ebay.com/delstats/email/location?ch=7&di=12345'

    Looking up svcs.ebay.com
    Making HTTPS connection to svcs.ebay.com
    SSL callback:ok, preverify_ok=1, ssl_okay=0
    SSL callback:ok, preverify_ok=1, ssl_okay=0
    SSL callback:ok, preverify_ok=1, ssl_okay=0
    Verified connection to svcs.ebay.com (cert=svcs.ebay.com)
    Certificate issued by: /C=US/O=Symantec Corporation/OU=Symantec Trust 
Network/CN=Symantec Class 3 Secure Server CA - G4
    Secure 128-bit TLSv1/SSLv3 (AES128-GCM-SHA256) HTTP connection
    Sending HTTP request.
    HTTP request sent; waiting for response.
    HTTP/1.0 307 Temporary Redirect
    'A'lways allowing from domain '.ebay.com'.
    Alert!: Got redirection with no Location header.
    Data transfer complete
    /bin/gzip -d --no-name /tmp/lynxXXXXuQX0AJ/L15600-7565TMP.bin.gz
    Using file://localhost/tmp/lynxXXXXuQX0AJ/L15600-7565TMP.bin
    hexdump '/tmp/lynxXXXXuQX0AJ/L15600-7565TMP.bin'

    lynx: Start file could not be found or is not text/html or text/plain
          Exiting...


    # This is in fact the error I get when opening a real dumped URL.
    # Now dump with ISO-8859-1:
    $ lynx -display_charset=ISO-8859-1 -dump /tmp/foo.html
       [1]click me


    References

       1. https://svcs.ebay.com/delstats/email/location?ch=7%26di=12345

    # The resulting URL works as expected.


Would ebay be at fault here (for their encoding or server handling),
lynx, or I for using the dumped URL directly?



reply via email to

[Prev in Thread] Current Thread [Next in Thread]