lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Lynx-dev] Problems accessing wikipedia with lynx 2.8.8


From: Dick Wesseling
Subject: Re: [Lynx-dev] Problems accessing wikipedia with lynx 2.8.8
Date: Fri, 27 Oct 2017 03:14:01 +0200

address@hidden wrote:

> About two weeks ago, lynx 2.8.8 rel 2, running on this Solaris system,
> ceased to be able to access wikipedia, although it had successfully done
> so for many years.
>
> Is this a known problem?  Any suggestions appreciated.

This is a known problem, known to me at least. Wikipedia used to trigger
a bug in Lynx, then they apparently changed something which made Lynx
work again, but recently it stopped working again.

The bug is that if a server sends a large amount of HTTP headers then
Lynx does not read all of the data sent by the server and you get a
truncated .gz file.

The reason is as follows:

- Network data is delivered to Lynx in chunks.

- HTLoadHTTP() reads the first chunk and processes the data in that
  chunk, in particular the HTTP response line and the Content-Length
  header.
  At this moment the input stream can be positioned anywhere!

- Next HTLoadHTTP() calls HTCopy() to read and process the bulk
  of the data. HTCopy() thinks it can limit the amount read to
  anchor->content_length, but that is not true because the stream
  is not positioned at the start of the content.

  However, HTCopy() usually gets away with this because it always
  reads in portions of INPUT_BUFFER_SIZE. This mitigates the problem,
  but not enough for Wikipedia.

The following patch solves the problem with Wikipedia, but it probably
breaks other things.

--- lynx2.8.9dev.16/WWW/Library/Implementation/HTFormat.c.bak   Sun Jul  2 
19:09:45 2017
+++ lynx2.8.9dev.16/WWW/Library/Implementation/HTFormat.c       Fri Oct 27 
02:26:46 2017
@@ -729,11 +729,11 @@
           void *handle GCC_UNUSED,
           HTStream *sink)
 {
     HTStreamClass targetClass;
     BOOL suppress_readprogress = NO;
-    off_t limit = anchor ? anchor->content_length : 0;
+    off_t limit = 0;
     off_t bytes = 0;
     int rv = 0;

     /*  Push the data down the stream
      */


Here's a test program that demonstrates the bug.

#include <sys/socket.h>
#include <netinet/in.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <stdio.h>

#define FILLER "filler:  blabla blabla blabla blabla blabla\r\n"
#define FILLER2 FILLER FILLER
#define FILLER3 FILLER2 FILLER2
#define FILLER4 FILLER3 FILLER3
#define FILLER5 FILLER4 FILLER4
#define FILLER6 FILLER5 FILLER5
#define FILLER7 FILLER6 FILLER6
#define FILLER8 FILLER7 FILLER7

char sink[0x10000];

int transac(int fd)
{
    char *response =
        "HTTP/1.0 200 OK\r\n"
        "content-type: text/plain\r\n"
        "content-length: 3\r\n"
        FILLER8
        "\r\n"
        "X\r\n" ;

    /* Discard request.
    */
    read(fd, sink, sizeof(sink));

    write(fd, response, strlen(response));
    close(fd);
    return 0;
}

int main(int argc, char **argv)
{   struct sockaddr_in serveraddr;
    int serverfd;

    int portno = 12345;
    if (argc >1) portno = atoi(argv[1]);
    bzero((char *) &serveraddr, sizeof(serveraddr));
    serveraddr.sin_port = htons((unsigned short)portno);
    serveraddr.sin_family = AF_INET;
    serveraddr.sin_addr.s_addr = htonl(INADDR_ANY);

    serverfd = socket(AF_INET, SOCK_STREAM, 0);
    if (bind(serverfd, (struct sockaddr *) &serveraddr, sizeof(serveraddr)) < 
0){
        perror("bind");
        exit(1);
    }
    if (listen(serverfd, 5) < 0) {
        perror("listen");
        exit(1);
    }

    while(1) {
       struct sockaddr_in clientaddr;
       socklen_t clientlen;
       int fd = accept(serverfd, (struct sockaddr *) &clientaddr, &clientlen);
       if (fd>=0) transac(fd);
    }
    return 0;
}



reply via email to

[Prev in Thread] Current Thread [Next in Thread]