bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] GSoC15: Speed up Wget's Download Mechanism


From: Darshit Shah
Subject: Re: [Bug-wget] GSoC15: Speed up Wget's Download Mechanism
Date: Thu, 30 Apr 2015 11:19:15 +0530
User-agent: Mutt/1.5.23 (2014-03-12)

Hi Hubert!

Congrats on your selection. I look forward to a great summer of code in Wget this time around.

On 04/29, Hubert Tarasiuk wrote:
Hello developers,

My proposal for *Speed up Wget's Download Mechanism* has been accepted
by the mentors!

There are two tasks to be done there:
- conditional GET requests (if-modified-since) (RFC7232)
- TCP Fast Open (RFC7413)

A summarized version of my proposal is available:
http://pliki.h.trsk.org/gsoc/wget_public.pdf

IMHO it is quite obvious how the first feature should be implemented in
Wget. However, there is some more moving around needed to use TFO. I
have proposed two possible ways in the above PDF. Perhaps you can
express your opinion about the approaches, or you have another idea for
accomplishing it?

There's two separate points I want to make here:

1. With respect to the changes in the Wget source, I think it is saner to merge the connect methods. Just ensure that we can handle proxies and FTP connections without any code duplication. I don't think there should be anything special when making a HTTPS connection? 2. Regarding the socket options, we should spend some more time evaluating our options. My understanding of TCP_CORK is that it may be a useful option for Servers, but it doesn't really affect TCP clients in any useful way. This is because TCP_CORK modifies the minimum TCP packet size by buffering for as much data before sending it out. With the small request sizes that a HTTP client would generally send, I think it is better to follow Nagle's algorithm, since TCP_CORK will not afford us any noticeable advantage. On the other hand, it's non-portability will be a nightmare for us when trying to support OSX, BSD and Windows.


Another issue I am thinking about is how to test the TFO feature. I am
not very familiar with network API in Python, but my first idea would be
to count the TCP segments sent and received and/or to check that the
first packet (with SYN flag) contains data (the request). What do you think?

I haven't gone through this code thoroughly yet, but they tried to reproduce the results of the original TFO whitepaper using a Python HTTP Server, like the one we use for our test suite. Maybe we can borrow some code from them?

I will be thankful for any other suggestions, as well.

Have a good day,
Hubert





--
Thanking You,
Darshit Shah

Attachment: pgp5uL61wnJPH.pgp
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]