bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bug-wget] GSoC project proposals, speed up


From: Laura Rueda
Subject: [Bug-wget] GSoC project proposals, speed up
Date: Tue, 10 Mar 2015 17:24:07 +0100

Hello,

It is great to see we are many students taking an interest in Wget. When I
went through the list of proposed projects, yours really caught up my eye,
C, protocols, unambiguous and certainly useful, oh yes! :)

I have been exploring the speed up ideas, namely the if-modified-since
headers and the TCP Fast Open implementation and I would like to make sure
I am walking in the right direction, both with the approach and the
assumptions.

*if-modified-since*

The idea is to reduce the amount of requests to obtain modified documents,
moving from the current three steps (HEAD, last-modified check and GET) to
the new conditional header. This should include a better handling of the
possible responses as well, like HTTP_STATUS_NOT_MODIFIED, that seems to be
defined but not treated. Plus a new argument (e.g. --if-modified), config,
tests…

It is a nice improvement for particular applications, e.g. efficient
updates for caches or time-saving crawlers, and for an overall bandwidth
reduction.

*TCP Fast Open*

On the other hand, TFO has a wider application; of course it lives in a
lower level. TFO allows servers to start sending their responses directly
after the SYN/ACK message, without waiting for the third handshake. It is
based on the exchange of a secure token/cookie during the first connection
and saving one RTT per request after.

Particularly for HTTP requests, with short data flows, the overall impact
can be very high (the RFC estimates it up to 40%, which sounds like forcing
a bit too much the best case). Taking some measures after the
implementation to verify it will complement the project nicely.

Linux contains the full implementation of TFO and, since 3.13, it is
enabled by default. The rest of most common OS don’t support it (Windows,
Mac OS); but others are considering it (FreeBSD), maybe by summer…

For this task, I see I should introduce the MSG_FASTOPEN flag to the calls,
moving from connect() to sendmsg()/sendto().Should this become a default or
should it be configurable? It sounds like the kind of thing that could
leave in your .wgetrc, but I honestly don't find any reason to force the
conventional TCP. It should just happen automatically if the remote server
doesn’t support TFO.

I would love to hear your ideas and comments to improve upon my proposal
draft. In the meantime, I will start reading the codebase and try fixing
small bugs, as already suggested in the list.

Many thanks in advance,

Laura


reply via email to

[Prev in Thread] Current Thread [Next in Thread]