bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] persistence with multiple hostnames


From: Ángel González
Subject: Re: [Bug-wget] persistence with multiple hostnames
Date: Tue, 17 Apr 2012 20:36:34 +0200
User-agent: Thunderbird

On 17/04/12 18:16, Ryan Rawdon wrote:
> I was speaking with Micah on IRC today regarding a behavior in wget which is 
> different than curl and most or all browsers.
>
> Generally HTTP clients do not use a given persistent connection for more than 
> one hostname, which is why tricks work like spreading static content across 
> multiple name-based vhosts on the same IP address to encourage more 
> parallelization in the fetching of a page's static elements.
>
> However, wget appears to use persistent connections for multiple hostnames 
> (see below).  In the case below, a connection is opened to soldat.pl which 
> 302s to a new hostname.  Wget resolves the new hostname and selects the same 
> address, and decides to reuse the existing connection to this IP address.
>
> The RFC does not appear to address the re-use of persistent connections with 
> regard to hostname, so the behavior is permissible (and fine from a protocol 
> standpoint since Host is specified with each request).
>
> The problem stems from usage of privilege separation between virtualhosts.  
> In the case below, before I fixed it today, wget was receiving 403 on the 
> second request because the user that owned this fd on the server side did not 
> have privileges to access the content for the soldat.thd.vg vhost.
The bug was on the server side. Instead of sending a 403, it should have
closed the connection, thus forcing the client to retry on a new
connection. It will look as a timeout, being perfectly permisible in the
protocol. That's also the same done when apache children reach the
maximum level of requests they can serve.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]