bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] Writing whole blocks?


From: Micah Cowan
Subject: Re: [Bug-wget] Writing whole blocks?
Date: Tue, 22 Mar 2011 08:57:39 -0700
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.13) Gecko/20101208 Thunderbird/3.1.7

On 03/22/2011 07:26 AM, Hrvoje Niksic wrote:
> Sebastian Pipping <address@hidden> writes:
> 
>> I noticed that wget writes data to disk as it comes in.
> 
> This is not strictly true, it is up to the OS to write data to disk.
> What Wget does is that it doesn't hold the data in stdio buffers after
> receiving it from the network.  Since the data comes from the network in
> buffers, this is exactly what you want.  It would be a bad idea to
> interrupt Wget only to find that some data is missing because Wget was
> unnecessarily buffering it.

Well, but this problem is the same one suffered by an program anywhere
that uses stdio. And if wget was interrupted, I don't think the user
ought to expect any particular state of data for files that were being
written to at the time of interruption.

But your mention of stdio brings an important point: if folks want to
buffer the data to "page size" chunks before writing, they are much
better off just using stdio to do the buffering. Because the system
library has the best chance of finding an optimum way to write out the
data in a way that's efficient for the system.

Of course, your point that data will typically arrive already in buffers
is also salient; though these buffers might not turn out to be the right
size for solid performance locally, particularly if the server-side
scripts do a lot of flushing or something (for dynamically generated
pages). I definitely think we should see some comparisons between the
proposed changes and the current code before we decide that it's a good
idea. With real-world, typical cases.

>> I was wondering if you would be interested to incorporate a patch
>> buffering writes to full page caches sizes (e.g. 4096 in my machine)
>> by default and adding a parameter to override this behavior.
> 
> Can you describe a specific problem that this additional parameter would
> address?

Do you mean, if he implemented buffering, would disabling it be useful?
I would think so. I could imagine scenarios where one would wish to tail
-f data as it came in. Or there's your explanation above of someone not
wanting to lose extra data because it had been buffered. But I'm
guessing what you really wanted to know was whether the feature as a
whole had a specific problem to address.

-- 
Micah J. Cowan
http://micah.cowan.name/



reply via email to

[Prev in Thread] Current Thread [Next in Thread]