bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] [RFC] Extend concurrency support


From: Daniel Stenberg
Subject: Re: [Bug-wget] [RFC] Extend concurrency support
Date: Thu, 22 May 2014 22:52:37 +0200 (CEST)
User-agent: Alpine 2.00 (DEB 1167 2008-08-23)

On Wed, 21 May 2014, Tim Ruehsen wrote:

libcurl offers a substantial amount of more functionality in the network layer than what wget has. And yes, libcurl has a DNS cache.

As I understood Guiseppe, he wants to concentrate on FTP(S) and HTTP(S). Additional functionality like POP3, IMAP, ... is not going to be used.

Let me rephrase: libcurl supports much more features in HTTP(S) and FTP(S) than wget does. Like more auths, more proxy auths, socks, FTPS, http2, more TLS library support, happy eyeballs and more. All that even event-based if wanted.

You are using chained lists to store cache entries !?

I'm not sure what cache you're talking about. libcurl has several. But also we don't expose those things in the API so if we'd run into a bottle neck it isn't that big of a deal to improve it. I figure you talk about the DNS cache here and that's only storing names for a short period (since it doesn't have the real underlying TTL) so it'll typically never grow very large.

Does this scale with Wget corner cases like 'download the internet' ? I know it sounds it bit nit- picky... but I like to mention it rather sooner than too late.

libcurl is used to download thousands of parallel transfers in existing applications today, in way more "heavy" use than wget ever does and it works perfectly fine and with speed. But of course, there are always room for improvements and we accept patches.

I am not sure, how we find enough people-power for this task. On the other hand side, that's what I've done in the Mget project. I guess, a merge of Mget and Wget would be less work. Mget already implements most of Wget's options plus a bunch more.

If mget > wget, why bother to merge anything to wget? Why not just either keep working on mget as it is or just rename mget to wget2?

- Cookie logic (incl. public suffix handling)

libcurl also provides cookie support.

Yes, to set cookie headers in requests and get cookies from responses.
This would not replace relevant code in Wget's cookie.c

libcurl offers more cookie API than that, but more importantly I did explicitly say that libcurl's functionality in FTP/HTTP/cookies etc would NOT completely remove the need to add necessary adaptions. Perhaps cookie code would be example of an area that would need more adaptions. Or perhaps not.

the checksum has to calculated and verified. With a non-threaded approach, you would serialize this task to a single CPU core. While checksumming, the parallel download is paused. Not so in a threaded model.

If you ask me, the overhead and complexity of threading does not motivate the small performance win this gives you. But that's just me and I'm not forcing this opinon on anyone.

The same goes for DNS resolving as long as the resolving does not work asynchronous.

libcurl offers asynchronous integrated name resolving already.

Not sure how and if this works with libcurl, but I guess that it will make the clients code more complex.

Not at all. The client doesn't have to know nor care.

One question that came to my mind while I was looking at libcurl API. What about type safety (thinking of e.g. curl_easy_setopt()) ?

It only has limited type checks (especially if gcc isn't used) but it is typcially not a big problem to users and the API has been tested and remained stable for quite some time by a fair amount of users. I'm sure it wouldn't be a problem for you either.

--

 / daniel.haxx.se



reply via email to

[Prev in Thread] Current Thread [Next in Thread]