wget-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Why GNUnet prefers curl over wget2


From: Tim Rühsen
Subject: Re: Why GNUnet prefers curl over wget2
Date: Sat, 10 Sep 2022 20:47:20 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.1.0

Hi Bastian,

thanks for the heads up !

TL;DR the wget2 project is willing to help and a multi API may be easy to implement. But someone has to write down the exact needs.

The long version:
I try to explain what libwget does today and that it seems very straight forward to implement an API - and by the way - everybody is invited to do that :-)

Libwget has several layers of abstraction of accessing the network stack(s).

So you have indeed the synchronously, super simplified
  response = wget_http_get(...) like in example/http_get.c

Then there is async HTTP API layer with more control over the details, see example/http_get2.c.
  err = wget_http_open(&conn, url); // comes back immediately
  err = wget_http_send_request(conn, req); // comes back immediately
resp = wget_http_get_response(conn); // waits until error/timeout or response
  wget_http_close(&conn);

While you could send several requests over a single connection, HTTP/1.1 has issues with it. But it works fine with HTTP/2. In this case wget_http_get_response(conn) can be called in a loop, returning the finished response in the order they came in. You can also have as many open connections as you like - but what is missing, if I understood correctly, is an API that fetches the responses from more than one connection at once, like
  resp = wget_http_get_response(conn1, conn1, ..., NULL);

Now there is also a TCP+SSL API (used by the above mentioned high level functions). This API is works asynchronously. It is like
  tcp = wget_tcp_init()
  ... set all kind of configurations to the 'tcp' handle ...
  err = int wget_tcp_connect(tcp, host, port) // returns immediately
  wget_tcp_write() // returns after timeout or immediately if no timeout
  wget_tcp_read() // returns after timeout or immediately if no timeout
  wget_tcp_close()
  wget_tcp_deinit()

Internally, this API uses select/poll, but just uses a single 'tcp' handle.

Now, what a "multi"  API basically would look like is e.g.
a function wget_tcp_select(array of tcp handles, timeout) which can be called in a loop and which returns an array of "ready" tcp handles (ready for write or read, configurable per tcp handle).

For me it looks like this is straight forward to implement (depends on the details / requirements).

Internally, wget_ready_2_transfer(int fd, int timeout, int mode) in
io.c just needs a companion function that takes a list/array of fds.

If there really is interest from the GNUnet community, why not open an issue at https://gitlab.com/gnuwget/wget2/issues to discuss the details and the needs. Once we agree upon the details, the implementation can be done by anyone - whoever likes to pick it up.

Maybe you can pass this or parts of it over to the GNUnet ML.
I currently only have time to read my email on the weekends, so it's maybe not a good idea to jump in myself.


Thank you so much, Bastian !

Regards, Tim


On 08.09.22 19:10, hyazinthe@emailn.de wrote:
Hello GNU fellows,

I'm an associate of another project, which is part of GNU: GNUnet - 
https://www.gnunet.org/en/

Yesterday on the mailinglist of the GNUnet developers - 
https://lists.gnu.org/archive/html/gnunet-developers/2022-09/msg00034.html - a 
question was raised, which certainly is of interest for you:
On 7. Sep 2022, at 15:46, madmurphy <madmurphy333@gmail.com> wrote:

I don't know all the reasons behind using curl and all GNUnet's requirements,
but have you guys thought about switching to wget2? It is a GNU package and
has a nice library (libwget). It supports GNU TLS natively, it is supposed to
download faster than curl, and if a minor feature is missing it might be an
opportunity to make libwget grow.

A comparison table (by curl):

https://curl.se/docs/comparison-table.html

--madmurphy

The answer:
On Wed, Sep 7, 2022 at 2:54 PM Schanzenbach, Martin <mschanzenbach@posteo.de> 
wrote:

We need a non-blocking API such as curl_multi.
Last time I checked, libwget2 does not have that.

BR

Back and forth with details:
(1/2)
On 7. Sep 2022, at 16:28, madmurphy <madmurphy333@gmail.com> wrote:

I never used the curl API, so I don't know what the multi interface is, but
if I remember correctly wget2 introduced non-blocking sockets. That's all I
know. I did not find a lot of info on Google, except maybe for this email on
gnutls mailing list:
https://lists.gnutls.org/pipermail/gnutls-devel/2019-June/014051.html

--madmurphy

(2/2)
On Wed, Sep 7, 2022 at 3:47 PM Schanzenbach, Martin <mschanzenbach@posteo.de> 
wrote:

Imagine that a "GET /download" downloads 1GB of data.
If your code looks like this (not the actual API but for demonstration
purposes):

data = wget_get("/download")
// Wait until download completes

Then you have a blocking API.

Instead you can have a non-blocking API that allows you to "select" or "epoll"
file descriptors of the download.
See https://curl.se/libcurl/c/libcurl-multi.html

I'd rather like to see GNUnet switch to wget2.
Just alone because wget2's licensing goes stronger in direction of copyleft and 
libre computing:
"It is licensed under the GPL-3.0-or-later license, and is wrapped around Libwget 
which is under the LGPL-3.0-or-later license." - 
https://en.wikipedia.org/wiki/Wget#Wget2

So, take this hint as a suggestion for improvement, a feature request.
As soon as functional barrier mentioned in cited conversation is overcome by 
wget2, be sure within GNUnet project I will be advocating the switch to wget2.


Greetings,
Bastian Schmidt



Attachment: OpenPGP_signature
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]