[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH ssh] ssh: Do not switch session to non-blocking mode
From: |
Kevin Wolf |
Subject: |
Re: [PATCH ssh] ssh: Do not switch session to non-blocking mode |
Date: |
Thu, 14 Nov 2024 16:21:37 +0100 |
Am 13.11.2024 um 14:00 hat Richard W.M. Jones geschrieben:
> On Wed, Nov 13, 2024 at 03:02:59PM +0300, Michael Tokarev wrote:
> > Heh. I was creating a qemu bug report on gitlab when this email arrived :)
> >
> > 13.11.2024 14:49, Richard W.M. Jones wrote:
> > >From: Jakub Jelen <jjelen@redhat.com>
> > >
> > >The libssh does not handle non-blocking mode in SFTP correctly. The
> > >driver code already changes the mode to blocking for the SFTP
> > >initialization, but for some reason changes to non-blocking mode.
> >
> > "changes to non-blocking mode LATER", I guess, - or else it's a bit
> > difficult to read. But this works too.
> >
> > >This used to work accidentally until libssh in 0.11 branch merged
> > >the patch to avoid infinite looping in case of network errors:
> > >
> > >https://gitlab.com/libssh/libssh-mirror/-/merge_requests/498
> > >
> > >Since then, the ssh driver in qemu fails to read files over SFTP
> > >as the first SFTP messages exchanged after switching the session
> > >to non-blocking mode return SSH_AGAIN, but that message is lost
> > >int the SFTP internals and interpretted as SSH_ERROR, which is
> > >returned to the caller:
> > >
> > >https://gitlab.com/libssh/libssh-mirror/-/issues/280
> > >
> > >This is indeed an issue in libssh that we should address in the
> > >long term, but it will require more work on the internals. For
> > >now, the SFTP is not supported in non-blocking mode.
> >
> > The comment at init where the code sets socket to blocking mode, says:
> >
> > /*
> > * Make sure we are in blocking mode during the connection and
> > * authentication phases.
> > */
> > ssh_set_blocking(s->session, 1);
> >
> >
> > There are a few other places where the code expect "some" blocking
> > mode, changes it to blocking, and restores the mode later, - eg,
> > see ssh_grow_file(). It looks all this has to be fixed too.
I agree, if we're moving away from non-blocking sessions, then we should
remove the switching everywhere.
But obviously...
> I'll just note that I'm only forwarding on the patch from Jakub.
> I didn't write it.
>
> I did lightly test it, and it seems to work. However it seems also
> likely that it is causing qemu to block internally. Probably not
> noticable for light use, but not something that we'd want for serious
> use. However if libssh doesn't support non-blocking SFTP there's not
> much we can do about that in qemu.
...just making it blocking is not acceptable. It will potentially make
the guest hang while we're waiting for sftp responses.
I see that there is an sftp_aio_*() API, but it looks weird. Instead of
allowing you to just poll the next request that is ready, you have to
call a (blocking) wait on a specific request.
co_yield(), which is currently used when sftp_read() returns SSH_AGAIN,
makes sure that we poll the socket fd, so we can know that _something_
new has arrived. However it's unclear to me how to know _which_ request
received a reply and can be completed now. It seems you have to call
sftp_aio_wait_*() in non-blocking mode on all requests to do that, which
probably is affected by the libssh bug, too.
So I'm not sure if sftp_aio_*() can be combined with something else into
a working solution, and I also don't know if it's affected by the same
libssh bug.
Jakub, can you help with that?
> I would recommend using nbdkit-ssh-plugin instead anyway as it is much
> more featureful and doesn't have this problem as we use real threads
> instead of coroutines.
Telling people to switch away from QEMU is not an appropriate fix for
the problem.
QEMU has all of the infrastructure with thread pools etc., that's not a
unique thing of nbdkit. So if libssh can't provide working non-blocking
connections, we'll have to use blocking sftp_read() in a worker thread.
It's uglier than using a proper asynchronous interface, but we'll have
to work with whatever we get from the library.
As far as I can see, libssh sessions aren't thread safe, so we'll have
to make sure to have only one request going at the same time, but I
assume that calling ssh_read/write() from different threads sequentially
isn't a problem?
> > I wonder if qemu ssh driver needs to mess with blocking mode of this
> > socket in the first place, ever. Is there a way qemu can get non-blocking
> > socket in this context? I can only think of fd=NNN, but is it
> > possible for this socket to be non-blocking?
I'm not sure if this is actually related to blocking sockets
specifically. It seems to me that it's more about blocking behaviour in
libssh itself, while it internally uses poll() to avoid blocking.
Kevin
- [PATCH ssh] ssh: Do not switch session to non-blocking mode, Richard W.M. Jones, 2024/11/13
- Re: [PATCH ssh] ssh: Do not switch session to non-blocking mode, Michael Tokarev, 2024/11/13
- Re: [PATCH ssh] ssh: Do not switch session to non-blocking mode, Richard W.M. Jones, 2024/11/13
- Re: [PATCH ssh] ssh: Do not switch session to non-blocking mode,
Kevin Wolf <=
- Re: [PATCH ssh] ssh: Do not switch session to non-blocking mode, Jakub Jelen, 2024/11/14
- Re: [PATCH ssh] ssh: Do not switch session to non-blocking mode, Kevin Wolf, 2024/11/14
- Re: [PATCH ssh] ssh: Do not switch session to non-blocking mode, Jakub Jelen, 2024/11/18
- Re: [PATCH ssh] ssh: Do not switch session to non-blocking mode, Kevin Wolf, 2024/11/22
- Re: [PATCH ssh] ssh: Do not switch session to non-blocking mode, Jakub Jelen, 2024/11/25