qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 49/51] io/channel-watch: Fix socket watch on Windows


From: Bin Meng
Subject: Re: [PATCH 49/51] io/channel-watch: Fix socket watch on Windows
Date: Tue, 6 Sep 2022 15:00:43 +0800

Hi Clément,

On Mon, Sep 5, 2022 at 4:10 PM Clément Chigot <chigot@adacore.com> wrote:
>
> Hi all,
>
> I did reach the same issue while trying to connect a gdb to qemu on
> Windows hosts. Some inputs send by gdb aren't getting correctly pulled,
> they will be retrieved only once g_poll times out.
>
> As you explained earlier, the issue, AFAICT, is that WSAEnumNetworkEvents
> will reset the internal events before select can detect them.

No, I don't think WSAEnumNetworkEvents and select cannot be used together.

MSDN says:
https://docs.microsoft.com/en-us/windows/win32/api/winsock2/nf-winsock2-select

"The select function has no effect on the persistence of socket events
registered with WSAAsyncSelect or WSAEventSelect."

That sounds to me like current usage in QEMU codes does not have a problem.

> Sadly, I didn't find any way to adjust the current code using select to
> avoid that. As select and any cleaner (ever WSAEnumNetworkEvents or
> WSEResetEvent) cannot be called in an atomic way, it seems that events
> can always be received between the reset and the select. At least, all
> my attempts failed.

According to MSDN:
https://docs.microsoft.com/en-us/windows/win32/api/winsock2/nf-winsock2-wsaeventselect

"Having successfully recorded the occurrence of the network event (by
setting the corresponding bit in the internal network event record)
and signaled the associated event object, no further actions are taken
for that network event until the application makes the function call
that implicitly reenables the setting of that network event and
signaling of the associated event object."

So events will be kept unsignaled after they are signaled, until the
reenable routine is called. For example, recv() for the FD_READ event.

>
> The only solution I've found is to move completely to the Windows API
> and stop calling select. This, however, needs some tricks. In Particular, we
> need to "remove" the support of GSources having only G_IO_HUP.
> This is already kind of done as we currently don't detect them in
> qio_channel_socket_source_check anyway.
> This is mandatory because of the two GSources created on the same socket.
> IIRC, the first one (I'll call it the master GSource) is created during
> the initialization of the channel-socket by update_ioc_handlers and will
> have only G_IO_HUP to catch up.
> The second one is created during the "prepare" phase of the first one,
> in io_watch_poll_prepare. This one will have all the events we want
> to pull (G_IO_IN here).
> As they are referring to the same socket, the Windows API cannot know
> on which GSources we want to catch which events. The solution is then

I think Windows knows which events to catch. As WSAEventSelect() in
channel-watch.c::qio_channel_create_socket_watch() tells Windows which
event it should monitor.

Both the "master" and "child" GSources are watching on the same socket
with G_IO_IN condition (see qio_channel_create_socket_watch), and GLib
is smart enough to merge these two resources into one in the event
handle array which is passed to WaitForMultipleObjectsEx() in
g_poll().

I checked your patch, what you did seems to be something one would
naturally write, but what is currently in the QEMU sources seems to be
written intentionally.

+Paolo Bonzini , you are the one who implemented the socket watch on
Windows. Could you please help analyze this issue?

> to avoid WSAEnumNetworkEvents for the master GSource which only has
> G_IO_HUP (or for any GSource having only that).
> As I said above, the current code doesn't do anything with it anyway.
> So, IMO, it's safe to do so.
>
> I'll send you my patch attached. I was planning to send it in the following
> weeks anyway. I was just waiting to be sure everything looks fine on our
> CI. Feel free to test and modify it if needed.

I tested your patch. Unfortunately there is still one test case
(migration-test.exe) throwing up the "Broken pipe" message.

Can you test my patch instead to see if your gdb issue can be fixed?

>
> PS: I don't know if it will correctly extend if I simply attach it to
> my mail. If not, tell me I'll simply copy-paste it, even if it might
> mess up the space/tab stuff.
>
> > >> >>
> > >> >> ---
> > >> >> During the testing, I removed the following codes in 
> > >> >> update_ioc_handlers():
> > >> >>
> > >> >>     remove_hup_source(s);
> > >> >>     s->hup_source = qio_channel_create_watch(s->ioc, G_IO_HUP);
> > >> >>     g_source_set_callback(s->hup_source, (GSourceFunc)tcp_chr_hup,
> > >> >>                           chr, NULL);
> > >> >>     g_source_attach(s->hup_source, chr->gcontext);
> > >> >>
> > >> >> and such change also makes the symptom go away.
> > >> >>
> > >> >> And if I moved the above codes to the beginning, before the call to
> > >> >> io_add_watch_poll(), the symptom also goes away.
> > >> >>
> > >> >> It seems two sources watching on the same socket event object is
> > >> >> the key that leads to the instability. The order of adding a source
> > >> >> watch seems to also play a role but I can't explain why.
> > >> >> Hopefully a Windows and glib expert could explain this behavior.
> > >> >>
> > >> >
> > >> > Feel free to leave that comment in the commit message.
> > >>
> > >> Sure, will add the above message into the commit in v2.
> > >>
> > >> >
> > >> > This is strange, as both sources should have different events, 
> > >> > clearing one shouldn't affect the other.
> > >>
> > >> Both sources have the same event, as one QIO channel only has one
> > >> socket, and one socket can only be bound to one event.
> > >
> > >
> > >  "one socket can only be bound to one event", is that a WSAEventSelect 
> > > limitation?
> > >
> >
> > Yes, please see the MSDN:
> >
> > It is not possible to specify different event objects for different
> > network events. The following code will not work; the second call will
> > cancel the effects of the first, and only the FD_WRITE network event
> > will be associated with hEventObject2:
> >
> > rc = WSAEventSelect(s, hEventObject1, FD_READ);
> > rc = WSAEventSelect(s, hEventObject2, FD_WRITE); //bad
>
> Yes, the Windows API is handled at socket levels. That's why having
> two GSources on the same sockets is problematic.
> Note that maybe there is a mix to be done between your patch with
> the update_ioc_handlers part to be removed and my patch which improves
> the polling of channel-watch.

Regards,
Bin



reply via email to

[Prev in Thread] Current Thread [Next in Thread]