qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v8 01/15] ui & main loop: Redesign of system-specific main th


From: BALATON Zoltan
Subject: Re: [PATCH v8 01/15] ui & main loop: Redesign of system-specific main thread event handling
Date: Tue, 12 Nov 2024 02:50:54 +0100 (CET)

On Mon, 11 Nov 2024, Phil Dennis-Jordan wrote:
On Mon, 11 Nov 2024 at 13:41, BALATON Zoltan <balaton@eik.bme.hu> wrote:
On Mon, 11 Nov 2024, Phil Dennis-Jordan wrote:
On Mon, 11 Nov 2024 at 10:08, Daniel P. Berrangé <berrange@redhat.com>
wrote:

On Sun, Nov 10, 2024 at 08:08:16AM +0100, Phil Dennis-Jordan wrote:
On Sun 10. Nov 2024 at 08:01, Akihiko Odaki <akihiko.odaki@daynix.com>
wrote:

On 2024/11/08 23:46, Phil Dennis-Jordan wrote:
macOS's Cocoa event handling must be done on the initial (main)
thread
of the process. Furthermore, if library or application code uses
libdispatch, the main dispatch queue must be handling events on the
main
thread as well.

So far, this has affected Qemu in both the Cocoa and SDL UIs,
although
in different ways: the Cocoa UI replaces the default qemu_main
function
with one that spins Qemu's internal main event loop off onto a
background thread. SDL (which uses Cocoa internally) on the other
hand
uses a polling approach within Qemu's main event loop. Events are
polled during the SDL UI's dpy_refresh callback, which happens to run
on the main thread by default.

GTK should also do the same as SDL and requires treatment; I forgot to
note that in previous reviews.


Although it‘s possible to build Qemu with GTK support enabled on macOS,
that UI doesn’t actually work on macOS at all, and apparently hasn’t
been
supported since 2018, see:
https://stackoverflow.com/a/51474795

I don’t think there’s any point making adjustments to the GTK code by
guessing what might be needed if someone did fix that to work with
macOS
at
some point.

If we don't support GTK on macOS, then we should update meson.build
to actively prevent users enabling GTK on macOS builds, rather than
letting them suffer random runtime crashes.


Agreed - I'm now more confused than ever though because
https://gitlab.com/qemu-project/qemu/-/issues/2539 sort of implies that
Philippe has previously been using it successfully. Or perhaps this was
created in response to
https://gitlab.com/qemu-project/qemu/-/issues/2515 ?
But it seems like even the SDL implementation isn't perfect:
https://gitlab.com/qemu-project/qemu/-/issues/2537

Basically, it seems like Qemu's UI threading on macOS is currently a bit
of
a mess, except in the Cocoa UI.

Maybe it worked with older MacOS X releases but broke around the same time
when commit 5588840ff77800 was needed to fix the cocoa UI? Maybe gtk needs
a similar fix or whatever cocoa was changed to use since somewhere in gtk
or QEMU?


Possible! Calling the Cocoa UI APIs from anything other than the main
thread has never officially been supported as far as I'm aware, but perhaps
some subset silently worked on earlier releases. Modern versions definitely
enforce the rule to some extent.

Also I find it strange to require UI backends to NULL init a global. If
the cocoa UI is the only one that needs it could that also set it instead
of doing it in /system? That would also confine macOS specific code to
cocoa.m and the other UIs would not need any change that way.


Well, that's the whole point, it's not just the Cocoa UI - other macOS
system frameworks also need a runloop or libdispatch to be handling events
on thread 0. Relevant here are the ParavirtualizedGraphics.framework that's
integrated with patches 2-4 from this patch set, as well as the
Metal.framework, which PVG uses internally, and which we need to use
directly to some extent to complete the generated guest frames. These
frameworks internally use libdispatch, and they just don't work if nothing
is processing events on the main dispatch queue/runloop. So without patch
01/16, PV graphics will only work in conjunction with Cocoa or SDL, because
those do process the runloop on the main thread. But if you were to run
with -nographic, or VNC/SPICE-only, Mac PV graphics just wouldn't work. So
the idea is to uncouple the main thread's runloop from the Cocoa UI.

OK, I think I got that now.

An then, the GTK+ and SDL libraries themselves call down into Cocoa on
macOS. Windows also has specific rules about its Win32 UI API and
threading. SDL says outright that everything needs to be done from the main
thread. GTK+ says everything needs to be called from a single thread on
Win32; it seems like the rule on macOS is the same, except that thread must
be thread 0. Both those UIs in Qemu seem to violate the threading rules of
the libraries they integrate, as evidenced by the bug reports listed above.
This brokenness is entirely independent of this patch set here, it's just
that we're bumping up against the same underlying issue of needing runloop
handling on thread 0 in macOS.

With regard to NULL'ing the qemu_main function pointer from the SDL or GTK
UIs in Qemu: I had implemented a different approach to solving the problem
in v4, where each UI declared up-front what kind of threading arrangement
it needed rather than each one just overwriting a global variable. This was
somewhat more complex than the current one though.
https://patchew.org/QEMU/20241024102813.9855-1-phil@philjordan.eu/20241024102813.9855-2-phil@philjordan.eu/

That does not seem to be better than the current version.

Ultimately, it seems like someone needs to take a look at the SDL and GTK
integrations in Qemu and rework them in a way that doesn't violate the SDL
and GTK+ libraries' own threading rules. Once we've figured out what
requirements fall out of that, we can tidy up the qemu_main arrangement.

But that's an undertaking that's out of scope for this patch - I see the
current patch as a step towards a global solution to the problem.
Definitely not the last word on the matter, but at least starting to get
away from the situation where each UI does whatever it wants with zero
regard for the rest of the code base.

Agreed. As long as it's not more broken than it is already, i.e. the current behaviour is preserved, that should be good enough but I'm not a maintainer who has a word in this so that's just my opinion. Then maybe you could add a comment when the global variable needs to be set to NULL to note that this may not be the correct way and some hint of the above that this may need fixing later. I think we don't have a graphics maintainer at the moment who has time to look at this so then at least leave a comment to help whoever comes later who wants to fix this eventually.

Regards,
BALATON Zoltan

reply via email to

[Prev in Thread] Current Thread [Next in Thread]