On 25/06/2020 10:48, Daniel P. Berrangé wrote:
On Wed, Jun 24, 2020 at 09:00:09PM +0200, Laurent Vivier wrote:
qemu_set_nonblock() checks that the file descriptor can be used and, if
not, crashes QEMU. An assert() is used for that. The use of assert() is
used to detect programming error and the coredump will allow to debug
the problem.
But in the case of the tap device, this assert() can be triggered by
a misconfiguration by the user. At startup, it's not a real problem,
but it
can also happen during the hot-plug of a new device, and here it's a
problem because we can crash a perfectly healthy system.
If the user/mgmt app is not correctly passing FDs, then there's a whole
pile of bad stuff that can happen. Checking whether the FD is valid is
only going to catch a small subset. eg consider if fd=9 refers to the
FD that is associated with the root disk QEMU has open. We'll fail to
setup the TAP device and close this FD, breaking the healthy system
again.
I'm not saying we can't check if the FD is valid, but lets be clear that
this is not offering very much protection against a broken mgmt apps
passing bad FDs.
I agree with you, but my only goal here is to avoid the crash in this
particular case.
The punishment should fit the crime.
The user can think the netdev_del doesn't close the fd, and he can try
to reuse it. Sending back an error is better than crashing his system.
After that, if the system crashes, it will be for the good reasons, not
because of an assert.