[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH] net: tap: check if the file descriptor is valid before using
From: |
Laurent Vivier |
Subject: |
Re: [PATCH] net: tap: check if the file descriptor is valid before using it |
Date: |
Tue, 30 Jun 2020 14:42:38 +0200 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.9.0 |
On 30/06/2020 14:35, Daniel P. Berrangé wrote:
> On Tue, Jun 30, 2020 at 02:00:06PM +0200, Laurent Vivier wrote:
>> On 30/06/2020 13:03, Daniel P. Berrangé wrote:
>>> On Tue, Jun 30, 2020 at 12:35:46PM +0200, Laurent Vivier wrote:
>>>> On 30/06/2020 12:03, Jason Wang wrote:
>>>>>
>>>>> On 2020/6/30 下午5:45, Laurent Vivier wrote:
>>>>>> On 30/06/2020 11:31, Daniel P. Berrangé wrote:
>>>>>>> On Tue, Jun 30, 2020 at 10:23:18AM +0100, Daniel P. Berrangé wrote:
>>>>>>>> On Tue, Jun 30, 2020 at 05:21:49PM +0800, Jason Wang wrote:
>>>>>>>>> On 2020/6/30 上午3:30, Laurent Vivier wrote:
>>>>>>>>>> On 28/06/2020 08:31, Jason Wang wrote:
>>>>>>>>>>> On 2020/6/25 下午7:56, Laurent Vivier wrote:
>>>>>>>>>>>> On 25/06/2020 10:48, Daniel P. Berrangé wrote:
>>>>>>>>>>>>> On Wed, Jun 24, 2020 at 09:00:09PM +0200, Laurent Vivier wrote:
>>>>>>>>>>>>>> qemu_set_nonblock() checks that the file descriptor can be
>>>>>>>>>>>>>> used and, if
>>>>>>>>>>>>>> not, crashes QEMU. An assert() is used for that. The use of
>>>>>>>>>>>>>> assert() is
>>>>>>>>>>>>>> used to detect programming error and the coredump will allow
>>>>>>>>>>>>>> to debug
>>>>>>>>>>>>>> the problem.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> But in the case of the tap device, this assert() can be
>>>>>>>>>>>>>> triggered by
>>>>>>>>>>>>>> a misconfiguration by the user. At startup, it's not a real
>>>>>>>>>>>>>> problem,
>>>>>>>>>>>>>> but it
>>>>>>>>>>>>>> can also happen during the hot-plug of a new device, and here
>>>>>>>>>>>>>> it's a
>>>>>>>>>>>>>> problem because we can crash a perfectly healthy system.
>>>>>>>>>>>>> If the user/mgmt app is not correctly passing FDs, then there's
>>>>>>>>>>>>> a whole
>>>>>>>>>>>>> pile of bad stuff that can happen. Checking whether the FD is
>>>>>>>>>>>>> valid is
>>>>>>>>>>>>> only going to catch a small subset. eg consider if fd=9 refers
>>>>>>>>>>>>> to the
>>>>>>>>>>>>> FD that is associated with the root disk QEMU has open. We'll
>>>>>>>>>>>>> fail to
>>>>>>>>>>>>> setup the TAP device and close this FD, breaking the healthy
>>>>>>>>>>>>> system
>>>>>>>>>>>>> again.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I'm not saying we can't check if the FD is valid, but lets be
>>>>>>>>>>>>> clear that
>>>>>>>>>>>>> this is not offering very much protection against a broken mgmt
>>>>>>>>>>>>> apps
>>>>>>>>>>>>> passing bad FDs.
>>>>>>>>>>>>>
>>>>>>>>>>>> I agree with you, but my only goal here is to avoid the crash in
>>>>>>>>>>>> this
>>>>>>>>>>>> particular case.
>>>>>>>>>>>>
>>>>>>>>>>>> The punishment should fit the crime.
>>>>>>>>>>>>
>>>>>>>>>>>> The user can think the netdev_del doesn't close the fd, and he
>>>>>>>>>>>> can try
>>>>>>>>>>>> to reuse it. Sending back an error is better than crashing his
>>>>>>>>>>>> system.
>>>>>>>>>>>> After that, if the system crashes, it will be for the good
>>>>>>>>>>>> reasons, not
>>>>>>>>>>>> because of an assert.
>>>>>>>>>>> Yes. And on top of this we may try to validate the TAP via st_dev
>>>>>>>>>>> through fstat[1].
>>>>>>>>>> I agree, but the problem I have is to know which major(st_dev) we can
>>>>>>>>>> allow to use.
>>>>>>>>>>
>>>>>>>>>> Do we allow only macvtap major number?
>>>>>>>>>
>>>>>>>>> Macvtap and tuntap.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> How to know the macvtap major number at user level?
>>>>>>>>>> [it is allocated dynamically: do we need to parse /proc/devices?]
>>>>>>>>>
>>>>>>>>> I think we can get them through fstat for /dev/net/tun and
>>>>>>>>> /dev/macvtapX.
>>>>>>>> Don't assume QEMU has any permission to access to these device nodes,
>>>>>>>> only the pre-opened FDs it is given by libvirt.
>>>>>>> Actually permissions are the least of the problem - the device nodes
>>>>>>> won't even exist, because QEMU's almost certainly running in a private
>>>>>>> mount namespace with a minimal /dev populated
>>>>>>>
>>>>>> I'm working on a solution using /proc/devices.
>>>>>
>>>>>
>>>>> Similar issue with /dev. There's no guarantee that qemu can access
>>>>> /proc/devices or it may not exist (CONFIG_PROCFS).
>>>>
>>>> There is a lot of things that will not work without /proc (several tools
>>>> rely on /proc, like ps, top, lsof, mount, ...). Some information are
>>>> only available from /proc, and if /proc is there, I think /proc/devices
>>>> is always readable by everyone. Moreover /proc is already used by qemu
>>>> in several places.
>>>>
>>>> It can also a best effort check.
>>>>
>>>> The problem with fstat() on /dev files is to guess the /dev/macvtapX as
>>>> X varies (the same with /dev/tapY)..
>>>>
>>>>>
>>>>>> macvtap has its own major number, but tuntap use "misc" (10) major
>>>>>> number.
>>>>
>>>> Another question: it is possible to use the "fd=" parameter with macvtap
>>>> as macvtap creates a /dev/tapY device, but how to do that with tuntap
>>>> that does not create a /dev/tapY device?
>>>
>>>
>>> I think we should step back and ask why we need to check this at all.
>>>
>>> IMHO, if the passed-in FD works with the syscalls that tap-linux.c
>>> is executing, then that shows the FD is suitable for QEMU. The problem
>>> is that many of the tap APIs don't use "Error **errp" parameters to
>>> report errors, so we can't catch the failures. IOW, instead of checking
>>> the FD major/minor number, we should make the existing code be better
>>> at reporting errors, so they can be fed back to the QMP console
>>> gracefully.
>>
>> The problem here is the very first operation of net_init_tap() is a
>> qemu_set_nonblock() that has an assert() and crashes QEMU.
>>
>> It's why I was only checking for the validity of the file descriptor,
>> not if it is a tap device or not.
>
> Yep, checking that it is really a FD is sufficient to avoid the
> assert in nonblock.
>
> As for whether it is really a tap device, I think we just need to
> improve error reporting of the functions that come later, instead
> of doing a literal "is it a tap" check.
I agree. I will update my patches to have a series with my patch
checking for the validity of fd and another patch to return the errors
to QMP from the tap functions.
> That's what I'd tried in my old patch from a few years back
>
> https://patchwork.kernel.org/patch/10029443/
>
> I can't remember why we didn't merge this back then
Jason already gave the link in the thread.
I'm going to try to use your patch in my series.
Thanks,
Laurent
- Re: [PATCH] net: tap: check if the file descriptor is valid before using it, (continued)
- Re: [PATCH] net: tap: check if the file descriptor is valid before using it, Jason Wang, 2020/06/30
- Re: [PATCH] net: tap: check if the file descriptor is valid before using it, Daniel P . Berrangé, 2020/06/30
- Re: [PATCH] net: tap: check if the file descriptor is valid before using it, Daniel P . Berrangé, 2020/06/30
- Re: [PATCH] net: tap: check if the file descriptor is valid before using it, Laurent Vivier, 2020/06/30
- Re: [PATCH] net: tap: check if the file descriptor is valid before using it, Jason Wang, 2020/06/30
- Re: [PATCH] net: tap: check if the file descriptor is valid before using it, Laurent Vivier, 2020/06/30
- Re: [PATCH] net: tap: check if the file descriptor is valid before using it, Jason Wang, 2020/06/30
- Re: [PATCH] net: tap: check if the file descriptor is valid before using it, Daniel P . Berrangé, 2020/06/30
- Re: [PATCH] net: tap: check if the file descriptor is valid before using it, Laurent Vivier, 2020/06/30
- Re: [PATCH] net: tap: check if the file descriptor is valid before using it, Daniel P . Berrangé, 2020/06/30
- Re: [PATCH] net: tap: check if the file descriptor is valid before using it,
Laurent Vivier <=
- Re: [PATCH] net: tap: check if the file descriptor is valid before using it, Jason Wang, 2020/06/30