qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 1/1] 9pfs: avoid iterator invalidation in v9fs_mark_fids_unre


From: Christian Schoenebeck
Subject: Re: [PATCH 1/1] 9pfs: avoid iterator invalidation in v9fs_mark_fids_unreclaim
Date: Tue, 27 Sep 2022 19:14:33 +0200

On Dienstag, 27. September 2022 15:05:13 CEST Linus Heckemann wrote:
> Christian Schoenebeck <qemu_oss@crudebyte.com> writes:
> > Ah, you sent this fix as a separate patch on top. I actually just meant
> > that you would take my already queued patch as the latest version (just
> > because I had made some minor changes on my end) and adjust that patch
> > further as v4.
> > 
> > Anyway, there are still some things to do here, so maybe you can send your
> > patch squashed in the next round ...
> 
> I see, will do!
> 
> >> @Christian: I still haven't been able to reproduce the issue that this
> >> commit is supposed to fix (I tried building KDE too, no problems), so
> >> it's a bit of a shot in the dark. It certainly still runs and I think it
> >> should fix the issue, but it would be great if you could test it.
> > 
> > No worries about reproduction, I will definitely test this thoroughly. ;-)
> > 
> >>  hw/9pfs/9p.c | 46 ++++++++++++++++++++++++++++++----------------
> >>  1 file changed, 30 insertions(+), 16 deletions(-)
> >> 
> >> diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
> >> index f4c1e37202..825c39e122 100644
> >> --- a/hw/9pfs/9p.c
> >> +++ b/hw/9pfs/9p.c
> >> @@ -522,33 +522,47 @@ static int coroutine_fn
> >> v9fs_mark_fids_unreclaim(V9fsPDU *pdu, V9fsPath *path) V9fsFidState
> >> *fidp;
> >> 
> >>      gpointer fid;
> >>      GHashTableIter iter;
> >> 
> >> +    /*
> >> +     * The most common case is probably that we have exactly one
> >> +     * fid for the given path, so preallocate exactly one.
> >> +     */
> >> +    GArray *to_reopen = g_array_sized_new(FALSE, FALSE,
> >> sizeof(V9fsFidState*), 1); +    gint i;
> > 
> > Please use `g_autoptr(GArray)` instead of `GArray *`, that avoids the need
> > for explicit calls to g_array_free() below.
> 
> Good call. I'm not familiar with glib, so I didn't know about this :)
> 
> >> -            fidp->flags |= FID_NON_RECLAIMABLE;
> > 
> > Why did you remove that? It should still be marked as FID_NON_RECLAIMABLE,
> > no?
> Indeed, that was an accident.
> 
> >> +            /*
> >> +             * Ensure the fid survives a potential clunk request during
> >> +             * v9fs_reopen_fid or put_fid.
> >> +             */
> >> +            fidp->ref++;
> > 
> > Hmm, bumping the refcount here makes sense, as the 2nd loop may be
> > interrupted and the fid otherwise disappear in between, but ...
> > 
> >> +            g_array_append_val(to_reopen, fidp);
> >> 
> >>          }
> >> 
> >> +    }
> >> 
> >> -        /* We're done with this fid */
> >> +    for (i=0; i < to_reopen->len; i++) {
> >> +        fidp = g_array_index(to_reopen, V9fsFidState*, i);
> >> +        /* reopen the file/dir if already closed */
> >> +        err = v9fs_reopen_fid(pdu, fidp);
> >> +        if (err < 0) {
> >> +            put_fid(pdu, fidp);
> >> +            g_array_free(to_reopen, TRUE);
> >> +            return err;
> > 
> > ... this return would then leak all remainder fids that you have bumped
> > the
> > refcount for above already.
> 
> You're right. I think the best way around it, though it feels ugly, is
> to add a third loop in an "out:".

Either that, or continuing the loop to the end. Not that this would become 
much prettier. I must admit I also don't really have a good idea for a clean 
solution in this case.

> > Also: I noticed that your changes in virtfs_reset() would need the same
> > 2-loop hack to avoid hash iterator invalidation, as it would also call
> > put_fid() inside the loop and be prone for hash iterator invalidation
> > otherwise.
> Good point. Will do.
> 
> One more thing has occurred to me. I think the reclaiming/reopening
> logic will misbehave in the following sequence of events:
> 
> 1. QEMU reclaims an open fid, losing the file handle
> 2. The file referred to by the fid is replaced with a different file
>    (e.g. via rename or symlink) outside QEMU
> 3. The file is accessed again by the guest, causing QEMU to reopen a
>    _different file_ from before without the guest having performed any
>    operations that should cause this to happen.
> 
> This is neither introduced nor resolved by my changes. Am I overlooking
> something that avoids this (be it documentation that directories exposed
> via 9p should not be touched by the host), or is this a real issue? I'm
> thinking one could at least detect it by saving inode numbers in
> V9fsFidState and comparing them when reopening, but recovering from such
> a situation seems difficult.

Well, in that specific scenario when rename/move happens outside of QEMU then 
yes, this might happen unfortunately. The point of this "reclaim fid" stuff is 
to deal with the fact that there is an upper limit on systems for the max. 
amount of open file descriptors a process might hold at a time. And on some 
systems like macOS I think that limit is quite low by default (like 100?).

There is also another issue pending that affects pure inner-guest behaviour; 
the infamous use-after-unlink() use pattern:
https://wiki.qemu.org/Documentation/9p#Implementation_Plans
https://gitlab.com/qemu-project/qemu/-/issues/103

It would make sense to look how other file servers deal with the max. amount 
of file descriptors limit before starting to just fight the symptoms. This 
whole reclaim fid stuff in general is PITA.

Best regards,
Christian Schoenebeck





reply via email to

[Prev in Thread] Current Thread [Next in Thread]