qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2 06/12] block: Deep-clear inherits_from


From: Kevin Wolf
Subject: Re: [Qemu-devel] [PATCH v2 06/12] block: Deep-clear inherits_from
Date: Wed, 17 Jul 2019 14:01:12 +0200
User-agent: Mutt/1.11.3 (2019-02-01)

Am 17.07.2019 um 11:07 hat Max Reitz geschrieben:
> On 17.07.19 10:17, Kevin Wolf wrote:
> > Am 17.07.2019 um 09:47 hat Max Reitz geschrieben:
> >> On 16.07.19 19:01, Kevin Wolf wrote:
> >>> Am 03.07.2019 um 19:28 hat Max Reitz geschrieben:
> >>>> BDS.inherits_from does not always point to an immediate parent node.
> >>>> When launching a block job with a filter node, for example, the node
> >>>> directly below the filter will not point to the filter, but keep its old
> >>>> pointee (above the filter).
> >>>>
> >>>> If that pointee goes away while the job is still running, the node's
> >>>> inherits_from will not be updated and thus point to garbage.  To fix
> >>>> this, bdrv_unref_child() has to check not only the parent node's
> >>>> immediate children for nodes whose inherits_from needs to be cleared,
> >>>> but its whole subtree.
> >>>>
> >>>> Signed-off-by: Max Reitz <address@hidden>
> >>>
> >>> Isn't the real bug that we keep pointing to a node that isn't a parent
> >>> of the node any more? I think this will effectively disable option
> >>> inheritance in bdrv_reopen() as long as the filter node is present,
> >>> which is certainly not what we intended.
> >>
> >> Well, it breaks it while a block job is running.  I don’t know whether I
> >> would advise doing a reopen across a block job filter.  It’s a case of
> >> “Why wouldn’t it work?”, but I’m sure there’s something that doesn’t.
> >> (Like this here, for example, but it at least has the decency of just
> >> letting the reopen fail.)
> > 
> > Why would it let the reopen fail? Failing would be justifiable, but I
> > expect it just succeeds without updating the options of the child node.
> 
> I actually don’t know what you’re referring to, because I’m not familiar
> with either the inherits_from paths nor with bdrv_reopen.
> 
> I took it you meant the loop over the children in
> bdrv_reopen_queue_child(), and the fact that it skips the children that
> do not inherit from this node.
> 
> So I took it that options that do not get passed to children remain at
> the parent level and would throw an error at some point, because options
> that cannot be handled should throw an error at some point.  (Which does
> happen, as far as I can tell.)

I'm talking about cases where you don't explicitly set an option for a
child, but it inherits the (possibly changed) option from its parent.

For example, consider a qcow2 image that was originally opened as a
backing file, so both its qcow2 node and its file-posix node are
read-only; the file-posix node inherits its read-only=on setting from
the qcow2 node. Now you reopen the qcow2 node read-write (starting a
commit job) and the expected result is that the file-posix node
automatically inherits the updated value and becomes read-write, too.

This happens indeed in bdrv_reopen_queue_child(), but the interesting
case is where an option isn't explicitly set in the recursive call for
the child node. This happens in the following line in the nested call:

    role->inherit_options(&flags, options, parent_flags, parent_options);

If child->bs->inherits_from didn't point to the parent that is being
reopened, we wouldn't even recurse, so .inherit_options would never be
called.

> >>> The intuitive thing would be that after inserting a filter, the image
> >>> now inherits from the filter node, and when the filter is removed, it
> >>> inherits from the filter's bs->inherit_from if that becomes a parent of
> >>> the node. (Though I'm not necessarily saying that my intuition is to be
> >>> trusted here.)
> >>
> >> I tried that first, but I couldn’t get it to work.  I don’t remember
> >> why, though.  I suppose my problem was that removing the filter node
> >> make inherits_from NULL.  I guess I stopped at that point and went this
> >> route instead.
> >>
> >> I suppose we could add a flag to the BDS that says whether an heir
> >> node’s inherits_from should be cleared or set to the bequeather’s
> >> inherits_from, like so:
> >>
> >>     if (parent->inherit_inherits_from) {
> >>         child->bs->inherits_from = parent->inherits_from;
> >>     } else {
> >>         child->bs->inherits_from = NULL;
> >>     }
> >>
> >> And then, if you insert a node between a child and its inherits_from
> >> parent, that node copies inherits_from from the child and gets its
> >> inherit_inherits_from set to true.
> >>
> >> The problem I see is that it doesn’t always appear clear to me that this
> >> intermediate node should actually copy its inherits_from from the child.
> >>
> >> So the same question applies here, too, I guess; should the filter node
> >> even inherit its options from the parent?
> > 
> > An explicitly created filter node that is inserted with blockdev-reopen
> > certainly doesn't inherit from its parent. An automatically created
> > filter node where you didn't have control of its options - hm... good
> > question.
> > 
> > If we want to keep it simple, maybe it would be defensible if we just
> > break the inheritance relationship as soon as the parent is detached
> > (i.e. when inserting the filter). At least that would be more consistent
> > than silently disabling option inheritance while a filter is present and
> > then suddenly reenabling it when the filter goes away.
> 
> That sounds wrong to me.
> 
> As I said, doing a reopen across a block job filter sounds like there
> can be many things to go wrong.  I don’t know why anyone would do so
> (and live to tell the tale).

You do reopen on a single node that could be way up the graph, not
really related to the job. The question is what this reopen can do to
the subtree below it, including the job filter.

> So from that perspective, if you do a reopen before or after, it would
> work.  You don’t do anything while the filter is there because it’s a
> bad idea anyway.  If it just failed while the job is running and then
> started working again afterwards, that would actually sound good to me.
> 
> 
> What does sound bad to me is breaking it just because you ran a block
> job in the meantime.
> 
> Well, it doesn’t really sound bad, but it doesn’t sound ideal either.

Fair enough (though I'm not completely convinced that a filter job
somewhere down the tree can reasonably block all of its parent nodes
from being reopened), but at the very least we must make sure that
reopen actually does report an error instead of returning success and
silently ignoring the subtrees that contain filter nodes.

> > The other option would be making it actually work, i.e. the child node
> > keeps inheriting from its original parent node. This would not only
> > require modifications to bdrv_reopen(), but also to the data structures
> > so that the parent has a list of nodes that inherit from it: Nobody can
> > even guarantee that the child node always stays in the subtree of the
> > original parent.
> 
> I don’t quite see how that cannot be guaranteed.  If the child goes out
> of the subtree, we should clear inherits_from.
> 
> The only way the child can go out of the subtree is by virtue of it or
> one of its recursive parents until the inherits_from pointee being detached.
> 
> I don’t see how that can happen without going through
> bdrv_unref_child(), which will clear all of those inherits_from pointers.
> 
> bdrv_replace_node() comes to mind, but currently it’s actually only
> called by bdrv_append(), by the block jobs to undo bdrv_append(), and by
> mirror to take the to_replace node.  The last case may be an issue.

Hm, okay. Maybe it's actually not possible. I'd have to have a closer
look at the code.

> > This is in fact a reason why this patch isn't only ugly, but actually
> > wrong - you may still miss some inherits_from references.
> 
> Well, the unfortunate thing is that this patch is in master now because
> I preferred fixing access of a dangling pointer over being sure to keep
> inherits_from working in all cases.
> 
> I know, that’s all the more reason for me to fix it now.  But as I said,
> I don’t have much experience with bdrv_reopen.

Yeah, let's see if we can improve the situation a bit at least for 4.1.

Kevin

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]