10.08.2020 11:12, Max Reitz wrote:
On 07.08.20 12:29, Vladimir Sementsov-Ogievskiy wrote:
16.07.2020 17:59, Max Reitz wrote:
On 10.07.20 19:41, Andrey Shinkevich wrote:
On 10.07.2020 18:24, Max Reitz wrote:
On 09.07.20 16:52, Andrey Shinkevich wrote:
On 25.06.2020 18:21, Max Reitz wrote:
Because of the (not so recent anymore) changes that make the
stream job
independent of the base node and instead track the node above
it, we
have to split that "bottom" node into two cases: The bottom COW
node,
and the node directly above the base node (which may be an R/W
filter
or the bottom COW node).
Signed-off-by: Max Reitz <mreitz@redhat.com>
---
qapi/block-core.json | 4 +++
block/stream.c | 63
++++++++++++++++++++++++++++++++------------
blockdev.c | 4 ++-
3 files changed, 53 insertions(+), 18 deletions(-)
diff --git a/qapi/block-core.json b/qapi/block-core.json
index b20332e592..df87855429 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -2486,6 +2486,10 @@
# On successful completion the image file is updated to
drop the
backing file
# and the BLOCK_JOB_COMPLETED event is emitted.
#
+# In case @device is a filter node, block-stream modifies the
first
non-filter
+# overlay node below it to point to base's backing node (or
NULL if
@base was
+# not specified) instead of modifying @device itself.
+#
# @job-id: identifier for the newly-created block job. If
# omitted, the device name will be used. (Since 2.7)
#
diff --git a/block/stream.c b/block/stream.c
index aa2e7af98e..b9c1141656 100644
--- a/block/stream.c
+++ b/block/stream.c
@@ -31,7 +31,8 @@ enum {
typedef struct StreamBlockJob {
BlockJob common;
- BlockDriverState *bottom;
+ BlockDriverState *base_overlay; /* COW overlay (stream from
this) */
+ BlockDriverState *above_base; /* Node directly above the
base */
Keeping the base_overlay is enough to complete the stream job.
Depends on the definition. If we decide it isn’t enough, then it
isn’t
enough.
The above_base may disappear during the job and we can't rely on
it.
In this version of this series, it may not, because the chain is
frozen.
So the above_base cannot disappear.
Once we insert a filter above the top bs of the stream job, the
parallel
jobs in
the iotests #030 will fail with 'frozen link error'. It is because of
the
independent parallel stream or commit jobs that insert/remove their
filters
asynchroniously.
I’m not sure whether that’s a problem with this series specifically.
We can discuss whether we should allow it to disappear, but I think
not.
The problem is, we need something to set as the backing file after
streaming. How do we figure out what that should be? My proposal
is we
keep above_base and use its immediate child.
We can do the same with the base_overlay.
If the backing node turns out to be a filter, the proper backing
child will
be set after the filter is removed. So, we shouldn't care.
And what if the user manually added some filter above the base (i.e.
below base_overlay) that they want to keep after the job?
It's automatically kept, if we use base_overlay->backing->bs as final
backing node.
You mean, that they want it to be dropped?
Er, yes. Point is, the graph structure below with @base at the root may
be different than the one right below @base_overlay.
so, assuming the following:
top -(backing)-> manually-inserted-filter -(file)-> base
and user do stream with base=base, and expects filter to be removed by
stream job?
Hmm, yes, such use-case is broken with our proposed way...
====
Let me now clarify the problem we'll have with your way.
When stream don't have any filter, we can easily imagine two parallel
stream jobs:
top -(backing)-> mid1 -(backing)-> mid2 -(backing)-> base
stream1: top=top, base=mid2
stream2: top=mid2, base=NULL
final picture is obvious:
top (merged with mid1) -(backing)-> mid2 (merged with base)
Yes, and I don’t think this currently working case is broken by this
series.
But we want stream job has own filter, like mirror.
Which it does not have yet, right? Which is why I was saying that I
don’t think this is a problem with this series. We could try to address
it later.
Or do you think we can’t address it later because right now all filter
cases are broken anyway so now would be the time to make a breaking
change (which the suggestion to not use @base as the final backing
node is)?
I think, we can address it later, but it would be good to fit into one
release cycle with these series, to not make incompatible behavior
changes later.
So the picture becomes more complex.
Assume stream2 starts first.
top -(backing)-> mid1 -(backing)-> stream2-filter -(backing)-> mid2
-(backing)-> base
stream2-filter would be on top of mid2, right?
Right. In my picture, "-(backing)->" means backing link. Hmm, most
probably stream-filter is COR, which actually have file child. It
doesn't matter here.
Now, when we run stream1, with your solution, stream1 will freeze
stream2-filter
(wrong thing, stream2 will fail to remove it if it finished first), and
stream1 will
remove stream2-filter on finish (which is wrong as well, stream2 is not
prepared to
removing of its filter)..
Note that the user first needs to pass “mid2” as the base to the stream
job stream1. Why don’t they just pass “stream2-filter”? In my model,
the user should specify exactly which node they want not to be touched
by this stream job, and so that would be stream2-filter, not mid1.
Hmm. I'm sure we already tried/discussed this.. But I don't remember
the results. Seems, such logic should work as well. Andrey don't you
remember? I think, the only difficulty is to update iotests 30 (in our
stream-filter series) to use node-names, not filename.. Can we go this
way?
I feel like the answer to this question has to do with implicit nodes.
AFAIU you wanted to remove them, so I don’t think we’d want to
special-case them here.
Agree. I think stream-filter should be explicit, like recently
introduced backup-top.
If you think that we can’t expect users to pass “stream2-filter” because
currently it should work with “mid2”, then that’s a case of implicit
nodes and it means we should ascend from @base up to the first
non-implicit node to get the @above_base we want.
But, with our proposed way (freeze only chain up to base_overlay
inclusively, and use backing(base_overlay) as final backing), all will
work as expected, and two parallel jobs will work..
I don’t think it will work as expected because users can no longer
specify which node should be the base node after streaming. And the
QAPI schema says that base-node is to become the backing file of the top
node after streaming.
But this will never work with either way: base node may disappear
during stream. Even with you way, they only stable thing is
"above-base", which backing child may be completely another node at
stream finish.
I suppose you’re arguing that streaming through filters basically just
doesn’t work at all right now, so we’re free to do whatever?
I don't, but I like the idea :)
Well, that still leaves the problem that users should be able to specify
which node is to become the base after streaming, and that that node
maybe shouldn’t be restricted to immediate children of COW images.
And again, this is impossible even with your way. I have an idea:
What about making the whole thing explicit?
We add an optional parameter to stream-job: bottom-node, which is
mutally exclusive with specifying base.
Then, if user specified base node, we freeze base as well, so it can't
disappear. User will not be able to start parallel stream with this
base node as top (because new stream can not insert a filter into
frozen chain), but for sure it's rare case, used only in iotest 30
:)). Benefit: user have guarantee of what would be final backing node.
Otherwise, if user specified bottom-node, we use the way of this
patch. So user can run parallel streams (iotest 30 will have to use
bottom-node argument). No guarantee of final base-node, it would be
backing of bottom-node at job finish.
But, this is incompatible change, and we probably should wait for 3
releases for deprecation of old behavior..
Anyway, I feel now, that you convinced me. I'm not sure that we will
not have to change it make filter work, but not reason to change
something now. Andrey, could you try to rebase your series on top of
this and fix iotest 30 by just specifying exact node-names in it?..
Hmmm. My thought goes further. Seems, that in this way, introducing
explicit filter would be incompatible change anyway: it will break
scenario with parallel stream jobs, when user specifies filenames, not
node names (user will have to specify filter-node name as base for
another stream job, as you said). So, it's incompatible anyway.
What do you think of it? Could we break this scenario in one release
without deprecation and don't care? Than I think my idea about base vs
bottom-node arguments for stream job may be applied. Or what to do?
If we can't break this scenario without a deprecation, we'll have to
implement "implicit" filter, like for mirror, when filter-node-name is
not specified. And for this implicit filter we'll need additional
logic (closer to what I've proposed in a previous mail). Or, try to
keep stream without a filter (not insert it at all and behave the old
way), when filter-node-name is not specified. Than new features based
on filter will be available only when filter-node-name is specified,
but this is OK. The latter seems better for me.
====
So, these are two mutually exclusive cases.. I vote for freezing up to
base_overlay, and use backing(base_overlay) as final backing, because:
1. I can't imaging other way to fix the case with parallel streams with
filters (it's not a problem of current master, but we have pending
series which will introduce stream job filter, and the problem will
appear and even break iotest 30)
Besides the question of whether the top job could just use the bottom
job’s filter node as the base, there’s also the alternative of admitting
defeat and declaring that you just cannot use a single node in two
streams, because we didn’t find a way to make it work after all.
You could still create a temporary overlay in between that’s never used
and then drop it with a trivial stream afterwards.
(But that just in case specifying the bottom job’s stream node somehow
wouldn’t work.)
We can break existing user scenarios.. Seems unlikely. I don't know.
2. I don't think that removing filters above base node by stream job is
so important case to break parallel stream jobs in future:
- Stream job is not intended to remove filters, but to stream data.
Filters between base_overlay and base don't contain any data and
unrelated to stream process
Well, it is intended to remove nodes. You can only remove data-bearing
nodes by moving data around. I suspect if there was a way to get the
to-be-removed nodes removed without having to move their data around,
that would be popular.
- I think, that filters are "more related" to their children than to
their parents. So, removing filters related to base node, when we just
remove all data-containing nodes between top and base (and are not
going
to remove base node) is at least questionable.
Yes.
Although it could be argued that it is a handy way to remove filters, in
a backing chain at least. (Thanks to bdrv_find_overlay(), @base and
@top still need to refer to different levels of the backing chain, but
if we lifted that restriction, I suppose it could work for any filter
chain.)
*shrug*
On the contrary, removing
all intermediate data containing nodes _together_ with their filters is
absolutely correct thing to do.
I don’t think so, actually. Like, you have a throttle node somewhere in
the chain, shouldn’t you maybe want to move it down below the chain? Or
a COR node, shouldn’t that go above the chain after streaming?
I’m not making an argument here, I just don’t quite understand why you’d
bring up what happens with intermediate filters here. The only reason
to drop them is because that’s what I expect users to expect of the
stream job.
Next, with your way, what about filters, inserted above base during
stream job? They will be between above_base and base, and will not be
removed. So with your way, filters above base, existing before job
start
will be frozen during the job and removed after it, but filters
appended
above base during the job will be untouched. With our way, just all
base
node related filters are untouched by the job. It seems simpler
definition for me and simpler to document.
Hm. The documentation seems the same to me. Either it’s “The backing
node (at the end of the job) of @base’s parent node (when starting the
job)” or “The backing node (at the end of the job) of the next
non-filter node above @base (when starting the job)”.
The problem you describe (that @above_base at the end of the job isn’t
necessarily above @base anymore) also exists with your suggestion,
namely that you can add overlays above @base after the job has started,
so @base_overlay at the end of the job isn’t necessarily the first
non-filter node above @base anymore.
OK, so after all this text, maybe some more original problem searching.
I think it the root of the problem is that the stream job takes a
@base
parameter, but as of c624b015bf14fe01, it doesn’t really matter anymore.
Maybe c624b015bf14fe01 should have introduced a new parameter for
users
to specify the bottom node instead of @base.
Yes, absolutely agree. If we do it now, would it be incompatible
change or not?
Well, that would have made everything a parameter mess, but it would
have saved us the trouble now.
In any case, the problem we have now is that we want a way to
automagically find out which node the bottom node should be, because the
user can’t specify it. So the documentation is always going to be
written as “The backing node (at the end of the job) of $bottom”, where
“$bottom” is what we’re interested in figuring out.
I thought it would be best if we stick as close as possible to the
spirit of the current documentation, which basically requires @base-node
to be the backing node of the top after streaming. (If you do graph
modifications during the job, that’s on you, because since
c624b015bf14fe01 we can’t keep the base frozen.)
Your suggestion to do basically what you consider to be right, which
comes at the caveat of being untruthful to the current documentation
even if there are no graph modifications during the job. Luckily, the
stream job right now doesn’t work in the cases we’re looking at, so it
wouldn’t be a breaking change.
I also try to think about future introduction of stream filter. It
shouldn't be a breaking change as well. But now I think it can be done
the hard-way if needed: just work without a filter if filter-node-name
is not given. but probably I care to much. Finally we can just drop
the test-case from 030 or insert additional intermediate node into it..
The problem I have with it is that
you’re assuming what is right and what isn’t (i.e. “Who would want to
remove filter nodes directly above @base”) without giving the user a
chance to specify.
I think it would be nice if we could have something that remains
truthful to the current documentation.
Then, we should freeze base node again, so just revert c624b015bf14fe01
I go and see it cover-letter of the series introduced c624b015bf14fe01:
This series introduces a bottom intermediate node that eliminates the
dependency on the base that may change while stream job is running.
It happens when stream/commit parallel jobs are running on the same
backing chain. The base node of the stream job may be a top node of
the parallel commit job and can change before the stream job is
completed. We avoid that dependency by introducing the bottom node.
Hmm. Bad that we didn't add an iotest, but the series solved existing
problem:
parallel stream and commit, as commit already has filter. Commit has
filter
with "backing" child, so, I assume that the case worked prior to
introducing
frozen chains, broken then by frozen chains and then fixed by
c624b015bf14fe01.
Still I don't know are there some real users of such parallel jobs..
If just ignoring implicit
filters above @base would work, then I’d find that nice. If it doesn’t,
I suppose there’s indeed little we can do but to indeed forego
@above_base and just use @base_overlay for all cases.
OK, at this point, I think, I'm OK with your patch in context of these
series.
But I fill that something more should be done. Could we just revert
c624b015bf14fe01 as "non corresponding to specification"?