On 24.08.20 16:07, Kevin Wolf wrote:
> Am 24.08.2020 um 15:18 hat Max Reitz geschrieben:
>> On 21.08.20 17:50, Kevin Wolf wrote:
>>> Am 25.06.2020 um 17:22 hat Max Reitz geschrieben:
>>>> We have to perform an active commit whenever the top node has a parent
>>>> that has taken the WRITE permission on it.
>>>>
>>>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>>>> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>>> ---
>>>>  blockdev.c | 24 +++++++++++++++++++++---
>>>>  1 file changed, 21 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/blockdev.c b/blockdev.c
>>>> index 402f1d1df1..237fffbe53 100644
>>>> --- a/blockdev.c
>>>> +++ b/blockdev.c
>>>> @@ -2589,6 +2589,7 @@ void qmp_block_commit(bool has_job_id, const char *job_id, const char *device,
>>>>      AioContext *aio_context;
>>>>      Error *local_err = NULL;
>>>>      int job_flags = JOB_DEFAULT;
>>>> +    uint64_t top_perm, top_shared;
>>>>  
>>>>      if (!has_speed) {
>>>>          speed = 0;
>>>> @@ -2704,14 +2705,31 @@ void qmp_block_commit(bool has_job_id, const char *job_id, const char *device,
>>>>          goto out;
>>>>      }
>>>>  
>>>> -    if (top_bs == bs) {
>>>> +    /*
>>>> +     * Active commit is required if and only if someone has taken a
>>>> +     * WRITE permission on the top node.
>>>
>>> ...or if someone wants to take a WRITE permission while the job is
>>> running.
>>>
>>> Future intentions of the user is something that we can't know, so maybe
>>> this should become an option in the future (not in this series, of
>>> course).
>>>
>>>>                                            Historically, we have always
>>>> +     * used active commit for top nodes, so continue that practice.
>>>> +     * (Active commit is never really wrong.)
>>>> +     */
>>>
>>> Changing the practice would break compatibility with clients that start
>>> an active commit job and then attach it to a read-write device, so we
>>> must continue the practice. I think the comment should be clearer about
>>> this, it sounds more like "no reason, but why not".
>>
>> I think that’s what I meant by “historically”.  Is “legacily” a word?
>>
>> But sure, I can make it more explicit.
>>
>>> This is even more problematic because the commit job doesn't unshare
>>> BLK_PERM_WRITE yet, so it would lead to silent corruption rather than an
>>> error.
>>>
>>>> +    bdrv_get_cumulative_perm(top_bs, &top_perm, &top_shared);
>>>> +    if (top_perm & BLK_PERM_WRITE ||
>>>> +        bdrv_skip_filters(top_bs) == bdrv_skip_filters(bs))
>>>> +    {
>>>>          if (has_backing_file) {
>>>>              error_setg(errp, "'backing-file' specified,"
>>>>                               " but 'top' is the active layer");
>>>
>>> Hm, this error message isn't accurate any more.
>>>
>>> In fact, the implementation isn't consistent with the QAPI documentation
>>> any more, because backing-file is only an error for the top level.
>>
>> Hm.  I wanted to agree, and then I wanted to come up with a QAPI
>> documentation that fits the new behavior (because I think it makes more
>> sense to change the QAPI documentation along with the behavior change,
>> rather than to force us to allow backing-file for anything that isn’t on
>> the top layer).
>>
>> But in the process of coming up with a better description, I noticed
>> that this doesn’t say “is a root node”, it says “is the active layer”.
>> I would say a node in the active layer is a node that has some parent
>> that has taken a WRITE permission on it.  So actually I think that the
>> documentation is right, and this code only now fits.
> 
> Then you may have not only "the" active layer, but multiple active
> layers. I find this a bit counterintuitive.

Depends on what you count as a layer.  I don’t think that’s a clearly
defined term, is it?  I only know of “active layer”, “format layer”,
“protocol layer”, and you can at least have multiple format layers above
each other.  So I don’t find it counterintuitive.

But perhaps it’d be best to just get away from the term “active layer”,
as you propose below.

> There is a simple reason why backing-file is an error for a root node:
> It doesn't have overlays, so a value to write to the header of overlay
> images just doesn't make sense.

Ah, yeah...

> The same reasoning doesn't apply for writable images that do have
> overlays. Forbidding backing-file is a more arbitrary restriction there.
> I'm not saying that we can't make arbitrary restrictions where allowing
> an option is not worth the effort, but I feel they should be spelt out
> more explicitly instead of twisting words like "active layer" until they
> fit the code.

I’m all for spelling it out more explicitly.  I just noticed that I
couldn’t clearly distinguish “active layer” from “other” cases of nodes
with writers on them, which is why I noted that “active” to me means the
post-patch behavior already.

You’re right that there is no semantic reason for making it an error.
So I just want it to be an error to be lazy.  I hope you let me do that.
 (I don’t think there’s much of a problem with it, considering that
commits on nodes that have the WRITE permission taken are basically just
completely broken right now.)

>> Though I do think this wants for some clarification.  Perhaps “If 'top'
>> is the active layer (i.e., is a node that may be written to), specifying
>> a backing [...]”?
> 
> "If 'top' doesn't have an overlay image or is in use by a writer..."?

I.e., avoiding the term “active layer” altogether?  Sounds good.  Only,
I don’t know about “writer”...  But it’s already used in
BlockdevOptionsFile.dynamic-auto-read-only’s description, so I suppose
we can use it here, too.  (I just don’t know if as a
non-block-layer-developer I’d know what it means.)

(Also, yes, you’re right, the current behavior of giving all root nodes
an active commit of course remains, even when there are no writers.)

>> There’s more wrong with the specification, namely the whole part under
>> @backing-file past the “(Since 2.1)”, starting with “If top == base”.  I
>> think all of that should go to the top level.  (And “If top == active”
>> should be changed to “If top is active (i.e., may be written to)”.)
> 
> At least the latter only becomes wrong with this patch, so I think it
> needs to be changed by this patch.

Sure.  So I understand you agree with moving the whole chunk, right?

Max