All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
To: Max Reitz <mreitz@redhat.com>, qemu-block@nongnu.org
Cc: qemu-devel@nongnu.org, Kevin Wolf <kwolf@redhat.com>
Subject: Re: [PATCH 1/2] block/mirror: Fix mirror_top's permissions
Date: Fri, 12 Feb 2021 13:48:15 +0300	[thread overview]
Message-ID: <4cb8e49d-d0b9-215c-b469-fae89db3faac@virtuozzo.com> (raw)
In-Reply-To: <10b1af76-a335-012a-1750-855b7a8e2992@redhat.com>

12.02.2021 12:23, Max Reitz wrote:
> On 12.02.21 10:04, Vladimir Sementsov-Ogievskiy wrote:
>> 11.02.2021 20:22, Max Reitz wrote:
>>> mirror_top currently shares all permissions, and takes only the WRITE
>>> permission (if some parent has taken that permission, too).
>>>
>>> That is wrong, though; mirror_top is a filter, so it should take
>>> permissions like any other filter does.  For example, if the parent
>>> needs CONSISTENT_READ, we need to take that, too, and if it cannot share
>>> the WRITE permission, we cannot share it either.
>>>
>>> The exception is when mirror_top is used for active commit, where we
>>> cannot take CONSISTENT_READ (because it is deliberately unshared above
>>> the base node) and where we must share WRITE (so that it is shared for
>>> all images in the backing chain, so the mirror job can take it for the
>>> target BB).
>>>
>>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>>> ---
>>>   block/mirror.c | 32 +++++++++++++++++++++++++-------
>>>   1 file changed, 25 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/block/mirror.c b/block/mirror.c
>>> index 8e1ad6eceb..1edfc3cc14 100644
>>> --- a/block/mirror.c
>>> +++ b/block/mirror.c
>>> @@ -89,6 +89,7 @@ typedef struct MirrorBlockJob {
>>>   typedef struct MirrorBDSOpaque {
>>>       MirrorBlockJob *job;
>>>       bool stop;
>>> +    bool is_commit;
>>>   } MirrorBDSOpaque;
>>>   struct MirrorOp {
>>> @@ -1513,13 +1514,27 @@ static void bdrv_mirror_top_child_perm(BlockDriverState *bs, BdrvChild *c,
>>>           return;
>>>       }
>>> -    /* Must be able to forward guest writes to the real image */
>>> -    *nperm = 0;
>>> -    if (perm & BLK_PERM_WRITE) {
>>> -        *nperm |= BLK_PERM_WRITE;
>>> -    }
>>> +    bdrv_default_perms(bs, c, role, reopen_queue,
>>> +                       perm, shared, nperm, nshared);
>>> -    *nshared = BLK_PERM_ALL;
>>> +    if (s->is_commit) {
>>> +        /*
>>> +         * For commit jobs, we cannot take CONSISTENT_READ, because
>>> +         * that permission is unshared for everything above the base
>>> +         * node (except for filters on the base node).
>>> +         * We also have to force-share the WRITE permission, or
>>> +         * otherwise we would block ourselves at the base node (if
>>> +         * writes are blocked for a node, they are also blocked for
>>> +         * its backing file).
>>> +         * (We could also share RESIZE, because it may be needed for
>>> +         * the target if its size is less than the top node's; but
>>> +         * bdrv_default_perms_for_cow() automatically shares RESIZE
>>> +         * for backing nodes if WRITE is shared, so there is no need
>>> +         * to do it here.)
>>> +         */
>>> +        *nperm &= ~BLK_PERM_CONSISTENT_READ;
>>
>> this works only because we don't assert READ permission on generic read path in block/io.c, like we do for WRITE..
>> but this is better than pre-patch anyway..
> 
> Yes, because you can read regardless of CONSISTENT_READ, the question is just whether you’ll receive consistent data.  The caller needs to decide whether that’s the case.
> 
> Taking that permission kind of is deferring the question whether the data will be consistent to the block layer.
> 
> In case of commit, we unshare CONSISTENT_READ above the base, because the data between base and top will not be consistent.  Starting from top, we know it is, so we do not need to take the permission, because we do not need that guarantee from the block layer; the commit job can give that guarantee itself.
> 
> (I suppose we could add some ALLOW_INCONSISTENT flag to read requests, and the permission is checked when that flag is not present, but I don’t think we really need to.)
> 
> (Technically we have the problem that there could be something else between top and base that unshares CONSISTENT_READ, and we’ll never know, but addressing that would be complicated and it’s only a hypothetical problem, AFAIA.)
> 
>> How block-jobs and filters works - definitely goes beyond our permissions architecture..
> 
> FWIW, AFAIR, the first job filter node was commit_top, whose purpose was precisely to allow unsharing CONSISTENT_READ on the base and then offering it again on the top.

Hmm reasonable. But sharing writes though backing chain is not that safe

> 
>>> +        *nshared |= BLK_PERM_WRITE;
>>> +    }
>>>   }
>>>   /* Dummy node that provides consistent read to its users without requiring it
>>> @@ -1583,6 +1598,8 @@ static BlockJob *mirror_start_job(
>>>           return NULL;
>>>       }
>>> +    target_is_backing = bdrv_chain_contains(bs, target);
>>
>> may be initialized together with definition..
> 
> Could be, but would that be better? :)
> 
>>> +
>>>       /* In the case of active commit, add dummy driver to provide consistent
>>>        * reads on the top, while disabling it in the intermediate nodes, and make
>>>        * the backing chain writable. */
>>> @@ -1605,6 +1622,8 @@ static BlockJob *mirror_start_job(
>>>       bs_opaque = g_new0(MirrorBDSOpaque, 1);
>>>       mirror_top_bs->opaque = bs_opaque;
>>> +    bs_opaque->is_commit = target_is_backing;
>>> +
>>>       /* bdrv_append takes ownership of the mirror_top_bs reference, need to keep
>>>        * it alive until block_job_create() succeeds even if bs has no parent. */
>>>       bdrv_ref(mirror_top_bs);
>>> @@ -1646,7 +1665,6 @@ static BlockJob *mirror_start_job(
>>>       target_perms = BLK_PERM_WRITE;
>>>       target_shared_perms = BLK_PERM_WRITE_UNCHANGED;
>>> -    target_is_backing = bdrv_chain_contains(bs, target);
>>>       if (target_is_backing) {
>>>           int64_t bs_size, target_size;
>>>           bs_size = bdrv_getlength(bs);
>>>
>>
>> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> 
> Thanks!
> 
> Max
> 


-- 
Best regards,
Vladimir


  reply	other threads:[~2021-02-12 10:49 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-11 17:22 [PATCH 0/2] file-posix: Cache next hole Max Reitz
2021-02-11 17:22 ` [PATCH 1/2] block/mirror: Fix mirror_top's permissions Max Reitz
2021-02-11 19:33   ` Eric Blake
2021-02-12  9:04   ` Vladimir Sementsov-Ogievskiy
2021-02-12  9:23     ` Max Reitz
2021-02-12 10:48       ` Vladimir Sementsov-Ogievskiy [this message]
2021-02-11 17:22 ` [PATCH 2/2] file-posix: Cache next hole Max Reitz
2021-02-11 20:00   ` Eric Blake
2021-02-12  9:25     ` Max Reitz
2021-02-11 20:38   ` Vladimir Sementsov-Ogievskiy
2021-02-12  9:14     ` Max Reitz
2021-02-12 10:25       ` Kevin Wolf
2021-02-12 12:11         ` Max Reitz
2021-03-29 16:21 ` [PATCH 0/2] " Max Reitz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4cb8e49d-d0b9-215c-b469-fae89db3faac@virtuozzo.com \
    --to=vsementsov@virtuozzo.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.