On 12.08.19 13:09, Vladimir Sementsov-Ogievskiy wrote: > 09.08.2019 19:13, Max Reitz wrote: >> This includes some permission limiting (for example, we only need to >> take the RESIZE permission for active commits where the base is smaller >> than the top). >> >> Signed-off-by: Max Reitz >> --- >> block/mirror.c | 117 ++++++++++++++++++++++++++++++++++++++----------- >> blockdev.c | 47 +++++++++++++++++--- >> 2 files changed, 131 insertions(+), 33 deletions(-) >> >> diff --git a/block/mirror.c b/block/mirror.c >> index 54bafdf176..6ddbfb9708 100644 >> --- a/block/mirror.c >> +++ b/block/mirror.c > > > [..] > >> @@ -1693,15 +1734,39 @@ static BlockJob *mirror_start_job( >> /* In commit_active_start() all intermediate nodes disappear, so >> * any jobs in them must be blocked */ >> if (target_is_backing) { >> - BlockDriverState *iter; >> - for (iter = backing_bs(bs); iter != target; iter = backing_bs(iter)) { >> - /* XXX BLK_PERM_WRITE needs to be allowed so we don't block >> - * ourselves at s->base (if writes are blocked for a node, they are >> - * also blocked for its backing file). The other options would be a >> - * second filter driver above s->base (== target). */ >> + BlockDriverState *iter, *filtered_target; >> + uint64_t iter_shared_perms; >> + >> + /* >> + * The topmost node with >> + * bdrv_skip_rw_filters(filtered_target) == bdrv_skip_rw_filters(target) >> + */ >> + filtered_target = bdrv_filtered_cow_bs(bdrv_find_overlay(bs, target)); >> + >> + assert(bdrv_skip_rw_filters(filtered_target) == >> + bdrv_skip_rw_filters(target)); >> + >> + /* >> + * XXX BLK_PERM_WRITE needs to be allowed so we don't block >> + * ourselves at s->base (if writes are blocked for a node, they are >> + * also blocked for its backing file). The other options would be a >> + * second filter driver above s->base (== target). >> + */ >> + iter_shared_perms = BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE; >> + >> + for (iter = bdrv_filtered_bs(bs); iter != target; >> + iter = bdrv_filtered_bs(iter)) >> + { >> + if (iter == filtered_target) { >> + /* >> + * From here on, all nodes are filters on the base. >> + * This allows us to share BLK_PERM_CONSISTENT_READ. >> + */ >> + iter_shared_perms |= BLK_PERM_CONSISTENT_READ; > > > Hmm, I don't understand, why read from upper nodes is not shared? Because they don’t represent a consistent disk state during the commit. Please don’t ask me details about CONSISTENT_READ, because I always pretend I understand it, but I never really do, actually. (My problem is that I do understand why the intermediate nodes shouldn’t share CONSISTENT_READ: It’s because they only read garbage, effectively. But I don’t understand how any block job target (like our base here) can have CONSISTENT_READ. Block job targets are mostly written front to back (except with sync=none), so they too don’t “[represent] the contents of a disk at a specific point.” But that is how it was, so that is how it should be kept.) If it makes you any happier, BLK_PERM_CONSISTENT_READ’s description explicitly notes that it will not be shared on intermediate nodes of a commit job. Max >> + } >> + >> ret = block_job_add_bdrv(&s->common, "intermediate node", iter, 0, >> - BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE, >> - errp); >> + iter_shared_perms, errp); >> if (ret < 0) { >> goto fail; >> } > > [..] >