On 05.02.20 16:55, Kevin Wolf wrote: > Am 11.11.2019 um 17:02 hat Max Reitz geschrieben: >> Signed-off-by: Max Reitz >> --- >> block/quorum.c | 62 ++++++++++++++++++++++++++++++++++++++++++++++++++ >> 1 file changed, 62 insertions(+) >> >> diff --git a/block/quorum.c b/block/quorum.c >> index 3a824e77e3..8ee03e9baf 100644 >> --- a/block/quorum.c >> +++ b/block/quorum.c >> @@ -825,6 +825,67 @@ static bool quorum_recurse_is_first_non_filter(BlockDriverState *bs, >> return false; >> } >> >> +static bool quorum_recurse_can_replace(BlockDriverState *bs, >> + BlockDriverState *to_replace) >> +{ >> + BDRVQuorumState *s = bs->opaque; >> + int i; >> + >> + for (i = 0; i < s->num_children; i++) { >> + /* >> + * We have no idea whether our children show the same data as >> + * this node (@bs). It is actually highly likely that >> + * @to_replace does not, because replacing a broken child is >> + * one of the main use cases here. >> + * >> + * We do know that the new BDS will match @bs, so replacing >> + * any of our children by it will be safe. It cannot change >> + * the data this quorum node presents to its parents. >> + * >> + * However, replacing @to_replace by @bs in any of our >> + * children's chains may change visible data somewhere in >> + * there. We therefore cannot recurse down those chains with >> + * bdrv_recurse_can_replace(). >> + * (More formally, bdrv_recurse_can_replace() requires that >> + * @to_replace will be replaced by something matching the @bs >> + * passed to it. We cannot guarantee that.) >> + * >> + * Thus, we can only check whether any of our immediate >> + * children matches @to_replace. >> + * >> + * (In the future, we might add a function to recurse down a >> + * chain that checks that nothing there cares about a change >> + * in data from the respective child in question. For >> + * example, most filters do not care when their child's data >> + * suddenly changes, as long as their parents do not care.) >> + */ >> + if (s->children[i].child->bs == to_replace) { >> + Error *local_err = NULL; >> + >> + /* >> + * We now have to ensure that there is no other parent >> + * that cares about replacing this child by a node with >> + * potentially different data. >> + */ >> + s->children[i].to_be_replaced = true; >> + bdrv_child_refresh_perms(bs, s->children[i].child, &local_err); >> + >> + /* Revert permissions */ >> + s->children[i].to_be_replaced = false; >> + bdrv_child_refresh_perms(bs, s->children[i].child, &error_abort); > > Quite a hack. The two obvious problems are: > > 1. We can't guarantee that we can actually revert the permissions. I > think we ignore failure to loosen permissions meanwhile so that at > least the &error_abort doesn't trigger, but bs could still be in the > wrong state afterwards. I thought we guaranteed that loosening permissions never fails. (Well, you know. It may “leak” permissions, but we’d never get an error here so there’s nothing to handle anyway.) > It would be cleaner to use check+abort instead of actually setting > the new permission. Oh. Yes. Maybe. It does require more code, though, because I’d rather not use bdrv_check_update_perm() from here as-is. > 2. As aborting the permission change makes more obvious, we're checking > something that might not be true any more when we actually make the > change. True. I tried to do it right by having a post-replace cleanup function, but after a while that was just going nowhere, really. So I just went with what’s patch 13 here. But isn’t 13 enough, actually? It check can_replace right before replacing in a drained section. I can’t imagine the permissions to change there. Max > Pragmatically, a hack might be good enough here, but it should be > documented as such (with a short explanation of its shortcomings) at > least.