Am 06.02.2020 um 17:47 hat Max Reitz geschrieben: > On 06.02.20 17:43, Max Reitz wrote: > > On 06.02.20 16:51, Kevin Wolf wrote: > >> Am 06.02.2020 um 16:21 hat Max Reitz geschrieben: > >>> On 06.02.20 15:58, Kevin Wolf wrote: > >>>> Am 06.02.2020 um 11:11 hat Max Reitz geschrieben: > >>>>> On 05.02.20 16:38, Kevin Wolf wrote: > >>>>>> Am 30.01.2020 um 22:44 hat Max Reitz geschrieben: > >>>>>>> We will need this to verify that Quorum can let one of its children be > >>>>>>> replaced without breaking anything else. > >>>>>>> > >>>>>>> Signed-off-by: Max Reitz > >>>>>>> --- > >>>>>>> block/quorum.c | 25 +++++++++++++++++++++++++ > >>>>>>> 1 file changed, 25 insertions(+) > >>>>>>> > >>>>>>> diff --git a/block/quorum.c b/block/quorum.c > >>>>>>> index 59cd524502..6a7224c9e4 100644 > >>>>>>> --- a/block/quorum.c > >>>>>>> +++ b/block/quorum.c > >>>>>>> @@ -67,6 +67,13 @@ typedef struct QuorumVotes { > >>>>>>> > >>>>>>> typedef struct QuorumChild { > >>>>>>> BdrvChild *child; > >>>>>>> + > >>>>>>> + /* > >>>>>>> + * If set, check whether this node can be replaced without any > >>>>>>> + * other parent noticing: Unshare CONSISTENT_READ, and take the > >>>>>>> + * WRITE permission. > >>>>>>> + */ > >>>>>>> + bool to_be_replaced; > >>>>>> > >>>>>> I don't understand these permission changes. How does (preparing for) > >>>>>> detaching a node from quorum make its content invalid? > >>>>> > >>>>> It doesn’t, of course. What we are preparing for is to replace it by > >>>>> some other node with some other content. > >>>>> > >>>>>> And why do we > >>>>>> suddenly need WRITE permissions even if the quorum node is only used > >>>>>> read-only? > >>>>>> > >>>>>> The comment is a bit unclear, too. "check whether" implies that both > >>>>>> outcomes could be true, but it doesn't say what happens in either case. > >>>>>> Is this really "make sure that"? > >>>>> > >>>>> I think the comment is not only unclear, it is the problem. (Well, > >>>>> maybe the code is also.) > >>>>> > >>>>> This series is about fixing at least some things about replacing nodes > >>>>> by mirroring. The original use cases this was introduced for was to fix > >>>>> broken quorum children: The other children are still intact, so you read > >>>>> from the quorum node and replace the broken child (which maybe shows > >>>>> invalid data, or maybe just EIO) by the fixed mirror result. > >>>>> > >>>>> Replacing that broken node by the fixed one changes the data that’s > >>>>> visible on that node. > >>>> > >>>> Hm, yes, that's true. But I wonder if this is really something that the > >>>> permission system must catch. Like other graph manipulations, it's > >>>> essentially the user saying "trust me, I know what I'm doing, this node > >>>> makes sense in this place". > >>>> > >>>> Because if you assume that the user could add a node with unsuitable > >>>> content and you want to prevent this, where do we stop? > >>>> blockdev-snapshot can insert a non-empty overlay, which would result in > >>>> visible data change. Should we therefore only allow snapshots when > >>>> shared writes are allowed? This doesn't work obviously. > >>>> > >>>> So I'm inclined to say that this is the user's responsibility and we > >>>> don't have to jump through hoops to prevent every possible way that the > >>>> user could mess up. (Which often also result in preventing legitimate > >>>> cases like here a quorum of read-only nodes.) > >>> > >>> Well, if you ask the question “where do we stop”, we also have to ask > >>> the question “where do we start”. If we say the user knows what they’re > >>> doing, we might as well drop the whole can_replace infrastructure > >>> altogether and just assume that you can replace any node by anything. > >> > >> Well, I don't actually know if that would be completely unreasonable. > >> The idea was obviously to keep graph changes restricted to very specific > >> cases to avoid nasty surprises like triggering latent bugs. Meanwhile we > >> have quite a few more operations that allow changing the graph. > >> > >> So if preventing some cases gives us headaches and is probably more work > >> than dealing with any bugs they might reveal, maybe preventing them is > >> wrong. > >> > >> I'm just afraid that we might be overengineering this and waste time on > >> things that we don't actually get much use from. > > > > That’s why I’m asking. > > (One thing to consider here, though, is that this series exists and has > been reviewed by Vladimir in full, so most of the engineering effort has > already been done. In contrast, writing a new series to drop the whole > can_replace infrastructure with no replacement may actually cost more.) Fair enough. If the artificial permission changes didn't feel so wrong, I think I would just say "let's merge it and forget about it". But they do feel wrong, so I'm not as sure. Kevin