On 11.09.19 13:00, Max Reitz wrote: > On 11.09.19 12:31, Kevin Wolf wrote: >> Am 11.09.2019 um 12:00 hat Max Reitz geschrieben: >>> On 11.09.19 10:27, Kevin Wolf wrote: >>>> Am 11.09.2019 um 09:37 hat Max Reitz geschrieben: >>>>> On 11.09.19 08:55, Kevin Wolf wrote: >>>>>> Well, by default the primary child, which should cover like 90% of the >>>>>> drivers? >>>>> >>>>> Hm, yes. >>>>> >>>>> But I still think that the drivers that do not want to count every >>>>> single non-COW child are the exception. >>>> >>>> They are, but drivers that want to count more than their primary node >>>> are exceptions, too. And I think you're more likely to remember adding >>>> the callback when you want to have a certain feature, not when you don't >>>> want to have it. >>>> >>>> I really think we're likely to forget adding the callback where we need >>>> to disable the feature. >>> >>> Well, I mean, we did forget adding it for qcow2. >> >> I'm afraid I have to agree. So the conclusion is that we won't get it >> right anyway? >> >>>> I can see two options that should address both of our views: >>>> >>>> 1. Just don't have a fallback at all, make the callback mandatory and >>>> provide implementations in block.c that can be referred to in >>>> BlockDriver. Not specifying the callback causes an assertion failure, >>>> so we'd hopefully notice it quite early (assuming that we run either >>>> 'qemu-img info' or 'query-block' on a configuration with the block >>>> driver, but I think that's faily safe to assume). >>> >>> Hm. Seems a bit much, but if we can’t agree on what’s a good general >>> implementation that works for everything, this is probably the only >>> thing that would actually keep us from forgetting to add special cases. >>> >>> Though I actually don’t know. I’d probably add two globally available >>> helpers, one that returns the sum of everything but the backing node, >>> and one that just returns the primary node. >> >> Yes, I think this is the same as I meant by "provide implementations in >> block.c". >> >>> Now if I were to make qcow2 use the primary node helper function, would >>> we have remembered changing it once we added a data file? >>> >>> Hmm. Maybe not, but it should be OK to just make everything use the sum >>> helper, except the drivers that want the primary node. That should work >>> for all cases. (I think that whenever a format driver suddenly gains >>> more child nodes, we probably will want to count them. OTOH, everything >>> that has nodes that shouldn’t be counted probably always wants to use >>> the primary node helper function from the start.) >> >> The job filter nodes have only one child currently, which should be >> counted. We'll add other children that shouldn't be counted only later. >> >> But we already have an idea of what possible extensions look like, so we >> can probably choose the right function from the start. > > Yep. > >>>> 2. Make the 90% solution a 100% solution: Allow drivers to have multiple >>>> storage children (for vmdk) and then have the fallback add up the >>>> primary child plus all storage children. This is what I suggested as >>>> the documented semantics in my initial reply to this patch (that you >>>> chose not to answer). >>> >>> I didn’t answer that because I didn’t disagree. >>> >>>> Adding the size of storage children covers qcow2 and vmdk. >>> >>> That’s of course exactly what we’re trying to do, but the question is, >>> how do we figure out that storage children? Make it a per-BdrvChild >>> attribute? That seems rather heavy-handed, because I think we’d need it >>> only here. >> >> Well, you added bdrv_storage_child().I'd argue this interface is wrong > > Yes, it probably is. > >> because it assumes that only one storage child exists. You just didn't >> implement it for vmdk so that the problem didn't become apparent. It >> would have to return a list rather than a single child. So fixing the >> interface and then using it is what I was thinking. >> >> Now that you mention a per-BdrvChild attribute, however, I start to >> wonder if the distinction between COW children, filter children, storage >> children, metadata children, etc. isn't really what BdrvChildRole was >> supposed to represent? > > That’s a good point. > >> Maybe we want to split off child_storage from child_file, though it's >> not strictly necessary for this specific case because we want to treat >> both metadata and storage nodes the same. But it could be useful for >> other users of bdrv_storage_child(), if there are any. > > Possible. Maybe it turns out that at least for this series I don’t need > bdrv_storage_child() at all. > >>>> As the job filter won't declare the target or any other involved >>>> nodes their storage nodes (I hope), this will do the right thing for >>>> them, too. >>>> >>>> For quorum and blkverify both ways could be justifiable. I think they >>>> probably shouldn't declare their children as storage nodes. They are >>>> more like filters that don't have a single filtered node. So some >>>> kind of almost-filters. >>> >>> I don’t think quorum is a filter, and blkverify can only be justified to >>> be a filter because it quits qemu when there is a mismatch. >>> >>> The better example is replication, but that has a clear filtered child >>> (the primary node). >>> >>> >>> So all in all I think it’s best to make the callback mandatory and add >>> two global helper functions. That’s simple enough and should prevent >>> us from making mistakes by forgetting to adjust something in the >>> future. >> >> Yes, that should work. >> >> We should probably still figure out what the relationship between the >> child access functions and child roles is, even if we don't need it for >> this solution. But it feels like an important part of the design. > > Hm. It feels like something that should be done before this series, > actually. > > So I think we should add at least a child role per child access function > so that they match? And then maybe in bdrv_attach_child() assert that a > BDS never has more than one primary or filtered child (a filtered child > acts as a primary child, too), or more than one COW child. (And that > these are always in bs->file or bs->backing so the child access > functions do work.) I’ve been trying to make this work, but I don’t think it does. It just feels all wrong and I need up with things like “child_metadata_and_data”. The last straw was that blkverify should have the raw file be the filtered child (because, well, it’s bs->file), but then the format file would need to be a non-filtered child, and those would default to BDRV_O_PROTOCOL (which we decidedly don’t want). Anyway, I’m currently attempting to solve this differently: BdrvChildRole isn’t suitable for the job, I think. The name is completely what we want, but it actually doesn’t look like something that describes the child role to me. Instead, I’m introducing a new BdrvChildRole enum mask that describes how the child is going to be used: stay-at-node, cow, metadata, data, etc. I’m going to rename the current BdrvChildRole structure to BdrvChildParent (in want of a better name), because really most of what it does is describe the parent, but precisely not the child. I’m moving .stay_as_node to the new BdrvChildRole enum. I hope this lets me unify child_file, child_backing, and child_format into a child_of_bds object. The callbacks should then decide the particularities based on the BdrvChildRole enum. Hope that makes sense. (? :S) At least I feel much happier implementing it this way, which I suppose is a good sign. Max