On 05.05.20 14:54, Kevin Wolf wrote: > Am 05.05.2020 um 13:59 hat Max Reitz geschrieben: >> On 05.05.20 13:19, Kevin Wolf wrote: >>> Am 18.02.2020 um 13:42 hat Max Reitz geschrieben: [...] >>>> + /* Useful combination of flags */ >>>> + BDRV_CHILD_IMAGE = BDRV_CHILD_DATA >>>> + | BDRV_CHILD_METADATA >>>> + | BDRV_CHILD_PRIMARY, >>>> +}; >>>> + >>>> +/* Mask of BdrvChildRoleBits values */ >>>> +typedef unsigned int BdrvChildRole; >>>> + >>>> char *bdrv_perm_names(uint64_t perm); >>>> uint64_t bdrv_qapi_perm_to_blk_perm(BlockPermission qapi_perm); >>> >>> The list intuitively makes sense to me. Let me try to think of some >>> interesting cases to see whether the documentation is complete or >>> whether it could be improved. >>> >>> >>> qcow2 is what everyone has in mind, so it should be obvious: >>> >>> * Without a data file: >>> * file: BDRV_CHILD_IMAGE >>> * backing: BDRV_CHILD_COW >>> >>> * With a data file: >>> * file: BDRV_CHILD_PRIMARY | BDRV_CHILD_METADATA >>> * data-file: BDRV_CHILD_DATA >>> * backing: BDRV_CHILD_COW >>> >>> >>> We can use VMDK to make things a bit more interesting: >>> >>> * file: BDRV_CHILD_PRIMARY | BDRV_CHILD_METADATA >>> * extents.*: BDRV_CHILD_METADATA | BDRV_CHILD_DATA >>> * backing: BDRV_CHILD_COW >>> >>> In other words, we can have multiple data and metadata children. Is this >>> correct or should extents not be marked as metadata? (Checked the final >>> code: yes we do have multiple of them in vmdk.) Should this be mentioned >>> in the documentation? >> >> If the extents contain metadata (I thought not, but I think I was just >> wrong; sparse extents must contain their respective mapping structures), >> then yes, they should all be marked as metadata children. >> >> I’m not sure whether that needs to be mentioned explicitly in the doc, >> because “Child stores metadata” seems sufficient to me. > > When you're the author, the meaning of everything is clear to you. :-) > > In case of doubt, I would be more explicit so that the comment gives a > clear guideline for which role to use in which scenario. OK, so you mean just noting everywhere explicitly how many children can get a specific flag, and not just in some cases? That is, make a note for DATA and METADATA that they can be given to an arbitrary number of children, and COW only to at most one. >>> Do we then also want to allow multiple BDRV_CHILD_COW children? We don't >>> currently have a driver that needs it, but maybe it would be consistent >>> with DATA and METADATA then. However, it would contradict the >>> documentation that it's the "Child from which to read all data". >> >> Yes. I would revisit that problem when the need arises. >> >> It seems to me like this would open a whole can of worms, just like >> allowing multiple filtered children does. > > Okay. Shall we document it explicitly like we do for the filter role? Yep. >>> blkverify: >>> >>> * x-image: BDRV_CHILD_PRIMARY | BDRV_CHILD_DATA | BDRV_CHILD_FILTERED >>> * x-raw: BDRV_CHILD_DATA | BDRV_CHILD_FILTERED >>> >>> Hm, according to the documentation, this doesn't work, FILTERED can be >>> set only for one node. But the condition ("the parent forwards all reads >>> and writes") applies to both children. I think the documentation should >>> mention what needs to be done in such cases. >> >> I don’t know. blkverify is a rare exception by design, because it can >> abort when both children don’t match. (I suppose we could theoretically >> have a quorum mode where a child gets ejected once a mismatch is >> detected, but that isn’t the case now.) > > Well, yes, this is exceptional. I would ignore that property for > assigning roles because when it comes to play, roles don't matter any > more because the whole process is gone. So... > >> Furthermore, I would argue that blkverify actually expects the formatted >> image to sometimes differ from the raw image, if anything, because the >> format driver is to be tested. This is the reason why I chose x-raw to >> be the filtered child. > > ...I don't think this case is relevant. If blkverify returns something, > both children have the same data. Another argument is that right now, bs->file points to x-raw, and .is_filter is set. So x-raw is already treated as the filtered child. >> So there is no general instruction on what to do in such cases that I >> followed here, I specifically chose one child based on what blkverify is >> and what it’s supposed to do. Therefore, I can’t really give a general >> instruction on “what needs to be done in such cases”. > > Maybe the missing part for me is what FILTERED is even used for. I > assume it's for skipping over filters in certain functions in the > generic block layer? Yes. > In this case, maybe the right answer is that... > >>> For blkverify, both >>> children are not equal in intention, so I guess the "real" filtered >>> child is x-image. But for quorum, you can't make any such distinction. I >>> assume the recommendation should be not to set FILTERED for any child >>> then. >> >> Quorum just isn’t a filter driver. > > ...blkverify isn't one either because performing an operation on only > one child usually won't be correct. Good point. It would work if filters are just skipped for functions that read/query stuff, which I think is the case. I don’t think we ever skip filters when it comes to modifying data. In any case, I wouldn’t lose too much sleep over blkverify whatever we do. It’s a driver used purely for debugging purposes. >>> Looking at the final code... Hm, your choice looks quite different: You >>> don't have DATA for x-raw, but you make it the PRIMARY and FILTERED >>> child. I think PRIMARY/FILTERED is just a bug (e.g. getlength and flush >>> being forwarded only to x-image show that it's primary). >> >> I rather consider getlength() a special case. Ideally, we’d forward >> getlength() to both and compare the results; however, image formats >> might have different size resolution than raw files, so there could be a >> difference, but it’d be irrelevant. >> >> It makes then sense to forward it to the formatted image, because >> generally formats have byte resolution for the disk size, whereas for >> raw files it depends on caching and the filesystem, I think. >> >> As for flush, yes, why do we forward it only to x-image? Why is “the >> raw file not important”? > > Because it's the copy that is used to check whether the main image is > correct. If things break, you just create a new copy. At least that's > how blkverify was supposed to be used. I wonder why blkverify decides that. This should be up to the user, and if the user wants to keep the verification image around, blkverify shouldn’t prevent that. > In fact, I guess in the typical use cases for blkverify, cache=unsafe is > enough anyway because you would start over from scratch, so... not a > strong argument. That too. >>> I do wonder >>> whether I have a different interpretation of DATA than you, though. >> >> I never set DATA for FILTERED, because I consider FILTERED to be >> stronger than DATA, so once FILTERED is set, it doesn’t matter whether >> DATA is set or not. I suppose that should either be mentioned in the >> documentation, or we decide that we should always set DATA regardless. > > Either option should be fine. I guess documenting it is less work. OK. >>> Also, the comparison makes me wonder whether FILTERED always implies >>> PRIMARY? Would there ever be a scenario where a child is FILTERED, but >>> not PRIMARY? >> >> I don’t know. I suppose it does. But what’s the implication? > > *shrug* I was just asking to see if I understand things right. We could > document it, but I don't have a good reason why we must do that. OK. > Maybe the more relevant question would be if a FILTERED child must be > the only child to avoid the problems we're discussing for blkverify. But > I think I already answered that question for myself with "no", so > probably not much use asking it. blkverify is just a bit weird, and I personally don’t mind just treating it as something “special”, considering it’s just a debugging aid. Regardless of blkverify, I don’t think FILTERED children must be the only children, though, because I can well imagine filter drivers having metadata children on the side, e.g. config data or bitmaps (not just dirty bitmaps, but also e.g. what to cache for a hypothetical cache driver). Max