On Thu, Dec 17, 2020 at 11:58:30AM +0100, Kevin Wolf wrote: > Am 17.12.2020 um 10:37 hat Sergio Lopez geschrieben: > > Do you think it's safe to re-enter backup-top, or should we look for a > > way to avoid this? > > I think it should be avoided, but I don't understand why putting all > children of backup-top into the ignore list doesn't already avoid it. If > backup-top is in the parents list of qcow2, then qcow2 should be in the > children list of backup-top and therefore the BdrvChild should already > be in the ignore list. > > The only way I can explain this is that backup-top and qcow2 have > different ideas about which BdrvChild objects exist that connect them. > Or that the graph changes between both places, but I don't see how that > could happen in bdrv_set_aio_context_ignore(). I've been digging around with gdb, and found that, at that point, the backup-top BDS is actually referenced by two different BdrvChild objects: (gdb) p *(BdrvChild *) 0x560c40f7e400 $84 = {bs = 0x560c40c4c030, name = 0x560c41ca4960 "root", klass = 0x560c3eae7c20 , role = 20, opaque = 0x560c41ca4610, perm = 3, shared_perm = 29, has_backup_perm = false, backup_perm = 0, backup_shared_perm = 31, frozen = false, parent_quiesce_counter = 2, next = { le_next = 0x0, le_prev = 0x0}, next_parent = {le_next = 0x0, le_prev = 0x560c40c44338}} (gdb) p sibling $72 = (BdrvChild *) 0x560c40981840 (gdb) p *sibling $73 = {bs = 0x560c40c4c030, name = 0x560c4161be20 "main node", klass = 0x560c3eae6a40 , role = 0, opaque = 0x560c4161bc00, perm = 0, shared_perm = 31, has_backup_perm = false, backup_perm = 0, backup_shared_perm = 0, frozen = false, parent_quiesce_counter = 2, next = { le_next = 0x0, le_prev = 0x0}, next_parent = {le_next = 0x560c40c442d0, le_prev = 0x560c40c501c0}} When the chain of calls to switch AIO contexts is started, backup-top is the first one to be processed. blk_do_set_aio_context() instructs bdrv_child_try_set_aio_context() to add blk->root (0x560c40f7e400) as the first element in ignore list, but the referenced BDS is still re-entered through the other BdrvChild (0x560c40981840) by one the children of the latter. I can't think of a way of preventing this other than keeping track of BDS pointers in the ignore list too. Do you think there are any alternatives? Thanks, Sergio.