On Tue, Dec 15, 2020 at 04:01:19PM +0100, Kevin Wolf wrote: > Am 15.12.2020 um 14:15 hat Sergio Lopez geschrieben: > > On Tue, Dec 15, 2020 at 01:12:33PM +0100, Kevin Wolf wrote: > > > Am 14.12.2020 um 18:05 hat Sergio Lopez geschrieben: > > > > While processing the parents of a BDS, one of the parents may process > > > > the child that's doing the tail recursion, which leads to a BDS being > > > > processed twice. This is especially problematic for the aio_notifiers, > > > > as they might attempt to work on both the old and the new AIO > > > > contexts. > > > > > > > > To avoid this, add the BDS pointer to the ignore list, and check the > > > > child BDS pointer while iterating over the children. > > > > > > > > Signed-off-by: Sergio Lopez > > > > > > Ugh, so we get a mixed list of BdrvChild and BlockDriverState? :-/ > > > > I know, it's effective but quite ugly... > > > > > What is the specific scenario where you saw this breaking? Did you have > > > multiple BdrvChild connections between two nodes so that we would go to > > > the parent node through one and then come back to the child node through > > > the other? > > > > I don't think this is a corner case. If the graph is walked top->down, > > there's no problem since children are added to the ignore list before > > getting processed, and siblings don't process each other. But, if the > > graph is walked bottom->up, a BDS will start processing its parents > > without adding itself to the ignore list, so there's nothing > > preventing them from processing it again. > > I don't understand. child is added to ignore before calling the parent > callback on it, so how can we come back through the same BdrvChild? > > QLIST_FOREACH(child, &bs->parents, next_parent) { > if (g_slist_find(*ignore, child)) { > continue; > } > assert(child->klass->set_aio_ctx); > *ignore = g_slist_prepend(*ignore, child); > child->klass->set_aio_ctx(child, new_context, ignore); > } Perhaps I'm missing something, but the way I understand it, that loop is adding the BdrvChild pointer of each of its parents, but not the BdrvChild pointer of the BDS that was passed as an argument to b_s_a_c_i. > You didn't dump the BdrvChild here. I think that would add some > information on why we re-entered 0x555ee2fbf660. Maybe you can also add > bs->drv->format_name for each node to make the scenario less abstract? I've generated another trace with more data: bs=0x565505e48030 (backup-top) enter bs=0x565505e48030 (backup-top) processing children bs=0x565505e48030 (backup-top) calling bsaci child=0x565505e42090 (child->bs=0x565505e5d420) bs=0x565505e5d420 (qcow2) enter bs=0x565505e5d420 (qcow2) processing children bs=0x565505e5d420 (qcow2) calling bsaci child=0x565505e41ea0 (child->bs=0x565505e52060) bs=0x565505e52060 (file) enter bs=0x565505e52060 (file) processing children bs=0x565505e52060 (file) processing parents bs=0x565505e52060 (file) processing itself bs=0x565505e5d420 (qcow2) processing parents bs=0x565505e5d420 (qcow2) calling set_aio_ctx child=0x5655066a34d0 bs=0x565505fbf660 (qcow2) enter bs=0x565505fbf660 (qcow2) processing children bs=0x565505fbf660 (qcow2) calling bsaci child=0x565505e41d20 (child->bs=0x565506bc0c00) bs=0x565506bc0c00 (file) enter bs=0x565506bc0c00 (file) processing children bs=0x565506bc0c00 (file) processing parents bs=0x565506bc0c00 (file) processing itself bs=0x565505fbf660 (qcow2) processing parents bs=0x565505fbf660 (qcow2) calling set_aio_ctx child=0x565505fc7aa0 bs=0x565505fbf660 (qcow2) calling set_aio_ctx child=0x5655068b8510 bs=0x565505e48030 (backup-top) enter bs=0x565505e48030 (backup-top) processing children bs=0x565505e48030 (backup-top) calling bsaci child=0x565505e3c450 (child->bs=0x565505fbf660) bs=0x565505fbf660 (qcow2) enter bs=0x565505fbf660 (qcow2) processing children bs=0x565505fbf660 (qcow2) processing parents bs=0x565505fbf660 (qcow2) processing itself bs=0x565505e48030 (backup-top) processing parents bs=0x565505e48030 (backup-top) calling set_aio_ctx child=0x565505e402d0 bs=0x565505e48030 (backup-top) processing itself bs=0x565505fbf660 (qcow2) processing itself So it seems this is happening: backup-top (5e48030) <---------| (5) | | | | | (6) ------------> qcow2 (5fbf660) | ^ | | (3) | | (4) |-> (1) qcow2 (5e5d420) ----- |-> file (6bc0c00) | |-> (2) file (5e52060) backup-top (5e48030), the BDS that was passed as argument in the first bdrv_set_aio_context_ignore() call, is re-entered when qcow2 (5fbf660) is processing its parents, and the latter is also re-entered when the first one starts processing its children again. Sergio.