On 09.09.19 12:01, Kevin Wolf wrote: > Am 09.09.2019 um 10:31 hat Max Reitz geschrieben: >> On 05.09.19 18:24, Kevin Wolf wrote: >>> Am 12.08.2019 um 14:58 hat Max Reitz geschrieben: >>>> On 10.08.19 17:36, Vladimir Sementsov-Ogievskiy wrote: >>>>> 09.08.2019 19:13, Max Reitz wrote: >>>>>> If the driver does not support .bdrv_co_flush() so bdrv_co_flush() >>>>>> itself has to flush the children of the given node, it should not flush >>>>>> just bs->file->bs, but in fact all children. >>>>>> >>>>>> In any case, the BLKDBG_EVENT() should be emitted on the primary child, >>>>>> because that is where a blkdebug node would be if there is any. >>>>>> >>>>>> Suggested-by: Vladimir Sementsov-Ogievskiy >>>>>> Signed-off-by: Max Reitz >>>>>> --- >>>>>> block/io.c | 23 +++++++++++++++++------ >>>>>> 1 file changed, 17 insertions(+), 6 deletions(-) >>>>>> >>>>>> diff --git a/block/io.c b/block/io.c >>>>>> index c5a8e3e6a3..bcc770d336 100644 >>>>>> --- a/block/io.c >>>>>> +++ b/block/io.c >>>>>> @@ -2572,6 +2572,8 @@ static void coroutine_fn bdrv_flush_co_entry(void *opaque) >>>>>> >>>>>> int coroutine_fn bdrv_co_flush(BlockDriverState *bs) >>>>>> { >>>>>> + BdrvChild *primary_child = bdrv_primary_child(bs); >>>>>> + BdrvChild *child; >>>>>> int current_gen; >>>>>> int ret = 0; >>>>>> >>>>>> @@ -2601,7 +2603,7 @@ int coroutine_fn bdrv_co_flush(BlockDriverState *bs) >>>>>> } >>>>>> >>>>>> /* Write back cached data to the OS even with cache=unsafe */ >>>>>> - BLKDBG_EVENT(bs->file, BLKDBG_FLUSH_TO_OS); >>>>>> + BLKDBG_EVENT(primary_child, BLKDBG_FLUSH_TO_OS); >>>>>> if (bs->drv->bdrv_co_flush_to_os) { >>>>>> ret = bs->drv->bdrv_co_flush_to_os(bs); >>>>>> if (ret < 0) { >>>>>> @@ -2611,15 +2613,15 @@ int coroutine_fn bdrv_co_flush(BlockDriverState *bs) >>>>>> >>>>>> /* But don't actually force it to the disk with cache=unsafe */ >>>>>> if (bs->open_flags & BDRV_O_NO_FLUSH) { >>>>>> - goto flush_parent; >>>>>> + goto flush_children; >>>>>> } >>>>>> >>>>>> /* Check if we really need to flush anything */ >>>>>> if (bs->flushed_gen == current_gen) { >>>>>> - goto flush_parent; >>>>>> + goto flush_children; >>>>>> } >>>>>> >>>>>> - BLKDBG_EVENT(bs->file, BLKDBG_FLUSH_TO_DISK); >>>>>> + BLKDBG_EVENT(primary_child, BLKDBG_FLUSH_TO_DISK); >>>>>> if (!bs->drv) { >>>>>> /* bs->drv->bdrv_co_flush() might have ejected the BDS >>>>>> * (even in case of apparent success) */ >>>>>> @@ -2663,8 +2665,17 @@ int coroutine_fn bdrv_co_flush(BlockDriverState *bs) >>>>>> /* Now flush the underlying protocol. It will also have BDRV_O_NO_FLUSH >>>>>> * in the case of cache=unsafe, so there are no useless flushes. >>>>>> */ >>>>>> -flush_parent: >>>>>> - ret = bs->file ? bdrv_co_flush(bs->file->bs) : 0; >>>>>> +flush_children: >>>>>> + ret = 0; > + QLIST_FOREACH(child, &bs->children, next) { >>>>>> + int this_child_ret; >>>>>> + >>>>>> + this_child_ret = bdrv_co_flush(child->bs); >>>>>> + if (!ret) { >>>>>> + ret = this_child_ret; >>>>>> + } >>>>>> + } >>>>> >>>>> Hmm, you said that we want to flush only children with write-access from parent.. >>>> >>>> Good that you remember it, I must have overlooked it (when reading the >>>> replies to the previous version). :-) >>>> >>>>> Shouldn't we check it? Or we assume that it's always safe to call bdrv_co_flush on >>>>> a node? >>>> >>>> I think it’s always safe. But checking it seems like a nice touch, yes. >>> >>> I'm not sure why we would unconditionally flush all children anyway. The >>> only drivers I can think of that really need to flush more than one >>> child are blkverify and quorum, and both of them already implement this. >>> blkverify implements .bdrv_co_flush, so it's not affected by the change >>> anyway, but quorum children will be flushed twice now. >>> >>> But more than this, I'm worried about the overhead of needlessly >>> recursing through the whole backing chain and calling flush on every >>> node there. Maybe bs->write_gen saves us so that at least this doesn't >>> result in an fdatasync() call for each, but still... Without a use case, >>> I'd rather not do this. >>> >>> Oh, well, after having written all of this, I see that qcow2 with an >>> external data file is buggy... This could be fixed in the qcow2 driver, >>> but maybe restricting the recursion to read-only is actually good enough >>> then. Can you mention this case in the commit message and maybe build a >>> test for it? >> >> And I should thus probably drop vmdk’s .bdrv_co_flush_to_disk() >> implementation. >> >> I will indeed try to write a test, but to be completely honest, I feel >> like this series is long enough. > > I guess I could already merge patch 1 to give you space for another test > patch. ;-) Don’t forget the mirror-top patch. And AFAIR, there was some comment from Vladimir that also required an additional patch. So it would need to be three! (Or I just start squashing from the back?) Max