On Wed, Mar 03 2021, edwardh wrote: > From: Edward Hsieh > > For chained bio, trace_block_bio_complete in bio_endio is currently called > only by the parent bio once upon all chained bio completed. > However, the sector and size for the parent bio are modified in bio_split. > Therefore, the size and sector of the complete events might not match the > queue events in blktrace. > > The original fix of bio completion trace ("block: trace > completion of all bios.") wants multiple complete events to correspond > to one queue event but missed this. > > md/raid5 read with bio cross chunks can reproduce this issue. > > To fix, move trace completion into the loop for every chained bio to call. Thanks. I think this is correct as far as tracing goes. However the code still looks a bit odd. The comment for the handling of bio_chain_endio suggests that the *only* purpose for that is to avoid deep recursion. That suggests it should be at the end of the function. As it is blk_throtl_bio_endio() and bio_unint() are only called on the last bio in a chain. That seems wrong. I'd be more comfortable if the patch moved the bio_chain_endio() handling to the end, after all of that. So the function would end. if (bio->bi_end_io == bio_chain_endio) { bio = __bio_chain_endio(bio); goto again; } else if (bio->bi_end_io) bio->bi_end_io(bio); Jens: can you see any reason why that functions must only be called on the last bio in the chain? Thanks, NeilBrown > > Fixes: fbbaf700e7b1 ("block: trace completion of all bios.") > Reviewed-by: Wade Liang > Reviewed-by: BingJing Chang > Signed-off-by: Edward Hsieh > --- > block/bio.c | 13 ++++++------- > 1 file changed, 6 insertions(+), 7 deletions(-) > > diff --git a/block/bio.c b/block/bio.c > index a1c4d29..2ff72cb 100644 > --- a/block/bio.c > +++ b/block/bio.c > @@ -1397,8 +1397,7 @@ static inline bool bio_remaining_done(struct bio *bio) > * > * bio_endio() can be called several times on a bio that has been chained > * using bio_chain(). The ->bi_end_io() function will only be called the > - * last time. At this point the BLK_TA_COMPLETE tracing event will be > - * generated if BIO_TRACE_COMPLETION is set. > + * last time. > **/ > void bio_endio(struct bio *bio) > { > @@ -1411,6 +1410,11 @@ void bio_endio(struct bio *bio) > if (bio->bi_bdev) > rq_qos_done_bio(bio->bi_bdev->bd_disk->queue, bio); > > + if (bio->bi_bdev && bio_flagged(bio, BIO_TRACE_COMPLETION)) { > + trace_block_bio_complete(bio->bi_bdev->bd_disk->queue, bio); > + bio_clear_flag(bio, BIO_TRACE_COMPLETION); > + } > + > /* > * Need to have a real endio function for chained bios, otherwise > * various corner cases will break (like stacking block devices that > @@ -1424,11 +1428,6 @@ void bio_endio(struct bio *bio) > goto again; > } > > - if (bio->bi_bdev && bio_flagged(bio, BIO_TRACE_COMPLETION)) { > - trace_block_bio_complete(bio->bi_bdev->bd_disk->queue, bio); > - bio_clear_flag(bio, BIO_TRACE_COMPLETION); > - } > - > blk_throtl_bio_endio(bio); > /* release cgroup info */ > bio_uninit(bio); > -- > 2.7.4