All of lore.kernel.org
 help / color / mirror / Atom feed
* FAILED: patch "[PATCH] dm: fix redundant IO accounting for bios that need splitting" failed to apply to 4.20-stable tree
@ 2019-01-28 12:50 gregkh
  2019-01-28 15:31 ` Mike Snitzer
  0 siblings, 1 reply; 4+ messages in thread
From: gregkh @ 2019-01-28 12:50 UTC (permalink / raw)
  To: snitzer, bgurney, ming.lei; +Cc: stable


The patch below does not apply to the 4.20-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable@vger.kernel.org>.

thanks,

greg k-h

------------------ original commit in Linus's tree ------------------

From a1e1cb72d96491277ede8d257ce6b48a381dd336 Mon Sep 17 00:00:00 2001
From: Mike Snitzer <snitzer@redhat.com>
Date: Thu, 17 Jan 2019 10:48:01 -0500
Subject: [PATCH] dm: fix redundant IO accounting for bios that need splitting

The risk of redundant IO accounting was not taken into consideration
when commit 18a25da84354 ("dm: ensure bio submission follows a
depth-first tree walk") introduced IO splitting in terms of recursion
via generic_make_request().

Fix this by subtracting the split bio's payload from the IO stats that
were already accounted for by start_io_acct() upon dm_make_request()
entry.  This repeat oscillation of the IO accounting, up then down,
isn't ideal but refactoring DM core's IO splitting to pre-split bios
_before_ they are accounted turned out to be an excessive amount of
change that will need a full development cycle to refine and verify.

Before this fix:

  /dev/mapper/stripe_dev is a 4-way stripe using a 32k chunksize, so
  bios are split on 32k boundaries.

  # fio --name=16M --filename=/dev/mapper/stripe_dev --rw=write --bs=64k --size=16M \
    	--iodepth=1 --ioengine=libaio --direct=1 --refill_buffers

  with debugging added:
  [103898.310264] device-mapper: core: start_io_acct: dm-2 WRITE bio->bi_iter.bi_sector=0 len=128
  [103898.318704] device-mapper: core: __split_and_process_bio: recursing for following split bio:
  [103898.329136] device-mapper: core: start_io_acct: dm-2 WRITE bio->bi_iter.bi_sector=64 len=64
  ...

  16M written yet 136M (278528 * 512b) accounted:
  # cat /sys/block/dm-2/stat | awk '{ print $7 }'
  278528

After this fix:

  16M written and 16M (32768 * 512b) accounted:
  # cat /sys/block/dm-2/stat | awk '{ print $7 }'
  32768

Fixes: 18a25da84354 ("dm: ensure bio submission follows a depth-first tree walk")
Cc: stable@vger.kernel.org # 4.16+
Reported-by: Bryan Gurney <bgurney@redhat.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index fcb97b0a5743..fbadda68e23b 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1584,6 +1584,9 @@ static void init_clone_info(struct clone_info *ci, struct mapped_device *md,
 	ci->sector = bio->bi_iter.bi_sector;
 }
 
+#define __dm_part_stat_sub(part, field, subnd)	\
+	(part_stat_get(part, field) -= (subnd))
+
 /*
  * Entry point to split a bio into clones and submit them to the targets.
  */
@@ -1638,6 +1641,19 @@ static blk_qc_t __split_and_process_bio(struct mapped_device *md,
 				struct bio *b = bio_split(bio, bio_sectors(bio) - ci.sector_count,
 							  GFP_NOIO, &md->queue->bio_split);
 				ci.io->orig_bio = b;
+
+				/*
+				 * Adjust IO stats for each split, otherwise upon queue
+				 * reentry there will be redundant IO accounting.
+				 * NOTE: this is a stop-gap fix, a proper fix involves
+				 * significant refactoring of DM core's bio splitting
+				 * (by eliminating DM's splitting and just using bio_split)
+				 */
+				part_stat_lock();
+				__dm_part_stat_sub(&dm_disk(md)->part0,
+						   sectors[op_stat_group(bio_op(bio))], ci.sector_count);
+				part_stat_unlock();
+
 				bio_chain(b, bio);
 				ret = generic_make_request(bio);
 				break;


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: FAILED: patch "[PATCH] dm: fix redundant IO accounting for bios that need splitting" failed to apply to 4.20-stable tree
  2019-01-28 12:50 FAILED: patch "[PATCH] dm: fix redundant IO accounting for bios that need splitting" failed to apply to 4.20-stable tree gregkh
@ 2019-01-28 15:31 ` Mike Snitzer
  2019-01-28 16:00   ` Greg KH
  0 siblings, 1 reply; 4+ messages in thread
From: Mike Snitzer @ 2019-01-28 15:31 UTC (permalink / raw)
  To: gregkh; +Cc: bgurney, ming.lei, stable

On Mon, Jan 28 2019 at  7:50am -0500,
gregkh@linuxfoundation.org <gregkh@linuxfoundation.org> wrote:

> 
> The patch below does not apply to the 4.20-stable tree.
> If someone wants it applied there, or to any other stable or longterm
> tree, then please email the backport, including the original git commit
> id to <stable@vger.kernel.org>.
> 
> thanks,
> 
> greg k-h
> 
> ------------------ original commit in Linus's tree ------------------
> 
> From a1e1cb72d96491277ede8d257ce6b48a381dd336 Mon Sep 17 00:00:00 2001
> From: Mike Snitzer <snitzer@redhat.com>
> Date: Thu, 17 Jan 2019 10:48:01 -0500
> Subject: [PATCH] dm: fix redundant IO accounting for bios that need splitting
> 
> The risk of redundant IO accounting was not taken into consideration
> when commit 18a25da84354 ("dm: ensure bio submission follows a
> depth-first tree walk") introduced IO splitting in terms of recursion
> via generic_make_request().
> 
> Fix this by subtracting the split bio's payload from the IO stats that
> were already accounted for by start_io_acct() upon dm_make_request()
> entry.  This repeat oscillation of the IO accounting, up then down,
> isn't ideal but refactoring DM core's IO splitting to pre-split bios
> _before_ they are accounted turned out to be an excessive amount of
> change that will need a full development cycle to refine and verify.
> 
> Before this fix:
> 
>   /dev/mapper/stripe_dev is a 4-way stripe using a 32k chunksize, so
>   bios are split on 32k boundaries.
> 
>   # fio --name=16M --filename=/dev/mapper/stripe_dev --rw=write --bs=64k --size=16M \
>     	--iodepth=1 --ioengine=libaio --direct=1 --refill_buffers
> 
>   with debugging added:
>   [103898.310264] device-mapper: core: start_io_acct: dm-2 WRITE bio->bi_iter.bi_sector=0 len=128
>   [103898.318704] device-mapper: core: __split_and_process_bio: recursing for following split bio:
>   [103898.329136] device-mapper: core: start_io_acct: dm-2 WRITE bio->bi_iter.bi_sector=64 len=64
>   ...
> 
>   16M written yet 136M (278528 * 512b) accounted:
>   # cat /sys/block/dm-2/stat | awk '{ print $7 }'
>   278528
> 
> After this fix:
> 
>   16M written and 16M (32768 * 512b) accounted:
>   # cat /sys/block/dm-2/stat | awk '{ print $7 }'
>   32768
> 
> Fixes: 18a25da84354 ("dm: ensure bio submission follows a depth-first tree walk")
> Cc: stable@vger.kernel.org # 4.16+
> Reported-by: Bryan Gurney <bgurney@redhat.com>
> Reviewed-by: Ming Lei <ming.lei@redhat.com>
> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> 
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index fcb97b0a5743..fbadda68e23b 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -1584,6 +1584,9 @@ static void init_clone_info(struct clone_info *ci, struct mapped_device *md,
>  	ci->sector = bio->bi_iter.bi_sector;
>  }
>  
> +#define __dm_part_stat_sub(part, field, subnd)	\
> +	(part_stat_get(part, field) -= (subnd))
> +
>  /*
>   * Entry point to split a bio into clones and submit them to the targets.
>   */
> @@ -1638,6 +1641,19 @@ static blk_qc_t __split_and_process_bio(struct mapped_device *md,
>  				struct bio *b = bio_split(bio, bio_sectors(bio) - ci.sector_count,
>  							  GFP_NOIO, &md->queue->bio_split);
>  				ci.io->orig_bio = b;
> +
> +				/*
> +				 * Adjust IO stats for each split, otherwise upon queue
> +				 * reentry there will be redundant IO accounting.
> +				 * NOTE: this is a stop-gap fix, a proper fix involves
> +				 * significant refactoring of DM core's bio splitting
> +				 * (by eliminating DM's splitting and just using bio_split)
> +				 */
> +				part_stat_lock();
> +				__dm_part_stat_sub(&dm_disk(md)->part0,
> +						   sectors[op_stat_group(bio_op(bio))], ci.sector_count);
> +				part_stat_unlock();
> +
>  				bio_chain(b, bio);
>  				ret = generic_make_request(bio);
>  				break;
> 

Seems to apply fine.. not sure what the problem is on your end:

$ git checkout stable/linux-4.20.y
Previous HEAD position was 8fe28cb58bcb... Linux 4.20
HEAD is now at 9f1a389a0b5b... Linux 4.20.5

$ git show a1e1cb72d96491277ede8d257ce6b48a381dd336 | patch -p1 --dry
patching file drivers/md/dm.c
Hunk #1 succeeded at 1578 (offset -6 lines).
Hunk #2 succeeded at 1626 (offset -15 lines).

$ git cherry-pick a1e1cb72d96491277ede8d257ce6b48a381dd336
[detached HEAD 3d6015ea633a] dm: fix redundant IO accounting for bios that need splitting
 Date: Thu Jan 17 10:48:01 2019 -0500
 1 file changed, 16 insertions(+)

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: FAILED: patch "[PATCH] dm: fix redundant IO accounting for bios that need splitting" failed to apply to 4.20-stable tree
  2019-01-28 15:31 ` Mike Snitzer
@ 2019-01-28 16:00   ` Greg KH
  2019-01-28 17:18     ` Mike Snitzer
  0 siblings, 1 reply; 4+ messages in thread
From: Greg KH @ 2019-01-28 16:00 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: bgurney, ming.lei, stable

On Mon, Jan 28, 2019 at 10:31:41AM -0500, Mike Snitzer wrote:
> On Mon, Jan 28 2019 at  7:50am -0500,
> gregkh@linuxfoundation.org <gregkh@linuxfoundation.org> wrote:
> 
> > 
> > The patch below does not apply to the 4.20-stable tree.
> > If someone wants it applied there, or to any other stable or longterm
> > tree, then please email the backport, including the original git commit
> > id to <stable@vger.kernel.org>.
> > 
> > thanks,
> > 
> > greg k-h
> > 
> > ------------------ original commit in Linus's tree ------------------
> > 
> > From a1e1cb72d96491277ede8d257ce6b48a381dd336 Mon Sep 17 00:00:00 2001
> > From: Mike Snitzer <snitzer@redhat.com>
> > Date: Thu, 17 Jan 2019 10:48:01 -0500
> > Subject: [PATCH] dm: fix redundant IO accounting for bios that need splitting
> > 
> > The risk of redundant IO accounting was not taken into consideration
> > when commit 18a25da84354 ("dm: ensure bio submission follows a
> > depth-first tree walk") introduced IO splitting in terms of recursion
> > via generic_make_request().
> > 
> > Fix this by subtracting the split bio's payload from the IO stats that
> > were already accounted for by start_io_acct() upon dm_make_request()
> > entry.  This repeat oscillation of the IO accounting, up then down,
> > isn't ideal but refactoring DM core's IO splitting to pre-split bios
> > _before_ they are accounted turned out to be an excessive amount of
> > change that will need a full development cycle to refine and verify.
> > 
> > Before this fix:
> > 
> >   /dev/mapper/stripe_dev is a 4-way stripe using a 32k chunksize, so
> >   bios are split on 32k boundaries.
> > 
> >   # fio --name=16M --filename=/dev/mapper/stripe_dev --rw=write --bs=64k --size=16M \
> >     	--iodepth=1 --ioengine=libaio --direct=1 --refill_buffers
> > 
> >   with debugging added:
> >   [103898.310264] device-mapper: core: start_io_acct: dm-2 WRITE bio->bi_iter.bi_sector=0 len=128
> >   [103898.318704] device-mapper: core: __split_and_process_bio: recursing for following split bio:
> >   [103898.329136] device-mapper: core: start_io_acct: dm-2 WRITE bio->bi_iter.bi_sector=64 len=64
> >   ...
> > 
> >   16M written yet 136M (278528 * 512b) accounted:
> >   # cat /sys/block/dm-2/stat | awk '{ print $7 }'
> >   278528
> > 
> > After this fix:
> > 
> >   16M written and 16M (32768 * 512b) accounted:
> >   # cat /sys/block/dm-2/stat | awk '{ print $7 }'
> >   32768
> > 
> > Fixes: 18a25da84354 ("dm: ensure bio submission follows a depth-first tree walk")
> > Cc: stable@vger.kernel.org # 4.16+
> > Reported-by: Bryan Gurney <bgurney@redhat.com>
> > Reviewed-by: Ming Lei <ming.lei@redhat.com>
> > Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> > 
> > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > index fcb97b0a5743..fbadda68e23b 100644
> > --- a/drivers/md/dm.c
> > +++ b/drivers/md/dm.c
> > @@ -1584,6 +1584,9 @@ static void init_clone_info(struct clone_info *ci, struct mapped_device *md,
> >  	ci->sector = bio->bi_iter.bi_sector;
> >  }
> >  
> > +#define __dm_part_stat_sub(part, field, subnd)	\
> > +	(part_stat_get(part, field) -= (subnd))
> > +
> >  /*
> >   * Entry point to split a bio into clones and submit them to the targets.
> >   */
> > @@ -1638,6 +1641,19 @@ static blk_qc_t __split_and_process_bio(struct mapped_device *md,
> >  				struct bio *b = bio_split(bio, bio_sectors(bio) - ci.sector_count,
> >  							  GFP_NOIO, &md->queue->bio_split);
> >  				ci.io->orig_bio = b;
> > +
> > +				/*
> > +				 * Adjust IO stats for each split, otherwise upon queue
> > +				 * reentry there will be redundant IO accounting.
> > +				 * NOTE: this is a stop-gap fix, a proper fix involves
> > +				 * significant refactoring of DM core's bio splitting
> > +				 * (by eliminating DM's splitting and just using bio_split)
> > +				 */
> > +				part_stat_lock();
> > +				__dm_part_stat_sub(&dm_disk(md)->part0,
> > +						   sectors[op_stat_group(bio_op(bio))], ci.sector_count);
> > +				part_stat_unlock();
> > +
> >  				bio_chain(b, bio);
> >  				ret = generic_make_request(bio);
> >  				break;
> > 
> 
> Seems to apply fine.. not sure what the problem is on your end:
> 
> $ git checkout stable/linux-4.20.y
> Previous HEAD position was 8fe28cb58bcb... Linux 4.20
> HEAD is now at 9f1a389a0b5b... Linux 4.20.5
> 
> $ git show a1e1cb72d96491277ede8d257ce6b48a381dd336 | patch -p1 --dry
> patching file drivers/md/dm.c
> Hunk #1 succeeded at 1578 (offset -6 lines).
> Hunk #2 succeeded at 1626 (offset -15 lines).
> 
> $ git cherry-pick a1e1cb72d96491277ede8d257ce6b48a381dd336
> [detached HEAD 3d6015ea633a] dm: fix redundant IO accounting for bios that need splitting
>  Date: Thu Jan 17 10:48:01 2019 -0500
>  1 file changed, 16 insertions(+)

Try building it, it blows up into tiny pieces :)

I guess I need a different script that says, "the patch applied, but
broke the build", but it is so rare it's almost not worth it...

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: FAILED: patch "[PATCH] dm: fix redundant IO accounting for bios that need splitting" failed to apply to 4.20-stable tree
  2019-01-28 16:00   ` Greg KH
@ 2019-01-28 17:18     ` Mike Snitzer
  0 siblings, 0 replies; 4+ messages in thread
From: Mike Snitzer @ 2019-01-28 17:18 UTC (permalink / raw)
  To: Greg KH; +Cc: bgurney, ming.lei, stable, axboe

On Mon, Jan 28 2019 at 11:00am -0500,
Greg KH <gregkh@linuxfoundation.org> wrote:

> On Mon, Jan 28, 2019 at 10:31:41AM -0500, Mike Snitzer wrote:
> > On Mon, Jan 28 2019 at  7:50am -0500,
> > gregkh@linuxfoundation.org <gregkh@linuxfoundation.org> wrote:
> > 
> > > 
> > > The patch below does not apply to the 4.20-stable tree.
> > > If someone wants it applied there, or to any other stable or longterm
> > > tree, then please email the backport, including the original git commit
> > > id to <stable@vger.kernel.org>.
> > > 
> > > thanks,
> > > 
> > > greg k-h
> > > 
> > > ------------------ original commit in Linus's tree ------------------
> > > 
> > > From a1e1cb72d96491277ede8d257ce6b48a381dd336 Mon Sep 17 00:00:00 2001
> > > From: Mike Snitzer <snitzer@redhat.com>
> > > Date: Thu, 17 Jan 2019 10:48:01 -0500
> > > Subject: [PATCH] dm: fix redundant IO accounting for bios that need splitting
> > > 
> > > The risk of redundant IO accounting was not taken into consideration
> > > when commit 18a25da84354 ("dm: ensure bio submission follows a
> > > depth-first tree walk") introduced IO splitting in terms of recursion
> > > via generic_make_request().
> > > 
> > > Fix this by subtracting the split bio's payload from the IO stats that
> > > were already accounted for by start_io_acct() upon dm_make_request()
> > > entry.  This repeat oscillation of the IO accounting, up then down,
> > > isn't ideal but refactoring DM core's IO splitting to pre-split bios
> > > _before_ they are accounted turned out to be an excessive amount of
> > > change that will need a full development cycle to refine and verify.
> > > 
> > > Before this fix:
> > > 
> > >   /dev/mapper/stripe_dev is a 4-way stripe using a 32k chunksize, so
> > >   bios are split on 32k boundaries.
> > > 
> > >   # fio --name=16M --filename=/dev/mapper/stripe_dev --rw=write --bs=64k --size=16M \
> > >     	--iodepth=1 --ioengine=libaio --direct=1 --refill_buffers
> > > 
> > >   with debugging added:
> > >   [103898.310264] device-mapper: core: start_io_acct: dm-2 WRITE bio->bi_iter.bi_sector=0 len=128
> > >   [103898.318704] device-mapper: core: __split_and_process_bio: recursing for following split bio:
> > >   [103898.329136] device-mapper: core: start_io_acct: dm-2 WRITE bio->bi_iter.bi_sector=64 len=64
> > >   ...
> > > 
> > >   16M written yet 136M (278528 * 512b) accounted:
> > >   # cat /sys/block/dm-2/stat | awk '{ print $7 }'
> > >   278528
> > > 
> > > After this fix:
> > > 
> > >   16M written and 16M (32768 * 512b) accounted:
> > >   # cat /sys/block/dm-2/stat | awk '{ print $7 }'
> > >   32768
> > > 
> > > Fixes: 18a25da84354 ("dm: ensure bio submission follows a depth-first tree walk")
> > > Cc: stable@vger.kernel.org # 4.16+
> > > Reported-by: Bryan Gurney <bgurney@redhat.com>
> > > Reviewed-by: Ming Lei <ming.lei@redhat.com>
> > > Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> > > 
> > > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > > index fcb97b0a5743..fbadda68e23b 100644
> > > --- a/drivers/md/dm.c
> > > +++ b/drivers/md/dm.c
> > > @@ -1584,6 +1584,9 @@ static void init_clone_info(struct clone_info *ci, struct mapped_device *md,
> > >  	ci->sector = bio->bi_iter.bi_sector;
> > >  }
> > >  
> > > +#define __dm_part_stat_sub(part, field, subnd)	\
> > > +	(part_stat_get(part, field) -= (subnd))
> > > +
> > >  /*
> > >   * Entry point to split a bio into clones and submit them to the targets.
> > >   */
> > > @@ -1638,6 +1641,19 @@ static blk_qc_t __split_and_process_bio(struct mapped_device *md,
> > >  				struct bio *b = bio_split(bio, bio_sectors(bio) - ci.sector_count,
> > >  							  GFP_NOIO, &md->queue->bio_split);
> > >  				ci.io->orig_bio = b;
> > > +
> > > +				/*
> > > +				 * Adjust IO stats for each split, otherwise upon queue
> > > +				 * reentry there will be redundant IO accounting.
> > > +				 * NOTE: this is a stop-gap fix, a proper fix involves
> > > +				 * significant refactoring of DM core's bio splitting
> > > +				 * (by eliminating DM's splitting and just using bio_split)
> > > +				 */
> > > +				part_stat_lock();
> > > +				__dm_part_stat_sub(&dm_disk(md)->part0,
> > > +						   sectors[op_stat_group(bio_op(bio))], ci.sector_count);
> > > +				part_stat_unlock();
> > > +
> > >  				bio_chain(b, bio);
> > >  				ret = generic_make_request(bio);
> > >  				break;
> > > 
> > 
> > Seems to apply fine.. not sure what the problem is on your end:
> > 
> > $ git checkout stable/linux-4.20.y
> > Previous HEAD position was 8fe28cb58bcb... Linux 4.20
> > HEAD is now at 9f1a389a0b5b... Linux 4.20.5
> > 
> > $ git show a1e1cb72d96491277ede8d257ce6b48a381dd336 | patch -p1 --dry
> > patching file drivers/md/dm.c
> > Hunk #1 succeeded at 1578 (offset -6 lines).
> > Hunk #2 succeeded at 1626 (offset -15 lines).
> > 
> > $ git cherry-pick a1e1cb72d96491277ede8d257ce6b48a381dd336
> > [detached HEAD 3d6015ea633a] dm: fix redundant IO accounting for bios that need splitting
> >  Date: Thu Jan 17 10:48:01 2019 -0500
> >  1 file changed, 16 insertions(+)
> 
> Try building it, it blows up into tiny pieces :)
> 
> I guess I need a different script that says, "the patch applied, but
> broke the build", but it is so rare it's almost not worth it...

Ah, gotcha.  Because of part_stat_get() it implicitly depends on commit
1226b8dd0e913 but that shouldn't go to stable.

stable@ would need to factor out __part_stat_sub(), like
__part_stat_add(), and part_stat_sub() updated to use __part_stat_sub().

As is, existing part_stat_sub() is broken on all kernels (and with no
callers nobody cares).

Now that I've woken the dragon (Jens) and told him I papered over block
core's broekn part_stat_sub() in DM.. I'll do whatever Jens wants me to
do ;)

Mike

p.s. since this bug has existed for 1.5 years maybe nobody cares that
DM's io stats are completely bogus and we can just ignore it?  Yeah,
that is a cop-out...

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-01-28 17:18 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-28 12:50 FAILED: patch "[PATCH] dm: fix redundant IO accounting for bios that need splitting" failed to apply to 4.20-stable tree gregkh
2019-01-28 15:31 ` Mike Snitzer
2019-01-28 16:00   ` Greg KH
2019-01-28 17:18     ` Mike Snitzer

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.