All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ming Lei <ming.lei@redhat.com>
To: Mike Snitzer <snitzer@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>,
	linux-block@vger.kernel.org, dm-devel@redhat.com,
	Damien Le Moal <damien.lemoal@opensource.wdc.com>
Subject: Re: [PATCH 5/8] dm: always setup ->orig_bio in alloc_io
Date: Thu, 14 Apr 2022 11:57:54 +0800	[thread overview]
Message-ID: <YlebwjTKH2MU9tCD@T590> (raw)
In-Reply-To: <YleGKbZiHeBIJidI@redhat.com>

On Wed, Apr 13, 2022 at 10:25:45PM -0400, Mike Snitzer wrote:
> On Wed, Apr 13 2022 at  8:36P -0400,
> Ming Lei <ming.lei@redhat.com> wrote:
> 
> > On Wed, Apr 13, 2022 at 01:58:54PM -0400, Mike Snitzer wrote:
> > > 
> > > The bigger issue with this patch is that you've caused
> > > dm_submit_bio_remap() to go back to accounting the entire original bio
> > > before any split occurs.  That is a problem because you'll end up
> > > accounting that bio for every split, so in split heavy workloads the
> > > IO accounting won't reflect when the IO is actually issued and we'll
> > > regress back to having very inaccurate and incorrect IO accounting for
> > > dm_submit_bio_remap() heavy targets (e.g. dm-crypt).
> > 
> > Good catch, but we know the length of mapped part in original bio before
> > calling __map_bio(), so io->sectors/io->offset_sector can be setup here,
> > something like the following delta change should address it:
> > 
> > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > index db23efd6bbf6..06b554f3104b 100644
> > --- a/drivers/md/dm.c
> > +++ b/drivers/md/dm.c
> > @@ -1558,6 +1558,13 @@ static int __split_and_process_bio(struct clone_info *ci)
> >  
> >  	len = min_t(sector_t, max_io_len(ti, ci->sector), ci->sector_count);
> >  	clone = alloc_tio(ci, ti, 0, &len, GFP_NOIO);
> > +
> > +	if (ci->sector_count > len) {
> > +		/* setup the mapped part for accounting */
> > +		dm_io_set_flag(ci->io, DM_IO_SPLITTED);
> > +		ci->io->sectors = len;
> > +		ci->io->sector_offset = bio_end_sector(ci->bio) - ci->sector;
> > +	}
> >  	__map_bio(clone);
> >  
> >  	ci->sector += len;
> > @@ -1603,11 +1610,6 @@ static void dm_split_and_process_bio(struct mapped_device *md,
> >  	if (error || !ci.sector_count)
> >  		goto out;
> >  
> > -	/* setup the mapped part for accounting */
> > -	dm_io_set_flag(ci.io, DM_IO_SPLITTED);
> > -	ci.io->sectors = bio_sectors(bio) - ci.sector_count;
> > -	ci.io->sector_offset = bio_end_sector(bio) - bio->bi_iter.bi_sector;
> > -
> >  	bio_trim(bio, ci.io->sectors, ci.sector_count);
> >  	trace_block_split(bio, bio->bi_iter.bi_sector);
> >  	bio_inc_remaining(bio);
> > 
> > -- 
> > Ming
> > 
> 
> Unfortunately we do need splitting after __map_bio() because a dm
> target's ->map can use dm_accept_partial_bio() to further reduce a
> bio's mapped part.
> 
> But I think dm_accept_partial_bio() could be trained to update
> tio->io->sectors?

->orig_bio is just for serving io accounting, but ->orig_bio isn't
passed to dm_accept_partial_bio(), and not gets updated after
dm_accept_partial_bio() is called.

If that is one issue, it must be one existed issue in dm io accounting
since ->orig_bio isn't updated when dm_accept_partial_bio() is called.

So do we have to update it?

> 
> dm_accept_partial_bio() has been around for a long time, it keeps
> growing BUG_ONs that are actually helpful to narrow its use to "normal
> IO", so it should be OK.
> 
> Running 'make check' in a built cryptsetup source tree should be a
> good test for DM target interface functionality.

Care to share the test tree?

> 
> But there aren't automated tests for IO accounting correctness yet.

I did verify io accounting by running dm-thin with blk-throttle, and the
observed throughput is same with expected setting. Running both small bs
and large bs, so non-split and split code path are covered.

Maybe you can add this kind of test into dm io accounting automated test.


Thanks,
Ming


WARNING: multiple messages have this Message-ID (diff)
From: Ming Lei <ming.lei@redhat.com>
To: Mike Snitzer <snitzer@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>,
	linux-block@vger.kernel.org, dm-devel@redhat.com,
	Damien Le Moal <damien.lemoal@opensource.wdc.com>
Subject: Re: [dm-devel] [PATCH 5/8] dm: always setup ->orig_bio in alloc_io
Date: Thu, 14 Apr 2022 11:57:54 +0800	[thread overview]
Message-ID: <YlebwjTKH2MU9tCD@T590> (raw)
In-Reply-To: <YleGKbZiHeBIJidI@redhat.com>

On Wed, Apr 13, 2022 at 10:25:45PM -0400, Mike Snitzer wrote:
> On Wed, Apr 13 2022 at  8:36P -0400,
> Ming Lei <ming.lei@redhat.com> wrote:
> 
> > On Wed, Apr 13, 2022 at 01:58:54PM -0400, Mike Snitzer wrote:
> > > 
> > > The bigger issue with this patch is that you've caused
> > > dm_submit_bio_remap() to go back to accounting the entire original bio
> > > before any split occurs.  That is a problem because you'll end up
> > > accounting that bio for every split, so in split heavy workloads the
> > > IO accounting won't reflect when the IO is actually issued and we'll
> > > regress back to having very inaccurate and incorrect IO accounting for
> > > dm_submit_bio_remap() heavy targets (e.g. dm-crypt).
> > 
> > Good catch, but we know the length of mapped part in original bio before
> > calling __map_bio(), so io->sectors/io->offset_sector can be setup here,
> > something like the following delta change should address it:
> > 
> > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > index db23efd6bbf6..06b554f3104b 100644
> > --- a/drivers/md/dm.c
> > +++ b/drivers/md/dm.c
> > @@ -1558,6 +1558,13 @@ static int __split_and_process_bio(struct clone_info *ci)
> >  
> >  	len = min_t(sector_t, max_io_len(ti, ci->sector), ci->sector_count);
> >  	clone = alloc_tio(ci, ti, 0, &len, GFP_NOIO);
> > +
> > +	if (ci->sector_count > len) {
> > +		/* setup the mapped part for accounting */
> > +		dm_io_set_flag(ci->io, DM_IO_SPLITTED);
> > +		ci->io->sectors = len;
> > +		ci->io->sector_offset = bio_end_sector(ci->bio) - ci->sector;
> > +	}
> >  	__map_bio(clone);
> >  
> >  	ci->sector += len;
> > @@ -1603,11 +1610,6 @@ static void dm_split_and_process_bio(struct mapped_device *md,
> >  	if (error || !ci.sector_count)
> >  		goto out;
> >  
> > -	/* setup the mapped part for accounting */
> > -	dm_io_set_flag(ci.io, DM_IO_SPLITTED);
> > -	ci.io->sectors = bio_sectors(bio) - ci.sector_count;
> > -	ci.io->sector_offset = bio_end_sector(bio) - bio->bi_iter.bi_sector;
> > -
> >  	bio_trim(bio, ci.io->sectors, ci.sector_count);
> >  	trace_block_split(bio, bio->bi_iter.bi_sector);
> >  	bio_inc_remaining(bio);
> > 
> > -- 
> > Ming
> > 
> 
> Unfortunately we do need splitting after __map_bio() because a dm
> target's ->map can use dm_accept_partial_bio() to further reduce a
> bio's mapped part.
> 
> But I think dm_accept_partial_bio() could be trained to update
> tio->io->sectors?

->orig_bio is just for serving io accounting, but ->orig_bio isn't
passed to dm_accept_partial_bio(), and not gets updated after
dm_accept_partial_bio() is called.

If that is one issue, it must be one existed issue in dm io accounting
since ->orig_bio isn't updated when dm_accept_partial_bio() is called.

So do we have to update it?

> 
> dm_accept_partial_bio() has been around for a long time, it keeps
> growing BUG_ONs that are actually helpful to narrow its use to "normal
> IO", so it should be OK.
> 
> Running 'make check' in a built cryptsetup source tree should be a
> good test for DM target interface functionality.

Care to share the test tree?

> 
> But there aren't automated tests for IO accounting correctness yet.

I did verify io accounting by running dm-thin with blk-throttle, and the
observed throughput is same with expected setting. Running both small bs
and large bs, so non-split and split code path are covered.

Maybe you can add this kind of test into dm io accounting automated test.


Thanks,
Ming
--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


  reply	other threads:[~2022-04-14  3:58 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-12  8:56 [dm-devel] [PATCH 0/8] dm: io accounting & polling improvement Ming Lei
2022-04-12  8:56 ` Ming Lei
2022-04-12  8:56 ` [dm-devel] [PATCH 1/8] block: replace disk based account with bdev's Ming Lei
2022-04-12  8:56   ` Ming Lei
2022-04-16  5:57   ` Christoph Hellwig
2022-04-16  5:57     ` [dm-devel] " Christoph Hellwig
2022-04-12  8:56 ` [dm-devel] [PATCH 2/8] dm: don't pass bio to __dm_start_io_acct and dm_end_io_acct Ming Lei
2022-04-12  8:56   ` Ming Lei
2022-04-12  8:56 ` [dm-devel] [PATCH 3/8] dm: pass 'dm_io' instance to dm_io_acct directly Ming Lei
2022-04-12  8:56   ` Ming Lei
2022-04-12 20:28   ` [dm-devel] " Mike Snitzer
2022-04-12 20:28     ` Mike Snitzer
2022-04-13  1:43     ` Ming Lei
2022-04-13  1:43       ` [dm-devel] " Ming Lei
2022-04-12  8:56 ` [dm-devel] [PATCH 4/8] dm: switch to bdev based io accounting interface Ming Lei
2022-04-12  8:56   ` Ming Lei
2022-04-12  8:56 ` [dm-devel] [PATCH 5/8] dm: always setup ->orig_bio in alloc_io Ming Lei
2022-04-12  8:56   ` Ming Lei
2022-04-12 20:52   ` [dm-devel] " Mike Snitzer
2022-04-12 20:52     ` Mike Snitzer
2022-04-12 22:38     ` [dm-devel] " Damien Le Moal
2022-04-12 22:38       ` Damien Le Moal
2022-04-12 23:00       ` [dm-devel] " Mike Snitzer
2022-04-12 23:00         ` Mike Snitzer
2022-04-12 23:31         ` [dm-devel] " Damien Le Moal
2022-04-12 23:31           ` Damien Le Moal
2022-04-13  0:00         ` Damien Le Moal
2022-04-13  0:00           ` [dm-devel] " Damien Le Moal
2022-04-13  1:56     ` Ming Lei
2022-04-13  1:56       ` [dm-devel] " Ming Lei
2022-04-13  6:12       ` Mike Snitzer
2022-04-13  6:12         ` [dm-devel] " Mike Snitzer
2022-04-13 12:26         ` Ming Lei
2022-04-13 12:26           ` [dm-devel] " Ming Lei
2022-04-13 17:58           ` Mike Snitzer
2022-04-13 17:58             ` [dm-devel] " Mike Snitzer
2022-04-14  0:36             ` Ming Lei
2022-04-14  0:36               ` [dm-devel] " Ming Lei
2022-04-14  2:25               ` Mike Snitzer
2022-04-14  2:25                 ` [dm-devel] " Mike Snitzer
2022-04-14  3:57                 ` Ming Lei [this message]
2022-04-14  3:57                   ` Ming Lei
2022-04-14 17:45                   ` Mike Snitzer
2022-04-14 17:45                     ` [dm-devel] " Mike Snitzer
2022-04-15  0:14                     ` Ming Lei
2022-04-15  0:14                       ` [dm-devel] " Ming Lei
2022-04-15 21:06                       ` Mike Snitzer
2022-04-15 21:06                         ` [dm-devel] " Mike Snitzer
2022-04-17  2:22                         ` Ming Lei
2022-04-17  2:22                           ` [dm-devel] " Ming Lei
2022-04-12  8:56 ` [dm-devel] [PATCH 6/8] dm: don't grab target io reference in dm_zone_map_bio Ming Lei
2022-04-12  8:56   ` Ming Lei
2022-04-12  8:56 ` [dm-devel] [PATCH 7/8] dm: improve target io referencing Ming Lei
2022-04-12  8:56   ` Ming Lei
2022-04-12  8:56 ` [dm-devel] [PATCH 8/8] dm: put all polled io into one single list Ming Lei
2022-04-12  8:56   ` Ming Lei
2022-04-12 17:15 ` [PATCH 0/8] dm: io accounting & polling improvement Mike Snitzer
2022-04-12 17:15   ` [dm-devel] " Mike Snitzer
2022-04-16  5:58   ` Christoph Hellwig
2022-04-16  5:58     ` [dm-devel] " Christoph Hellwig
2022-04-17  1:23     ` Mike Snitzer
2022-04-17  1:23       ` [dm-devel] " Mike Snitzer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YlebwjTKH2MU9tCD@T590 \
    --to=ming.lei@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=damien.lemoal@opensource.wdc.com \
    --cc=dm-devel@redhat.com \
    --cc=linux-block@vger.kernel.org \
    --cc=snitzer@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.