All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Snitzer <snitzer@redhat.com>
To: Ming Lei <ming.lei@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>,
	linux-block@vger.kernel.org, dm-devel@redhat.com,
	Damien Le Moal <damien.lemoal@opensource.wdc.com>
Subject: Re: [PATCH 5/8] dm: always setup ->orig_bio in alloc_io
Date: Fri, 15 Apr 2022 17:06:55 -0400	[thread overview]
Message-ID: <Ylneb0NWsCab0HqI@redhat.com> (raw)
In-Reply-To: <Yli48LmLi7dEngLn@T590>

On Thu, Apr 14 2022 at  8:14P -0400,
Ming Lei <ming.lei@redhat.com> wrote:

> On Thu, Apr 14, 2022 at 01:45:33PM -0400, Mike Snitzer wrote:
> > On Wed, Apr 13 2022 at 11:57P -0400,
> > Ming Lei <ming.lei@redhat.com> wrote:
> > 
> > > On Wed, Apr 13, 2022 at 10:25:45PM -0400, Mike Snitzer wrote:
> > > > On Wed, Apr 13 2022 at  8:36P -0400,
> > > > Ming Lei <ming.lei@redhat.com> wrote:
> > > > 
> > > > > On Wed, Apr 13, 2022 at 01:58:54PM -0400, Mike Snitzer wrote:
> > > > > > 
> > > > > > The bigger issue with this patch is that you've caused
> > > > > > dm_submit_bio_remap() to go back to accounting the entire original bio
> > > > > > before any split occurs.  That is a problem because you'll end up
> > > > > > accounting that bio for every split, so in split heavy workloads the
> > > > > > IO accounting won't reflect when the IO is actually issued and we'll
> > > > > > regress back to having very inaccurate and incorrect IO accounting for
> > > > > > dm_submit_bio_remap() heavy targets (e.g. dm-crypt).
> > > > > 
> > > > > Good catch, but we know the length of mapped part in original bio before
> > > > > calling __map_bio(), so io->sectors/io->offset_sector can be setup here,
> > > > > something like the following delta change should address it:
> > > > > 
> > > > > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > > > > index db23efd6bbf6..06b554f3104b 100644
> > > > > --- a/drivers/md/dm.c
> > > > > +++ b/drivers/md/dm.c
> > > > > @@ -1558,6 +1558,13 @@ static int __split_and_process_bio(struct clone_info *ci)
> > > > >  
> > > > >  	len = min_t(sector_t, max_io_len(ti, ci->sector), ci->sector_count);
> > > > >  	clone = alloc_tio(ci, ti, 0, &len, GFP_NOIO);
> > > > > +
> > > > > +	if (ci->sector_count > len) {
> > > > > +		/* setup the mapped part for accounting */
> > > > > +		dm_io_set_flag(ci->io, DM_IO_SPLITTED);
> > > > > +		ci->io->sectors = len;
> > > > > +		ci->io->sector_offset = bio_end_sector(ci->bio) - ci->sector;
> > > > > +	}
> > > > >  	__map_bio(clone);
> > > > >  
> > > > >  	ci->sector += len;
> > > > > @@ -1603,11 +1610,6 @@ static void dm_split_and_process_bio(struct mapped_device *md,
> > > > >  	if (error || !ci.sector_count)
> > > > >  		goto out;
> > > > >  
> > > > > -	/* setup the mapped part for accounting */
> > > > > -	dm_io_set_flag(ci.io, DM_IO_SPLITTED);
> > > > > -	ci.io->sectors = bio_sectors(bio) - ci.sector_count;
> > > > > -	ci.io->sector_offset = bio_end_sector(bio) - bio->bi_iter.bi_sector;
> > > > > -
> > > > >  	bio_trim(bio, ci.io->sectors, ci.sector_count);
> > > > >  	trace_block_split(bio, bio->bi_iter.bi_sector);
> > > > >  	bio_inc_remaining(bio);
> > > > > 
> > > > > -- 
> > > > > Ming
> > > > > 
> > > > 
> > > > Unfortunately we do need splitting after __map_bio() because a dm
> > > > target's ->map can use dm_accept_partial_bio() to further reduce a
> > > > bio's mapped part.
> > > > 
> > > > But I think dm_accept_partial_bio() could be trained to update
> > > > tio->io->sectors?
> > > 
> > > ->orig_bio is just for serving io accounting, but ->orig_bio isn't
> > > passed to dm_accept_partial_bio(), and not gets updated after
> > > dm_accept_partial_bio() is called.
> > > 
> > > If that is one issue, it must be one existed issue in dm io accounting
> > > since ->orig_bio isn't updated when dm_accept_partial_bio() is called.
> > 
> > Recall that ->orig_bio is updated after the bio_split() at the bottom of
> > dm_split_and_process_bio().
> > 
> > That bio_split() is based on ci->sector_count, which is reduced as a
> > side-effect of dm_accept_partial_bio() reducing tio->len_ptr.  It is
> > pretty circuitous so I can absolutely understand why you didn't
> > immediately appreciate the interface.  The block comment above
> > dm_accept_partial_bio() does a pretty comprehensive job of explaining.
> 
> Go it now, thanks for the explanation.
> 
> As you mentioned, it can be addressed in dm_accept_partial_bio()
> by updating ti->io->sectors.

Yes, I rebased your patchset ontop of dm-5.19 and fixed up your
splitting like we discussed.  I'll be rebasing ontop of v5.18-rc3 once
it is released but please have a look at this 'dm-5.19-v2' branch:
https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/log/?h=dm-5.19-v2

And this commit in particular:
https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-5.19-v2&id=fe5a99da8b0d0518342f5cdb522a06b0f123ca09

Once I've verified with you that it looks OK I'll fold it into your
commit (at the same time I rebase on v5.18-rc3 early next week).

In general this rebase sucked.. but I wanted to layer your changes
ontop of earlier changes I had already staged for 5.19. I've at least
verified that DM's bio polling seems to work like usual and also
verified that cryptsetup's 'make check' passes.

No urgency on you reviewing before early next week.  Have a great
weekend.

Mike


WARNING: multiple messages have this Message-ID (diff)
From: Mike Snitzer <snitzer@redhat.com>
To: Ming Lei <ming.lei@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>,
	linux-block@vger.kernel.org, dm-devel@redhat.com,
	Damien Le Moal <damien.lemoal@opensource.wdc.com>
Subject: Re: [dm-devel] [PATCH 5/8] dm: always setup ->orig_bio in alloc_io
Date: Fri, 15 Apr 2022 17:06:55 -0400	[thread overview]
Message-ID: <Ylneb0NWsCab0HqI@redhat.com> (raw)
In-Reply-To: <Yli48LmLi7dEngLn@T590>

On Thu, Apr 14 2022 at  8:14P -0400,
Ming Lei <ming.lei@redhat.com> wrote:

> On Thu, Apr 14, 2022 at 01:45:33PM -0400, Mike Snitzer wrote:
> > On Wed, Apr 13 2022 at 11:57P -0400,
> > Ming Lei <ming.lei@redhat.com> wrote:
> > 
> > > On Wed, Apr 13, 2022 at 10:25:45PM -0400, Mike Snitzer wrote:
> > > > On Wed, Apr 13 2022 at  8:36P -0400,
> > > > Ming Lei <ming.lei@redhat.com> wrote:
> > > > 
> > > > > On Wed, Apr 13, 2022 at 01:58:54PM -0400, Mike Snitzer wrote:
> > > > > > 
> > > > > > The bigger issue with this patch is that you've caused
> > > > > > dm_submit_bio_remap() to go back to accounting the entire original bio
> > > > > > before any split occurs.  That is a problem because you'll end up
> > > > > > accounting that bio for every split, so in split heavy workloads the
> > > > > > IO accounting won't reflect when the IO is actually issued and we'll
> > > > > > regress back to having very inaccurate and incorrect IO accounting for
> > > > > > dm_submit_bio_remap() heavy targets (e.g. dm-crypt).
> > > > > 
> > > > > Good catch, but we know the length of mapped part in original bio before
> > > > > calling __map_bio(), so io->sectors/io->offset_sector can be setup here,
> > > > > something like the following delta change should address it:
> > > > > 
> > > > > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > > > > index db23efd6bbf6..06b554f3104b 100644
> > > > > --- a/drivers/md/dm.c
> > > > > +++ b/drivers/md/dm.c
> > > > > @@ -1558,6 +1558,13 @@ static int __split_and_process_bio(struct clone_info *ci)
> > > > >  
> > > > >  	len = min_t(sector_t, max_io_len(ti, ci->sector), ci->sector_count);
> > > > >  	clone = alloc_tio(ci, ti, 0, &len, GFP_NOIO);
> > > > > +
> > > > > +	if (ci->sector_count > len) {
> > > > > +		/* setup the mapped part for accounting */
> > > > > +		dm_io_set_flag(ci->io, DM_IO_SPLITTED);
> > > > > +		ci->io->sectors = len;
> > > > > +		ci->io->sector_offset = bio_end_sector(ci->bio) - ci->sector;
> > > > > +	}
> > > > >  	__map_bio(clone);
> > > > >  
> > > > >  	ci->sector += len;
> > > > > @@ -1603,11 +1610,6 @@ static void dm_split_and_process_bio(struct mapped_device *md,
> > > > >  	if (error || !ci.sector_count)
> > > > >  		goto out;
> > > > >  
> > > > > -	/* setup the mapped part for accounting */
> > > > > -	dm_io_set_flag(ci.io, DM_IO_SPLITTED);
> > > > > -	ci.io->sectors = bio_sectors(bio) - ci.sector_count;
> > > > > -	ci.io->sector_offset = bio_end_sector(bio) - bio->bi_iter.bi_sector;
> > > > > -
> > > > >  	bio_trim(bio, ci.io->sectors, ci.sector_count);
> > > > >  	trace_block_split(bio, bio->bi_iter.bi_sector);
> > > > >  	bio_inc_remaining(bio);
> > > > > 
> > > > > -- 
> > > > > Ming
> > > > > 
> > > > 
> > > > Unfortunately we do need splitting after __map_bio() because a dm
> > > > target's ->map can use dm_accept_partial_bio() to further reduce a
> > > > bio's mapped part.
> > > > 
> > > > But I think dm_accept_partial_bio() could be trained to update
> > > > tio->io->sectors?
> > > 
> > > ->orig_bio is just for serving io accounting, but ->orig_bio isn't
> > > passed to dm_accept_partial_bio(), and not gets updated after
> > > dm_accept_partial_bio() is called.
> > > 
> > > If that is one issue, it must be one existed issue in dm io accounting
> > > since ->orig_bio isn't updated when dm_accept_partial_bio() is called.
> > 
> > Recall that ->orig_bio is updated after the bio_split() at the bottom of
> > dm_split_and_process_bio().
> > 
> > That bio_split() is based on ci->sector_count, which is reduced as a
> > side-effect of dm_accept_partial_bio() reducing tio->len_ptr.  It is
> > pretty circuitous so I can absolutely understand why you didn't
> > immediately appreciate the interface.  The block comment above
> > dm_accept_partial_bio() does a pretty comprehensive job of explaining.
> 
> Go it now, thanks for the explanation.
> 
> As you mentioned, it can be addressed in dm_accept_partial_bio()
> by updating ti->io->sectors.

Yes, I rebased your patchset ontop of dm-5.19 and fixed up your
splitting like we discussed.  I'll be rebasing ontop of v5.18-rc3 once
it is released but please have a look at this 'dm-5.19-v2' branch:
https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/log/?h=dm-5.19-v2

And this commit in particular:
https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-5.19-v2&id=fe5a99da8b0d0518342f5cdb522a06b0f123ca09

Once I've verified with you that it looks OK I'll fold it into your
commit (at the same time I rebase on v5.18-rc3 early next week).

In general this rebase sucked.. but I wanted to layer your changes
ontop of earlier changes I had already staged for 5.19. I've at least
verified that DM's bio polling seems to work like usual and also
verified that cryptsetup's 'make check' passes.

No urgency on you reviewing before early next week.  Have a great
weekend.

Mike

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


  reply	other threads:[~2022-04-15 21:07 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-12  8:56 [dm-devel] [PATCH 0/8] dm: io accounting & polling improvement Ming Lei
2022-04-12  8:56 ` Ming Lei
2022-04-12  8:56 ` [dm-devel] [PATCH 1/8] block: replace disk based account with bdev's Ming Lei
2022-04-12  8:56   ` Ming Lei
2022-04-16  5:57   ` Christoph Hellwig
2022-04-16  5:57     ` [dm-devel] " Christoph Hellwig
2022-04-12  8:56 ` [dm-devel] [PATCH 2/8] dm: don't pass bio to __dm_start_io_acct and dm_end_io_acct Ming Lei
2022-04-12  8:56   ` Ming Lei
2022-04-12  8:56 ` [dm-devel] [PATCH 3/8] dm: pass 'dm_io' instance to dm_io_acct directly Ming Lei
2022-04-12  8:56   ` Ming Lei
2022-04-12 20:28   ` [dm-devel] " Mike Snitzer
2022-04-12 20:28     ` Mike Snitzer
2022-04-13  1:43     ` Ming Lei
2022-04-13  1:43       ` [dm-devel] " Ming Lei
2022-04-12  8:56 ` [dm-devel] [PATCH 4/8] dm: switch to bdev based io accounting interface Ming Lei
2022-04-12  8:56   ` Ming Lei
2022-04-12  8:56 ` [dm-devel] [PATCH 5/8] dm: always setup ->orig_bio in alloc_io Ming Lei
2022-04-12  8:56   ` Ming Lei
2022-04-12 20:52   ` [dm-devel] " Mike Snitzer
2022-04-12 20:52     ` Mike Snitzer
2022-04-12 22:38     ` [dm-devel] " Damien Le Moal
2022-04-12 22:38       ` Damien Le Moal
2022-04-12 23:00       ` [dm-devel] " Mike Snitzer
2022-04-12 23:00         ` Mike Snitzer
2022-04-12 23:31         ` [dm-devel] " Damien Le Moal
2022-04-12 23:31           ` Damien Le Moal
2022-04-13  0:00         ` Damien Le Moal
2022-04-13  0:00           ` [dm-devel] " Damien Le Moal
2022-04-13  1:56     ` Ming Lei
2022-04-13  1:56       ` [dm-devel] " Ming Lei
2022-04-13  6:12       ` Mike Snitzer
2022-04-13  6:12         ` [dm-devel] " Mike Snitzer
2022-04-13 12:26         ` Ming Lei
2022-04-13 12:26           ` [dm-devel] " Ming Lei
2022-04-13 17:58           ` Mike Snitzer
2022-04-13 17:58             ` [dm-devel] " Mike Snitzer
2022-04-14  0:36             ` Ming Lei
2022-04-14  0:36               ` [dm-devel] " Ming Lei
2022-04-14  2:25               ` Mike Snitzer
2022-04-14  2:25                 ` [dm-devel] " Mike Snitzer
2022-04-14  3:57                 ` Ming Lei
2022-04-14  3:57                   ` [dm-devel] " Ming Lei
2022-04-14 17:45                   ` Mike Snitzer
2022-04-14 17:45                     ` [dm-devel] " Mike Snitzer
2022-04-15  0:14                     ` Ming Lei
2022-04-15  0:14                       ` [dm-devel] " Ming Lei
2022-04-15 21:06                       ` Mike Snitzer [this message]
2022-04-15 21:06                         ` Mike Snitzer
2022-04-17  2:22                         ` Ming Lei
2022-04-17  2:22                           ` [dm-devel] " Ming Lei
2022-04-12  8:56 ` [dm-devel] [PATCH 6/8] dm: don't grab target io reference in dm_zone_map_bio Ming Lei
2022-04-12  8:56   ` Ming Lei
2022-04-12  8:56 ` [dm-devel] [PATCH 7/8] dm: improve target io referencing Ming Lei
2022-04-12  8:56   ` Ming Lei
2022-04-12  8:56 ` [dm-devel] [PATCH 8/8] dm: put all polled io into one single list Ming Lei
2022-04-12  8:56   ` Ming Lei
2022-04-12 17:15 ` [PATCH 0/8] dm: io accounting & polling improvement Mike Snitzer
2022-04-12 17:15   ` [dm-devel] " Mike Snitzer
2022-04-16  5:58   ` Christoph Hellwig
2022-04-16  5:58     ` [dm-devel] " Christoph Hellwig
2022-04-17  1:23     ` Mike Snitzer
2022-04-17  1:23       ` [dm-devel] " Mike Snitzer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Ylneb0NWsCab0HqI@redhat.com \
    --to=snitzer@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=damien.lemoal@opensource.wdc.com \
    --cc=dm-devel@redhat.com \
    --cc=linux-block@vger.kernel.org \
    --cc=ming.lei@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.