All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Snitzer <snitzer@redhat.com>
To: John Dorminy <jdorminy@redhat.com>
Cc: linux-block <linux-block@vger.kernel.org>,
	device-mapper development <dm-devel@redhat.com>,
	Bruce Johnston <bjohnsto@redhat.com>
Subject: Re: block: revert to using min_not_zero() when stacking chunk_sectors
Date: Mon, 30 Nov 2020 18:24:17 -0500	[thread overview]
Message-ID: <20201130232417.GA12865@redhat.com> (raw)
In-Reply-To: <CAMeeMh8fb2JEBmuSuTP8ys6Xr+GpFqcUr5Py73W4wCQb1MCuAw@mail.gmail.com>

On Mon, Nov 30 2020 at  3:51pm -0500,
John Dorminy <jdorminy@redhat.com> wrote:

> I don't think this suffices, as it allows IOs that span max(a,b) chunk
> boundaries.
> 
> Chunk sectors is defined as "if set, it will prevent merging across
> chunk boundaries". Pulling the example from the last change:

If you're going to cherry pick a portion of a commit header please
reference the commit id and use quotes or indentation to make it clear
what is being referenced, etc.

> It is possible, albeit more unlikely, for a block device to have a non
> power-of-2 for chunk_sectors (e.g. 10+2 RAID6 with 128K chunk_sectors,
> which results in a full-stripe size of 1280K. This causes the RAID6's
> io_opt to be advertised as 1280K, and a stacked device _could_ then be
> made to use a blocksize, aka chunk_sectors, that matches non power-of-2
> io_opt of underlying RAID6 -- resulting in stacked device's
> chunk_sectors being a non power-of-2).

This was from the header for commit 07d098e6bba ("block: allow
'chunk_sectors' to be non-power-of-2")

> Suppose the stacked device had a block size/chunk_sectors of 256k.

Quite the tangent just to setup an a toy example of say: thinp with 256K
blocksize/chunk_sectors ontop of a RAID6 with a chunk_sectors of 128K
and stripesize of 1280K.

> Then, with this change, some IOs issued by the stacked device to the
> RAID beneath could span 1280k sector boundaries, and require further
> splitting still.
> I think combining as the GCD is better, since any IO
> of size gcd(a,b) definitely spans neither a a-chunk nor a b-chunk
> boundary.

To be clear, you are _not_ saying using lcm_not_zero() is correct.
You're saying that simply reverting block core back to using
min_not_zero() may not be as good as using gcd().

While that may be true (not sure yet) you've now muddied a conservative
fix (that reverts block core back to its longstanding use of
min_not_zero for chunk_sectors) in pursuit of addressing some different
concern than the case that you _really_ care about getting fixed
(I'm inferring based on your regression report):
4K chunk_sectors stacked on larger chunk_sectors, e.g. 256K

My patch fixes the case and doesn't try to innovate, it tries to get
block core back to sane chunk_sectors stacking (which I broke).

> But it's possible I'm misunderstanding the purpose of chunk_sectors,
> or there should be a check that the one of the two devices' chunk
> sizes divides the other.

Seriously not amused by your response, I now have to do damage control
because you have a concern that you really weren't able to communicate
very effectively.

But I got this far, so for your above toy example (stacking 128K and
256K chunk_sectors):
min_not_zero = 128K
gcd = 128K

SO please explain to me why gcd() is better at setting a chunk_sectors
that ensures IO doesn't span 1280K stripesize (nevermind that
chunk_sectors has no meaningful relation to io_opt to begin with!).

Mike


> 
> On Mon, Nov 30, 2020 at 12:18 PM Mike Snitzer <snitzer@redhat.com> wrote:
> >
> > chunk_sectors must reflect the most limited of all devices in the IO
> > stack.
> >
> > Otherwise malformed IO may result. E.g.: prior to this fix,
> > ->chunk_sectors = lcm_not_zero(8, 128) would result in
> > blk_max_size_offset() splitting IO at 128 sectors rather than the
> > required more restrictive 8 sectors.
> >
> > Fixes: 22ada802ede8 ("block: use lcm_not_zero() when stacking chunk_sectors")
> > Cc: stable@vger.kernel.org
> > Reported-by: John Dorminy <jdorminy@redhat.com>
> > Reported-by: Bruce Johnston <bjohnsto@redhat.com>
> > Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> > ---
> >  block/blk-settings.c | 5 ++++-
> >  1 file changed, 4 insertions(+), 1 deletion(-)
> >
> > diff --git a/block/blk-settings.c b/block/blk-settings.c
> > index 9741d1d83e98..1d9decd4646e 100644
> > --- a/block/blk-settings.c
> > +++ b/block/blk-settings.c
> > @@ -547,7 +547,10 @@ int blk_stack_limits(struct queue_limits *t, struct queue_limits *b,
> >
> >         t->io_min = max(t->io_min, b->io_min);
> >         t->io_opt = lcm_not_zero(t->io_opt, b->io_opt);
> > -       t->chunk_sectors = lcm_not_zero(t->chunk_sectors, b->chunk_sectors);
> > +
> > +       if (b->chunk_sectors)
> > +               t->chunk_sectors = min_not_zero(t->chunk_sectors,
> > +                                               b->chunk_sectors);
> >
> >         /* Physical block size a multiple of the logical block size? */
> >         if (t->physical_block_size & (t->logical_block_size - 1)) {
> > --
> > 2.15.0
> >
> 


WARNING: multiple messages have this Message-ID (diff)
From: Mike Snitzer <snitzer@redhat.com>
To: John Dorminy <jdorminy@redhat.com>
Cc: linux-block <linux-block@vger.kernel.org>,
	device-mapper development <dm-devel@redhat.com>,
	Bruce Johnston <bjohnsto@redhat.com>
Subject: Re: [dm-devel] block: revert to using min_not_zero() when stacking chunk_sectors
Date: Mon, 30 Nov 2020 18:24:17 -0500	[thread overview]
Message-ID: <20201130232417.GA12865@redhat.com> (raw)
In-Reply-To: <CAMeeMh8fb2JEBmuSuTP8ys6Xr+GpFqcUr5Py73W4wCQb1MCuAw@mail.gmail.com>

On Mon, Nov 30 2020 at  3:51pm -0500,
John Dorminy <jdorminy@redhat.com> wrote:

> I don't think this suffices, as it allows IOs that span max(a,b) chunk
> boundaries.
> 
> Chunk sectors is defined as "if set, it will prevent merging across
> chunk boundaries". Pulling the example from the last change:

If you're going to cherry pick a portion of a commit header please
reference the commit id and use quotes or indentation to make it clear
what is being referenced, etc.

> It is possible, albeit more unlikely, for a block device to have a non
> power-of-2 for chunk_sectors (e.g. 10+2 RAID6 with 128K chunk_sectors,
> which results in a full-stripe size of 1280K. This causes the RAID6's
> io_opt to be advertised as 1280K, and a stacked device _could_ then be
> made to use a blocksize, aka chunk_sectors, that matches non power-of-2
> io_opt of underlying RAID6 -- resulting in stacked device's
> chunk_sectors being a non power-of-2).

This was from the header for commit 07d098e6bba ("block: allow
'chunk_sectors' to be non-power-of-2")

> Suppose the stacked device had a block size/chunk_sectors of 256k.

Quite the tangent just to setup an a toy example of say: thinp with 256K
blocksize/chunk_sectors ontop of a RAID6 with a chunk_sectors of 128K
and stripesize of 1280K.

> Then, with this change, some IOs issued by the stacked device to the
> RAID beneath could span 1280k sector boundaries, and require further
> splitting still.
> I think combining as the GCD is better, since any IO
> of size gcd(a,b) definitely spans neither a a-chunk nor a b-chunk
> boundary.

To be clear, you are _not_ saying using lcm_not_zero() is correct.
You're saying that simply reverting block core back to using
min_not_zero() may not be as good as using gcd().

While that may be true (not sure yet) you've now muddied a conservative
fix (that reverts block core back to its longstanding use of
min_not_zero for chunk_sectors) in pursuit of addressing some different
concern than the case that you _really_ care about getting fixed
(I'm inferring based on your regression report):
4K chunk_sectors stacked on larger chunk_sectors, e.g. 256K

My patch fixes the case and doesn't try to innovate, it tries to get
block core back to sane chunk_sectors stacking (which I broke).

> But it's possible I'm misunderstanding the purpose of chunk_sectors,
> or there should be a check that the one of the two devices' chunk
> sizes divides the other.

Seriously not amused by your response, I now have to do damage control
because you have a concern that you really weren't able to communicate
very effectively.

But I got this far, so for your above toy example (stacking 128K and
256K chunk_sectors):
min_not_zero = 128K
gcd = 128K

SO please explain to me why gcd() is better at setting a chunk_sectors
that ensures IO doesn't span 1280K stripesize (nevermind that
chunk_sectors has no meaningful relation to io_opt to begin with!).

Mike


> 
> On Mon, Nov 30, 2020 at 12:18 PM Mike Snitzer <snitzer@redhat.com> wrote:
> >
> > chunk_sectors must reflect the most limited of all devices in the IO
> > stack.
> >
> > Otherwise malformed IO may result. E.g.: prior to this fix,
> > ->chunk_sectors = lcm_not_zero(8, 128) would result in
> > blk_max_size_offset() splitting IO at 128 sectors rather than the
> > required more restrictive 8 sectors.
> >
> > Fixes: 22ada802ede8 ("block: use lcm_not_zero() when stacking chunk_sectors")
> > Cc: stable@vger.kernel.org
> > Reported-by: John Dorminy <jdorminy@redhat.com>
> > Reported-by: Bruce Johnston <bjohnsto@redhat.com>
> > Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> > ---
> >  block/blk-settings.c | 5 ++++-
> >  1 file changed, 4 insertions(+), 1 deletion(-)
> >
> > diff --git a/block/blk-settings.c b/block/blk-settings.c
> > index 9741d1d83e98..1d9decd4646e 100644
> > --- a/block/blk-settings.c
> > +++ b/block/blk-settings.c
> > @@ -547,7 +547,10 @@ int blk_stack_limits(struct queue_limits *t, struct queue_limits *b,
> >
> >         t->io_min = max(t->io_min, b->io_min);
> >         t->io_opt = lcm_not_zero(t->io_opt, b->io_opt);
> > -       t->chunk_sectors = lcm_not_zero(t->chunk_sectors, b->chunk_sectors);
> > +
> > +       if (b->chunk_sectors)
> > +               t->chunk_sectors = min_not_zero(t->chunk_sectors,
> > +                                               b->chunk_sectors);
> >
> >         /* Physical block size a multiple of the logical block size? */
> >         if (t->physical_block_size & (t->logical_block_size - 1)) {
> > --
> > 2.15.0
> >
> 

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


  reply	other threads:[~2020-11-30 23:25 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-30 17:18 [PATCH] block: revert to using min_not_zero() when stacking chunk_sectors Mike Snitzer
2020-11-30 17:18 ` [dm-devel] " Mike Snitzer
2020-11-30 20:51 ` John Dorminy
2020-11-30 20:51   ` [dm-devel] " John Dorminy
2020-11-30 23:24   ` Mike Snitzer [this message]
2020-11-30 23:24     ` [dm-devel] " Mike Snitzer
2020-12-01  0:21     ` John Dorminy
2020-12-01  0:21       ` [dm-devel] " John Dorminy
2020-12-01  2:12       ` Mike Snitzer
2020-12-01  2:12         ` [dm-devel] " Mike Snitzer
2020-12-01 16:07 ` [PATCH v2] block: use gcd() to fix chunk_sectors limit stacking Mike Snitzer
2020-12-01 16:07   ` [dm-devel] " Mike Snitzer
2020-12-01 17:43   ` John Dorminy
2020-12-01 17:43     ` [dm-devel] " John Dorminy
2020-12-01 17:53   ` Jens Axboe
2020-12-01 17:53     ` [dm-devel] " Jens Axboe
2020-12-01 18:02   ` Martin K. Petersen
2020-12-01 18:02     ` [dm-devel] " Martin K. Petersen
2020-12-02  3:38   ` [PATCH] dm: " Jeffle Xu
2020-12-02  3:38     ` [dm-devel] " Jeffle Xu
2020-12-02  3:38     ` Jeffle Xu
2020-12-02  3:38       ` [dm-devel] " Jeffle Xu
2020-12-02  3:57       ` JeffleXu
2020-12-02  3:57         ` [dm-devel] " JeffleXu
2020-12-02  5:03         ` Mike Snitzer
2020-12-02  5:03           ` [dm-devel] " Mike Snitzer
2020-12-02  5:14           ` Mike Snitzer
2020-12-02  5:14             ` [dm-devel] " Mike Snitzer
2020-12-02  6:31             ` JeffleXu
2020-12-02  6:31               ` [dm-devel] " JeffleXu
2020-12-02  6:35               ` JeffleXu
2020-12-02  6:35                 ` [dm-devel] " JeffleXu
2020-12-02  6:28           ` JeffleXu
2020-12-02  6:28             ` [dm-devel] " JeffleXu
2020-12-02  7:10           ` JeffleXu
2020-12-02  7:10             ` [dm-devel] " JeffleXu
2020-12-02 15:11             ` Mike Snitzer
2020-12-02 15:11               ` [dm-devel] " Mike Snitzer
2020-12-03  1:48               ` JeffleXu
2020-12-03  1:48                 ` JeffleXu
2020-12-03  3:26   ` [PATCH v2] block: " Ming Lei
2020-12-03  3:26     ` [dm-devel] " Ming Lei
2020-12-03 14:33     ` Mike Snitzer
2020-12-03 14:33       ` [dm-devel] " Mike Snitzer
2020-12-03 16:27       ` Keith Busch
2020-12-03 16:27         ` [dm-devel] " Keith Busch
2020-12-03 17:56         ` Mike Snitzer
2020-12-03 17:56           ` [dm-devel] " Mike Snitzer
2020-12-04  1:45         ` Ming Lei
2020-12-04  1:45           ` [dm-devel] " Ming Lei
2020-12-04  2:11           ` Mike Snitzer
2020-12-04  2:11             ` [dm-devel] " Mike Snitzer
2020-12-04  6:22             ` Damien Le Moal
2020-12-04  6:22               ` Damien Le Moal
2020-12-04  1:12       ` Ming Lei
2020-12-04  1:12         ` [dm-devel] " Ming Lei
2020-12-04  2:03         ` Mike Snitzer
2020-12-04  2:03           ` [dm-devel] " Mike Snitzer
2020-12-04  3:59           ` Ming Lei
2020-12-04  3:59             ` [dm-devel] " Ming Lei
2020-12-04 16:47             ` Mike Snitzer
2020-12-04 16:47               ` [dm-devel] " Mike Snitzer
2020-12-04 17:32               ` [RFC PATCH] dm: fix IO splitting [was: Re: [PATCH v2] block: use gcd() to fix chunk_sectors limit stacking] Mike Snitzer
2020-12-04 17:32                 ` [dm-devel] " Mike Snitzer
2020-12-04 17:49                 ` Mike Snitzer
2020-12-04 17:49                   ` [dm-devel] " Mike Snitzer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201130232417.GA12865@redhat.com \
    --to=snitzer@redhat.com \
    --cc=bjohnsto@redhat.com \
    --cc=dm-devel@redhat.com \
    --cc=jdorminy@redhat.com \
    --cc=linux-block@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.