dm-devel.redhat.com archive mirror
 help / color / mirror / Atom feed
From: Keith Busch <kbusch@kernel.org>
To: Ming Lei <ming.lei@redhat.com>
Cc: Mike Snitzer <snitzer@redhat.com>,
	Vijayendra Suman <vijayendra.suman@oracle.com>,
	dm-devel@redhat.com, linux-block@vger.kernel.org
Subject: Re: [PATCH 3/3] block: allow 'chunk_sectors' to be non-power-of-2
Date: Sun, 13 Sep 2020 19:43:06 -0700	[thread overview]
Message-ID: <20200914024306.GA3657769@dhcp-10-100-145-180.wdl.wdc.com> (raw)
In-Reply-To: <20200912140630.GC210077@T590>

On Sat, Sep 12, 2020 at 10:06:30PM +0800, Ming Lei wrote:
> On Fri, Sep 11, 2020 at 05:53:38PM -0400, Mike Snitzer wrote:
> > It is possible for a block device to use a non power-of-2 for chunk
> > size which results in a full-stripe size that is also a non
> > power-of-2.
> > 
> > Update blk_queue_chunk_sectors() and blk_max_size_offset() to
> > accommodate drivers that need a non power-of-2 chunk_sectors.
> > 
> > Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> > ---
> >  block/blk-settings.c   | 10 ++++------
> >  include/linux/blkdev.h | 12 +++++++++---
> >  2 files changed, 13 insertions(+), 9 deletions(-)
> > 
> > diff --git a/block/blk-settings.c b/block/blk-settings.c
> > index b09642d5f15e..e40a162cc946 100644
> > --- a/block/blk-settings.c
> > +++ b/block/blk-settings.c
> > @@ -172,15 +172,13 @@ EXPORT_SYMBOL(blk_queue_max_hw_sectors);
> >   *
> >   * Description:
> >   *    If a driver doesn't want IOs to cross a given chunk size, it can set
> > - *    this limit and prevent merging across chunks. Note that the chunk size
> > - *    must currently be a power-of-2 in sectors. Also note that the block
> > - *    layer must accept a page worth of data at any offset. So if the
> > - *    crossing of chunks is a hard limitation in the driver, it must still be
> > - *    prepared to split single page bios.
> > + *    this limit and prevent merging across chunks. Note that the block layer
> > + *    must accept a page worth of data at any offset. So if the crossing of
> > + *    chunks is a hard limitation in the driver, it must still be prepared
> > + *    to split single page bios.
> >   **/
> >  void blk_queue_chunk_sectors(struct request_queue *q, unsigned int chunk_sectors)
> >  {
> > -	BUG_ON(!is_power_of_2(chunk_sectors));
> >  	q->limits.chunk_sectors = chunk_sectors;
> >  }
> >  EXPORT_SYMBOL(blk_queue_chunk_sectors);
> > diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> > index 453a3d735d66..e72bcce22143 100644
> > --- a/include/linux/blkdev.h
> > +++ b/include/linux/blkdev.h
> > @@ -1059,11 +1059,17 @@ static inline unsigned int blk_queue_get_max_sectors(struct request_queue *q,
> >  static inline unsigned int blk_max_size_offset(struct request_queue *q,
> >  					       sector_t offset)
> >  {
> > -	if (!q->limits.chunk_sectors)
> > +	unsigned int chunk_sectors = q->limits.chunk_sectors;
> > +
> > +	if (!chunk_sectors)
> >  		return q->limits.max_sectors;
> >  
> > -	return min(q->limits.max_sectors, (unsigned int)(q->limits.chunk_sectors -
> > -			(offset & (q->limits.chunk_sectors - 1))));
> > +	if (is_power_of_2(chunk_sectors))
> > +		chunk_sectors -= (offset & (chunk_sectors - 1));
> > +	else
> > +		chunk_sectors -= sector_div(offset, chunk_sectors);
> > +
> > +	return min(q->limits.max_sectors, chunk_sectors);
> >  }
> >  
> >  static inline unsigned int blk_rq_get_max_sectors(struct request *rq,
> > -- 
> > 2.15.0
> > 
> 
> is_power_of_2() is cheap enough for fast path, so looks fine to support
> non-power-of-2 chunk sectors.
> 
> Maybe NVMe PCI can remove the power_of_2() limit too.

I'd need to see the justification for that. The boundary is just a
suggestion in NVMe. The majority of IO never crosses it so the
calculation is usually wasted CPU cycles. Crossing the boundary is going
to have to be very costly on the device side in order to justify the
host side per-IO overhead for a non-power-of-2 split. 

  reply	other threads:[~2020-09-14  2:43 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <529c2394-1b58-b9d8-d462-1f3de1b78ac8@oracle.com>
2020-09-10 14:24 ` Revert "dm: always call blk_queue_split() in dm_process_bio()" Mike Snitzer
2020-09-10 19:29   ` Vijayendra Suman
2020-09-15  1:33     ` Mike Snitzer
2020-09-15 17:03       ` Mike Snitzer
2020-09-16 14:56       ` Vijayendra Suman
2020-09-11 12:20   ` Ming Lei
2020-09-11 16:13     ` Mike Snitzer
2020-09-11 21:53       ` [PATCH 0/3] block: a few chunk_sectors fixes/improvements Mike Snitzer
2020-09-11 21:53         ` [PATCH 1/3] block: fix blk_rq_get_max_sectors() to flow more carefully Mike Snitzer
2020-09-12 13:52           ` Ming Lei
2020-09-14  0:43             ` Damien Le Moal
2020-09-14 14:52               ` Mike Snitzer
2020-09-14 23:28                 ` Damien Le Moal
2020-09-15  2:03               ` Ming Lei
2020-09-15  2:15                 ` Damien Le Moal
2020-09-14 14:49             ` Mike Snitzer
2020-09-15  1:50               ` Ming Lei
2020-09-14  0:46           ` Damien Le Moal
2020-09-14 15:03             ` Mike Snitzer
2020-09-15  1:09               ` Damien Le Moal
2020-09-15  4:21                 ` Damien Le Moal
2020-09-15  8:01                   ` Ming Lei
2020-09-11 21:53         ` [PATCH 2/3] block: use lcm_not_zero() when stacking chunk_sectors Mike Snitzer
2020-09-12 13:58           ` Ming Lei
2020-09-11 21:53         ` [PATCH 3/3] block: allow 'chunk_sectors' to be non-power-of-2 Mike Snitzer
2020-09-12 14:06           ` Ming Lei
2020-09-14  2:43             ` Keith Busch [this message]
2020-09-14  0:55           ` Damien Le Moal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200914024306.GA3657769@dhcp-10-100-145-180.wdl.wdc.com \
    --to=kbusch@kernel.org \
    --cc=dm-devel@redhat.com \
    --cc=linux-block@vger.kernel.org \
    --cc=ming.lei@redhat.com \
    --cc=snitzer@redhat.com \
    --cc=vijayendra.suman@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).