linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] block: remove artifical max_hw_sectors cap
@ 2014-09-06 23:08 Christoph Hellwig
  2014-09-23  5:59 ` Christoph Hellwig
  2014-10-01 13:08 ` Christoph Hellwig
  0 siblings, 2 replies; 7+ messages in thread
From: Christoph Hellwig @ 2014-09-06 23:08 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-kernel, linux-scsi

Set max_sectors to the value the drivers provides as hardware limit by
default.  Linux had proper I/O throttling for a long time and doesn't
rely on a artifically small maximum I/O size anymore.  By not limiting
the I/O size by default we remove an annoying tuning step required for
most Linux installation.

Note that both the user, and if absolutely required the driver can still
impose a limit for FS requests below max_hw_sectors_kb.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 block/blk-settings.c       | 4 +---
 drivers/block/aoe/aoeblk.c | 2 +-
 include/linux/blkdev.h     | 1 -
 3 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/block/blk-settings.c b/block/blk-settings.c
index f1a1795..f52c223 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -257,9 +257,7 @@ void blk_limits_max_hw_sectors(struct queue_limits *limits, unsigned int max_hw_
 		       __func__, max_hw_sectors);
 	}
 
-	limits->max_hw_sectors = max_hw_sectors;
-	limits->max_sectors = min_t(unsigned int, max_hw_sectors,
-				    BLK_DEF_MAX_SECTORS);
+	limits->max_sectors = limits->max_hw_sectors = max_hw_sectors;
 }
 EXPORT_SYMBOL(blk_limits_max_hw_sectors);
 
diff --git a/drivers/block/aoe/aoeblk.c b/drivers/block/aoe/aoeblk.c
index dd73e1f..46c282f 100644
--- a/drivers/block/aoe/aoeblk.c
+++ b/drivers/block/aoe/aoeblk.c
@@ -395,7 +395,7 @@ aoeblk_gdalloc(void *vp)
 	WARN_ON(d->flags & DEVFL_TKILL);
 	WARN_ON(d->gd);
 	WARN_ON(d->flags & DEVFL_UP);
-	blk_queue_max_hw_sectors(q, BLK_DEF_MAX_SECTORS);
+	blk_queue_max_hw_sectors(q, 1024);
 	q->backing_dev_info.name = "aoe";
 	q->backing_dev_info.ra_pages = READ_AHEAD / PAGE_CACHE_SIZE;
 	d->bufpool = mp;
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 518b465..7e3c172 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1192,7 +1192,6 @@ extern int blk_verify_command(unsigned char *cmd, fmode_t has_write_perm);
 enum blk_default_limits {
 	BLK_MAX_SEGMENTS	= 128,
 	BLK_SAFE_MAX_SECTORS	= 255,
-	BLK_DEF_MAX_SECTORS	= 1024,
 	BLK_MAX_SEGMENT_SIZE	= 65536,
 	BLK_SEG_BOUNDARY_MASK	= 0xFFFFFFFFUL,
 };
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] block: remove artifical max_hw_sectors cap
  2014-09-06 23:08 [PATCH] block: remove artifical max_hw_sectors cap Christoph Hellwig
@ 2014-09-23  5:59 ` Christoph Hellwig
  2014-10-01 13:08 ` Christoph Hellwig
  1 sibling, 0 replies; 7+ messages in thread
From: Christoph Hellwig @ 2014-09-23  5:59 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Jens Axboe, linux-kernel, linux-scsi

ping?

On Sat, Sep 06, 2014 at 04:08:05PM -0700, Christoph Hellwig wrote:
> Set max_sectors to the value the drivers provides as hardware limit by
> default.  Linux had proper I/O throttling for a long time and doesn't
> rely on a artifically small maximum I/O size anymore.  By not limiting
> the I/O size by default we remove an annoying tuning step required for
> most Linux installation.
> 
> Note that both the user, and if absolutely required the driver can still
> impose a limit for FS requests below max_hw_sectors_kb.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  block/blk-settings.c       | 4 +---
>  drivers/block/aoe/aoeblk.c | 2 +-
>  include/linux/blkdev.h     | 1 -
>  3 files changed, 2 insertions(+), 5 deletions(-)
> 
> diff --git a/block/blk-settings.c b/block/blk-settings.c
> index f1a1795..f52c223 100644
> --- a/block/blk-settings.c
> +++ b/block/blk-settings.c
> @@ -257,9 +257,7 @@ void blk_limits_max_hw_sectors(struct queue_limits *limits, unsigned int max_hw_
>  		       __func__, max_hw_sectors);
>  	}
>  
> -	limits->max_hw_sectors = max_hw_sectors;
> -	limits->max_sectors = min_t(unsigned int, max_hw_sectors,
> -				    BLK_DEF_MAX_SECTORS);
> +	limits->max_sectors = limits->max_hw_sectors = max_hw_sectors;
>  }
>  EXPORT_SYMBOL(blk_limits_max_hw_sectors);
>  
> diff --git a/drivers/block/aoe/aoeblk.c b/drivers/block/aoe/aoeblk.c
> index dd73e1f..46c282f 100644
> --- a/drivers/block/aoe/aoeblk.c
> +++ b/drivers/block/aoe/aoeblk.c
> @@ -395,7 +395,7 @@ aoeblk_gdalloc(void *vp)
>  	WARN_ON(d->flags & DEVFL_TKILL);
>  	WARN_ON(d->gd);
>  	WARN_ON(d->flags & DEVFL_UP);
> -	blk_queue_max_hw_sectors(q, BLK_DEF_MAX_SECTORS);
> +	blk_queue_max_hw_sectors(q, 1024);
>  	q->backing_dev_info.name = "aoe";
>  	q->backing_dev_info.ra_pages = READ_AHEAD / PAGE_CACHE_SIZE;
>  	d->bufpool = mp;
> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index 518b465..7e3c172 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -1192,7 +1192,6 @@ extern int blk_verify_command(unsigned char *cmd, fmode_t has_write_perm);
>  enum blk_default_limits {
>  	BLK_MAX_SEGMENTS	= 128,
>  	BLK_SAFE_MAX_SECTORS	= 255,
> -	BLK_DEF_MAX_SECTORS	= 1024,
>  	BLK_MAX_SEGMENT_SIZE	= 65536,
>  	BLK_SEG_BOUNDARY_MASK	= 0xFFFFFFFFUL,
>  };
> -- 
> 1.9.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
---end quoted text---

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] block: remove artifical max_hw_sectors cap
  2014-09-06 23:08 [PATCH] block: remove artifical max_hw_sectors cap Christoph Hellwig
  2014-09-23  5:59 ` Christoph Hellwig
@ 2014-10-01 13:08 ` Christoph Hellwig
  2014-10-01 18:59   ` Elliott, Robert (Server Storage)
  1 sibling, 1 reply; 7+ messages in thread
From: Christoph Hellwig @ 2014-10-01 13:08 UTC (permalink / raw)
  To: Jens Axboe, linux-kernel, linux-scsi, Wu Fengguang

As we still haven't made any progress on this let me explain why
the limit does not make sense:  It only applies to _FS request,
which basically have three use cases:

 - metadata I/O.  Generally small enough that the limit does not
   matter at all.
 - buffered reads/writes.  We already have a self-tuning algorithm
   that limits writeback size, and a readahead size tunable that
   caps read sizes.  Imposing another confusing limit that does
   not interact with the visible tunables here is not helpful
 - direct I/O.  Users should get something resembling their request
   as closely as possible on the write, and this is where our
   stupid limitation causes the most problems.

On Sat, Sep 06, 2014 at 04:08:05PM -0700, Christoph Hellwig wrote:
> Set max_sectors to the value the drivers provides as hardware limit by
> default.  Linux had proper I/O throttling for a long time and doesn't
> rely on a artifically small maximum I/O size anymore.  By not limiting
> the I/O size by default we remove an annoying tuning step required for
> most Linux installation.
> 
> Note that both the user, and if absolutely required the driver can still
> impose a limit for FS requests below max_hw_sectors_kb.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  block/blk-settings.c       | 4 +---
>  drivers/block/aoe/aoeblk.c | 2 +-
>  include/linux/blkdev.h     | 1 -
>  3 files changed, 2 insertions(+), 5 deletions(-)
> 
> diff --git a/block/blk-settings.c b/block/blk-settings.c
> index f1a1795..f52c223 100644
> --- a/block/blk-settings.c
> +++ b/block/blk-settings.c
> @@ -257,9 +257,7 @@ void blk_limits_max_hw_sectors(struct queue_limits *limits, unsigned int max_hw_
>  		       __func__, max_hw_sectors);
>  	}
>  
> -	limits->max_hw_sectors = max_hw_sectors;
> -	limits->max_sectors = min_t(unsigned int, max_hw_sectors,
> -				    BLK_DEF_MAX_SECTORS);
> +	limits->max_sectors = limits->max_hw_sectors = max_hw_sectors;
>  }
>  EXPORT_SYMBOL(blk_limits_max_hw_sectors);
>  
> diff --git a/drivers/block/aoe/aoeblk.c b/drivers/block/aoe/aoeblk.c
> index dd73e1f..46c282f 100644
> --- a/drivers/block/aoe/aoeblk.c
> +++ b/drivers/block/aoe/aoeblk.c
> @@ -395,7 +395,7 @@ aoeblk_gdalloc(void *vp)
>  	WARN_ON(d->flags & DEVFL_TKILL);
>  	WARN_ON(d->gd);
>  	WARN_ON(d->flags & DEVFL_UP);
> -	blk_queue_max_hw_sectors(q, BLK_DEF_MAX_SECTORS);
> +	blk_queue_max_hw_sectors(q, 1024);
>  	q->backing_dev_info.name = "aoe";
>  	q->backing_dev_info.ra_pages = READ_AHEAD / PAGE_CACHE_SIZE;
>  	d->bufpool = mp;
> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index 518b465..7e3c172 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -1192,7 +1192,6 @@ extern int blk_verify_command(unsigned char *cmd, fmode_t has_write_perm);
>  enum blk_default_limits {
>  	BLK_MAX_SEGMENTS	= 128,
>  	BLK_SAFE_MAX_SECTORS	= 255,
> -	BLK_DEF_MAX_SECTORS	= 1024,
>  	BLK_MAX_SEGMENT_SIZE	= 65536,
>  	BLK_SEG_BOUNDARY_MASK	= 0xFFFFFFFFUL,
>  };
> -- 
> 1.9.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
---end quoted text---

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH] block: remove artifical max_hw_sectors cap
  2014-10-01 13:08 ` Christoph Hellwig
@ 2014-10-01 18:59   ` Elliott, Robert (Server Storage)
  2014-10-01 21:12     ` Christoph Hellwig
  0 siblings, 1 reply; 7+ messages in thread
From: Elliott, Robert (Server Storage) @ 2014-10-01 18:59 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe, linux-kernel, linux-scsi, Wu Fengguang



> -----Original Message-----
> From: linux-scsi-owner@vger.kernel.org [mailto:linux-scsi-
> owner@vger.kernel.org] On Behalf Of Christoph Hellwig
> Sent: Wednesday, 01 October, 2014 8:08 AM
> To: Jens Axboe; linux-kernel@vger.kernel.org; linux-scsi@vger.kernel.org; Wu
> Fengguang
> Subject: Re: [PATCH] block: remove artifical max_hw_sectors cap
> 
> As we still haven't made any progress on this let me explain why
> the limit does not make sense:  It only applies to _FS request,
> which basically have three use cases:
> 
>  - metadata I/O.  Generally small enough that the limit does not
>    matter at all.
>  - buffered reads/writes.  We already have a self-tuning algorithm
>    that limits writeback size, and a readahead size tunable that
>    caps read sizes.  Imposing another confusing limit that does
>    not interact with the visible tunables here is not helpful
>  - direct I/O.  Users should get something resembling their request
>    as closely as possible on the write, and this is where our
>    stupid limitation causes the most problems.

One supporting example: A low limit interferes with creation of
full stripe writes for RAID controllers.



> On Sat, Sep 06, 2014 at 04:08:05PM -0700, Christoph Hellwig wrote:
> > Set max_sectors to the value the drivers provides as hardware limit by
> > default.  Linux had proper I/O throttling for a long time and doesn't
> > rely on a artifically small maximum I/O size anymore.  By not limiting
> > the I/O size by default we remove an annoying tuning step required for
> > most Linux installation.
> >
> > Note that both the user, and if absolutely required the driver can still
> > impose a limit for FS requests below max_hw_sectors_kb.
> >
> > Signed-off-by: Christoph Hellwig <hch@lst.de>
> > ---
> >  block/blk-settings.c       | 4 +---
> >  drivers/block/aoe/aoeblk.c | 2 +-
> >  include/linux/blkdev.h     | 1 -
> >  3 files changed, 2 insertions(+), 5 deletions(-)
> >
> > diff --git a/block/blk-settings.c b/block/blk-settings.c
> > index f1a1795..f52c223 100644
> > --- a/block/blk-settings.c
> > +++ b/block/blk-settings.c
> > @@ -257,9 +257,7 @@ void blk_limits_max_hw_sectors(struct queue_limits
> *limits, unsigned int max_hw_
> >  		       __func__, max_hw_sectors);
> >  	}
> >
> > -	limits->max_hw_sectors = max_hw_sectors;
> > -	limits->max_sectors = min_t(unsigned int, max_hw_sectors,
> > -				    BLK_DEF_MAX_SECTORS);
> > +	limits->max_sectors = limits->max_hw_sectors = max_hw_sectors;
> >  }
> >  EXPORT_SYMBOL(blk_limits_max_hw_sectors);

1. Documentation/block/biodoc.txt needs some updates:

        blk_queue_max_sectors(q, max_sectors)
                Sets two variables that limit the size of the request.

                - The request queue's max_sectors, which is a soft size in
                units of 512 byte sectors, and could be dynamically varied
                by the core kernel.

                - The request queue's max_hw_sectors, which is a hard limit
                and reflects the maximum size request a driver can handle
                in units of 512 byte sectors.

                The default for both max_sectors and max_hw_sectors is
                255. The upper limit of max_sectors is 1024.

There is no function with that name (it is now called
blk_queue_max_hw_sectors), the upper limit of max_sectors
is max_hw_sectors, and the default is misleading (255
is the default if the LLD doesn't provide max_hw_sectors).

2. Testing with hpsa and mpt3sas, this patch works as expected
for this setting.  I/O sizes are still limited by max_segments, 
which is expected.  Something else is still limiting I/O sizes
to 1 MiB, though; probably bio_get_nr_vecs enforcing a maximum
size per bio of BIO_MAX_PAGES 256 (which is 1 MiB with 4 KiB
pages).


Otherwise,
Reviewed-by: Robert Elliott <elliott@hp.com>
Tested-by: Robert Elliott <elliott@hp.com>

---
Rob Elliott    HP Server Storage




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] block: remove artifical max_hw_sectors cap
  2014-10-01 18:59   ` Elliott, Robert (Server Storage)
@ 2014-10-01 21:12     ` Christoph Hellwig
  2014-10-17 13:12       ` Christoph Hellwig
  0 siblings, 1 reply; 7+ messages in thread
From: Christoph Hellwig @ 2014-10-01 21:12 UTC (permalink / raw)
  To: Elliott, Robert (Server Storage)
  Cc: Christoph Hellwig, Jens Axboe, linux-kernel, linux-scsi, Wu Fengguang

On Wed, Oct 01, 2014 at 06:59:34PM +0000, Elliott, Robert (Server Storage) wrote:
> One supporting example: A low limit interferes with creation of
> full stripe writes for RAID controllers.

Yes, both Linux mdraid and parity raids are often crippled by the
default.  The mdadm default of 512kb for each chunk combined with
the default limit very much guarantees a horrible out of the box
experience.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] block: remove artifical max_hw_sectors cap
  2014-10-01 21:12     ` Christoph Hellwig
@ 2014-10-17 13:12       ` Christoph Hellwig
  2014-10-21 20:02         ` Jens Axboe
  0 siblings, 1 reply; 7+ messages in thread
From: Christoph Hellwig @ 2014-10-17 13:12 UTC (permalink / raw)
  To: Elliott, Robert (Server Storage)
  Cc: Christoph Hellwig, Jens Axboe, linux-kernel, linux-scsi, Wu Fengguang

On Wed, Oct 01, 2014 at 02:12:58PM -0700, Christoph Hellwig wrote:
> On Wed, Oct 01, 2014 at 06:59:34PM +0000, Elliott, Robert (Server Storage) wrote:
> > One supporting example: A low limit interferes with creation of
> > full stripe writes for RAID controllers.
> 
> Yes, both Linux mdraid and parity raids are often crippled by the
> default.  The mdadm default of 512kb for each chunk combined with
> the default limit very much guarantees a horrible out of the box
> experience.

Jens, can you apply this patch once the 3.19 queue opens?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] block: remove artifical max_hw_sectors cap
  2014-10-17 13:12       ` Christoph Hellwig
@ 2014-10-21 20:02         ` Jens Axboe
  0 siblings, 0 replies; 7+ messages in thread
From: Jens Axboe @ 2014-10-21 20:02 UTC (permalink / raw)
  To: Christoph Hellwig, Elliott, Robert (Server Storage)
  Cc: linux-kernel, linux-scsi, Wu Fengguang

On 10/17/2014 07:12 AM, Christoph Hellwig wrote:
> On Wed, Oct 01, 2014 at 02:12:58PM -0700, Christoph Hellwig wrote:
>> On Wed, Oct 01, 2014 at 06:59:34PM +0000, Elliott, Robert (Server Storage) wrote:
>>> One supporting example: A low limit interferes with creation of
>>> full stripe writes for RAID controllers.
>>
>> Yes, both Linux mdraid and parity raids are often crippled by the
>> default.  The mdadm default of 512kb for each chunk combined with
>> the default limit very much guarantees a horrible out of the box
>> experience.
> 
> Jens, can you apply this patch once the 3.19 queue opens?

Yup, lets get it applied.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2014-10-21 20:02 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-06 23:08 [PATCH] block: remove artifical max_hw_sectors cap Christoph Hellwig
2014-09-23  5:59 ` Christoph Hellwig
2014-10-01 13:08 ` Christoph Hellwig
2014-10-01 18:59   ` Elliott, Robert (Server Storage)
2014-10-01 21:12     ` Christoph Hellwig
2014-10-17 13:12       ` Christoph Hellwig
2014-10-21 20:02         ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).