linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] bcache: make stripe_size configurable and persistent for hardware raid5/6
       [not found] <d3f7fd44-9287-c7fa-ee95-c3b8a4d56c93@suse.de>
@ 2019-06-22 23:16 ` Eric Wheeler
  2019-06-23  0:41   ` Martin K. Petersen
                     ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Eric Wheeler @ 2019-06-22 23:16 UTC (permalink / raw)
  To: Coly Li
  Cc: linux-block, Eric Wheeler, Eric Wheeler, Jonathan Corbet,
	Kent Overstreet, open list:DOCUMENTATION, open list,
	open list:BCACHE (BLOCK LAYER CACHE)

From: Eric Wheeler <git@linux.ewheeler.net>

While some drivers set queue_limits.io_opt (e.g., md raid5), there are
currently no SCSI/RAID controller drivers that do.  Previously stripe_size
and partial_stripes_expensive were read-only values and could not be
tuned by users (eg, for hardware RAID5/6).

This patch enables users to save the optimal IO size via sysfs through
the backing device attributes stripe_size and partial_stripes_expensive
into the bcache superblock.

Superblock changes are backwards-compatable:

*  partial_stripes_expensive: One bit was used in the superblock flags field

*  stripe_size: There are eight 64-bit "pad" fields for future use in
   the superblock which default to 0; from those, 32-bits are now used
   to save the stripe_size and load at device registration time.

Signed-off-by: Eric Wheeler <bcache@linux.ewheeler.net>
---
 Documentation/admin-guide/bcache.rst | 21 +++++++++++++++++++++
 drivers/md/bcache/super.c            | 15 ++++++++++++++-
 drivers/md/bcache/sysfs.c            | 33 +++++++++++++++++++++++++++++++--
 include/uapi/linux/bcache.h          |  6 ++++--
 4 files changed, 70 insertions(+), 5 deletions(-)

diff --git a/Documentation/admin-guide/bcache.rst b/Documentation/admin-guide/bcache.rst
index c0ce64d..ef82022 100644
--- a/Documentation/admin-guide/bcache.rst
+++ b/Documentation/admin-guide/bcache.rst
@@ -420,6 +420,12 @@ dirty_data
 label
   Name of underlying device.
 
+partial_stripes_expensive
+  Flag to bcache that partial or unaligned stripe_size'd
+  writes to the backing device are expensive (e.g., RAID5/6 incur
+  read-copy-write). Writing this sysfs attribute updates the superblock
+  and also takes effect immediately.  See also stripe_size, below.
+
 readahead
   Size of readahead that should be performed.  Defaults to 0.  If set to e.g.
   1M, it will round cache miss reads up to that size, but without overlapping
@@ -458,6 +464,21 @@ stop
   Write to this file to shut down the bcache device and close the backing
   device.
 
+stripe_size
+  The stripe size in bytes of the backing device for optimial
+  write performance (also known as the "stride width"). This is set
+  automatically when using a device driver sets blk_limits_io_opt
+  (e.g., md, rbd, skd, zram, virtio_blk).  No hardware RAID controller
+  sets blk_limits_io_opt as of 2019-06-15, so configure this to suit
+  your needs.  Note that you must unregister and re-register the backing
+  device after making a change to stripe_size.
+
+  Where N is the number of data disks,
+    RAID5: stripe_size = (N-1)*RAID_CHUNK_SIZE.
+    RAID6: stripe_size = (N-2)*RAID_CHUNK_SIZE.
+
+  See also partial_stripes_expensive, above.
+
 writeback_delay
   When dirty data is written to the cache and it previously did not contain
   any, waits some number of seconds before initiating writeback. Defaults to
diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index 1b63ac8..d0b9501 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -80,6 +80,7 @@ static const char *read_super(struct cache_sb *sb, struct block_device *bdev,
 
 	sb->flags		= le64_to_cpu(s->flags);
 	sb->seq			= le64_to_cpu(s->seq);
+	sb->stripe_size		= le32_to_cpu(s->stripe_size);
 	sb->last_mount		= le32_to_cpu(s->last_mount);
 	sb->first_bucket	= le16_to_cpu(s->first_bucket);
 	sb->keys		= le16_to_cpu(s->keys);
@@ -221,6 +222,7 @@ static void __write_super(struct cache_sb *sb, struct bio *bio)
 
 	out->flags		= cpu_to_le64(sb->flags);
 	out->seq		= cpu_to_le64(sb->seq);
+	out->stripe_size	= cpu_to_le32(sb->stripe_size);
 
 	out->last_mount		= cpu_to_le32(sb->last_mount);
 	out->first_bucket	= cpu_to_le16(sb->first_bucket);
@@ -1258,7 +1260,18 @@ static int cached_dev_init(struct cached_dev *dc, unsigned int block_size)
 
 	dc->disk.stripe_size = q->limits.io_opt >> 9;
 
-	if (dc->disk.stripe_size)
+	if (dc->sb.stripe_size) {
+		if (dc->disk.stripe_size &&
+		    dc->disk.stripe_size != dc->sb.stripe_size) {
+			pr_warn("superblock stripe_size (%d) overrides bdev stripe_size (%d)\n",
+				(int)dc->sb.stripe_size,
+				(int)dc->disk.stripe_size);
+		}
+
+		dc->disk.stripe_size = dc->sb.stripe_size;
+		dc->partial_stripes_expensive =
+			(unsigned int)BDEV_PARTIAL_STRIPES_EXPENSIVE(&dc->sb);
+	} else if (dc->disk.stripe_size)
 		dc->partial_stripes_expensive =
 			q->limits.raid_partial_stripes_expensive;
 
diff --git a/drivers/md/bcache/sysfs.c b/drivers/md/bcache/sysfs.c
index bfb437f..4ebca52 100644
--- a/drivers/md/bcache/sysfs.c
+++ b/drivers/md/bcache/sysfs.c
@@ -111,8 +111,8 @@
 rw_attribute(writeback_rate_minimum);
 read_attribute(writeback_rate_debug);
 
-read_attribute(stripe_size);
-read_attribute(partial_stripes_expensive);
+rw_attribute(stripe_size);
+rw_attribute(partial_stripes_expensive);
 
 rw_attribute(synchronous);
 rw_attribute(journal_delay_ms);
@@ -343,6 +343,35 @@ static ssize_t bch_snprint_string_list(char *buf,
 		}
 	}
 
+	if (attr == &sysfs_stripe_size) {
+		int v = strtoul_or_return(buf);
+
+		if (v & 0x1FF) {
+			pr_err("stripe_size must be a muliple of 512-byte sectors");
+			return -EINVAL;
+		}
+
+		v >>= 9;
+
+		if (v != dc->sb.stripe_size) {
+			dc->sb.stripe_size = v;
+			pr_info("stripe_size=%d, re-register to take effect.",
+				v<<9);
+			bch_write_bdev_super(dc, NULL);
+		} else
+			pr_info("stripe_size is already set to %d.", v<<9);
+	}
+
+	if (attr == &sysfs_partial_stripes_expensive) {
+		int v = strtoul_or_return(buf);
+
+		if (v != BDEV_PARTIAL_STRIPES_EXPENSIVE(&dc->sb)) {
+			SET_BDEV_PARTIAL_STRIPES_EXPENSIVE(&dc->sb, v);
+			dc->partial_stripes_expensive = v;
+			bch_write_bdev_super(dc, NULL);
+		}
+	}
+
 	if (attr == &sysfs_stop_when_cache_set_failed) {
 		v = __sysfs_match_string(bch_stop_on_failure_modes, -1, buf);
 		if (v < 0)
diff --git a/include/uapi/linux/bcache.h b/include/uapi/linux/bcache.h
index 5d4f58e..ee60914 100644
--- a/include/uapi/linux/bcache.h
+++ b/include/uapi/linux/bcache.h
@@ -172,7 +172,9 @@ struct cache_sb {
 
 	__u64			flags;
 	__u64			seq;
-	__u64			pad[8];
+	__u32			stripe_size;
+	__u32			pad_u32;
+	__u64			pad_u64[7];
 
 	union {
 	struct {
@@ -230,7 +232,7 @@ static inline _Bool SB_IS_BDEV(const struct cache_sb *sb)
 #define BDEV_STATE_CLEAN		1U
 #define BDEV_STATE_DIRTY		2U
 #define BDEV_STATE_STALE		3U
-
+BITMASK(BDEV_PARTIAL_STRIPES_EXPENSIVE,	struct cache_sb, flags, 60, 1);
 /*
  * Magic numbers
  *
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH] bcache: make stripe_size configurable and persistent for hardware raid5/6
  2019-06-22 23:16 ` [PATCH] bcache: make stripe_size configurable and persistent for hardware raid5/6 Eric Wheeler
@ 2019-06-23  0:41   ` Martin K. Petersen
  2019-06-24  6:57   ` Coly Li
  2019-06-24  7:05   ` Coly Li
  2 siblings, 0 replies; 17+ messages in thread
From: Martin K. Petersen @ 2019-06-23  0:41 UTC (permalink / raw)
  To: Eric Wheeler
  Cc: Coly Li, linux-block, Eric Wheeler, Eric Wheeler,
	Jonathan Corbet, Kent Overstreet, open list:DOCUMENTATION,
	open list, open list:BCACHE (BLOCK LAYER CACHE)


Eric,

> While some drivers set queue_limits.io_opt (e.g., md raid5), there are
> currently no SCSI/RAID controller drivers that do.

That's not true. Lots of SCSI RAID devices report a stripe width.

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] bcache: make stripe_size configurable and persistent for hardware raid5/6
  2019-06-22 23:16 ` [PATCH] bcache: make stripe_size configurable and persistent for hardware raid5/6 Eric Wheeler
  2019-06-23  0:41   ` Martin K. Petersen
@ 2019-06-24  6:57   ` Coly Li
  2019-06-24  7:05   ` Coly Li
  2 siblings, 0 replies; 17+ messages in thread
From: Coly Li @ 2019-06-24  6:57 UTC (permalink / raw)
  To: Eric Wheeler
  Cc: linux-block, Eric Wheeler, Eric Wheeler, Jonathan Corbet,
	Kent Overstreet, open list:DOCUMENTATION, open list,
	open list:BCACHE (BLOCK LAYER CACHE)

On 2019/6/23 7:16 上午, Eric Wheeler wrote:
> From: Eric Wheeler <git@linux.ewheeler.net>
> 
> While some drivers set queue_limits.io_opt (e.g., md raid5), there are
> currently no SCSI/RAID controller drivers that do.  Previously stripe_size
> and partial_stripes_expensive were read-only values and could not be
> tuned by users (eg, for hardware RAID5/6).
> 
> This patch enables users to save the optimal IO size via sysfs through
> the backing device attributes stripe_size and partial_stripes_expensive
> into the bcache superblock.
> 
> Superblock changes are backwards-compatable:
> 
> *  partial_stripes_expensive: One bit was used in the superblock flags field
> 
> *  stripe_size: There are eight 64-bit "pad" fields for future use in
>    the superblock which default to 0; from those, 32-bits are now used
>    to save the stripe_size and load at device registration time.
> 
> Signed-off-by: Eric Wheeler <bcache@linux.ewheeler.net>

Hi Eric,

In general I am OK with this patch. Since Peter comments lots of SCSI
RAID devices reports a stripe width, could you please list the hardware
raid devices which don't list stripe size ? Then we can make decision
whether it is necessary to have such option enabled.

Another point is, this patch changes struct cache_sb, it is no problem
to change on-disk format. I plan to update the super block version soon,
to store more configuration persistently into super block. stripe_size
can be added to cache_sb with other on-disk changes.

Thanks.

Coly Li


> ---
>  Documentation/admin-guide/bcache.rst | 21 +++++++++++++++++++++
>  drivers/md/bcache/super.c            | 15 ++++++++++++++-
>  drivers/md/bcache/sysfs.c            | 33 +++++++++++++++++++++++++++++++--
>  include/uapi/linux/bcache.h          |  6 ++++--
>  4 files changed, 70 insertions(+), 5 deletions(-)
> 
> diff --git a/Documentation/admin-guide/bcache.rst b/Documentation/admin-guide/bcache.rst
> index c0ce64d..ef82022 100644
> --- a/Documentation/admin-guide/bcache.rst
> +++ b/Documentation/admin-guide/bcache.rst
> @@ -420,6 +420,12 @@ dirty_data
>  label
>    Name of underlying device.
>  
> +partial_stripes_expensive
> +  Flag to bcache that partial or unaligned stripe_size'd
> +  writes to the backing device are expensive (e.g., RAID5/6 incur
> +  read-copy-write). Writing this sysfs attribute updates the superblock
> +  and also takes effect immediately.  See also stripe_size, below.
> +
>  readahead
>    Size of readahead that should be performed.  Defaults to 0.  If set to e.g.
>    1M, it will round cache miss reads up to that size, but without overlapping
> @@ -458,6 +464,21 @@ stop
>    Write to this file to shut down the bcache device and close the backing
>    device.
>  
> +stripe_size
> +  The stripe size in bytes of the backing device for optimial
> +  write performance (also known as the "stride width"). This is set
> +  automatically when using a device driver sets blk_limits_io_opt
> +  (e.g., md, rbd, skd, zram, virtio_blk).  No hardware RAID controller
> +  sets blk_limits_io_opt as of 2019-06-15, so configure this to suit
> +  your needs.  Note that you must unregister and re-register the backing
> +  device after making a change to stripe_size.
> +
> +  Where N is the number of data disks,
> +    RAID5: stripe_size = (N-1)*RAID_CHUNK_SIZE.
> +    RAID6: stripe_size = (N-2)*RAID_CHUNK_SIZE.
> +
> +  See also partial_stripes_expensive, above.
> +
>  writeback_delay
>    When dirty data is written to the cache and it previously did not contain
>    any, waits some number of seconds before initiating writeback. Defaults to
> diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
> index 1b63ac8..d0b9501 100644
> --- a/drivers/md/bcache/super.c
> +++ b/drivers/md/bcache/super.c
> @@ -80,6 +80,7 @@ static const char *read_super(struct cache_sb *sb, struct block_device *bdev,
>  
>  	sb->flags		= le64_to_cpu(s->flags);
>  	sb->seq			= le64_to_cpu(s->seq);
> +	sb->stripe_size		= le32_to_cpu(s->stripe_size);
>  	sb->last_mount		= le32_to_cpu(s->last_mount);
>  	sb->first_bucket	= le16_to_cpu(s->first_bucket);
>  	sb->keys		= le16_to_cpu(s->keys);
> @@ -221,6 +222,7 @@ static void __write_super(struct cache_sb *sb, struct bio *bio)
>  
>  	out->flags		= cpu_to_le64(sb->flags);
>  	out->seq		= cpu_to_le64(sb->seq);
> +	out->stripe_size	= cpu_to_le32(sb->stripe_size);
>  
>  	out->last_mount		= cpu_to_le32(sb->last_mount);
>  	out->first_bucket	= cpu_to_le16(sb->first_bucket);
> @@ -1258,7 +1260,18 @@ static int cached_dev_init(struct cached_dev *dc, unsigned int block_size)
>  
>  	dc->disk.stripe_size = q->limits.io_opt >> 9;
>  
> -	if (dc->disk.stripe_size)
> +	if (dc->sb.stripe_size) {
> +		if (dc->disk.stripe_size &&
> +		    dc->disk.stripe_size != dc->sb.stripe_size) {
> +			pr_warn("superblock stripe_size (%d) overrides bdev stripe_size (%d)\n",
> +				(int)dc->sb.stripe_size,
> +				(int)dc->disk.stripe_size);
> +		}
> +
> +		dc->disk.stripe_size = dc->sb.stripe_size;
> +		dc->partial_stripes_expensive =
> +			(unsigned int)BDEV_PARTIAL_STRIPES_EXPENSIVE(&dc->sb);
> +	} else if (dc->disk.stripe_size)
>  		dc->partial_stripes_expensive =
>  			q->limits.raid_partial_stripes_expensive;
>  
> diff --git a/drivers/md/bcache/sysfs.c b/drivers/md/bcache/sysfs.c
> index bfb437f..4ebca52 100644
> --- a/drivers/md/bcache/sysfs.c
> +++ b/drivers/md/bcache/sysfs.c
> @@ -111,8 +111,8 @@
>  rw_attribute(writeback_rate_minimum);
>  read_attribute(writeback_rate_debug);
>  
> -read_attribute(stripe_size);
> -read_attribute(partial_stripes_expensive);
> +rw_attribute(stripe_size);
> +rw_attribute(partial_stripes_expensive);
>  
>  rw_attribute(synchronous);
>  rw_attribute(journal_delay_ms);
> @@ -343,6 +343,35 @@ static ssize_t bch_snprint_string_list(char *buf,
>  		}
>  	}
>  
> +	if (attr == &sysfs_stripe_size) {
> +		int v = strtoul_or_return(buf);
> +
> +		if (v & 0x1FF) {
> +			pr_err("stripe_size must be a muliple of 512-byte sectors");
> +			return -EINVAL;
> +		}
> +
> +		v >>= 9;
> +
> +		if (v != dc->sb.stripe_size) {
> +			dc->sb.stripe_size = v;
> +			pr_info("stripe_size=%d, re-register to take effect.",
> +				v<<9);
> +			bch_write_bdev_super(dc, NULL);
> +		} else
> +			pr_info("stripe_size is already set to %d.", v<<9);
> +	}
> +
> +	if (attr == &sysfs_partial_stripes_expensive) {
> +		int v = strtoul_or_return(buf);
> +
> +		if (v != BDEV_PARTIAL_STRIPES_EXPENSIVE(&dc->sb)) {
> +			SET_BDEV_PARTIAL_STRIPES_EXPENSIVE(&dc->sb, v);
> +			dc->partial_stripes_expensive = v;
> +			bch_write_bdev_super(dc, NULL);
> +		}
> +	}
> +
>  	if (attr == &sysfs_stop_when_cache_set_failed) {
>  		v = __sysfs_match_string(bch_stop_on_failure_modes, -1, buf);
>  		if (v < 0)
> diff --git a/include/uapi/linux/bcache.h b/include/uapi/linux/bcache.h
> index 5d4f58e..ee60914 100644
> --- a/include/uapi/linux/bcache.h
> +++ b/include/uapi/linux/bcache.h
> @@ -172,7 +172,9 @@ struct cache_sb {
>  
>  	__u64			flags;
>  	__u64			seq;
> -	__u64			pad[8];
> +	__u32			stripe_size;
> +	__u32			pad_u32;
> +	__u64			pad_u64[7];
>  
>  	union {
>  	struct {
> @@ -230,7 +232,7 @@ static inline _Bool SB_IS_BDEV(const struct cache_sb *sb)
>  #define BDEV_STATE_CLEAN		1U
>  #define BDEV_STATE_DIRTY		2U
>  #define BDEV_STATE_STALE		3U
> -
> +BITMASK(BDEV_PARTIAL_STRIPES_EXPENSIVE,	struct cache_sb, flags, 60, 1);
>  /*
>   * Magic numbers
>   *
> 


-- 

Coly Li

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] bcache: make stripe_size configurable and persistent for hardware raid5/6
  2019-06-22 23:16 ` [PATCH] bcache: make stripe_size configurable and persistent for hardware raid5/6 Eric Wheeler
  2019-06-23  0:41   ` Martin K. Petersen
  2019-06-24  6:57   ` Coly Li
@ 2019-06-24  7:05   ` Coly Li
  2019-06-24 18:14     ` Eric Wheeler
  2 siblings, 1 reply; 17+ messages in thread
From: Coly Li @ 2019-06-24  7:05 UTC (permalink / raw)
  To: Eric Wheeler
  Cc: linux-block, Eric Wheeler, Eric Wheeler, Jonathan Corbet,
	Kent Overstreet, open list:DOCUMENTATION, open list,
	open list:BCACHE (BLOCK LAYER CACHE)

On 2019/6/23 7:16 上午, Eric Wheeler wrote:
> From: Eric Wheeler <git@linux.ewheeler.net>
> 
> While some drivers set queue_limits.io_opt (e.g., md raid5), there are
> currently no SCSI/RAID controller drivers that do.  Previously stripe_size
> and partial_stripes_expensive were read-only values and could not be
> tuned by users (eg, for hardware RAID5/6).
> 
> This patch enables users to save the optimal IO size via sysfs through
> the backing device attributes stripe_size and partial_stripes_expensive
> into the bcache superblock.
> 
> Superblock changes are backwards-compatable:
> 
> *  partial_stripes_expensive: One bit was used in the superblock flags field
> 
> *  stripe_size: There are eight 64-bit "pad" fields for future use in
>    the superblock which default to 0; from those, 32-bits are now used
>    to save the stripe_size and load at device registration time.
> 
> Signed-off-by: Eric Wheeler <bcache@linux.ewheeler.net>

Hi Eric,

In general I am OK with this patch. Since Peter comments lots of SCSI
RAID devices reports a stripe width, could you please list the hardware
raid devices which don't list stripe size ? Then we can make decision
whether it is necessary to have such option enabled.

Another point is, this patch changes struct cache_sb, it is no problem
to change on-disk format. I plan to update the super block version soon,
to store more configuration persistently into super block. stripe_size
can be added to cache_sb with other on-disk changes.

Thanks.

Coly Li


> ---
>  Documentation/admin-guide/bcache.rst | 21 +++++++++++++++++++++
>  drivers/md/bcache/super.c            | 15 ++++++++++++++-
>  drivers/md/bcache/sysfs.c            | 33 +++++++++++++++++++++++++++++++--
>  include/uapi/linux/bcache.h          |  6 ++++--
>  4 files changed, 70 insertions(+), 5 deletions(-)
> 
> diff --git a/Documentation/admin-guide/bcache.rst b/Documentation/admin-guide/bcache.rst
> index c0ce64d..ef82022 100644
> --- a/Documentation/admin-guide/bcache.rst
> +++ b/Documentation/admin-guide/bcache.rst
> @@ -420,6 +420,12 @@ dirty_data
>  label
>    Name of underlying device.
>  
> +partial_stripes_expensive
> +  Flag to bcache that partial or unaligned stripe_size'd
> +  writes to the backing device are expensive (e.g., RAID5/6 incur
> +  read-copy-write). Writing this sysfs attribute updates the superblock
> +  and also takes effect immediately.  See also stripe_size, below.
> +
>  readahead
>    Size of readahead that should be performed.  Defaults to 0.  If set to e.g.
>    1M, it will round cache miss reads up to that size, but without overlapping
> @@ -458,6 +464,21 @@ stop
>    Write to this file to shut down the bcache device and close the backing
>    device.
>  
> +stripe_size
> +  The stripe size in bytes of the backing device for optimial
> +  write performance (also known as the "stride width"). This is set
> +  automatically when using a device driver sets blk_limits_io_opt
> +  (e.g., md, rbd, skd, zram, virtio_blk).  No hardware RAID controller
> +  sets blk_limits_io_opt as of 2019-06-15, so configure this to suit
> +  your needs.  Note that you must unregister and re-register the backing
> +  device after making a change to stripe_size.
> +
> +  Where N is the number of data disks,
> +    RAID5: stripe_size = (N-1)*RAID_CHUNK_SIZE.
> +    RAID6: stripe_size = (N-2)*RAID_CHUNK_SIZE.
> +
> +  See also partial_stripes_expensive, above.
> +
>  writeback_delay
>    When dirty data is written to the cache and it previously did not contain
>    any, waits some number of seconds before initiating writeback. Defaults to
> diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
> index 1b63ac8..d0b9501 100644
> --- a/drivers/md/bcache/super.c
> +++ b/drivers/md/bcache/super.c
> @@ -80,6 +80,7 @@ static const char *read_super(struct cache_sb *sb, struct block_device *bdev,
>  
>  	sb->flags		= le64_to_cpu(s->flags);
>  	sb->seq			= le64_to_cpu(s->seq);
> +	sb->stripe_size		= le32_to_cpu(s->stripe_size);
>  	sb->last_mount		= le32_to_cpu(s->last_mount);
>  	sb->first_bucket	= le16_to_cpu(s->first_bucket);
>  	sb->keys		= le16_to_cpu(s->keys);
> @@ -221,6 +222,7 @@ static void __write_super(struct cache_sb *sb, struct bio *bio)
>  
>  	out->flags		= cpu_to_le64(sb->flags);
>  	out->seq		= cpu_to_le64(sb->seq);
> +	out->stripe_size	= cpu_to_le32(sb->stripe_size);
>  
>  	out->last_mount		= cpu_to_le32(sb->last_mount);
>  	out->first_bucket	= cpu_to_le16(sb->first_bucket);
> @@ -1258,7 +1260,18 @@ static int cached_dev_init(struct cached_dev *dc, unsigned int block_size)
>  
>  	dc->disk.stripe_size = q->limits.io_opt >> 9;
>  
> -	if (dc->disk.stripe_size)
> +	if (dc->sb.stripe_size) {
> +		if (dc->disk.stripe_size &&
> +		    dc->disk.stripe_size != dc->sb.stripe_size) {
> +			pr_warn("superblock stripe_size (%d) overrides bdev stripe_size (%d)\n",
> +				(int)dc->sb.stripe_size,
> +				(int)dc->disk.stripe_size);
> +		}
> +
> +		dc->disk.stripe_size = dc->sb.stripe_size;
> +		dc->partial_stripes_expensive =
> +			(unsigned int)BDEV_PARTIAL_STRIPES_EXPENSIVE(&dc->sb);
> +	} else if (dc->disk.stripe_size)
>  		dc->partial_stripes_expensive =
>  			q->limits.raid_partial_stripes_expensive;
>  
> diff --git a/drivers/md/bcache/sysfs.c b/drivers/md/bcache/sysfs.c
> index bfb437f..4ebca52 100644
> --- a/drivers/md/bcache/sysfs.c
> +++ b/drivers/md/bcache/sysfs.c
> @@ -111,8 +111,8 @@
>  rw_attribute(writeback_rate_minimum);
>  read_attribute(writeback_rate_debug);
>  
> -read_attribute(stripe_size);
> -read_attribute(partial_stripes_expensive);
> +rw_attribute(stripe_size);
> +rw_attribute(partial_stripes_expensive);
>  
>  rw_attribute(synchronous);
>  rw_attribute(journal_delay_ms);
> @@ -343,6 +343,35 @@ static ssize_t bch_snprint_string_list(char *buf,
>  		}
>  	}
>  
> +	if (attr == &sysfs_stripe_size) {
> +		int v = strtoul_or_return(buf);
> +
> +		if (v & 0x1FF) {
> +			pr_err("stripe_size must be a muliple of 512-byte sectors");
> +			return -EINVAL;
> +		}
> +
> +		v >>= 9;
> +
> +		if (v != dc->sb.stripe_size) {
> +			dc->sb.stripe_size = v;
> +			pr_info("stripe_size=%d, re-register to take effect.",
> +				v<<9);
> +			bch_write_bdev_super(dc, NULL);
> +		} else
> +			pr_info("stripe_size is already set to %d.", v<<9);
> +	}
> +
> +	if (attr == &sysfs_partial_stripes_expensive) {
> +		int v = strtoul_or_return(buf);
> +
> +		if (v != BDEV_PARTIAL_STRIPES_EXPENSIVE(&dc->sb)) {
> +			SET_BDEV_PARTIAL_STRIPES_EXPENSIVE(&dc->sb, v);
> +			dc->partial_stripes_expensive = v;
> +			bch_write_bdev_super(dc, NULL);
> +		}
> +	}
> +
>  	if (attr == &sysfs_stop_when_cache_set_failed) {
>  		v = __sysfs_match_string(bch_stop_on_failure_modes, -1, buf);
>  		if (v < 0)
> diff --git a/include/uapi/linux/bcache.h b/include/uapi/linux/bcache.h
> index 5d4f58e..ee60914 100644
> --- a/include/uapi/linux/bcache.h
> +++ b/include/uapi/linux/bcache.h
> @@ -172,7 +172,9 @@ struct cache_sb {
>  
>  	__u64			flags;
>  	__u64			seq;
> -	__u64			pad[8];
> +	__u32			stripe_size;
> +	__u32			pad_u32;
> +	__u64			pad_u64[7];
>  
>  	union {
>  	struct {
> @@ -230,7 +232,7 @@ static inline _Bool SB_IS_BDEV(const struct cache_sb *sb)
>  #define BDEV_STATE_CLEAN		1U
>  #define BDEV_STATE_DIRTY		2U
>  #define BDEV_STATE_STALE		3U
> -
> +BITMASK(BDEV_PARTIAL_STRIPES_EXPENSIVE,	struct cache_sb, flags, 60, 1);
>  /*
>   * Magic numbers
>   *
> 


-- 

Coly Li

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] bcache: make stripe_size configurable and persistent for hardware raid5/6
  2019-06-24  7:05   ` Coly Li
@ 2019-06-24 18:14     ` Eric Wheeler
  2019-06-24 23:24       ` Martin K. Petersen
  2019-06-25  1:59       ` Coly Li
  0 siblings, 2 replies; 17+ messages in thread
From: Eric Wheeler @ 2019-06-24 18:14 UTC (permalink / raw)
  To: Coly Li
  Cc: linux-block, Jonathan Corbet, Kent Overstreet,
	open list:DOCUMENTATION, open list,
	open list:BCACHE (BLOCK LAYER CACHE),
	Martin K. Petersen

[-- Attachment #1: Type: TEXT/PLAIN, Size: 8629 bytes --]

On Mon, 24 Jun 2019, Coly Li wrote:

> On 2019/6/23 7:16 上午, Eric Wheeler wrote:
> > From: Eric Wheeler <git@linux.ewheeler.net>
> > 
> > While some drivers set queue_limits.io_opt (e.g., md raid5), there are
> > currently no SCSI/RAID controller drivers that do.  Previously stripe_size
> > and partial_stripes_expensive were read-only values and could not be
> > tuned by users (eg, for hardware RAID5/6).
> > 
> > This patch enables users to save the optimal IO size via sysfs through
> > the backing device attributes stripe_size and partial_stripes_expensive
> > into the bcache superblock.
> > 
> > Superblock changes are backwards-compatable:
> > 
> > *  partial_stripes_expensive: One bit was used in the superblock flags field
> > 
> > *  stripe_size: There are eight 64-bit "pad" fields for future use in
> >    the superblock which default to 0; from those, 32-bits are now used
> >    to save the stripe_size and load at device registration time.
> > 
> > Signed-off-by: Eric Wheeler <bcache@linux.ewheeler.net>
> 
> Hi Eric,
> 
> In general I am OK with this patch. Since Peter comments lots of SCSI
> RAID devices reports a stripe width, could you please list the hardware
> raid devices which don't list stripe size ? Then we can make decision
> whether it is necessary to have such option enabled.

Perhaps they do not set stripe_width using io_opt? I did a grep to see if 
any of them did, but I didn't see them. How is stripe_width indicated by 
RAID controllers? 

If they do set io_opt, then at least my Areca 1883 does not set io_opt as 
of 4.19.x. I also have a LSI MegaRAID 3108 which does not report io_opt as 
of 4.1.x, but that is an older kernel so maybe support has been added 
since then.

Martin,

Where would stripe_width be configured in the SCSI drivers? Is it visible 
through sysfs or debugfs so I can check my hardware support without 
hacking debugging the kernel?

> 
> Another point is, this patch changes struct cache_sb, it is no problem
> to change on-disk format. I plan to update the super block version soon,
> to store more configuration persistently into super block. stripe_size
> can be added to cache_sb with other on-disk changes.

Maybe bumping version makes sense, but even if you do not, this is safe to 
use on systems without bumping the version because the values are unused 
and default to 0.

--
Eric Wheeler

> 
> Thanks.
> 
> Coly Li
> 
> 
> > ---
> >  Documentation/admin-guide/bcache.rst | 21 +++++++++++++++++++++
> >  drivers/md/bcache/super.c            | 15 ++++++++++++++-
> >  drivers/md/bcache/sysfs.c            | 33 +++++++++++++++++++++++++++++++--
> >  include/uapi/linux/bcache.h          |  6 ++++--
> >  4 files changed, 70 insertions(+), 5 deletions(-)
> > 
> > diff --git a/Documentation/admin-guide/bcache.rst b/Documentation/admin-guide/bcache.rst
> > index c0ce64d..ef82022 100644
> > --- a/Documentation/admin-guide/bcache.rst
> > +++ b/Documentation/admin-guide/bcache.rst
> > @@ -420,6 +420,12 @@ dirty_data
> >  label
> >    Name of underlying device.
> >  
> > +partial_stripes_expensive
> > +  Flag to bcache that partial or unaligned stripe_size'd
> > +  writes to the backing device are expensive (e.g., RAID5/6 incur
> > +  read-copy-write). Writing this sysfs attribute updates the superblock
> > +  and also takes effect immediately.  See also stripe_size, below.
> > +
> >  readahead
> >    Size of readahead that should be performed.  Defaults to 0.  If set to e.g.
> >    1M, it will round cache miss reads up to that size, but without overlapping
> > @@ -458,6 +464,21 @@ stop
> >    Write to this file to shut down the bcache device and close the backing
> >    device.
> >  
> > +stripe_size
> > +  The stripe size in bytes of the backing device for optimial
> > +  write performance (also known as the "stride width"). This is set
> > +  automatically when using a device driver sets blk_limits_io_opt
> > +  (e.g., md, rbd, skd, zram, virtio_blk).  No hardware RAID controller
> > +  sets blk_limits_io_opt as of 2019-06-15, so configure this to suit
> > +  your needs.  Note that you must unregister and re-register the backing
> > +  device after making a change to stripe_size.
> > +
> > +  Where N is the number of data disks,
> > +    RAID5: stripe_size = (N-1)*RAID_CHUNK_SIZE.
> > +    RAID6: stripe_size = (N-2)*RAID_CHUNK_SIZE.
> > +
> > +  See also partial_stripes_expensive, above.
> > +
> >  writeback_delay
> >    When dirty data is written to the cache and it previously did not contain
> >    any, waits some number of seconds before initiating writeback. Defaults to
> > diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
> > index 1b63ac8..d0b9501 100644
> > --- a/drivers/md/bcache/super.c
> > +++ b/drivers/md/bcache/super.c
> > @@ -80,6 +80,7 @@ static const char *read_super(struct cache_sb *sb, struct block_device *bdev,
> >  
> >  	sb->flags		= le64_to_cpu(s->flags);
> >  	sb->seq			= le64_to_cpu(s->seq);
> > +	sb->stripe_size		= le32_to_cpu(s->stripe_size);
> >  	sb->last_mount		= le32_to_cpu(s->last_mount);
> >  	sb->first_bucket	= le16_to_cpu(s->first_bucket);
> >  	sb->keys		= le16_to_cpu(s->keys);
> > @@ -221,6 +222,7 @@ static void __write_super(struct cache_sb *sb, struct bio *bio)
> >  
> >  	out->flags		= cpu_to_le64(sb->flags);
> >  	out->seq		= cpu_to_le64(sb->seq);
> > +	out->stripe_size	= cpu_to_le32(sb->stripe_size);
> >  
> >  	out->last_mount		= cpu_to_le32(sb->last_mount);
> >  	out->first_bucket	= cpu_to_le16(sb->first_bucket);
> > @@ -1258,7 +1260,18 @@ static int cached_dev_init(struct cached_dev *dc, unsigned int block_size)
> >  
> >  	dc->disk.stripe_size = q->limits.io_opt >> 9;
> >  
> > -	if (dc->disk.stripe_size)
> > +	if (dc->sb.stripe_size) {
> > +		if (dc->disk.stripe_size &&
> > +		    dc->disk.stripe_size != dc->sb.stripe_size) {
> > +			pr_warn("superblock stripe_size (%d) overrides bdev stripe_size (%d)\n",
> > +				(int)dc->sb.stripe_size,
> > +				(int)dc->disk.stripe_size);
> > +		}
> > +
> > +		dc->disk.stripe_size = dc->sb.stripe_size;
> > +		dc->partial_stripes_expensive =
> > +			(unsigned int)BDEV_PARTIAL_STRIPES_EXPENSIVE(&dc->sb);
> > +	} else if (dc->disk.stripe_size)
> >  		dc->partial_stripes_expensive =
> >  			q->limits.raid_partial_stripes_expensive;
> >  
> > diff --git a/drivers/md/bcache/sysfs.c b/drivers/md/bcache/sysfs.c
> > index bfb437f..4ebca52 100644
> > --- a/drivers/md/bcache/sysfs.c
> > +++ b/drivers/md/bcache/sysfs.c
> > @@ -111,8 +111,8 @@
> >  rw_attribute(writeback_rate_minimum);
> >  read_attribute(writeback_rate_debug);
> >  
> > -read_attribute(stripe_size);
> > -read_attribute(partial_stripes_expensive);
> > +rw_attribute(stripe_size);
> > +rw_attribute(partial_stripes_expensive);
> >  
> >  rw_attribute(synchronous);
> >  rw_attribute(journal_delay_ms);
> > @@ -343,6 +343,35 @@ static ssize_t bch_snprint_string_list(char *buf,
> >  		}
> >  	}
> >  
> > +	if (attr == &sysfs_stripe_size) {
> > +		int v = strtoul_or_return(buf);
> > +
> > +		if (v & 0x1FF) {
> > +			pr_err("stripe_size must be a muliple of 512-byte sectors");
> > +			return -EINVAL;
> > +		}
> > +
> > +		v >>= 9;
> > +
> > +		if (v != dc->sb.stripe_size) {
> > +			dc->sb.stripe_size = v;
> > +			pr_info("stripe_size=%d, re-register to take effect.",
> > +				v<<9);
> > +			bch_write_bdev_super(dc, NULL);
> > +		} else
> > +			pr_info("stripe_size is already set to %d.", v<<9);
> > +	}
> > +
> > +	if (attr == &sysfs_partial_stripes_expensive) {
> > +		int v = strtoul_or_return(buf);
> > +
> > +		if (v != BDEV_PARTIAL_STRIPES_EXPENSIVE(&dc->sb)) {
> > +			SET_BDEV_PARTIAL_STRIPES_EXPENSIVE(&dc->sb, v);
> > +			dc->partial_stripes_expensive = v;
> > +			bch_write_bdev_super(dc, NULL);
> > +		}
> > +	}
> > +
> >  	if (attr == &sysfs_stop_when_cache_set_failed) {
> >  		v = __sysfs_match_string(bch_stop_on_failure_modes, -1, buf);
> >  		if (v < 0)
> > diff --git a/include/uapi/linux/bcache.h b/include/uapi/linux/bcache.h
> > index 5d4f58e..ee60914 100644
> > --- a/include/uapi/linux/bcache.h
> > +++ b/include/uapi/linux/bcache.h
> > @@ -172,7 +172,9 @@ struct cache_sb {
> >  
> >  	__u64			flags;
> >  	__u64			seq;
> > -	__u64			pad[8];
> > +	__u32			stripe_size;
> > +	__u32			pad_u32;
> > +	__u64			pad_u64[7];
> >  
> >  	union {
> >  	struct {
> > @@ -230,7 +232,7 @@ static inline _Bool SB_IS_BDEV(const struct cache_sb *sb)
> >  #define BDEV_STATE_CLEAN		1U
> >  #define BDEV_STATE_DIRTY		2U
> >  #define BDEV_STATE_STALE		3U
> > -
> > +BITMASK(BDEV_PARTIAL_STRIPES_EXPENSIVE,	struct cache_sb, flags, 60, 1);
> >  /*
> >   * Magic numbers
> >   *
> > 
> 
> 
> -- 
> 
> Coly Li
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] bcache: make stripe_size configurable and persistent for hardware raid5/6
  2019-06-24 18:14     ` Eric Wheeler
@ 2019-06-24 23:24       ` Martin K. Petersen
  2019-06-26  0:23         ` Eric Wheeler
  2019-06-25  1:59       ` Coly Li
  1 sibling, 1 reply; 17+ messages in thread
From: Martin K. Petersen @ 2019-06-24 23:24 UTC (permalink / raw)
  To: Eric Wheeler
  Cc: Coly Li, linux-block, Jonathan Corbet, Kent Overstreet,
	open list:DOCUMENTATION, open list,
	open list:BCACHE (BLOCK LAYER CACHE),
	Martin K. Petersen


Eric,

> Perhaps they do not set stripe_width using io_opt? I did a grep to see
> if any of them did, but I didn't see them. How is stripe_width
> indicated by RAID controllers?

The values are reported in the Block Limits VPD page for each SCSI block
device and are thus set by the SCSI disk driver. IOW, the RAID
controller device drivers have nothing to do with this.

For RAID controllers specifically, the controller firmware will fill out
the VPD fields for each virtual SCSI disk when you configure a RAID
set. For pretty much everything else, the Block Limits come straight
from the device itself.

Also note that these values aren't specific to RAID controllers at
all. Most new SCSI devices, including disk drives and SSDs, will fill
out the Block Limits VPD page one way or the other. Even some USB
storage devices are providing this page.

> If they do set io_opt, then at least my Areca 1883 does not set io_opt
> as of 4.19.x. I also have a LSI MegaRAID 3108 which does not report
> io_opt as of 4.1.x, but that is an older kernel so maybe support has
> been added since then.

I have several MegaRAIDs that all report it. But it depends on the
controller firmware.

> Is it visible through sysfs or debugfs so I can check my hardware
> support without hacking debugging the kernel?

To print the block device topology:

  # lsblk -t

or look up io_opt in sysfs:

  # grep . /sys/block/sdX/queue/optimal_io_size

You can also query a SCSI device's Block Limits directly:

  # sg_vpd -p bl /dev/sdX

If you want to tinker, you can simulate a SCSI disk with your choice of
io_opt:

  # modprobe scsi_debug opt_blks=N

where N is the number of logical blocks to report as being the optimal
I/O size.

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] bcache: make stripe_size configurable and persistent for hardware raid5/6
  2019-06-24 18:14     ` Eric Wheeler
  2019-06-24 23:24       ` Martin K. Petersen
@ 2019-06-25  1:59       ` Coly Li
  2022-01-06  3:29         ` Eric Wheeler
  1 sibling, 1 reply; 17+ messages in thread
From: Coly Li @ 2019-06-25  1:59 UTC (permalink / raw)
  To: Eric Wheeler
  Cc: linux-block, Jonathan Corbet, Kent Overstreet,
	open list:DOCUMENTATION, open list,
	open list:BCACHE (BLOCK LAYER CACHE),
	Martin K. Petersen

On 2019/6/25 2:14 上午, Eric Wheeler wrote:
> On Mon, 24 Jun 2019, Coly Li wrote:
> 
>> On 2019/6/23 7:16 上午, Eric Wheeler wrote:
>>> From: Eric Wheeler <git@linux.ewheeler.net>
>>>
>>> While some drivers set queue_limits.io_opt (e.g., md raid5), there are
>>> currently no SCSI/RAID controller drivers that do.  Previously stripe_size
>>> and partial_stripes_expensive were read-only values and could not be
>>> tuned by users (eg, for hardware RAID5/6).
>>>
>>> This patch enables users to save the optimal IO size via sysfs through
>>> the backing device attributes stripe_size and partial_stripes_expensive
>>> into the bcache superblock.
>>>
>>> Superblock changes are backwards-compatable:
>>>
>>> *  partial_stripes_expensive: One bit was used in the superblock flags field
>>>
>>> *  stripe_size: There are eight 64-bit "pad" fields for future use in
>>>    the superblock which default to 0; from those, 32-bits are now used
>>>    to save the stripe_size and load at device registration time.
>>>
>>> Signed-off-by: Eric Wheeler <bcache@linux.ewheeler.net>
>>
>> Hi Eric,
>>
>> In general I am OK with this patch. Since Peter comments lots of SCSI
>> RAID devices reports a stripe width, could you please list the hardware
>> raid devices which don't list stripe size ? Then we can make decision
>> whether it is necessary to have such option enabled.
> 
> Perhaps they do not set stripe_width using io_opt? I did a grep to see if 
> any of them did, but I didn't see them. How is stripe_width indicated by 
> RAID controllers? 
> 
> If they do set io_opt, then at least my Areca 1883 does not set io_opt as 
> of 4.19.x. I also have a LSI MegaRAID 3108 which does not report io_opt as 
> of 4.1.x, but that is an older kernel so maybe support has been added 
> since then.
> 
> Martin,
> 
> Where would stripe_width be configured in the SCSI drivers? Is it visible 
> through sysfs or debugfs so I can check my hardware support without 
> hacking debugging the kernel?
> 
>>
>> Another point is, this patch changes struct cache_sb, it is no problem
>> to change on-disk format. I plan to update the super block version soon,
>> to store more configuration persistently into super block. stripe_size
>> can be added to cache_sb with other on-disk changes.
> 

Hi Eric,

> Maybe bumping version makes sense, but even if you do not, this is safe to 
> use on systems without bumping the version because the values are unused 
> and default to 0.

Yes, I understand you, it works as you suggested. I need to think how to
organize all options in struct cache_sb, stripe_size will be arranged
then. And I will ask help to you for reviewing the changes of on-disk
format.

Thanks.

[snipped]

-- 

Coly Li

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] bcache: make stripe_size configurable and persistent for hardware raid5/6
  2019-06-24 23:24       ` Martin K. Petersen
@ 2019-06-26  0:23         ` Eric Wheeler
  2019-06-26  2:50           ` Martin K. Petersen
  0 siblings, 1 reply; 17+ messages in thread
From: Eric Wheeler @ 2019-06-26  0:23 UTC (permalink / raw)
  To: Martin K. Petersen
  Cc: Coly Li, linux-block, Jonathan Corbet, Kent Overstreet,
	open list\:DOCUMENTATION, open list,
	open list\:BCACHE \(BLOCK LAYER CACHE\)

On Mon, 24 Jun 2019, Martin K. Petersen wrote:
> > Perhaps they do not set stripe_width using io_opt? I did a grep to see
> > if any of them did, but I didn't see them. How is stripe_width
> > indicated by RAID controllers?
> 
> The values are reported in the Block Limits VPD page for each SCSI block
> device and are thus set by the SCSI disk driver. IOW, the RAID
> controller device drivers have nothing to do with this.
> 
> For RAID controllers specifically, the controller firmware will fill out
> the VPD fields for each virtual SCSI disk when you configure a RAID
> set. For pretty much everything else, the Block Limits come straight
> from the device itself.
> 
> Also note that these values aren't specific to RAID controllers at
> all. Most new SCSI devices, including disk drives and SSDs, will fill
> out the Block Limits VPD page one way or the other. Even some USB
> storage devices are providing this page.

Thanks, that makes sense.  Interesting about USB.

> > If they do set io_opt, then at least my Areca 1883 does not set io_opt
> > as of 4.19.x. I also have a LSI MegaRAID 3108 which does not report
> > io_opt as of 4.1.x, but that is an older kernel so maybe support has
> > been added since then.
> 
> I have several MegaRAIDs that all report it. But it depends on the
> controller firmware.
> 
> > Is it visible through sysfs or debugfs so I can check my hardware
> > support without hacking debugging the kernel?
> 
> To print the block device topology:
> 
>   # lsblk -t
> 
> or look up io_opt in sysfs:
> 
>   # grep . /sys/block/sdX/queue/optimal_io_size
> 
> You can also query a SCSI device's Block Limits directly:
> 
>   # sg_vpd -p bl /dev/sdX

Perfect, thank you for that.  I've tried the following controllers that I 
have access to.  One worked (hspa/HP Gen8 Smart Array Controller), but the 
others I tried are not providing VPDs:

* LSI 2108 (Supermicro)
* LSI 3108 (Dell)
* Areca 1882
* Areca 1883
* Fibrechannel 8gbe connected to a Storwize 3700

~]# sg_vpd -p bl /dev/sdb
VPD page=0xb0
fetching VPD page failed

> If you want to tinker, you can simulate a SCSI disk with your choice of
> io_opt:
> 
>   # modprobe scsi_debug opt_blks=N
> 
> where N is the number of logical blocks to report as being the optimal
> I/O size.

Neat, thanks for the hint!

-Eric

> 
> -- 
> Martin K. Petersen	Oracle Linux Engineering
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] bcache: make stripe_size configurable and persistent for hardware raid5/6
  2019-06-26  0:23         ` Eric Wheeler
@ 2019-06-26  2:50           ` Martin K. Petersen
  0 siblings, 0 replies; 17+ messages in thread
From: Martin K. Petersen @ 2019-06-26  2:50 UTC (permalink / raw)
  To: Eric Wheeler
  Cc: Martin K. Petersen, Coly Li, linux-block, Jonathan Corbet,
	Kent Overstreet, open list:DOCUMENTATION, open list,
	open list:BCACHE (BLOCK LAYER CACHE)


Eric,

> * LSI 2108 (Supermicro)
> * LSI 3108 (Dell)
> * Areca 1882
> * Areca 1883
> * Fibrechannel 8gbe connected to a Storwize 3700

I have a 3108 that provides the BL VPD. Surprised the 1883 doesn't.

As a rule of thumb, you need 12 Gbps SAS or 16 Gbps FC devices for the
VPD page to be present. The protocol feature is not tied to the
transport signaling speed in any way. But general support for the BL VPD
page roughly coincided with vendors introducing 12 Gbps SAS and 16 Gbps
FC products to the market.

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] bcache: make stripe_size configurable and persistent for hardware raid5/6
  2019-06-25  1:59       ` Coly Li
@ 2022-01-06  3:29         ` Eric Wheeler
  2022-01-06 16:17           ` Coly Li
  2022-01-08  0:21           ` Martin K. Petersen
  0 siblings, 2 replies; 17+ messages in thread
From: Eric Wheeler @ 2022-01-06  3:29 UTC (permalink / raw)
  To: Coly Li
  Cc: linux-block, Jonathan Corbet, Kent Overstreet,
	open list:DOCUMENTATION, open list,
	open list:BCACHE (BLOCK LAYER CACHE),
	Martin K. Petersen

[-- Attachment #1: Type: text/plain, Size: 3656 bytes --]

On Tue, 25 Jun 2019, Coly Li wrote:
> On 2019/6/25 2:14 上午, Eric Wheeler wrote:
> > On Mon, 24 Jun 2019, Coly Li wrote:
> > 
> >> On 2019/6/23 7:16 上午, Eric Wheeler wrote:
> >>> From: Eric Wheeler <git@linux.ewheeler.net>
> >>>
> >>> While some drivers set queue_limits.io_opt (e.g., md raid5), there are
> >>> currently no SCSI/RAID controller drivers that do.  Previously stripe_size
> >>> and partial_stripes_expensive were read-only values and could not be
> >>> tuned by users (eg, for hardware RAID5/6).
> >>>
> >>> This patch enables users to save the optimal IO size via sysfs through
> >>> the backing device attributes stripe_size and partial_stripes_expensive
> >>> into the bcache superblock.
> >>>
> >>> Superblock changes are backwards-compatable:
> >>>
> >>> *  partial_stripes_expensive: One bit was used in the superblock flags field
> >>>
> >>> *  stripe_size: There are eight 64-bit "pad" fields for future use in
> >>>    the superblock which default to 0; from those, 32-bits are now used
> >>>    to save the stripe_size and load at device registration time.
> >>>
> >>> Signed-off-by: Eric Wheeler <bcache@linux.ewheeler.net>
> >>
> >> Hi Eric,
> >>
> >> In general I am OK with this patch. Since Peter comments lots of SCSI
> >> RAID devices reports a stripe width, could you please list the hardware
> >> raid devices which don't list stripe size ? Then we can make decision
> >> whether it is necessary to have such option enabled.
> > 
> > Perhaps they do not set stripe_width using io_opt? I did a grep to see if 
> > any of them did, but I didn't see them. How is stripe_width indicated by 
> > RAID controllers? 
> > 
> > If they do set io_opt, then at least my Areca 1883 does not set io_opt as 
> > of 4.19.x. I also have a LSI MegaRAID 3108 which does not report io_opt as 
> > of 4.1.x, but that is an older kernel so maybe support has been added 
> > since then.
> > 
> > Martin,
> > 
> > Where would stripe_width be configured in the SCSI drivers? Is it visible 
> > through sysfs or debugfs so I can check my hardware support without 
> > hacking debugging the kernel?
> > 
> >>
> >> Another point is, this patch changes struct cache_sb, it is no problem
> >> to change on-disk format. I plan to update the super block version soon,
> >> to store more configuration persistently into super block. stripe_size
> >> can be added to cache_sb with other on-disk changes.
> > 
> 
> Hi Eric,
> 
> > Maybe bumping version makes sense, but even if you do not, this is safe to 
> > use on systems without bumping the version because the values are unused 
> > and default to 0.
> 
> Yes, I understand you, it works as you suggested. I need to think how to
> organize all options in struct cache_sb, stripe_size will be arranged
> then. And I will ask help to you for reviewing the changes of on-disk
> format.

Hi Coli,

Just checking in, its been a while and I didn't see any more discussion on 
the topic:

This would benefit users with older RAID controllers using RAID-5/6 that 
don't set io_opt.

Even new new RAID controlers that _do_ provide `io_opt` still do _not_ 
indicate partial_stripes_expensive (which is an mdraid feature, but Martin 
please correct me if I'm wrong here).  Thus, all hardware RAID-5/6 users 
could benefit by manually flagging partial_stripes_expensive to get burst 
writes out of bcache that fit their stride width.

This patch probably needs rebased and documentation updated about io_opt, 
but here is the original patch with documentation for your reference:
	https://lkml.org/lkml/2019/6/22/298

What do you think?

-Eric

> 
> Thanks.
> 
> [snipped]
> 
> -- 
> 
> Coly Li
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] bcache: make stripe_size configurable and persistent for hardware raid5/6
  2022-01-06  3:29         ` Eric Wheeler
@ 2022-01-06 16:17           ` Coly Li
  2022-01-08  0:21           ` Martin K. Petersen
  1 sibling, 0 replies; 17+ messages in thread
From: Coly Li @ 2022-01-06 16:17 UTC (permalink / raw)
  To: Eric Wheeler
  Cc: linux-block, Jonathan Corbet, Kent Overstreet,
	open list:DOCUMENTATION, open list,
	open list:BCACHE (BLOCK LAYER CACHE),
	Martin K. Petersen

On 1/6/22 11:29 AM, Eric Wheeler wrote:
> On Tue, 25 Jun 2019, Coly Li wrote:
>> On 2019/6/25 2:14 上午, Eric Wheeler wrote:
>>> On Mon, 24 Jun 2019, Coly Li wrote:
>>>
>>>> On 2019/6/23 7:16 上午, Eric Wheeler wrote:
>>>>> From: Eric Wheeler <git@linux.ewheeler.net>
>>>>>
>>>>> While some drivers set queue_limits.io_opt (e.g., md raid5), there are
>>>>> currently no SCSI/RAID controller drivers that do.  Previously stripe_size
>>>>> and partial_stripes_expensive were read-only values and could not be
>>>>> tuned by users (eg, for hardware RAID5/6).
>>>>>
>>>>> This patch enables users to save the optimal IO size via sysfs through
>>>>> the backing device attributes stripe_size and partial_stripes_expensive
>>>>> into the bcache superblock.
>>>>>
>>>>> Superblock changes are backwards-compatable:
>>>>>
>>>>> *  partial_stripes_expensive: One bit was used in the superblock flags field
>>>>>
>>>>> *  stripe_size: There are eight 64-bit "pad" fields for future use in
>>>>>     the superblock which default to 0; from those, 32-bits are now used
>>>>>     to save the stripe_size and load at device registration time.
>>>>>
>>>>> Signed-off-by: Eric Wheeler <bcache@linux.ewheeler.net>
>>>> Hi Eric,
>>>>
>>>> In general I am OK with this patch. Since Peter comments lots of SCSI
>>>> RAID devices reports a stripe width, could you please list the hardware
>>>> raid devices which don't list stripe size ? Then we can make decision
>>>> whether it is necessary to have such option enabled.
>>> Perhaps they do not set stripe_width using io_opt? I did a grep to see if
>>> any of them did, but I didn't see them. How is stripe_width indicated by
>>> RAID controllers?
>>>
>>> If they do set io_opt, then at least my Areca 1883 does not set io_opt as
>>> of 4.19.x. I also have a LSI MegaRAID 3108 which does not report io_opt as
>>> of 4.1.x, but that is an older kernel so maybe support has been added
>>> since then.
>>>
>>> Martin,
>>>
>>> Where would stripe_width be configured in the SCSI drivers? Is it visible
>>> through sysfs or debugfs so I can check my hardware support without
>>> hacking debugging the kernel?
>>>
>>>> Another point is, this patch changes struct cache_sb, it is no problem
>>>> to change on-disk format. I plan to update the super block version soon,
>>>> to store more configuration persistently into super block. stripe_size
>>>> can be added to cache_sb with other on-disk changes.
>> Hi Eric,
>>
>>> Maybe bumping version makes sense, but even if you do not, this is safe to
>>> use on systems without bumping the version because the values are unused
>>> and default to 0.
>> Yes, I understand you, it works as you suggested. I need to think how to
>> organize all options in struct cache_sb, stripe_size will be arranged
>> then. And I will ask help to you for reviewing the changes of on-disk
>> format.
> Hi Coli,
>
> Just checking in, its been a while and I didn't see any more discussion on
> the topic:

Hi Eric,

Thank you for reminding me. The persistent on-disk options were that 
much as I thought, so using a reserved space from the on-disk super 
block is fine.

> This would benefit users with older RAID controllers using RAID-5/6 that
> don't set io_opt.
>
> Even new new RAID controlers that _do_ provide `io_opt` still do _not_
> indicate partial_stripes_expensive (which is an mdraid feature, but Martin
> please correct me if I'm wrong here).  Thus, all hardware RAID-5/6 users
> could benefit by manually flagging partial_stripes_expensive to get burst
> writes out of bcache that fit their stride width.

Yeah, I agree with you.

> This patch probably needs rebased and documentation updated about io_opt,
> but here is the original patch with documentation for your reference:
> 	https://lkml.org/lkml/2019/6/22/298
>
> What do you think?

Yes please rebase the patch with latest mainline kernel and let's start 
the review.

Thank you.

Coly Li

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] bcache: make stripe_size configurable and persistent for hardware raid5/6
  2022-01-06  3:29         ` Eric Wheeler
  2022-01-06 16:17           ` Coly Li
@ 2022-01-08  0:21           ` Martin K. Petersen
  2022-01-08  4:54             ` Eric Wheeler
  1 sibling, 1 reply; 17+ messages in thread
From: Martin K. Petersen @ 2022-01-08  0:21 UTC (permalink / raw)
  To: Eric Wheeler
  Cc: Coly Li, linux-block, Jonathan Corbet, Kent Overstreet,
	open list:DOCUMENTATION, open list,
	open list:BCACHE (BLOCK LAYER CACHE),
	Martin K. Petersen


Eric,

> Even new new RAID controlers that _do_ provide `io_opt` still do _not_ 
> indicate partial_stripes_expensive (which is an mdraid feature, but Martin 
> please correct me if I'm wrong here).

partial_stripes_expensive is a bcache thing, I am not sure why it needs
a separate flag. It is implied, although I guess one could argue that
RAID0 is a special case since partial writes are not as painful as with
parity RAID.

The SCSI spec states that submitting an I/O that is smaller than io_min
"may incur delays in processing the command". And similarly, submitting
a command larger than io_opt "may incur delays in processing the
command".

IOW, the spec says "don't write less than an aligned multiple of the
stripe chunk size" and "don't write more than an aligned full
stripe". That leaves "aligned multiples of the stripe chunk size but
less than the full stripe width" unaccounted for. And I guess that's
what the bcache flag is trying to capture.

SCSI doesn't go into details about RAID levels and other implementation
details which is why the wording is deliberately vague. But obviously
the expectation is that partial stripe writes are slower than full.

In my book any component in the stack that sees either io_min or io_opt
should try very hard to send I/Os that are aligned multiples of those
values. I am not opposed to letting users manually twiddle the
settings. But I do think that we should aim for the stack doing the
right thing when it sees io_opt reported on a device.

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] bcache: make stripe_size configurable and persistent for hardware raid5/6
  2022-01-08  0:21           ` Martin K. Petersen
@ 2022-01-08  4:54             ` Eric Wheeler
  2022-01-08 21:51               ` Eric Wheeler
  0 siblings, 1 reply; 17+ messages in thread
From: Eric Wheeler @ 2022-01-08  4:54 UTC (permalink / raw)
  To: Martin K. Petersen
  Cc: Coly Li, linux-block, Jonathan Corbet, Kent Overstreet,
	open list:DOCUMENTATION, open list,
	open list:BCACHE (BLOCK LAYER CACHE)

On Fri, 7 Jan 2022, Martin K. Petersen wrote:
> Eric,
> 
> > Even new new RAID controlers that _do_ provide `io_opt` still do _not_ 
> > indicate partial_stripes_expensive (which is an mdraid feature, but Martin 
> > please correct me if I'm wrong here).
> 
> partial_stripes_expensive is a bcache thing, I am not sure why it needs
> a separate flag. It is implied, although I guess one could argue that
> RAID0 is a special case since partial writes are not as painful as with
> parity RAID.

I'm guessing bcache used did some optimization for 
queue->limits.raid_partial_stripes_expensive because md raid5 code sets 
this flag.  At least when using Linux md as the RAID5 implementation it 
gets configured automatically:
   raid5.c:       mddev->queue->limits.raid_partial_stripes_expensive = 1;

https://elixir.bootlin.com/linux/latest/source/drivers/md/raid5.c#L7729

Interestingly only bcache uses it, but md does set it.

> The SCSI spec states that submitting an I/O that is smaller than io_min
> "may incur delays in processing the command". And similarly, submitting
> a command larger than io_opt "may incur delays in processing the
> command".
> 
> IOW, the spec says "don't write less than an aligned multiple of the
> stripe chunk size" and "don't write more than an aligned full
> stripe". That leaves "aligned multiples of the stripe chunk size but
> less than the full stripe width" unaccounted for. And I guess that's
> what the bcache flag is trying to capture.

Maybe any time io_opt is provided then partial_stripes_expensive should be 
flagged too and any code to the contrary should be removed?

Question: Does anyone have a reason to keep partial_stripes_expensive in 
the kernel at all?

> SCSI doesn't go into details about RAID levels and other implementation
> details which is why the wording is deliberately vague. But obviously
> the expectation is that partial stripe writes are slower than full.
> 
> In my book any component in the stack that sees either io_min or io_opt
> should try very hard to send I/Os that are aligned multiples of those
> values. I am not opposed to letting users manually twiddle the
> settings. But I do think that we should aim for the stack doing the
> right thing when it sees io_opt reported on a device.

Agreed, thanks for the feedback!

-Eric


> 
> -- 
> Martin K. Petersen	Oracle Linux Engineering
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] bcache: make stripe_size configurable and persistent for hardware raid5/6
  2022-01-08  4:54             ` Eric Wheeler
@ 2022-01-08 21:51               ` Eric Wheeler
  2022-01-10 16:14                 ` Martin K. Petersen
  0 siblings, 1 reply; 17+ messages in thread
From: Eric Wheeler @ 2022-01-08 21:51 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: Martin K. Petersen, Coly Li, linux-block, Jonathan Corbet,
	open list:DOCUMENTATION, open list,
	open list:BCACHE (BLOCK LAYER CACHE)

On Fri, 7 Jan 2022, Eric Wheeler wrote:
> On Fri, 7 Jan 2022, Martin K. Petersen wrote:
> > Eric,
> > 
> > > Even new new RAID controlers that _do_ provide `io_opt` still do _not_ 
> > > indicate partial_stripes_expensive (which is an mdraid feature, but Martin 
> > > please correct me if I'm wrong here).
> > 
> > partial_stripes_expensive is a bcache thing, I am not sure why it needs
> > a separate flag. It is implied, although I guess one could argue that
> > RAID0 is a special case since partial writes are not as painful as with
> > parity RAID.
> 
> I'm guessing bcache used did some optimization for 
> queue->limits.raid_partial_stripes_expensive because md raid5 code sets 
> this flag.  At least when using Linux md as the RAID5 implementation it 
> gets configured automatically:
>    raid5.c:       mddev->queue->limits.raid_partial_stripes_expensive = 1;
> 
> https://elixir.bootlin.com/linux/latest/source/drivers/md/raid5.c#L7729
> 
> Interestingly only bcache uses it, but md does set it.

Ok so `git blame` shows that Kent added this to md/raid5.c in 
c78afc6261b (Kent Overstreet 2013-07-11 22:39:53 -0700 7526)
	mddev->queue->limits.raid_partial_stripes_expensive = 1;

    bcache/md: Use raid stripe size
    
    Now that we've got code for raid5/6 stripe awareness, bcache just needs
    to know about the stripes and when writing partial stripes is expensive
    - we probably don't want to enable this optimization for raid1 or 10,
    even though they have stripes. So add a flag to queue_limits.

Kent, Martin:

Do you think we should leave the md-specific 
raid_partial_stripes_expensive setting and require users of RAID 
controllers to set the bit themselves in bcache---or---remove all 
raid_partial_stripes_expensive code and always treat writes as "expensive" 
when `opt_io` is defined?

--
Eric Wheeler


> 
> > The SCSI spec states that submitting an I/O that is smaller than io_min
> > "may incur delays in processing the command". And similarly, submitting
> > a command larger than io_opt "may incur delays in processing the
> > command".
> > 
> > IOW, the spec says "don't write less than an aligned multiple of the
> > stripe chunk size" and "don't write more than an aligned full
> > stripe". That leaves "aligned multiples of the stripe chunk size but
> > less than the full stripe width" unaccounted for. And I guess that's
> > what the bcache flag is trying to capture.
> 
> Maybe any time io_opt is provided then partial_stripes_expensive should be 
> flagged too and any code to the contrary should be removed?
> 
> Question: Does anyone have a reason to keep partial_stripes_expensive in 
> the kernel at all?
> 
> > SCSI doesn't go into details about RAID levels and other implementation
> > details which is why the wording is deliberately vague. But obviously
> > the expectation is that partial stripe writes are slower than full.
> > 
> > In my book any component in the stack that sees either io_min or io_opt
> > should try very hard to send I/Os that are aligned multiples of those
> > values. I am not opposed to letting users manually twiddle the
> > settings. But I do think that we should aim for the stack doing the
> > right thing when it sees io_opt reported on a device.
> 
> Agreed, thanks for the feedback!
> 
> -Eric
> 
> 
> > 
> > -- 
> > Martin K. Petersen	Oracle Linux Engineering
> > 
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] bcache: make stripe_size configurable and persistent for hardware raid5/6
  2022-01-08 21:51               ` Eric Wheeler
@ 2022-01-10 16:14                 ` Martin K. Petersen
  2022-01-10 23:30                   ` Eric Wheeler
  0 siblings, 1 reply; 17+ messages in thread
From: Martin K. Petersen @ 2022-01-10 16:14 UTC (permalink / raw)
  To: Eric Wheeler
  Cc: Kent Overstreet, Martin K. Petersen, Coly Li, linux-block,
	Jonathan Corbet, open list:DOCUMENTATION, open list,
	open list:BCACHE (BLOCK LAYER CACHE)


Eric,

> Ok so `git blame` shows that Kent added this to md/raid5.c in 
> c78afc6261b (Kent Overstreet 2013-07-11 22:39:53 -0700 7526)

Yep.

> Do you think we should leave the md-specific
> raid_partial_stripes_expensive setting and require users of RAID
> controllers to set the bit themselves in bcache---or---remove all
> raid_partial_stripes_expensive code and always treat writes as
> "expensive" when `opt_io` is defined?

I'd prefer the latter since that was the very intent of exporting the
device topology in an abstract and protocol-independent fashion.
However, I don't know enough about bcache internals to know whether it
is always the right choice, what the trade-offs are, etc.

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] bcache: make stripe_size configurable and persistent for hardware raid5/6
  2022-01-10 16:14                 ` Martin K. Petersen
@ 2022-01-10 23:30                   ` Eric Wheeler
  2022-01-11  2:55                     ` Martin K. Petersen
  0 siblings, 1 reply; 17+ messages in thread
From: Eric Wheeler @ 2022-01-10 23:30 UTC (permalink / raw)
  To: Martin K. Petersen
  Cc: Kent Overstreet, Coly Li, linux-block, Jonathan Corbet,
	open list:DOCUMENTATION, open list,
	open list:BCACHE (BLOCK LAYER CACHE)

On Mon, 10 Jan 2022, Martin K. Petersen wrote:
> > Ok so `git blame` shows that Kent added this to md/raid5.c in 
> > c78afc6261b (Kent Overstreet 2013-07-11 22:39:53 -0700 7526)
> 
> Yep.
> 
> > Do you think we should leave the md-specific
> > raid_partial_stripes_expensive setting and require users of RAID
> > controllers to set the bit themselves in bcache---or---remove all
> > raid_partial_stripes_expensive code and always treat writes as
> > "expensive" when `opt_io` is defined?
> 
> I'd prefer the latter since that was the very intent of exporting the
> device topology in an abstract and protocol-independent fashion.
> However, I don't know enough about bcache internals to know whether it
> is always the right choice, what the trade-offs are, etc.

I've not looked in detail, but I don't think its too difficult.

Do you know if io_opt can be configured on scsi devices that do not have 
io_opt provided by the controller?  If so, or if it would be easy to add, 
then configuring io_opt at the scsi layer is probably a better option so 
subsystems besides bcache would benefit.

What do you think?

--
Eric Wheeler


> 
> -- 
> Martin K. Petersen	Oracle Linux Engineering
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] bcache: make stripe_size configurable and persistent for hardware raid5/6
  2022-01-10 23:30                   ` Eric Wheeler
@ 2022-01-11  2:55                     ` Martin K. Petersen
  0 siblings, 0 replies; 17+ messages in thread
From: Martin K. Petersen @ 2022-01-11  2:55 UTC (permalink / raw)
  To: Eric Wheeler
  Cc: Martin K. Petersen, Kent Overstreet, Coly Li, linux-block,
	Jonathan Corbet, open list:DOCUMENTATION, open list,
	open list:BCACHE (BLOCK LAYER CACHE)


Eric,

> Do you know if io_opt can be configured on scsi devices that do not
> have io_opt provided by the controller?  If so, or if it would be easy
> to add, then configuring io_opt at the scsi layer is probably a better
> option so subsystems besides bcache would benefit.

There is currently no way to do this in the SCSI sysfs interface.
However, io_opt is a queue_limits parameter. The appropriate location
for modification is blk-sysfs.c. We already allow overriding other block
device properties such as rotational, stable_writes, discards, max I/O
size, etc. I don't see any particular reason why we couldn't also permit
overriding the optimal I/O size.

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2022-01-11  2:55 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <d3f7fd44-9287-c7fa-ee95-c3b8a4d56c93@suse.de>
2019-06-22 23:16 ` [PATCH] bcache: make stripe_size configurable and persistent for hardware raid5/6 Eric Wheeler
2019-06-23  0:41   ` Martin K. Petersen
2019-06-24  6:57   ` Coly Li
2019-06-24  7:05   ` Coly Li
2019-06-24 18:14     ` Eric Wheeler
2019-06-24 23:24       ` Martin K. Petersen
2019-06-26  0:23         ` Eric Wheeler
2019-06-26  2:50           ` Martin K. Petersen
2019-06-25  1:59       ` Coly Li
2022-01-06  3:29         ` Eric Wheeler
2022-01-06 16:17           ` Coly Li
2022-01-08  0:21           ` Martin K. Petersen
2022-01-08  4:54             ` Eric Wheeler
2022-01-08 21:51               ` Eric Wheeler
2022-01-10 16:14                 ` Martin K. Petersen
2022-01-10 23:30                   ` Eric Wheeler
2022-01-11  2:55                     ` Martin K. Petersen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).