linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/8] block: add zone write granularity limit
@ 2021-01-28  4:47 Damien Le Moal
  2021-01-28  4:47 ` [PATCH v4 1/8] block: document zone_append_max_bytes attribute Damien Le Moal
                   ` (9 more replies)
  0 siblings, 10 replies; 38+ messages in thread
From: Damien Le Moal @ 2021-01-28  4:47 UTC (permalink / raw)
  To: linux-block, Jens Axboe
  Cc: Keith Busch, Chaitanya Kulkarni, linux-nvme, linux-scsi,
	Martin K . Petersen, Christoph Hellwig

The first patch of this series adds missing documentation of the
zone_append_max_bytes sysfs attribute.

The following 3 patches are cleanup and preparatory patches for the
introduction of the zone write granularity limit. The goal of these
patches is to have all code setting a device queue zoned model to use
the helper function blk_queue_set_zoned(). The nvme driver, null_blk
driver and the partition code are modified to do so.

The fourth patch in this series introduces the zone write granularity
queue limit to indicate the alignment constraint for write operations
into sequential zones of zoned block devices. This limit is always set
by default to the device logical block size. The following patch
documents this new limit.

The last 2 patches introduce the blk_queue_clear_zone_settings()
function and modify the SCSI sd driver to clear the zone related queue
limits and resources of a host-aware zoned disk that is changed to a
regular disk due to the presence of partitions.

Changes from v3:
* Added pathces 2, 3, 4, 7 and 8
* Addressed Christoph's comments on patch 5

Changes from v2:
* Added patch 3 for zonefs
* Addressed Christoph's comments on patch 1 and added the limit
  initialization for zoned nullblk

Changes from v1:
* Fixed typo in patch 2

Damien Le Moal (8):
  block: document zone_append_max_bytes attribute
  nvme: cleanup zone information initialization
  nullb: use blk_queue_set_zoned() to setup zoned devices
  block: use blk_queue_set_zoned in add_partition()
  block: introduce zone_write_granularity limit
  zonefs: use zone write granularity as block size
  block: introduce blk_queue_clear_zone_settings()
  sd_zbc: clear zone resources for non-zoned case

Damien Le Moal (8):
  block: document zone_append_max_bytes attribute
  nvme: cleanup zone information initialization
  nullb: use blk_queue_set_zoned() to setup zoned devices
  block: use blk_queue_set_zoned in add_partition()
  block: introduce zone_write_granularity limit
  zonefs: use zone write granularity as block size
  block: introduce blk_queue_clear_zone_settings()
  sd_zbc: clear zone resources for non-zoned case

 Documentation/block/queue-sysfs.rst | 13 +++++++++
 block/blk-settings.c                | 39 +++++++++++++++++++++++++-
 block/blk-sysfs.c                   |  8 ++++++
 block/blk-zoned.c                   | 17 ++++++++++++
 block/blk.h                         |  2 ++
 block/partitions/core.c             |  2 +-
 drivers/block/null_blk/zoned.c      |  8 +++---
 drivers/nvme/host/core.c            | 11 ++++----
 drivers/nvme/host/zns.c             | 11 ++------
 drivers/scsi/sd_zbc.c               | 43 ++++++++++++++++++++++++++---
 fs/zonefs/super.c                   |  9 +++---
 include/linux/blkdev.h              | 15 ++++++++++
 12 files changed, 150 insertions(+), 28 deletions(-)

-- 
2.29.2


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v4 1/8] block: document zone_append_max_bytes attribute
  2021-01-28  4:47 [PATCH v4 0/8] block: add zone write granularity limit Damien Le Moal
@ 2021-01-28  4:47 ` Damien Le Moal
  2021-01-28 10:20   ` Johannes Thumshirn
  2021-01-28  4:47 ` [PATCH v4 2/8] nvme: cleanup zone information initialization Damien Le Moal
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 38+ messages in thread
From: Damien Le Moal @ 2021-01-28  4:47 UTC (permalink / raw)
  To: linux-block, Jens Axboe
  Cc: Keith Busch, Chaitanya Kulkarni, linux-nvme, linux-scsi,
	Martin K . Petersen, Christoph Hellwig

The description of the zone_append_max_bytes sysfs queue attribute is
missing from Documentation/block/queue-sysfs.rst. Add it.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
---
 Documentation/block/queue-sysfs.rst | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/Documentation/block/queue-sysfs.rst b/Documentation/block/queue-sysfs.rst
index 2638d3446b79..edc6e6960b96 100644
--- a/Documentation/block/queue-sysfs.rst
+++ b/Documentation/block/queue-sysfs.rst
@@ -261,6 +261,12 @@ For block drivers that support REQ_OP_WRITE_ZEROES, the maximum number of
 bytes that can be zeroed at once. The value 0 means that REQ_OP_WRITE_ZEROES
 is not supported.
 
+zone_append_max_bytes (RO)
+--------------------------
+This is the maximum number of bytes that can be written to a sequential
+zone of a zoned block device using a zone append write operation
+(REQ_OP_ZONE_APPEND). This value is always 0 for regular block devices.
+
 zoned (RO)
 ----------
 This indicates if the device is a zoned block device and the zone model of the
-- 
2.29.2


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v4 2/8] nvme: cleanup zone information initialization
  2021-01-28  4:47 [PATCH v4 0/8] block: add zone write granularity limit Damien Le Moal
  2021-01-28  4:47 ` [PATCH v4 1/8] block: document zone_append_max_bytes attribute Damien Le Moal
@ 2021-01-28  4:47 ` Damien Le Moal
  2021-01-28  9:17   ` Christoph Hellwig
  2021-01-28  4:47 ` [PATCH v4 3/8] nullb: use blk_queue_set_zoned() to setup zoned devices Damien Le Moal
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 38+ messages in thread
From: Damien Le Moal @ 2021-01-28  4:47 UTC (permalink / raw)
  To: linux-block, Jens Axboe
  Cc: Keith Busch, Chaitanya Kulkarni, linux-nvme, linux-scsi,
	Martin K . Petersen, Christoph Hellwig

For a zoned namespace, in nvme_update_ns_info(), call
nvme_update_zone_info() after executing nvme_update_disk_info() so that
the namespace queue logical and physical block size limits are set.
This allows setting the namespace queue max_zone_append_sectors limit
in nvme_update_zone_info() instead of nvme_revalidate_zones(),
simplifying this function. Also use blk_queue_set_zoned() to set the
namespace zoned model.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
---
 drivers/nvme/host/core.c | 11 ++++++-----
 drivers/nvme/host/zns.c  | 11 +++--------
 2 files changed, 9 insertions(+), 13 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 1a3cdc6b1036..81a1c7f6223f 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -2176,17 +2176,18 @@ static int nvme_update_ns_info(struct nvme_ns *ns, struct nvme_id_ns *id)
 	ns->lba_shift = id->lbaf[lbaf].ds;
 	nvme_set_queue_limits(ns->ctrl, ns->queue);
 
+	ret = nvme_configure_metadata(ns, id);
+	if (ret)
+		goto out_unfreeze;
+	nvme_set_chunk_sectors(ns, id);
+	nvme_update_disk_info(ns->disk, ns, id);
+
 	if (ns->head->ids.csi == NVME_CSI_ZNS) {
 		ret = nvme_update_zone_info(ns, lbaf);
 		if (ret)
 			goto out_unfreeze;
 	}
 
-	ret = nvme_configure_metadata(ns, id);
-	if (ret)
-		goto out_unfreeze;
-	nvme_set_chunk_sectors(ns, id);
-	nvme_update_disk_info(ns->disk, ns, id);
 	blk_mq_unfreeze_queue(ns->disk->queue);
 
 	if (blk_queue_is_zoned(ns->queue)) {
diff --git a/drivers/nvme/host/zns.c b/drivers/nvme/host/zns.c
index 1dfe9a3500e3..c7e3ec561ba0 100644
--- a/drivers/nvme/host/zns.c
+++ b/drivers/nvme/host/zns.c
@@ -9,13 +9,7 @@
 
 int nvme_revalidate_zones(struct nvme_ns *ns)
 {
-	struct request_queue *q = ns->queue;
-	int ret;
-
-	ret = blk_revalidate_disk_zones(ns->disk, NULL);
-	if (!ret)
-		blk_queue_max_zone_append_sectors(q, ns->ctrl->max_zone_append);
-	return ret;
+	return blk_revalidate_disk_zones(ns->disk, NULL);
 }
 
 static int nvme_set_max_append(struct nvme_ctrl *ctrl)
@@ -109,10 +103,11 @@ int nvme_update_zone_info(struct nvme_ns *ns, unsigned lbaf)
 		goto free_data;
 	}
 
-	q->limits.zoned = BLK_ZONED_HM;
+	blk_queue_set_zoned(ns->disk, BLK_ZONED_HM);
 	blk_queue_flag_set(QUEUE_FLAG_ZONE_RESETALL, q);
 	blk_queue_max_open_zones(q, le32_to_cpu(id->mor) + 1);
 	blk_queue_max_active_zones(q, le32_to_cpu(id->mar) + 1);
+	blk_queue_max_zone_append_sectors(q, ns->ctrl->max_zone_append);
 free_data:
 	kfree(id);
 	return status;
-- 
2.29.2


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v4 3/8] nullb: use blk_queue_set_zoned() to setup zoned devices
  2021-01-28  4:47 [PATCH v4 0/8] block: add zone write granularity limit Damien Le Moal
  2021-01-28  4:47 ` [PATCH v4 1/8] block: document zone_append_max_bytes attribute Damien Le Moal
  2021-01-28  4:47 ` [PATCH v4 2/8] nvme: cleanup zone information initialization Damien Le Moal
@ 2021-01-28  4:47 ` Damien Le Moal
  2021-01-28  5:12   ` Chaitanya Kulkarni
                     ` (2 more replies)
  2021-01-28  4:47 ` [PATCH v4 4/8] block: use blk_queue_set_zoned in add_partition() Damien Le Moal
                   ` (6 subsequent siblings)
  9 siblings, 3 replies; 38+ messages in thread
From: Damien Le Moal @ 2021-01-28  4:47 UTC (permalink / raw)
  To: linux-block, Jens Axboe
  Cc: Keith Busch, Chaitanya Kulkarni, linux-nvme, linux-scsi,
	Martin K . Petersen, Christoph Hellwig

Use blk_queue_set_zoned() to set a nullb device zone model instead of
directly assigning the device queue zoned limit. This initialization of
the devicve zoned model as well as the setup of the queue flag
QUEUE_FLAG_ZONE_RESETALL and of the device queue elevator feature are
moved from null_init_zoned_dev() to null_register_zoned_dev() so that
the initialization of the queue limits is done when the gendisk of the
nullb device is available.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
---
 drivers/block/null_blk/zoned.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/block/null_blk/zoned.c b/drivers/block/null_blk/zoned.c
index 148b871f263b..78cae8703dcf 100644
--- a/drivers/block/null_blk/zoned.c
+++ b/drivers/block/null_blk/zoned.c
@@ -146,10 +146,6 @@ int null_init_zoned_dev(struct nullb_device *dev, struct request_queue *q)
 		sector += dev->zone_size_sects;
 	}
 
-	q->limits.zoned = BLK_ZONED_HM;
-	blk_queue_flag_set(QUEUE_FLAG_ZONE_RESETALL, q);
-	blk_queue_required_elevator_features(q, ELEVATOR_F_ZBD_SEQ_WRITE);
-
 	return 0;
 }
 
@@ -158,6 +154,10 @@ int null_register_zoned_dev(struct nullb *nullb)
 	struct nullb_device *dev = nullb->dev;
 	struct request_queue *q = nullb->q;
 
+	blk_queue_set_zoned(nullb->disk, BLK_ZONED_HM);
+	blk_queue_flag_set(QUEUE_FLAG_ZONE_RESETALL, q);
+	blk_queue_required_elevator_features(q, ELEVATOR_F_ZBD_SEQ_WRITE);
+
 	if (queue_is_mq(q)) {
 		int ret = blk_revalidate_disk_zones(nullb->disk, NULL);
 
-- 
2.29.2


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v4 4/8] block: use blk_queue_set_zoned in add_partition()
  2021-01-28  4:47 [PATCH v4 0/8] block: add zone write granularity limit Damien Le Moal
                   ` (2 preceding siblings ...)
  2021-01-28  4:47 ` [PATCH v4 3/8] nullb: use blk_queue_set_zoned() to setup zoned devices Damien Le Moal
@ 2021-01-28  4:47 ` Damien Le Moal
  2021-01-28  5:16   ` Chaitanya Kulkarni
                     ` (2 more replies)
  2021-01-28  4:47 ` [PATCH v4 5/8] block: introduce zone_write_granularity limit Damien Le Moal
                   ` (5 subsequent siblings)
  9 siblings, 3 replies; 38+ messages in thread
From: Damien Le Moal @ 2021-01-28  4:47 UTC (permalink / raw)
  To: linux-block, Jens Axboe
  Cc: Keith Busch, Chaitanya Kulkarni, linux-nvme, linux-scsi,
	Martin K . Petersen, Christoph Hellwig

When changing the zoned model of host-aware zoned block devices, use
blk_queue_set_zoned() instead of directly assigning the gendisk queue
zoned limit.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
---
 block/partitions/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/partitions/core.c b/block/partitions/core.c
index b1cdf88f96e2..d6094203116a 100644
--- a/block/partitions/core.c
+++ b/block/partitions/core.c
@@ -334,7 +334,7 @@ static struct block_device *add_partition(struct gendisk *disk, int partno,
 	case BLK_ZONED_HA:
 		pr_info("%s: disabling host aware zoned block device support due to partitions\n",
 			disk->disk_name);
-		disk->queue->limits.zoned = BLK_ZONED_NONE;
+		blk_queue_set_zoned(disk, BLK_ZONED_NONE);
 		break;
 	case BLK_ZONED_NONE:
 		break;
-- 
2.29.2


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v4 5/8] block: introduce zone_write_granularity limit
  2021-01-28  4:47 [PATCH v4 0/8] block: add zone write granularity limit Damien Le Moal
                   ` (3 preceding siblings ...)
  2021-01-28  4:47 ` [PATCH v4 4/8] block: use blk_queue_set_zoned in add_partition() Damien Le Moal
@ 2021-01-28  4:47 ` Damien Le Moal
  2021-01-28  9:19   ` Christoph Hellwig
                     ` (2 more replies)
  2021-01-28  4:47 ` [PATCH v4 6/8] zonefs: use zone write granularity as block size Damien Le Moal
                   ` (4 subsequent siblings)
  9 siblings, 3 replies; 38+ messages in thread
From: Damien Le Moal @ 2021-01-28  4:47 UTC (permalink / raw)
  To: linux-block, Jens Axboe
  Cc: Keith Busch, Chaitanya Kulkarni, linux-nvme, linux-scsi,
	Martin K . Petersen, Christoph Hellwig

Per ZBC and ZAC specifications, host-managed SMR hard-disks mandate that
all writes into sequential write required zones be aligned to the device
physical block size. However, NVMe ZNS does not have this constraint and
allows write operations into sequential zones to be aligned to the
device logical block size. This inconsistency does not help with
software portability across device types.

To solve this, introduce the zone_write_granularity queue limit to
indicate the alignment constraint, in bytes, of write operations into
zones of a zoned block device. This new limit is exported as a
read-only sysfs queue attribute and the helper
blk_queue_zone_write_granularity() introduced for drivers to set this
limit.

The function blk_queue_set_zoned() is modified to set this new limit to
the device logical block size by default. NVMe ZNS devices as well as
zoned nullb devices use this default value as is. The scsi disk driver
is modified to execute the blk_queue_zone_write_granularity() helper to
set the zone write granularity of host-managed SMR disks to the disk
physical block size.

The accessor functions queue_zone_write_granularity() and
bdev_zone_write_granularity() are also introduced.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
---
 Documentation/block/queue-sysfs.rst |  7 ++++++
 block/blk-settings.c                | 37 ++++++++++++++++++++++++++++-
 block/blk-sysfs.c                   |  8 +++++++
 drivers/scsi/sd_zbc.c               |  8 +++++++
 include/linux/blkdev.h              | 15 ++++++++++++
 5 files changed, 74 insertions(+), 1 deletion(-)

diff --git a/Documentation/block/queue-sysfs.rst b/Documentation/block/queue-sysfs.rst
index edc6e6960b96..4dc7f0d499a8 100644
--- a/Documentation/block/queue-sysfs.rst
+++ b/Documentation/block/queue-sysfs.rst
@@ -279,4 +279,11 @@ devices are described in the ZBC (Zoned Block Commands) and ZAC
 do not support zone commands, they will be treated as regular block devices
 and zoned will report "none".
 
+zone_write_granularity (RO)
+---------------------------
+This indicates the alignment constraint, in bytes, for write operations in
+sequential zones of zoned block devices (devices with a zoned attributed
+that reports "host-managed" or "host-aware"). This value is always 0 for
+regular block devices.
+
 Jens Axboe <jens.axboe@oracle.com>, February 2009
diff --git a/block/blk-settings.c b/block/blk-settings.c
index 4c974340f1a9..a1e66165adcf 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -60,6 +60,7 @@ void blk_set_default_limits(struct queue_limits *lim)
 	lim->io_opt = 0;
 	lim->misaligned = 0;
 	lim->zoned = BLK_ZONED_NONE;
+	lim->zone_write_granularity = 0;
 }
 EXPORT_SYMBOL(blk_set_default_limits);
 
@@ -366,6 +367,28 @@ void blk_queue_physical_block_size(struct request_queue *q, unsigned int size)
 }
 EXPORT_SYMBOL(blk_queue_physical_block_size);
 
+/**
+ * blk_queue_zone_write_granularity - set zone write granularity for the queue
+ * @q:  the request queue for the zoned device
+ * @size:  the zone write granularity size, in bytes
+ *
+ * Description:
+ *   This should be set to the lowest possible size allowing to write in
+ *   sequential zones of a zoned block device.
+ */
+void blk_queue_zone_write_granularity(struct request_queue *q,
+				      unsigned int size)
+{
+	if (WARN_ON_ONCE(!blk_queue_is_zoned(q)))
+		return;
+
+	q->limits.zone_write_granularity = size;
+
+	if (q->limits.zone_write_granularity < q->limits.logical_block_size)
+		q->limits.zone_write_granularity = q->limits.logical_block_size;
+}
+EXPORT_SYMBOL_GPL(blk_queue_zone_write_granularity);
+
 /**
  * blk_queue_alignment_offset - set physical block alignment offset
  * @q:	the request queue for the device
@@ -631,6 +654,8 @@ int blk_stack_limits(struct queue_limits *t, struct queue_limits *b,
 			t->discard_granularity;
 	}
 
+	t->zone_write_granularity = max(t->zone_write_granularity,
+					b->zone_write_granularity);
 	t->zoned = max(t->zoned, b->zoned);
 	return ret;
 }
@@ -847,6 +872,8 @@ EXPORT_SYMBOL_GPL(blk_queue_can_use_dma_map_merging);
  */
 void blk_queue_set_zoned(struct gendisk *disk, enum blk_zoned_model model)
 {
+	struct request_queue *q = disk->queue;
+
 	switch (model) {
 	case BLK_ZONED_HM:
 		/*
@@ -875,7 +902,15 @@ void blk_queue_set_zoned(struct gendisk *disk, enum blk_zoned_model model)
 		break;
 	}
 
-	disk->queue->limits.zoned = model;
+	q->limits.zoned = model;
+	if (model != BLK_ZONED_NONE) {
+		/*
+		 * Set the zone write granularity to the device logical block
+		 * size by default. The driver can change this value if needed.
+		 */
+		blk_queue_zone_write_granularity(q,
+						queue_logical_block_size(q));
+	}
 }
 EXPORT_SYMBOL_GPL(blk_queue_set_zoned);
 
diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index b513f1683af0..ae39c7f3d83d 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -219,6 +219,12 @@ static ssize_t queue_write_zeroes_max_show(struct request_queue *q, char *page)
 		(unsigned long long)q->limits.max_write_zeroes_sectors << 9);
 }
 
+static ssize_t queue_zone_write_granularity_show(struct request_queue *q,
+						 char *page)
+{
+	return queue_var_show(queue_zone_write_granularity(q), page);
+}
+
 static ssize_t queue_zone_append_max_show(struct request_queue *q, char *page)
 {
 	unsigned long long max_sectors = q->limits.max_zone_append_sectors;
@@ -585,6 +591,7 @@ QUEUE_RO_ENTRY(queue_discard_zeroes_data, "discard_zeroes_data");
 QUEUE_RO_ENTRY(queue_write_same_max, "write_same_max_bytes");
 QUEUE_RO_ENTRY(queue_write_zeroes_max, "write_zeroes_max_bytes");
 QUEUE_RO_ENTRY(queue_zone_append_max, "zone_append_max_bytes");
+QUEUE_RO_ENTRY(queue_zone_write_granularity, "zone_write_granularity");
 
 QUEUE_RO_ENTRY(queue_zoned, "zoned");
 QUEUE_RO_ENTRY(queue_nr_zones, "nr_zones");
@@ -639,6 +646,7 @@ static struct attribute *queue_attrs[] = {
 	&queue_write_same_max_entry.attr,
 	&queue_write_zeroes_max_entry.attr,
 	&queue_zone_append_max_entry.attr,
+	&queue_zone_write_granularity_entry.attr,
 	&queue_nonrot_entry.attr,
 	&queue_zoned_entry.attr,
 	&queue_nr_zones_entry.attr,
diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
index cf07b7f93579..8293b29584b3 100644
--- a/drivers/scsi/sd_zbc.c
+++ b/drivers/scsi/sd_zbc.c
@@ -789,6 +789,14 @@ int sd_zbc_read_zones(struct scsi_disk *sdkp, unsigned char *buf)
 	blk_queue_max_active_zones(q, 0);
 	nr_zones = round_up(sdkp->capacity, zone_blocks) >> ilog2(zone_blocks);
 
+	/*
+	 * Per ZBC and ZAC specifications, writes in sequential write required
+	 * zones of host-managed devices must be aligned to the device physical
+	 * block size.
+	 */
+	if (blk_queue_zoned_model(q) == BLK_ZONED_HM)
+		blk_queue_zone_write_granularity(q, sdkp->physical_block_size);
+
 	/* READ16/WRITE16 is mandatory for ZBC disks */
 	sdkp->device->use_16_for_rw = 1;
 	sdkp->device->use_10_for_rw = 0;
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 0dea268bd61b..9149f4a5adb3 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -337,6 +337,7 @@ struct queue_limits {
 	unsigned int		max_zone_append_sectors;
 	unsigned int		discard_granularity;
 	unsigned int		discard_alignment;
+	unsigned int		zone_write_granularity;
 
 	unsigned short		max_segments;
 	unsigned short		max_integrity_segments;
@@ -1160,6 +1161,8 @@ extern void blk_queue_logical_block_size(struct request_queue *, unsigned int);
 extern void blk_queue_max_zone_append_sectors(struct request_queue *q,
 		unsigned int max_zone_append_sectors);
 extern void blk_queue_physical_block_size(struct request_queue *, unsigned int);
+void blk_queue_zone_write_granularity(struct request_queue *q,
+				      unsigned int size);
 extern void blk_queue_alignment_offset(struct request_queue *q,
 				       unsigned int alignment);
 void blk_queue_update_readahead(struct request_queue *q);
@@ -1473,6 +1476,18 @@ static inline int bdev_io_opt(struct block_device *bdev)
 	return queue_io_opt(bdev_get_queue(bdev));
 }
 
+static inline unsigned int
+queue_zone_write_granularity(const struct request_queue *q)
+{
+	return q->limits.zone_write_granularity;
+}
+
+static inline unsigned int
+bdev_zone_write_granularity(struct block_device *bdev)
+{
+	return queue_zone_write_granularity(bdev_get_queue(bdev));
+}
+
 static inline int queue_alignment_offset(const struct request_queue *q)
 {
 	if (q->limits.misaligned)
-- 
2.29.2


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v4 6/8] zonefs: use zone write granularity as block size
  2021-01-28  4:47 [PATCH v4 0/8] block: add zone write granularity limit Damien Le Moal
                   ` (4 preceding siblings ...)
  2021-01-28  4:47 ` [PATCH v4 5/8] block: introduce zone_write_granularity limit Damien Le Moal
@ 2021-01-28  4:47 ` Damien Le Moal
  2021-01-28  5:17   ` Chaitanya Kulkarni
  2021-01-28 11:33   ` Johannes Thumshirn
  2021-01-28  4:47 ` [PATCH v4 7/8] block: introduce blk_queue_clear_zone_settings() Damien Le Moal
                   ` (3 subsequent siblings)
  9 siblings, 2 replies; 38+ messages in thread
From: Damien Le Moal @ 2021-01-28  4:47 UTC (permalink / raw)
  To: linux-block, Jens Axboe
  Cc: Keith Busch, Chaitanya Kulkarni, linux-nvme, linux-scsi,
	Martin K . Petersen, Christoph Hellwig

Zoned block devices have different granularity constraints for write
operations into sequential zones. E.g. ZBC and ZAC devices require that
writes be aligned to the device physical block size while NVMe ZNS
devices allow logical block size aligned write operations. To correctly
handle such difference, use the device zone write granularity limit to
set the block size of a zonefs volume, thus allowing the smallest
possible write unit for all zoned device types.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
---
 fs/zonefs/super.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/fs/zonefs/super.c b/fs/zonefs/super.c
index ab68e27bb322..b9fb55b250ae 100644
--- a/fs/zonefs/super.c
+++ b/fs/zonefs/super.c
@@ -1581,12 +1581,11 @@ static int zonefs_fill_super(struct super_block *sb, void *data, int silent)
 	sb->s_time_gran	= 1;
 
 	/*
-	 * The block size is set to the device physical sector size to ensure
-	 * that write operations on 512e devices (512B logical block and 4KB
-	 * physical block) are always aligned to the device physical blocks,
-	 * as mandated by the ZBC/ZAC specifications.
+	 * The block size is set to the device zone write granularity to ensure
+	 * that write operations are always aligned according to the device
+	 * interface constraints.
 	 */
-	sb_set_blocksize(sb, bdev_physical_block_size(sb->s_bdev));
+	sb_set_blocksize(sb, bdev_zone_write_granularity(sb->s_bdev));
 	sbi->s_zone_sectors_shift = ilog2(bdev_zone_sectors(sb->s_bdev));
 	sbi->s_uid = GLOBAL_ROOT_UID;
 	sbi->s_gid = GLOBAL_ROOT_GID;
-- 
2.29.2


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v4 7/8] block: introduce blk_queue_clear_zone_settings()
  2021-01-28  4:47 [PATCH v4 0/8] block: add zone write granularity limit Damien Le Moal
                   ` (5 preceding siblings ...)
  2021-01-28  4:47 ` [PATCH v4 6/8] zonefs: use zone write granularity as block size Damien Le Moal
@ 2021-01-28  4:47 ` Damien Le Moal
  2021-01-28  5:26   ` Chaitanya Kulkarni
                     ` (2 more replies)
  2021-01-28  4:47 ` [PATCH v4 8/8] sd_zbc: clear zone resources for non-zoned case Damien Le Moal
                   ` (2 subsequent siblings)
  9 siblings, 3 replies; 38+ messages in thread
From: Damien Le Moal @ 2021-01-28  4:47 UTC (permalink / raw)
  To: linux-block, Jens Axboe
  Cc: Keith Busch, Chaitanya Kulkarni, linux-nvme, linux-scsi,
	Martin K . Petersen, Christoph Hellwig

Introduce the internal function blk_queue_clear_zone_settings() to
cleanup all limits and resources related to zoned block devices. This
new function is called from blk_queue_set_zoned() when a disk zoned
model is set to BLK_ZONED_NONE. This particular case can happens when a
partition is created on a host-aware scsi disk.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
---
 block/blk-settings.c |  2 ++
 block/blk-zoned.c    | 17 +++++++++++++++++
 block/blk.h          |  2 ++
 3 files changed, 21 insertions(+)

diff --git a/block/blk-settings.c b/block/blk-settings.c
index a1e66165adcf..7dd8be314ac6 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -910,6 +910,8 @@ void blk_queue_set_zoned(struct gendisk *disk, enum blk_zoned_model model)
 		 */
 		blk_queue_zone_write_granularity(q,
 						queue_logical_block_size(q));
+	} else {
+		blk_queue_clear_zone_settings(q);
 	}
 }
 EXPORT_SYMBOL_GPL(blk_queue_set_zoned);
diff --git a/block/blk-zoned.c b/block/blk-zoned.c
index 7a68b6e4300c..833978c02e60 100644
--- a/block/blk-zoned.c
+++ b/block/blk-zoned.c
@@ -549,3 +549,20 @@ int blk_revalidate_disk_zones(struct gendisk *disk,
 	return ret;
 }
 EXPORT_SYMBOL_GPL(blk_revalidate_disk_zones);
+
+void blk_queue_clear_zone_settings(struct request_queue *q)
+{
+	blk_mq_freeze_queue(q);
+
+	blk_queue_free_zone_bitmaps(q);
+	blk_queue_flag_clear(QUEUE_FLAG_ZONE_RESETALL, q);
+	q->required_elevator_features &= ~ELEVATOR_F_ZBD_SEQ_WRITE;
+	q->nr_zones = 0;
+	q->max_open_zones = 0;
+	q->max_active_zones = 0;
+	q->limits.chunk_sectors = 0;
+	q->limits.zone_write_granularity = 0;
+	q->limits.max_zone_append_sectors = 0;
+
+	blk_mq_unfreeze_queue(q);
+}
diff --git a/block/blk.h b/block/blk.h
index 0198335c5838..977d79a0d99a 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -333,8 +333,10 @@ struct bio *blk_next_bio(struct bio *bio, unsigned int nr_pages, gfp_t gfp);
 
 #ifdef CONFIG_BLK_DEV_ZONED
 void blk_queue_free_zone_bitmaps(struct request_queue *q);
+void blk_queue_clear_zone_settings(struct request_queue *q);
 #else
 static inline void blk_queue_free_zone_bitmaps(struct request_queue *q) {}
+static inline void blk_queue_clear_zone_settings(struct request_queue *q) {}
 #endif
 
 int blk_alloc_devt(struct block_device *part, dev_t *devt);
-- 
2.29.2


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v4 8/8] sd_zbc: clear zone resources for non-zoned case
  2021-01-28  4:47 [PATCH v4 0/8] block: add zone write granularity limit Damien Le Moal
                   ` (6 preceding siblings ...)
  2021-01-28  4:47 ` [PATCH v4 7/8] block: introduce blk_queue_clear_zone_settings() Damien Le Moal
@ 2021-01-28  4:47 ` Damien Le Moal
  2021-01-28  5:38   ` Chaitanya Kulkarni
                     ` (3 more replies)
  2021-02-04  8:47 ` [PATCH v4 0/8] block: add zone write granularity limit Damien Le Moal
  2021-02-10 14:45 ` Jens Axboe
  9 siblings, 4 replies; 38+ messages in thread
From: Damien Le Moal @ 2021-01-28  4:47 UTC (permalink / raw)
  To: linux-block, Jens Axboe
  Cc: Keith Busch, Chaitanya Kulkarni, linux-nvme, linux-scsi,
	Martin K . Petersen, Christoph Hellwig

For host-aware ZBC disk, setting the device zoned model to BLK_ZONED_HA
using blk_queue_set_zoned() in sd_read_block_characteristics() may
result in the block device effective zoned model to be "none"
(BLK_ZONED_NONE) if partitions are present on the device. In this case,
sd_zbc_read_zones() should not setup the zone related queue limits for
the disk so that the device limits and configuration is consistent with
a regular disk and resources not uselessly allocated (e.g. the zone
write pointer tracking array for zone append emulation).

Furthermore, if the disk zoned model changes at run time due to the
creation of a partition by the user, the zone related resources can be
released.

Fix both problems by introducing the function sd_zbc_clear_zone_info()
to reset the scsi disk zone information and free resources and by
returning early in sd_zbc_read_zones() for a block device that has a
zoned model equal to BLK_ZONED_NONE.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
---
 drivers/scsi/sd_zbc.c | 37 ++++++++++++++++++++++++++++++++-----
 1 file changed, 32 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
index 8293b29584b3..03adb39293c2 100644
--- a/drivers/scsi/sd_zbc.c
+++ b/drivers/scsi/sd_zbc.c
@@ -665,12 +665,28 @@ static int sd_zbc_init_disk(struct scsi_disk *sdkp)
 	return 0;
 }
 
-void sd_zbc_release_disk(struct scsi_disk *sdkp)
+static void sd_zbc_clear_zone_info(struct scsi_disk *sdkp)
 {
+	/* Serialize against revalidate zones */
+	mutex_lock(&sdkp->rev_mutex);
+
 	kvfree(sdkp->zones_wp_offset);
 	sdkp->zones_wp_offset = NULL;
 	kfree(sdkp->zone_wp_update_buf);
 	sdkp->zone_wp_update_buf = NULL;
+
+	sdkp->nr_zones = 0;
+	sdkp->rev_nr_zones = 0;
+	sdkp->zone_blocks = 0;
+	sdkp->rev_zone_blocks = 0;
+
+	mutex_unlock(&sdkp->rev_mutex);
+}
+
+void sd_zbc_release_disk(struct scsi_disk *sdkp)
+{
+	if (sd_is_zoned(sdkp))
+		sd_zbc_clear_zone_info(sdkp);
 }
 
 static void sd_zbc_revalidate_zones_cb(struct gendisk *disk)
@@ -769,6 +785,21 @@ int sd_zbc_read_zones(struct scsi_disk *sdkp, unsigned char *buf)
 		 */
 		return 0;
 
+	/* READ16/WRITE16 is mandatory for ZBC disks */
+	sdkp->device->use_16_for_rw = 1;
+	sdkp->device->use_10_for_rw = 0;
+
+	if (!blk_queue_is_zoned(q)) {
+		/*
+		 * This can happen for a host aware disk with partitions.
+		 * The block device zone information was already cleared
+		 * by blk_queue_set_zoned(). Only clear the scsi disk zone
+		 * information and exit early.
+		 */
+		sd_zbc_clear_zone_info(sdkp);
+		return 0;
+	}
+
 	/* Check zoned block device characteristics (unconstrained reads) */
 	ret = sd_zbc_check_zoned_characteristics(sdkp, buf);
 	if (ret)
@@ -797,10 +828,6 @@ int sd_zbc_read_zones(struct scsi_disk *sdkp, unsigned char *buf)
 	if (blk_queue_zoned_model(q) == BLK_ZONED_HM)
 		blk_queue_zone_write_granularity(q, sdkp->physical_block_size);
 
-	/* READ16/WRITE16 is mandatory for ZBC disks */
-	sdkp->device->use_16_for_rw = 1;
-	sdkp->device->use_10_for_rw = 0;
-
 	sdkp->rev_nr_zones = nr_zones;
 	sdkp->rev_zone_blocks = zone_blocks;
 
-- 
2.29.2


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 3/8] nullb: use blk_queue_set_zoned() to setup zoned devices
  2021-01-28  4:47 ` [PATCH v4 3/8] nullb: use blk_queue_set_zoned() to setup zoned devices Damien Le Moal
@ 2021-01-28  5:12   ` Chaitanya Kulkarni
  2021-01-28  9:17   ` Christoph Hellwig
  2021-01-28 11:17   ` Johannes Thumshirn
  2 siblings, 0 replies; 38+ messages in thread
From: Chaitanya Kulkarni @ 2021-01-28  5:12 UTC (permalink / raw)
  To: Damien Le Moal, linux-block, Jens Axboe
  Cc: Keith Busch, linux-nvme, Martin K . Petersen, linux-scsi,
	Christoph Hellwig

On 1/27/21 20:47, Damien Le Moal wrote:
> Use blk_queue_set_zoned() to set a nullb device zone model instead of
> directly assigning the device queue zoned limit. This initialization of
> the devicve zoned model as well as the setup of the queue flag
> QUEUE_FLAG_ZONE_RESETALL and of the device queue elevator feature are
> moved from null_init_zoned_dev() to null_register_zoned_dev() so that
> the initialization of the queue limits is done when the gendisk of the
> nullb device is available.
>
> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Looks good.

Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 4/8] block: use blk_queue_set_zoned in add_partition()
  2021-01-28  4:47 ` [PATCH v4 4/8] block: use blk_queue_set_zoned in add_partition() Damien Le Moal
@ 2021-01-28  5:16   ` Chaitanya Kulkarni
  2021-01-28  9:17   ` Christoph Hellwig
  2021-01-28 11:28   ` Johannes Thumshirn
  2 siblings, 0 replies; 38+ messages in thread
From: Chaitanya Kulkarni @ 2021-01-28  5:16 UTC (permalink / raw)
  To: Damien Le Moal, linux-block, Jens Axboe
  Cc: Keith Busch, linux-nvme, Martin K . Petersen, linux-scsi,
	Christoph Hellwig

On 1/27/21 20:47, Damien Le Moal wrote:
> When changing the zoned model of host-aware zoned block devices, use
> blk_queue_set_zoned() instead of directly assigning the gendisk queue
> zoned limit.
>
> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Looks good.

Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 6/8] zonefs: use zone write granularity as block size
  2021-01-28  4:47 ` [PATCH v4 6/8] zonefs: use zone write granularity as block size Damien Le Moal
@ 2021-01-28  5:17   ` Chaitanya Kulkarni
  2021-01-28 11:33   ` Johannes Thumshirn
  1 sibling, 0 replies; 38+ messages in thread
From: Chaitanya Kulkarni @ 2021-01-28  5:17 UTC (permalink / raw)
  To: Damien Le Moal, linux-block, Jens Axboe
  Cc: Keith Busch, linux-nvme, Martin K . Petersen, linux-scsi,
	Christoph Hellwig

On 1/27/21 20:47, Damien Le Moal wrote:
> Zoned block devices have different granularity constraints for write
> operations into sequential zones. E.g. ZBC and ZAC devices require that
> writes be aligned to the device physical block size while NVMe ZNS
> devices allow logical block size aligned write operations. To correctly
> handle such difference, use the device zone write granularity limit to
> set the block size of a zonefs volume, thus allowing the smallest
> possible write unit for all zoned device types.
>
> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Looks good.

Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 7/8] block: introduce blk_queue_clear_zone_settings()
  2021-01-28  4:47 ` [PATCH v4 7/8] block: introduce blk_queue_clear_zone_settings() Damien Le Moal
@ 2021-01-28  5:26   ` Chaitanya Kulkarni
  2021-01-28  9:21   ` Christoph Hellwig
  2021-01-28 11:43   ` Johannes Thumshirn
  2 siblings, 0 replies; 38+ messages in thread
From: Chaitanya Kulkarni @ 2021-01-28  5:26 UTC (permalink / raw)
  To: Damien Le Moal, linux-block, Jens Axboe
  Cc: Keith Busch, linux-nvme, Martin K . Petersen, linux-scsi,
	Christoph Hellwig

On 1/27/21 20:47, Damien Le Moal wrote:
> Introduce the internal function blk_queue_clear_zone_settings() to
> cleanup all limits and resources related to zoned block devices. This
> new function is called from blk_queue_set_zoned() when a disk zoned
> model is set to BLK_ZONED_NONE. This particular case can happens when a
> partition is created on a host-aware scsi disk.
>
> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
nit:- can be done at the time of applying the patch.
In the last line of the commit log :-

1. s/happens/happen/.
2. article 'a' on the same line with the noun 'partition'.

see it make sense otherwise ignore :-

Introduce the internal function blk_queue_clear_zone_settings() to
cleanup all limits and resources related to zoned block devices. This
new function is called from blk_queue_set_zoned() when a disk zoned
model is set to BLK_ZONED_NONE. This particular case can happen when
apartition is created on a host-aware scsi disk.

Either case LGTM,

Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 8/8] sd_zbc: clear zone resources for non-zoned case
  2021-01-28  4:47 ` [PATCH v4 8/8] sd_zbc: clear zone resources for non-zoned case Damien Le Moal
@ 2021-01-28  5:38   ` Chaitanya Kulkarni
  2021-01-28  5:40     ` Damien Le Moal
  2021-01-28  9:24   ` Christoph Hellwig
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 38+ messages in thread
From: Chaitanya Kulkarni @ 2021-01-28  5:38 UTC (permalink / raw)
  To: Damien Le Moal, linux-block, Jens Axboe
  Cc: Keith Busch, linux-nvme, Martin K . Petersen, linux-scsi,
	Christoph Hellwig

On 1/27/21 20:47, Damien Le Moal wrote:
> -void sd_zbc_release_disk(struct scsi_disk *sdkp)
> +static void sd_zbc_clear_zone_info(struct scsi_disk *sdkp)
>  {
> +	/* Serialize against revalidate zones */
> +	mutex_lock(&sdkp->rev_mutex);
> +
>  	kvfree(sdkp->zones_wp_offset);
>  	sdkp->zones_wp_offset = NULL;
>  	kfree(sdkp->zone_wp_update_buf);
>  	sdkp->zone_wp_update_buf = NULL;
> +
> +	sdkp->nr_zones = 0;
> +	sdkp->rev_nr_zones = 0;
> +	sdkp->zone_blocks = 0;
> +	sdkp->rev_zone_blocks = 0;
> +
> +	mutex_unlock(&sdkp->rev_mutex);
> +}
> +
> +void sd_zbc_release_disk(struct scsi_disk *sdkp)
> +{
> +	if (sd_is_zoned(sdkp))
> +		sd_zbc_clear_zone_info(sdkp);
>  }
>  
If I'm not wrong there is only one caller for sd_zbc_clear_zone_info().
Is there any reason why sd_zbc_clear_zone_info() is notopen coded with
a meaningful comment in sd_zbc_release_disk() ? e.g. :-

void sd_zbc_release_disk(struct scsi_disk *sdkp)
{
	if (!sd_is_zoned(sdkp))
		return; 
	/* Serialize against revalidate zones */
	mutex_lock(&sdkp->rev_mutex);

 	kvfree(sdkp->zones_wp_offset);
 	sdkp->zones_wp_offset = NULL;
 	kfree(sdkp->zone_wp_update_buf);
 	sdkp->zone_wp_update_buf = NULL;

	/* clear zone info */
	sdkp->nr_zones = 0;
	sdkp->rev_nr_zones = 0;
	sdkp->zone_blocks = 0;
	sdkp->rev_zone_blocks = 0;

	mutex_unlock(&sdkp->rev_mutex);
 }


unless I miss something, in either case LGTM.

Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 8/8] sd_zbc: clear zone resources for non-zoned case
  2021-01-28  5:38   ` Chaitanya Kulkarni
@ 2021-01-28  5:40     ` Damien Le Moal
  0 siblings, 0 replies; 38+ messages in thread
From: Damien Le Moal @ 2021-01-28  5:40 UTC (permalink / raw)
  To: Chaitanya Kulkarni, linux-block, Jens Axboe
  Cc: Keith Busch, linux-nvme, Martin K . Petersen, linux-scsi,
	Christoph Hellwig

On 2021/01/28 14:38, Chaitanya Kulkarni wrote:
> On 1/27/21 20:47, Damien Le Moal wrote:
>> -void sd_zbc_release_disk(struct scsi_disk *sdkp)
>> +static void sd_zbc_clear_zone_info(struct scsi_disk *sdkp)
>>  {
>> +	/* Serialize against revalidate zones */
>> +	mutex_lock(&sdkp->rev_mutex);
>> +
>>  	kvfree(sdkp->zones_wp_offset);
>>  	sdkp->zones_wp_offset = NULL;
>>  	kfree(sdkp->zone_wp_update_buf);
>>  	sdkp->zone_wp_update_buf = NULL;
>> +
>> +	sdkp->nr_zones = 0;
>> +	sdkp->rev_nr_zones = 0;
>> +	sdkp->zone_blocks = 0;
>> +	sdkp->rev_zone_blocks = 0;
>> +
>> +	mutex_unlock(&sdkp->rev_mutex);
>> +}
>> +
>> +void sd_zbc_release_disk(struct scsi_disk *sdkp)
>> +{
>> +	if (sd_is_zoned(sdkp))
>> +		sd_zbc_clear_zone_info(sdkp);
>>  }
>>  
> If I'm not wrong there is only one caller for sd_zbc_clear_zone_info().
> Is there any reason why sd_zbc_clear_zone_info() is notopen coded with
> a meaningful comment in sd_zbc_release_disk() ? e.g. :-

There are 2 call sites: sd_zbc_read_zones() and sd_zbc_release_disk().

> 
> void sd_zbc_release_disk(struct scsi_disk *sdkp)
> {
> 	if (!sd_is_zoned(sdkp))
> 		return; 
> 	/* Serialize against revalidate zones */
> 	mutex_lock(&sdkp->rev_mutex);
> 
>  	kvfree(sdkp->zones_wp_offset);
>  	sdkp->zones_wp_offset = NULL;
>  	kfree(sdkp->zone_wp_update_buf);
>  	sdkp->zone_wp_update_buf = NULL;
> 
> 	/* clear zone info */
> 	sdkp->nr_zones = 0;
> 	sdkp->rev_nr_zones = 0;
> 	sdkp->zone_blocks = 0;
> 	sdkp->rev_zone_blocks = 0;
> 
> 	mutex_unlock(&sdkp->rev_mutex);
>  }
> 
> 
> unless I miss something, in either case LGTM.
> 
> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
> 
> 


-- 
Damien Le Moal
Western Digital Research

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 2/8] nvme: cleanup zone information initialization
  2021-01-28  4:47 ` [PATCH v4 2/8] nvme: cleanup zone information initialization Damien Le Moal
@ 2021-01-28  9:17   ` Christoph Hellwig
  2021-01-28  9:27     ` Damien Le Moal
  0 siblings, 1 reply; 38+ messages in thread
From: Christoph Hellwig @ 2021-01-28  9:17 UTC (permalink / raw)
  To: Damien Le Moal
  Cc: Jens Axboe, Keith Busch, linux-scsi, Martin K . Petersen,
	Chaitanya Kulkarni, linux-nvme, linux-block, Christoph Hellwig

>  int nvme_revalidate_zones(struct nvme_ns *ns)
>  {
> -	struct request_queue *q = ns->queue;
> -	int ret;
> -
> -	ret = blk_revalidate_disk_zones(ns->disk, NULL);
> -	if (!ret)
> -		blk_queue_max_zone_append_sectors(q, ns->ctrl->max_zone_append);
> -	return ret;
> +	return blk_revalidate_disk_zones(ns->disk, NULL);

We can just kill off nvme_revalidate_zones now and open code it in
the caller as the stub is no needed now that blk_queue_is_zoned always
return false for the !CONFIG_BLK_DEV_ZONED case.

Otherwise this look great, nice cleanup:

Reviewed-by: Christoph Hellwig <hch@lst.de>

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 3/8] nullb: use blk_queue_set_zoned() to setup zoned devices
  2021-01-28  4:47 ` [PATCH v4 3/8] nullb: use blk_queue_set_zoned() to setup zoned devices Damien Le Moal
  2021-01-28  5:12   ` Chaitanya Kulkarni
@ 2021-01-28  9:17   ` Christoph Hellwig
  2021-01-28 11:17   ` Johannes Thumshirn
  2 siblings, 0 replies; 38+ messages in thread
From: Christoph Hellwig @ 2021-01-28  9:17 UTC (permalink / raw)
  To: Damien Le Moal
  Cc: Jens Axboe, Keith Busch, linux-scsi, Martin K . Petersen,
	Chaitanya Kulkarni, linux-nvme, linux-block, Christoph Hellwig

On Thu, Jan 28, 2021 at 01:47:28PM +0900, Damien Le Moal wrote:
> Use blk_queue_set_zoned() to set a nullb device zone model instead of
> directly assigning the device queue zoned limit. This initialization of
> the devicve zoned model as well as the setup of the queue flag
> QUEUE_FLAG_ZONE_RESETALL and of the device queue elevator feature are
> moved from null_init_zoned_dev() to null_register_zoned_dev() so that
> the initialization of the queue limits is done when the gendisk of the
> nullb device is available.

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 4/8] block: use blk_queue_set_zoned in add_partition()
  2021-01-28  4:47 ` [PATCH v4 4/8] block: use blk_queue_set_zoned in add_partition() Damien Le Moal
  2021-01-28  5:16   ` Chaitanya Kulkarni
@ 2021-01-28  9:17   ` Christoph Hellwig
  2021-01-28 11:28   ` Johannes Thumshirn
  2 siblings, 0 replies; 38+ messages in thread
From: Christoph Hellwig @ 2021-01-28  9:17 UTC (permalink / raw)
  To: Damien Le Moal
  Cc: Jens Axboe, Keith Busch, linux-scsi, Martin K . Petersen,
	Chaitanya Kulkarni, linux-nvme, linux-block, Christoph Hellwig

On Thu, Jan 28, 2021 at 01:47:29PM +0900, Damien Le Moal wrote:
> When changing the zoned model of host-aware zoned block devices, use
> blk_queue_set_zoned() instead of directly assigning the gendisk queue
> zoned limit.
> 
> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 5/8] block: introduce zone_write_granularity limit
  2021-01-28  4:47 ` [PATCH v4 5/8] block: introduce zone_write_granularity limit Damien Le Moal
@ 2021-01-28  9:19   ` Christoph Hellwig
  2021-01-28 11:32   ` Johannes Thumshirn
  2021-02-05  2:54   ` Martin K. Petersen
  2 siblings, 0 replies; 38+ messages in thread
From: Christoph Hellwig @ 2021-01-28  9:19 UTC (permalink / raw)
  To: Damien Le Moal
  Cc: Jens Axboe, Keith Busch, linux-scsi, Martin K . Petersen,
	Chaitanya Kulkarni, linux-nvme, linux-block, Christoph Hellwig

> +	t->zone_write_granularity = max(t->zone_write_granularity,
> +					b->zone_write_granularity);
>  	t->zoned = max(t->zoned, b->zoned);

Totally superficial nit:  it would read a little nicer if
zone_write_granularity is assigned after the zoned value.

Otherwise this looks perfect.

Reviewed-by: Christoph Hellwig <hch@lst.de>

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 7/8] block: introduce blk_queue_clear_zone_settings()
  2021-01-28  4:47 ` [PATCH v4 7/8] block: introduce blk_queue_clear_zone_settings() Damien Le Moal
  2021-01-28  5:26   ` Chaitanya Kulkarni
@ 2021-01-28  9:21   ` Christoph Hellwig
  2021-01-28  9:32     ` Damien Le Moal
  2021-01-28 11:43   ` Johannes Thumshirn
  2 siblings, 1 reply; 38+ messages in thread
From: Christoph Hellwig @ 2021-01-28  9:21 UTC (permalink / raw)
  To: Damien Le Moal
  Cc: Jens Axboe, Keith Busch, linux-scsi, Martin K . Petersen,
	Chaitanya Kulkarni, linux-nvme, linux-block, Christoph Hellwig

On Thu, Jan 28, 2021 at 01:47:32PM +0900, Damien Le Moal wrote:
> Introduce the internal function blk_queue_clear_zone_settings() to
> cleanup all limits and resources related to zoned block devices. This
> new function is called from blk_queue_set_zoned() when a disk zoned
> model is set to BLK_ZONED_NONE. This particular case can happens when a
> partition is created on a host-aware scsi disk.

Shouldn't we just do all this work when blk_queue_set_zoned is called
with a BLK_ZONED_NONE argument?  That seems like the more obvious API
to me.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 8/8] sd_zbc: clear zone resources for non-zoned case
  2021-01-28  4:47 ` [PATCH v4 8/8] sd_zbc: clear zone resources for non-zoned case Damien Le Moal
  2021-01-28  5:38   ` Chaitanya Kulkarni
@ 2021-01-28  9:24   ` Christoph Hellwig
  2021-01-28  9:36     ` Damien Le Moal
  2021-01-28 11:48   ` Johannes Thumshirn
  2021-02-05  2:56   ` Martin K. Petersen
  3 siblings, 1 reply; 38+ messages in thread
From: Christoph Hellwig @ 2021-01-28  9:24 UTC (permalink / raw)
  To: Damien Le Moal
  Cc: Jens Axboe, Keith Busch, linux-scsi, Martin K . Petersen,
	Chaitanya Kulkarni, linux-nvme, linux-block, Christoph Hellwig

On Thu, Jan 28, 2021 at 01:47:33PM +0900, Damien Le Moal wrote:
> For host-aware ZBC disk, setting the device zoned model to BLK_ZONED_HA
> using blk_queue_set_zoned() in sd_read_block_characteristics() may
> result in the block device effective zoned model to be "none"
> (BLK_ZONED_NONE) if partitions are present on the device. In this case,
> sd_zbc_read_zones() should not setup the zone related queue limits for
> the disk so that the device limits and configuration is consistent with
> a regular disk and resources not uselessly allocated (e.g. the zone
> write pointer tracking array for zone append emulation).
> 
> Furthermore, if the disk zoned model changes at run time due to the
> creation of a partition by the user, the zone related resources can be
> released.
> 
> Fix both problems by introducing the function sd_zbc_clear_zone_info()
> to reset the scsi disk zone information and free resources and by
> returning early in sd_zbc_read_zones() for a block device that has a
> zoned model equal to BLK_ZONED_NONE.

So creating the partition doesn't even call into the driver, which
means we'll leak the info for now.  But I guess the next revalidate
will simply clean it up, so it is not a major issue.

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 2/8] nvme: cleanup zone information initialization
  2021-01-28  9:17   ` Christoph Hellwig
@ 2021-01-28  9:27     ` Damien Le Moal
  2021-01-28  9:32       ` Christoph Hellwig
  0 siblings, 1 reply; 38+ messages in thread
From: Damien Le Moal @ 2021-01-28  9:27 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Martin K . Petersen, Keith Busch, Chaitanya Kulkarni,
	linux-nvme, linux-block, linux-scsi

On 2021/01/28 18:17, Christoph Hellwig wrote:
>>  int nvme_revalidate_zones(struct nvme_ns *ns)
>>  {
>> -	struct request_queue *q = ns->queue;
>> -	int ret;
>> -
>> -	ret = blk_revalidate_disk_zones(ns->disk, NULL);
>> -	if (!ret)
>> -		blk_queue_max_zone_append_sectors(q, ns->ctrl->max_zone_append);
>> -	return ret;
>> +	return blk_revalidate_disk_zones(ns->disk, NULL);
> 
> We can just kill off nvme_revalidate_zones now and open code it in
> the caller as the stub is no needed now that blk_queue_is_zoned always
> return false for the !CONFIG_BLK_DEV_ZONED case.

I tried that first, but it did not work. I end up with
blk_revalidate_disk_zones() undefined error with !CONFIG_BLK_DEV_ZONED.
This is because blk_queue_is_zoned() is *not* stubbed for !CONFIG_BLK_DEV_ZONED.
It will simply always return 0/none in that case. We would need to have:

if (IS_ENABLED(CONFIG_BLK_DEV_ZONED) && blk_queue_is_zoned()) {
	ret = blk_revalidate_disk_zones(ns->disk, NULL);
	...

Or stub blk_queue_is_zoned()...

> 
> Otherwise this look great, nice cleanup:
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> 


-- 
Damien Le Moal
Western Digital Research

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 7/8] block: introduce blk_queue_clear_zone_settings()
  2021-01-28  9:21   ` Christoph Hellwig
@ 2021-01-28  9:32     ` Damien Le Moal
  2021-01-28  9:33       ` Christoph Hellwig
  0 siblings, 1 reply; 38+ messages in thread
From: Damien Le Moal @ 2021-01-28  9:32 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Martin K . Petersen, Keith Busch, Chaitanya Kulkarni,
	linux-nvme, linux-block, linux-scsi

On 2021/01/28 18:21, Christoph Hellwig wrote:
> On Thu, Jan 28, 2021 at 01:47:32PM +0900, Damien Le Moal wrote:
>> Introduce the internal function blk_queue_clear_zone_settings() to
>> cleanup all limits and resources related to zoned block devices. This
>> new function is called from blk_queue_set_zoned() when a disk zoned
>> model is set to BLK_ZONED_NONE. This particular case can happens when a
>> partition is created on a host-aware scsi disk.
> 
> Shouldn't we just do all this work when blk_queue_set_zoned is called
> with a BLK_ZONED_NONE argument?  That seems like the more obvious API
> to me.

That is what I did. blk_queue_set_zoned() calls blk_queue_clear_zone_settings()
for BLK_ZONED_NONE case. I simply did not open code the cleanups in that
functions because it is simpler to stub only blk_queue_clear_zone_settings()
rather than having conditionals in blk_queue_set_zoned(). That also puts the
cleanup function together with the code that allocates most resources in
blk-zoned.c. Easier to not overlook something.



-- 
Damien Le Moal
Western Digital Research

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 2/8] nvme: cleanup zone information initialization
  2021-01-28  9:27     ` Damien Le Moal
@ 2021-01-28  9:32       ` Christoph Hellwig
  0 siblings, 0 replies; 38+ messages in thread
From: Christoph Hellwig @ 2021-01-28  9:32 UTC (permalink / raw)
  To: Damien Le Moal
  Cc: Jens Axboe, Keith Busch, Martin K . Petersen, linux-scsi,
	Chaitanya Kulkarni, linux-nvme, linux-block, Christoph Hellwig

On Thu, Jan 28, 2021 at 09:27:41AM +0000, Damien Le Moal wrote:
> 
> I tried that first, but it did not work. I end up with
> blk_revalidate_disk_zones() undefined error with !CONFIG_BLK_DEV_ZONED.
> This is because blk_queue_is_zoned() is *not* stubbed for !CONFIG_BLK_DEV_ZONED.
> It will simply always return 0/none in that case. We would need to have:

Hmm.  blk_queue_is_zoned is calls blk_queue_zoned_model, which
always returns BLK_ZONED_NONE for !CONFIG_BLK_DEV_ZONED, and I thought we
rely on that elsewhere in nvme.  That is a fairly recent change from me,
though.

> 
> if (IS_ENABLED(CONFIG_BLK_DEV_ZONED) && blk_queue_is_zoned()) {
> 	ret = blk_revalidate_disk_zones(ns->disk, NULL);
> 	...
> 
> Or stub blk_queue_is_zoned()...

If the above really doesn't work we should properly stub it out.

Anyway, I think we can go with your current patch:

Reviewed-by: Christoph Hellwig <hch@lst.de>

and refine all the zoned stubs later.  I have a few more ideas for
improvements there anyway.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 7/8] block: introduce blk_queue_clear_zone_settings()
  2021-01-28  9:32     ` Damien Le Moal
@ 2021-01-28  9:33       ` Christoph Hellwig
  0 siblings, 0 replies; 38+ messages in thread
From: Christoph Hellwig @ 2021-01-28  9:33 UTC (permalink / raw)
  To: Damien Le Moal
  Cc: Jens Axboe, Keith Busch, Martin K . Petersen, linux-scsi,
	Chaitanya Kulkarni, linux-nvme, linux-block, Christoph Hellwig

On Thu, Jan 28, 2021 at 09:32:22AM +0000, Damien Le Moal wrote:
> On 2021/01/28 18:21, Christoph Hellwig wrote:
> > On Thu, Jan 28, 2021 at 01:47:32PM +0900, Damien Le Moal wrote:
> >> Introduce the internal function blk_queue_clear_zone_settings() to
> >> cleanup all limits and resources related to zoned block devices. This
> >> new function is called from blk_queue_set_zoned() when a disk zoned
> >> model is set to BLK_ZONED_NONE. This particular case can happens when a
> >> partition is created on a host-aware scsi disk.
> > 
> > Shouldn't we just do all this work when blk_queue_set_zoned is called
> > with a BLK_ZONED_NONE argument?  That seems like the more obvious API
> > to me.
> 
> That is what I did. blk_queue_set_zoned() calls blk_queue_clear_zone_settings()
> for BLK_ZONED_NONE case. I simply did not open code the cleanups in that
> functions because it is simpler to stub only blk_queue_clear_zone_settings()
> rather than having conditionals in blk_queue_set_zoned(). That also puts the
> cleanup function together with the code that allocates most resources in
> blk-zoned.c. Easier to not overlook something.

Ok, looks good to me then:

Reviewed-by: Christoph Hellwig <hch@lst.de>

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 8/8] sd_zbc: clear zone resources for non-zoned case
  2021-01-28  9:24   ` Christoph Hellwig
@ 2021-01-28  9:36     ` Damien Le Moal
  0 siblings, 0 replies; 38+ messages in thread
From: Damien Le Moal @ 2021-01-28  9:36 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Martin K . Petersen, Keith Busch, Chaitanya Kulkarni,
	linux-nvme, linux-block, linux-scsi

On 2021/01/28 18:24, Christoph Hellwig wrote:
> On Thu, Jan 28, 2021 at 01:47:33PM +0900, Damien Le Moal wrote:
>> For host-aware ZBC disk, setting the device zoned model to BLK_ZONED_HA
>> using blk_queue_set_zoned() in sd_read_block_characteristics() may
>> result in the block device effective zoned model to be "none"
>> (BLK_ZONED_NONE) if partitions are present on the device. In this case,
>> sd_zbc_read_zones() should not setup the zone related queue limits for
>> the disk so that the device limits and configuration is consistent with
>> a regular disk and resources not uselessly allocated (e.g. the zone
>> write pointer tracking array for zone append emulation).
>>
>> Furthermore, if the disk zoned model changes at run time due to the
>> creation of a partition by the user, the zone related resources can be
>> released.
>>
>> Fix both problems by introducing the function sd_zbc_clear_zone_info()
>> to reset the scsi disk zone information and free resources and by
>> returning early in sd_zbc_read_zones() for a block device that has a
>> zoned model equal to BLK_ZONED_NONE.
> 
> So creating the partition doesn't even call into the driver, which
> means we'll leak the info for now.  But I guess the next revalidate
> will simply clean it up, so it is not a major issue.

Exactly. But the leak is only for the sd level resources now.
blk_queue_set_zoned() cleans up everything else.
The super annoying thing is that deleting all partitions leaves that disk in
regular mode instead of returning it to zoned mode. Need to wait for a
revalidate or for a manual rescan for that to happen. I wonder if we should not
trigger a revalidate, always, to avoid that the user sees the device type
suddenly changing long after the partitions were deleted...
Adding such revalidate for partition creation would solve the "leak" problem (or
rather the lack of cleaning) too.

> 
> Looks good:
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> 


-- 
Damien Le Moal
Western Digital Research

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 1/8] block: document zone_append_max_bytes attribute
  2021-01-28  4:47 ` [PATCH v4 1/8] block: document zone_append_max_bytes attribute Damien Le Moal
@ 2021-01-28 10:20   ` Johannes Thumshirn
  0 siblings, 0 replies; 38+ messages in thread
From: Johannes Thumshirn @ 2021-01-28 10:20 UTC (permalink / raw)
  To: Damien Le Moal, linux-block, Jens Axboe
  Cc: Keith Busch, Chaitanya Kulkarni, linux-nvme, linux-scsi,
	Martin K . Petersen, Christoph Hellwig

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 3/8] nullb: use blk_queue_set_zoned() to setup zoned devices
  2021-01-28  4:47 ` [PATCH v4 3/8] nullb: use blk_queue_set_zoned() to setup zoned devices Damien Le Moal
  2021-01-28  5:12   ` Chaitanya Kulkarni
  2021-01-28  9:17   ` Christoph Hellwig
@ 2021-01-28 11:17   ` Johannes Thumshirn
  2 siblings, 0 replies; 38+ messages in thread
From: Johannes Thumshirn @ 2021-01-28 11:17 UTC (permalink / raw)
  To: Damien Le Moal, linux-block, Jens Axboe
  Cc: Keith Busch, Chaitanya Kulkarni, linux-nvme, linux-scsi,
	Martin K . Petersen, Christoph Hellwig

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 4/8] block: use blk_queue_set_zoned in add_partition()
  2021-01-28  4:47 ` [PATCH v4 4/8] block: use blk_queue_set_zoned in add_partition() Damien Le Moal
  2021-01-28  5:16   ` Chaitanya Kulkarni
  2021-01-28  9:17   ` Christoph Hellwig
@ 2021-01-28 11:28   ` Johannes Thumshirn
  2 siblings, 0 replies; 38+ messages in thread
From: Johannes Thumshirn @ 2021-01-28 11:28 UTC (permalink / raw)
  To: Damien Le Moal, linux-block, Jens Axboe
  Cc: Keith Busch, Chaitanya Kulkarni, linux-nvme, linux-scsi,
	Martin K . Petersen, Christoph Hellwig

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 5/8] block: introduce zone_write_granularity limit
  2021-01-28  4:47 ` [PATCH v4 5/8] block: introduce zone_write_granularity limit Damien Le Moal
  2021-01-28  9:19   ` Christoph Hellwig
@ 2021-01-28 11:32   ` Johannes Thumshirn
  2021-02-05  2:54   ` Martin K. Petersen
  2 siblings, 0 replies; 38+ messages in thread
From: Johannes Thumshirn @ 2021-01-28 11:32 UTC (permalink / raw)
  To: Damien Le Moal, linux-block, Jens Axboe
  Cc: Keith Busch, Chaitanya Kulkarni, linux-nvme, linux-scsi,
	Martin K . Petersen, Christoph Hellwig

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 6/8] zonefs: use zone write granularity as block size
  2021-01-28  4:47 ` [PATCH v4 6/8] zonefs: use zone write granularity as block size Damien Le Moal
  2021-01-28  5:17   ` Chaitanya Kulkarni
@ 2021-01-28 11:33   ` Johannes Thumshirn
  1 sibling, 0 replies; 38+ messages in thread
From: Johannes Thumshirn @ 2021-01-28 11:33 UTC (permalink / raw)
  To: Damien Le Moal, linux-block, Jens Axboe
  Cc: Keith Busch, Chaitanya Kulkarni, linux-nvme, linux-scsi,
	Martin K . Petersen, Christoph Hellwig

Looks good,
Reviewed-by: Johannes Thumshirn <johanness.thumshirn@wdc.com>

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 7/8] block: introduce blk_queue_clear_zone_settings()
  2021-01-28  4:47 ` [PATCH v4 7/8] block: introduce blk_queue_clear_zone_settings() Damien Le Moal
  2021-01-28  5:26   ` Chaitanya Kulkarni
  2021-01-28  9:21   ` Christoph Hellwig
@ 2021-01-28 11:43   ` Johannes Thumshirn
  2 siblings, 0 replies; 38+ messages in thread
From: Johannes Thumshirn @ 2021-01-28 11:43 UTC (permalink / raw)
  To: Damien Le Moal, linux-block, Jens Axboe
  Cc: Keith Busch, Chaitanya Kulkarni, linux-nvme, linux-scsi,
	Martin K . Petersen, Christoph Hellwig

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 8/8] sd_zbc: clear zone resources for non-zoned case
  2021-01-28  4:47 ` [PATCH v4 8/8] sd_zbc: clear zone resources for non-zoned case Damien Le Moal
  2021-01-28  5:38   ` Chaitanya Kulkarni
  2021-01-28  9:24   ` Christoph Hellwig
@ 2021-01-28 11:48   ` Johannes Thumshirn
  2021-02-05  2:56   ` Martin K. Petersen
  3 siblings, 0 replies; 38+ messages in thread
From: Johannes Thumshirn @ 2021-01-28 11:48 UTC (permalink / raw)
  To: Damien Le Moal, linux-block, Jens Axboe
  Cc: Keith Busch, Chaitanya Kulkarni, linux-nvme, linux-scsi,
	Martin K . Petersen, Christoph Hellwig

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@edc.com>

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 0/8] block: add zone write granularity limit
  2021-01-28  4:47 [PATCH v4 0/8] block: add zone write granularity limit Damien Le Moal
                   ` (7 preceding siblings ...)
  2021-01-28  4:47 ` [PATCH v4 8/8] sd_zbc: clear zone resources for non-zoned case Damien Le Moal
@ 2021-02-04  8:47 ` Damien Le Moal
  2021-02-08 17:17   ` Christoph Hellwig
  2021-02-10 14:45 ` Jens Axboe
  9 siblings, 1 reply; 38+ messages in thread
From: Damien Le Moal @ 2021-02-04  8:47 UTC (permalink / raw)
  To: linux-block, Jens Axboe
  Cc: Keith Busch, linux-nvme, Martin K . Petersen, linux-scsi,
	Christoph Hellwig

On 2021/01/28 13:50, Damien Le Moal wrote:
> The first patch of this series adds missing documentation of the
> zone_append_max_bytes sysfs attribute.
> 
> The following 3 patches are cleanup and preparatory patches for the
> introduction of the zone write granularity limit. The goal of these
> patches is to have all code setting a device queue zoned model to use
> the helper function blk_queue_set_zoned(). The nvme driver, null_blk
> driver and the partition code are modified to do so.
> 
> The fourth patch in this series introduces the zone write granularity
> queue limit to indicate the alignment constraint for write operations
> into sequential zones of zoned block devices. This limit is always set
> by default to the device logical block size. The following patch
> documents this new limit.
> 
> The last 2 patches introduce the blk_queue_clear_zone_settings()
> function and modify the SCSI sd driver to clear the zone related queue
> limits and resources of a host-aware zoned disk that is changed to a
> regular disk due to the presence of partitions.

Hi Jens,

Any comment on this series ?

Martin,

The scsi bits (patch 5 and 8) may need your ack.

Thanks !

> 
> Changes from v3:
> * Added pathces 2, 3, 4, 7 and 8
> * Addressed Christoph's comments on patch 5
> 
> Changes from v2:
> * Added patch 3 for zonefs
> * Addressed Christoph's comments on patch 1 and added the limit
>   initialization for zoned nullblk
> 
> Changes from v1:
> * Fixed typo in patch 2
> 
> Damien Le Moal (8):
>   block: document zone_append_max_bytes attribute
>   nvme: cleanup zone information initialization
>   nullb: use blk_queue_set_zoned() to setup zoned devices
>   block: use blk_queue_set_zoned in add_partition()
>   block: introduce zone_write_granularity limit
>   zonefs: use zone write granularity as block size
>   block: introduce blk_queue_clear_zone_settings()
>   sd_zbc: clear zone resources for non-zoned case
> 
> Damien Le Moal (8):
>   block: document zone_append_max_bytes attribute
>   nvme: cleanup zone information initialization
>   nullb: use blk_queue_set_zoned() to setup zoned devices
>   block: use blk_queue_set_zoned in add_partition()
>   block: introduce zone_write_granularity limit
>   zonefs: use zone write granularity as block size
>   block: introduce blk_queue_clear_zone_settings()
>   sd_zbc: clear zone resources for non-zoned case
> 
>  Documentation/block/queue-sysfs.rst | 13 +++++++++
>  block/blk-settings.c                | 39 +++++++++++++++++++++++++-
>  block/blk-sysfs.c                   |  8 ++++++
>  block/blk-zoned.c                   | 17 ++++++++++++
>  block/blk.h                         |  2 ++
>  block/partitions/core.c             |  2 +-
>  drivers/block/null_blk/zoned.c      |  8 +++---
>  drivers/nvme/host/core.c            | 11 ++++----
>  drivers/nvme/host/zns.c             | 11 ++------
>  drivers/scsi/sd_zbc.c               | 43 ++++++++++++++++++++++++++---
>  fs/zonefs/super.c                   |  9 +++---
>  include/linux/blkdev.h              | 15 ++++++++++
>  12 files changed, 150 insertions(+), 28 deletions(-)
> 


-- 
Damien Le Moal
Western Digital Research

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 5/8] block: introduce zone_write_granularity limit
  2021-01-28  4:47 ` [PATCH v4 5/8] block: introduce zone_write_granularity limit Damien Le Moal
  2021-01-28  9:19   ` Christoph Hellwig
  2021-01-28 11:32   ` Johannes Thumshirn
@ 2021-02-05  2:54   ` Martin K. Petersen
  2 siblings, 0 replies; 38+ messages in thread
From: Martin K. Petersen @ 2021-02-05  2:54 UTC (permalink / raw)
  To: Damien Le Moal
  Cc: Jens Axboe, Keith Busch, linux-scsi, Martin K . Petersen,
	Chaitanya Kulkarni, linux-nvme, linux-block, Christoph Hellwig


Damien,

> Per ZBC and ZAC specifications, host-managed SMR hard-disks mandate that
> all writes into sequential write required zones be aligned to the device
> physical block size. However, NVMe ZNS does not have this constraint and
> allows write operations into sequential zones to be aligned to the
> device logical block size. This inconsistency does not help with
> software portability across device types.
>
> To solve this, introduce the zone_write_granularity queue limit to
> indicate the alignment constraint, in bytes, of write operations into
> zones of a zoned block device. This new limit is exported as a
> read-only sysfs queue attribute and the helper
> blk_queue_zone_write_granularity() introduced for drivers to set this
> limit.
>
> The function blk_queue_set_zoned() is modified to set this new limit to
> the device logical block size by default. NVMe ZNS devices as well as
> zoned nullb devices use this default value as is. The scsi disk driver
> is modified to execute the blk_queue_zone_write_granularity() helper to
> set the zone write granularity of host-managed SMR disks to the disk
> physical block size.
>
> The accessor functions queue_zone_write_granularity() and
> bdev_zone_write_granularity() are also introduced.

Looks fine.

Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>

-- 
Martin K. Petersen	Oracle Linux Engineering

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 8/8] sd_zbc: clear zone resources for non-zoned case
  2021-01-28  4:47 ` [PATCH v4 8/8] sd_zbc: clear zone resources for non-zoned case Damien Le Moal
                     ` (2 preceding siblings ...)
  2021-01-28 11:48   ` Johannes Thumshirn
@ 2021-02-05  2:56   ` Martin K. Petersen
  3 siblings, 0 replies; 38+ messages in thread
From: Martin K. Petersen @ 2021-02-05  2:56 UTC (permalink / raw)
  To: Damien Le Moal
  Cc: Jens Axboe, Keith Busch, linux-scsi, Martin K . Petersen,
	Chaitanya Kulkarni, linux-nvme, linux-block, Christoph Hellwig


Damien,

> For host-aware ZBC disk, setting the device zoned model to
> BLK_ZONED_HA using blk_queue_set_zoned() in
> sd_read_block_characteristics() may result in the block device
> effective zoned model to be "none" (BLK_ZONED_NONE) if partitions are
> present on the device. In this case, sd_zbc_read_zones() should not
> setup the zone related queue limits for the disk so that the device
> limits and configuration is consistent with a regular disk and
> resources not uselessly allocated (e.g. the zone write pointer
> tracking array for zone append emulation).
>
> Furthermore, if the disk zoned model changes at run time due to the
> creation of a partition by the user, the zone related resources can be
> released.
>
> Fix both problems by introducing the function sd_zbc_clear_zone_info()
> to reset the scsi disk zone information and free resources and by
> returning early in sd_zbc_read_zones() for a block device that has a
> zoned model equal to BLK_ZONED_NONE.

Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>

-- 
Martin K. Petersen	Oracle Linux Engineering

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 0/8] block: add zone write granularity limit
  2021-02-04  8:47 ` [PATCH v4 0/8] block: add zone write granularity limit Damien Le Moal
@ 2021-02-08 17:17   ` Christoph Hellwig
  0 siblings, 0 replies; 38+ messages in thread
From: Christoph Hellwig @ 2021-02-08 17:17 UTC (permalink / raw)
  To: Damien Le Moal
  Cc: Jens Axboe, Keith Busch, linux-scsi, Martin K . Petersen,
	linux-nvme, linux-block, Christoph Hellwig

On Thu, Feb 04, 2021 at 08:47:51AM +0000, Damien Le Moal wrote:
> Any comment on this series ?
> 
> Martin,
> 
> The scsi bits (patch 5 and 8) may need your ack.

ping?  Jens, having this series really helps with some of the
ongoing zoned device work.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 0/8] block: add zone write granularity limit
  2021-01-28  4:47 [PATCH v4 0/8] block: add zone write granularity limit Damien Le Moal
                   ` (8 preceding siblings ...)
  2021-02-04  8:47 ` [PATCH v4 0/8] block: add zone write granularity limit Damien Le Moal
@ 2021-02-10 14:45 ` Jens Axboe
  9 siblings, 0 replies; 38+ messages in thread
From: Jens Axboe @ 2021-02-10 14:45 UTC (permalink / raw)
  To: Damien Le Moal, linux-block
  Cc: Keith Busch, Chaitanya Kulkarni, linux-nvme, linux-scsi,
	Martin K . Petersen, Christoph Hellwig

On 1/27/21 9:47 PM, Damien Le Moal wrote:
> The first patch of this series adds missing documentation of the
> zone_append_max_bytes sysfs attribute.
> 
> The following 3 patches are cleanup and preparatory patches for the
> introduction of the zone write granularity limit. The goal of these
> patches is to have all code setting a device queue zoned model to use
> the helper function blk_queue_set_zoned(). The nvme driver, null_blk
> driver and the partition code are modified to do so.
> 
> The fourth patch in this series introduces the zone write granularity
> queue limit to indicate the alignment constraint for write operations
> into sequential zones of zoned block devices. This limit is always set
> by default to the device logical block size. The following patch
> documents this new limit.
> 
> The last 2 patches introduce the blk_queue_clear_zone_settings()
> function and modify the SCSI sd driver to clear the zone related queue
> limits and resources of a host-aware zoned disk that is changed to a
> regular disk due to the presence of partitions.

Applied, thanks.

-- 
Jens Axboe


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2021-02-10 14:45 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-28  4:47 [PATCH v4 0/8] block: add zone write granularity limit Damien Le Moal
2021-01-28  4:47 ` [PATCH v4 1/8] block: document zone_append_max_bytes attribute Damien Le Moal
2021-01-28 10:20   ` Johannes Thumshirn
2021-01-28  4:47 ` [PATCH v4 2/8] nvme: cleanup zone information initialization Damien Le Moal
2021-01-28  9:17   ` Christoph Hellwig
2021-01-28  9:27     ` Damien Le Moal
2021-01-28  9:32       ` Christoph Hellwig
2021-01-28  4:47 ` [PATCH v4 3/8] nullb: use blk_queue_set_zoned() to setup zoned devices Damien Le Moal
2021-01-28  5:12   ` Chaitanya Kulkarni
2021-01-28  9:17   ` Christoph Hellwig
2021-01-28 11:17   ` Johannes Thumshirn
2021-01-28  4:47 ` [PATCH v4 4/8] block: use blk_queue_set_zoned in add_partition() Damien Le Moal
2021-01-28  5:16   ` Chaitanya Kulkarni
2021-01-28  9:17   ` Christoph Hellwig
2021-01-28 11:28   ` Johannes Thumshirn
2021-01-28  4:47 ` [PATCH v4 5/8] block: introduce zone_write_granularity limit Damien Le Moal
2021-01-28  9:19   ` Christoph Hellwig
2021-01-28 11:32   ` Johannes Thumshirn
2021-02-05  2:54   ` Martin K. Petersen
2021-01-28  4:47 ` [PATCH v4 6/8] zonefs: use zone write granularity as block size Damien Le Moal
2021-01-28  5:17   ` Chaitanya Kulkarni
2021-01-28 11:33   ` Johannes Thumshirn
2021-01-28  4:47 ` [PATCH v4 7/8] block: introduce blk_queue_clear_zone_settings() Damien Le Moal
2021-01-28  5:26   ` Chaitanya Kulkarni
2021-01-28  9:21   ` Christoph Hellwig
2021-01-28  9:32     ` Damien Le Moal
2021-01-28  9:33       ` Christoph Hellwig
2021-01-28 11:43   ` Johannes Thumshirn
2021-01-28  4:47 ` [PATCH v4 8/8] sd_zbc: clear zone resources for non-zoned case Damien Le Moal
2021-01-28  5:38   ` Chaitanya Kulkarni
2021-01-28  5:40     ` Damien Le Moal
2021-01-28  9:24   ` Christoph Hellwig
2021-01-28  9:36     ` Damien Le Moal
2021-01-28 11:48   ` Johannes Thumshirn
2021-02-05  2:56   ` Martin K. Petersen
2021-02-04  8:47 ` [PATCH v4 0/8] block: add zone write granularity limit Damien Le Moal
2021-02-08 17:17   ` Christoph Hellwig
2021-02-10 14:45 ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).