All of lore.kernel.org
 help / color / mirror / Atom feed
* [V2 PATCH 0/4] ZNS: Extra features for current patche
@ 2020-07-02  6:54 ` Javier González
  0 siblings, 0 replies; 38+ messages in thread
From: Javier González @ 2020-07-02  6:54 UTC (permalink / raw)
  To: linux-nvme
  Cc: linux-block, hch, kbusch, sagi, axboe, Damien.LeMoal, mb,
	Javier González

From: Javier González <javier.gonz@samsung.com>

This patchset extends zoned device functionality on top of the existing
ZNS patchset that Keith sent.

Changes since V1
  - Remove new IOCTLs for zone management and zone values. I believe we
    will need the zone management IOCTL to support new ZNS features in
    the near future, but we will wait until we upstream these too. In
    the meantime, feedback on the IOCTL is welcome. We drop the zone
    values IOCTL completely - Niklas send a patch to expose mar / mor
    through sysfs already - we will extend on this if needed.

  - Reimplement zone offline transition to be a IOCTL in itself, as the
    rest of the existing zone transitions. In the process, implement a
    way for drivers to report zone feature support in the newly added
    feature flags. Refactor support for capacity to adopt this model.
    Here, note that we have maintained behaviour for the scsi driver,
    even though zone_size == zone_capacity

  - Reimplement the zone count consistency check to do a correct zone
    calculation. Move this logic to the initialization logic, instead of
    doing it each time we report zones.

Thanks,
Javier

--- V1 cover letter ---

Patches 1-5 are zoned block interface and IOCTL additions to expose ZNS
values to user-space. One major change is the addition of a new zone
management IOCTL that allows to extend zone management commands with
flags. I recall a conversation in the mailing list from early this year
where a similar approach was proposed by Matias, but never made it
upstream. We extended the IOCTL here to align with the comments in that
thread. Here, we are happy to get sign-offs by anyone that contributed
to the thread - just comment here or on the patch.

Patch 6 is nvme-only and adds an extra check to the ZNS report code to
ensure consistency on the zone count.

The patches apply on top of Jens' block-5.8 + Keith's V3 ZNS patches.

Thanks,
Javier


Javier González (4):
  block: Add zone flags to queue zone prop.
  block: add support for zone offline transition
  nvme: Add consistency check for zone count
  block: add attributes to zone report

 block/blk-core.c               |  2 ++
 block/blk-zoned.c              | 11 +++++++--
 drivers/block/null_blk_zoned.c |  2 ++
 drivers/nvme/host/core.c       |  5 ++++-
 drivers/nvme/host/nvme.h       |  6 +++--
 drivers/nvme/host/zns.c        | 41 +++++++++++++++++++++++++++++++++-
 drivers/scsi/sd.c              |  2 ++
 drivers/scsi/sd_zbc.c          |  8 +++++--
 include/linux/blk_types.h      |  3 +++
 include/linux/blkdev.h         |  4 +++-
 include/uapi/linux/blkzoned.h  | 29 +++++++++++++++++++++++-
 11 files changed, 103 insertions(+), 10 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [V2 PATCH 0/4] ZNS: Extra features for current patche
@ 2020-07-02  6:54 ` Javier González
  0 siblings, 0 replies; 38+ messages in thread
From: Javier González @ 2020-07-02  6:54 UTC (permalink / raw)
  To: linux-nvme
  Cc: axboe, Damien.LeMoal, sagi, linux-block, kbusch,
	Javier González, mb, hch

From: Javier González <javier.gonz@samsung.com>

This patchset extends zoned device functionality on top of the existing
ZNS patchset that Keith sent.

Changes since V1
  - Remove new IOCTLs for zone management and zone values. I believe we
    will need the zone management IOCTL to support new ZNS features in
    the near future, but we will wait until we upstream these too. In
    the meantime, feedback on the IOCTL is welcome. We drop the zone
    values IOCTL completely - Niklas send a patch to expose mar / mor
    through sysfs already - we will extend on this if needed.

  - Reimplement zone offline transition to be a IOCTL in itself, as the
    rest of the existing zone transitions. In the process, implement a
    way for drivers to report zone feature support in the newly added
    feature flags. Refactor support for capacity to adopt this model.
    Here, note that we have maintained behaviour for the scsi driver,
    even though zone_size == zone_capacity

  - Reimplement the zone count consistency check to do a correct zone
    calculation. Move this logic to the initialization logic, instead of
    doing it each time we report zones.

Thanks,
Javier

--- V1 cover letter ---

Patches 1-5 are zoned block interface and IOCTL additions to expose ZNS
values to user-space. One major change is the addition of a new zone
management IOCTL that allows to extend zone management commands with
flags. I recall a conversation in the mailing list from early this year
where a similar approach was proposed by Matias, but never made it
upstream. We extended the IOCTL here to align with the comments in that
thread. Here, we are happy to get sign-offs by anyone that contributed
to the thread - just comment here or on the patch.

Patch 6 is nvme-only and adds an extra check to the ZNS report code to
ensure consistency on the zone count.

The patches apply on top of Jens' block-5.8 + Keith's V3 ZNS patches.

Thanks,
Javier


Javier González (4):
  block: Add zone flags to queue zone prop.
  block: add support for zone offline transition
  nvme: Add consistency check for zone count
  block: add attributes to zone report

 block/blk-core.c               |  2 ++
 block/blk-zoned.c              | 11 +++++++--
 drivers/block/null_blk_zoned.c |  2 ++
 drivers/nvme/host/core.c       |  5 ++++-
 drivers/nvme/host/nvme.h       |  6 +++--
 drivers/nvme/host/zns.c        | 41 +++++++++++++++++++++++++++++++++-
 drivers/scsi/sd.c              |  2 ++
 drivers/scsi/sd_zbc.c          |  8 +++++--
 include/linux/blk_types.h      |  3 +++
 include/linux/blkdev.h         |  4 +++-
 include/uapi/linux/blkzoned.h  | 29 +++++++++++++++++++++++-
 11 files changed, 103 insertions(+), 10 deletions(-)

-- 
2.17.1


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH 1/4] block: Add zone flags to queue zone prop.
  2020-07-02  6:54 ` Javier González
@ 2020-07-02  6:54   ` Javier González
  -1 siblings, 0 replies; 38+ messages in thread
From: Javier González @ 2020-07-02  6:54 UTC (permalink / raw)
  To: linux-nvme
  Cc: linux-block, hch, kbusch, sagi, axboe, Damien.LeMoal, mb,
	Javier González, SelvaKumar S, Kanchan Joshi, Nitesh Shetty

From: Javier González <javier.gonz@samsung.com>

As the zoned block device will have to deal with features that are
optional for the backend device, add a flag field to inform the block
layer about supported features. This builds on top of
blk_zone_report_flags and extendes to the zone report code.

Signed-off-by: Javier González <javier.gonz@samsung.com>
Signed-off-by: SelvaKumar S <selvakuma.s1@samsung.com>
Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
Signed-off-by: Nitesh Shetty <nj.shetty@samsung.com>
---
 block/blk-zoned.c              | 3 ++-
 drivers/block/null_blk_zoned.c | 2 ++
 drivers/nvme/host/zns.c        | 1 +
 drivers/scsi/sd.c              | 2 ++
 include/linux/blkdev.h         | 3 +++
 5 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/block/blk-zoned.c b/block/blk-zoned.c
index 81152a260354..0f156e96e48f 100644
--- a/block/blk-zoned.c
+++ b/block/blk-zoned.c
@@ -312,7 +312,8 @@ int blkdev_report_zones_ioctl(struct block_device *bdev, fmode_t mode,
 		return ret;
 
 	rep.nr_zones = ret;
-	rep.flags = BLK_ZONE_REP_CAPACITY;
+	rep.flags = q->zone_flags;
+
 	if (copy_to_user(argp, &rep, sizeof(struct blk_zone_report)))
 		return -EFAULT;
 	return 0;
diff --git a/drivers/block/null_blk_zoned.c b/drivers/block/null_blk_zoned.c
index b05832eb21b2..957c2103f240 100644
--- a/drivers/block/null_blk_zoned.c
+++ b/drivers/block/null_blk_zoned.c
@@ -78,6 +78,8 @@ int null_init_zoned_dev(struct nullb_device *dev, struct request_queue *q)
 	}
 
 	q->limits.zoned = BLK_ZONED_HM;
+	q->zone_flags = BLK_ZONE_REP_CAPACITY;
+
 	blk_queue_flag_set(QUEUE_FLAG_ZONE_RESETALL, q);
 	blk_queue_required_elevator_features(q, ELEVATOR_F_ZBD_SEQ_WRITE);
 
diff --git a/drivers/nvme/host/zns.c b/drivers/nvme/host/zns.c
index 0642d3c54e8f..888264261ba3 100644
--- a/drivers/nvme/host/zns.c
+++ b/drivers/nvme/host/zns.c
@@ -81,6 +81,7 @@ int nvme_update_zone_info(struct gendisk *disk, struct nvme_ns *ns,
 	}
 
 	q->limits.zoned = BLK_ZONED_HM;
+	q->zone_flags = BLK_ZONE_REP_CAPACITY;
 	blk_queue_flag_set(QUEUE_FLAG_ZONE_RESETALL, q);
 free_data:
 	kfree(id);
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index d90fefffe31b..b9c920bace28 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -2967,6 +2967,7 @@ static void sd_read_block_characteristics(struct scsi_disk *sdkp)
 	if (sdkp->device->type == TYPE_ZBC) {
 		/* Host-managed */
 		q->limits.zoned = BLK_ZONED_HM;
+		q->zone_flags = BLK_ZONE_REP_CAPACITY;
 	} else {
 		sdkp->zoned = (buffer[8] >> 4) & 3;
 		if (sdkp->zoned == 1 && !disk_has_partitions(sdkp->disk)) {
@@ -2983,6 +2984,7 @@ static void sd_read_block_characteristics(struct scsi_disk *sdkp)
 					  "Drive-managed SMR disk\n");
 		}
 	}
+
 	if (blk_queue_is_zoned(q) && sdkp->first_scan)
 		sd_printk(KERN_NOTICE, sdkp, "Host-%s zoned block device\n",
 		      q->limits.zoned == BLK_ZONED_HM ? "managed" : "aware");
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 8fd900998b4e..3f2e3425fa53 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -512,12 +512,15 @@ struct request_queue {
 	 * Stacking drivers (device mappers) may or may not initialize
 	 * these fields.
 	 *
+	 * Flags represent features as described by blk_zone_report_flags in blkzoned.h
+	 *
 	 * Reads of this information must be protected with blk_queue_enter() /
 	 * blk_queue_exit(). Modifying this information is only allowed while
 	 * no requests are being processed. See also blk_mq_freeze_queue() and
 	 * blk_mq_unfreeze_queue().
 	 */
 	unsigned int		nr_zones;
+	unsigned int		zone_flags;
 	unsigned long		*conv_zones_bitmap;
 	unsigned long		*seq_zones_wlock;
 #endif /* CONFIG_BLK_DEV_ZONED */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 1/4] block: Add zone flags to queue zone prop.
@ 2020-07-02  6:54   ` Javier González
  0 siblings, 0 replies; 38+ messages in thread
From: Javier González @ 2020-07-02  6:54 UTC (permalink / raw)
  To: linux-nvme
  Cc: axboe, Damien.LeMoal, SelvaKumar S, sagi, Kanchan Joshi,
	Nitesh Shetty, linux-block, kbusch, Javier González, mb,
	hch

From: Javier González <javier.gonz@samsung.com>

As the zoned block device will have to deal with features that are
optional for the backend device, add a flag field to inform the block
layer about supported features. This builds on top of
blk_zone_report_flags and extendes to the zone report code.

Signed-off-by: Javier González <javier.gonz@samsung.com>
Signed-off-by: SelvaKumar S <selvakuma.s1@samsung.com>
Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
Signed-off-by: Nitesh Shetty <nj.shetty@samsung.com>
---
 block/blk-zoned.c              | 3 ++-
 drivers/block/null_blk_zoned.c | 2 ++
 drivers/nvme/host/zns.c        | 1 +
 drivers/scsi/sd.c              | 2 ++
 include/linux/blkdev.h         | 3 +++
 5 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/block/blk-zoned.c b/block/blk-zoned.c
index 81152a260354..0f156e96e48f 100644
--- a/block/blk-zoned.c
+++ b/block/blk-zoned.c
@@ -312,7 +312,8 @@ int blkdev_report_zones_ioctl(struct block_device *bdev, fmode_t mode,
 		return ret;
 
 	rep.nr_zones = ret;
-	rep.flags = BLK_ZONE_REP_CAPACITY;
+	rep.flags = q->zone_flags;
+
 	if (copy_to_user(argp, &rep, sizeof(struct blk_zone_report)))
 		return -EFAULT;
 	return 0;
diff --git a/drivers/block/null_blk_zoned.c b/drivers/block/null_blk_zoned.c
index b05832eb21b2..957c2103f240 100644
--- a/drivers/block/null_blk_zoned.c
+++ b/drivers/block/null_blk_zoned.c
@@ -78,6 +78,8 @@ int null_init_zoned_dev(struct nullb_device *dev, struct request_queue *q)
 	}
 
 	q->limits.zoned = BLK_ZONED_HM;
+	q->zone_flags = BLK_ZONE_REP_CAPACITY;
+
 	blk_queue_flag_set(QUEUE_FLAG_ZONE_RESETALL, q);
 	blk_queue_required_elevator_features(q, ELEVATOR_F_ZBD_SEQ_WRITE);
 
diff --git a/drivers/nvme/host/zns.c b/drivers/nvme/host/zns.c
index 0642d3c54e8f..888264261ba3 100644
--- a/drivers/nvme/host/zns.c
+++ b/drivers/nvme/host/zns.c
@@ -81,6 +81,7 @@ int nvme_update_zone_info(struct gendisk *disk, struct nvme_ns *ns,
 	}
 
 	q->limits.zoned = BLK_ZONED_HM;
+	q->zone_flags = BLK_ZONE_REP_CAPACITY;
 	blk_queue_flag_set(QUEUE_FLAG_ZONE_RESETALL, q);
 free_data:
 	kfree(id);
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index d90fefffe31b..b9c920bace28 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -2967,6 +2967,7 @@ static void sd_read_block_characteristics(struct scsi_disk *sdkp)
 	if (sdkp->device->type == TYPE_ZBC) {
 		/* Host-managed */
 		q->limits.zoned = BLK_ZONED_HM;
+		q->zone_flags = BLK_ZONE_REP_CAPACITY;
 	} else {
 		sdkp->zoned = (buffer[8] >> 4) & 3;
 		if (sdkp->zoned == 1 && !disk_has_partitions(sdkp->disk)) {
@@ -2983,6 +2984,7 @@ static void sd_read_block_characteristics(struct scsi_disk *sdkp)
 					  "Drive-managed SMR disk\n");
 		}
 	}
+
 	if (blk_queue_is_zoned(q) && sdkp->first_scan)
 		sd_printk(KERN_NOTICE, sdkp, "Host-%s zoned block device\n",
 		      q->limits.zoned == BLK_ZONED_HM ? "managed" : "aware");
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 8fd900998b4e..3f2e3425fa53 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -512,12 +512,15 @@ struct request_queue {
 	 * Stacking drivers (device mappers) may or may not initialize
 	 * these fields.
 	 *
+	 * Flags represent features as described by blk_zone_report_flags in blkzoned.h
+	 *
 	 * Reads of this information must be protected with blk_queue_enter() /
 	 * blk_queue_exit(). Modifying this information is only allowed while
 	 * no requests are being processed. See also blk_mq_freeze_queue() and
 	 * blk_mq_unfreeze_queue().
 	 */
 	unsigned int		nr_zones;
+	unsigned int		zone_flags;
 	unsigned long		*conv_zones_bitmap;
 	unsigned long		*seq_zones_wlock;
 #endif /* CONFIG_BLK_DEV_ZONED */
-- 
2.17.1


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 2/4] block: add support for zone offline transition
  2020-07-02  6:54 ` Javier González
@ 2020-07-02  6:54   ` Javier González
  -1 siblings, 0 replies; 38+ messages in thread
From: Javier González @ 2020-07-02  6:54 UTC (permalink / raw)
  To: linux-nvme
  Cc: linux-block, hch, kbusch, sagi, axboe, Damien.LeMoal, mb,
	Javier González, SelvaKumar S, Kanchan Joshi, Nitesh Shetty

From: Javier González <javier.gonz@samsung.com>

Add support for offline transition on the zoned block device. Use the
existing feature flags for the underlying driver to report support for
the feature, as currently this transition is only supported in ZNS and
not in ZAC/ZBC

Signed-off-by: Javier González <javier.gonz@samsung.com>
Signed-off-by: SelvaKumar S <selvakuma.s1@samsung.com>
Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
Signed-off-by: Nitesh Shetty <nj.shetty@samsung.com>
---
 block/blk-core.c              | 2 ++
 block/blk-zoned.c             | 8 +++++++-
 drivers/nvme/host/core.c      | 3 +++
 drivers/nvme/host/zns.c       | 2 +-
 include/linux/blk_types.h     | 3 +++
 include/linux/blkdev.h        | 1 -
 include/uapi/linux/blkzoned.h | 3 +++
 7 files changed, 19 insertions(+), 3 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 03252af8c82c..589cbdacc5ec 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -140,6 +140,7 @@ static const char *const blk_op_name[] = {
 	REQ_OP_NAME(ZONE_CLOSE),
 	REQ_OP_NAME(ZONE_FINISH),
 	REQ_OP_NAME(ZONE_APPEND),
+	REQ_OP_NAME(ZONE_OFFLINE),
 	REQ_OP_NAME(WRITE_SAME),
 	REQ_OP_NAME(WRITE_ZEROES),
 	REQ_OP_NAME(SCSI_IN),
@@ -1030,6 +1031,7 @@ generic_make_request_checks(struct bio *bio)
 	case REQ_OP_ZONE_OPEN:
 	case REQ_OP_ZONE_CLOSE:
 	case REQ_OP_ZONE_FINISH:
+	case REQ_OP_ZONE_OFFLINE:
 		if (!blk_queue_is_zoned(q))
 			goto not_supported;
 		break;
diff --git a/block/blk-zoned.c b/block/blk-zoned.c
index 0f156e96e48f..b97f67f462b4 100644
--- a/block/blk-zoned.c
+++ b/block/blk-zoned.c
@@ -320,7 +320,8 @@ int blkdev_report_zones_ioctl(struct block_device *bdev, fmode_t mode,
 }
 
 /*
- * BLKRESETZONE, BLKOPENZONE, BLKCLOSEZONE and BLKFINISHZONE ioctl processing.
+ * BLKRESETZONE, BLKOPENZONE, BLKCLOSEZONE, BLKFINISHZONE and BLKOFFLINEZONE
+ * ioctl processing.
  * Called from blkdev_ioctl.
  */
 int blkdev_zone_mgmt_ioctl(struct block_device *bdev, fmode_t mode,
@@ -363,6 +364,11 @@ int blkdev_zone_mgmt_ioctl(struct block_device *bdev, fmode_t mode,
 	case BLKFINISHZONE:
 		op = REQ_OP_ZONE_FINISH;
 		break;
+	case BLKOFFLINEZONE:
+		if (!(q->zone_flags & BLK_ZONE_REP_OFFLINE))
+			return -EINVAL;
+		op = REQ_OP_ZONE_OFFLINE;
+		break;
 	default:
 		return -ENOTTY;
 	}
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index e5f754889234..1f5c7fc3d2c9 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -776,6 +776,9 @@ blk_status_t nvme_setup_cmd(struct nvme_ns *ns, struct request *req,
 	case REQ_OP_ZONE_FINISH:
 		ret = nvme_setup_zone_mgmt_send(ns, req, cmd, NVME_ZONE_FINISH);
 		break;
+	case REQ_OP_ZONE_OFFLINE:
+		ret = nvme_setup_zone_mgmt_send(ns, req, cmd, NVME_ZONE_OFFLINE);
+		break;
 	case REQ_OP_WRITE_ZEROES:
 		ret = nvme_setup_write_zeroes(ns, req, cmd);
 		break;
diff --git a/drivers/nvme/host/zns.c b/drivers/nvme/host/zns.c
index 888264261ba3..b34d2ed13825 100644
--- a/drivers/nvme/host/zns.c
+++ b/drivers/nvme/host/zns.c
@@ -81,7 +81,7 @@ int nvme_update_zone_info(struct gendisk *disk, struct nvme_ns *ns,
 	}
 
 	q->limits.zoned = BLK_ZONED_HM;
-	q->zone_flags = BLK_ZONE_REP_CAPACITY;
+	q->zone_flags = BLK_ZONE_REP_CAPACITY | BLK_ZONE_REP_OFFLINE;
 	blk_queue_flag_set(QUEUE_FLAG_ZONE_RESETALL, q);
 free_data:
 	kfree(id);
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index ccb895f911b1..c0123c643e2f 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -316,6 +316,8 @@ enum req_opf {
 	REQ_OP_ZONE_FINISH	= 12,
 	/* write data at the current zone write pointer */
 	REQ_OP_ZONE_APPEND	= 13,
+	/* Transition a zone to offline */
+	REQ_OP_ZONE_OFFLINE	= 14,
 
 	/* SCSI passthrough using struct scsi_request */
 	REQ_OP_SCSI_IN		= 32,
@@ -455,6 +457,7 @@ static inline bool op_is_zone_mgmt(enum req_opf op)
 	case REQ_OP_ZONE_OPEN:
 	case REQ_OP_ZONE_CLOSE:
 	case REQ_OP_ZONE_FINISH:
+	case REQ_OP_ZONE_OFFLINE:
 		return true;
 	default:
 		return false;
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 3f2e3425fa53..e489b646486d 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -370,7 +370,6 @@ extern int blkdev_report_zones_ioctl(struct block_device *bdev, fmode_t mode,
 				     unsigned int cmd, unsigned long arg);
 extern int blkdev_zone_mgmt_ioctl(struct block_device *bdev, fmode_t mode,
 				  unsigned int cmd, unsigned long arg);
-
 #else /* CONFIG_BLK_DEV_ZONED */
 
 static inline unsigned int blkdev_nr_zones(struct gendisk *disk)
diff --git a/include/uapi/linux/blkzoned.h b/include/uapi/linux/blkzoned.h
index 42c3366cc25f..e5adf4a9f4b0 100644
--- a/include/uapi/linux/blkzoned.h
+++ b/include/uapi/linux/blkzoned.h
@@ -77,9 +77,11 @@ enum blk_zone_cond {
  * enum blk_zone_report_flags - Feature flags of reported zone descriptors.
  *
  * @BLK_ZONE_REP_CAPACITY: Zone descriptor has capacity field.
+ * @BLK_ZONE_REP_OFFLINE : Zone device supports offline transition.
  */
 enum blk_zone_report_flags {
 	BLK_ZONE_REP_CAPACITY	= (1 << 0),
+	BLK_ZONE_REP_OFFLINE	= (1 << 1),
 };
 
 /**
@@ -166,5 +168,6 @@ struct blk_zone_range {
 #define BLKOPENZONE	_IOW(0x12, 134, struct blk_zone_range)
 #define BLKCLOSEZONE	_IOW(0x12, 135, struct blk_zone_range)
 #define BLKFINISHZONE	_IOW(0x12, 136, struct blk_zone_range)
+#define BLKOFFLINEZONE	_IOW(0x12, 137, struct blk_zone_range)
 
 #endif /* _UAPI_BLKZONED_H */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 2/4] block: add support for zone offline transition
@ 2020-07-02  6:54   ` Javier González
  0 siblings, 0 replies; 38+ messages in thread
From: Javier González @ 2020-07-02  6:54 UTC (permalink / raw)
  To: linux-nvme
  Cc: axboe, Damien.LeMoal, SelvaKumar S, sagi, Kanchan Joshi,
	Nitesh Shetty, linux-block, kbusch, Javier González, mb,
	hch

From: Javier González <javier.gonz@samsung.com>

Add support for offline transition on the zoned block device. Use the
existing feature flags for the underlying driver to report support for
the feature, as currently this transition is only supported in ZNS and
not in ZAC/ZBC

Signed-off-by: Javier González <javier.gonz@samsung.com>
Signed-off-by: SelvaKumar S <selvakuma.s1@samsung.com>
Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
Signed-off-by: Nitesh Shetty <nj.shetty@samsung.com>
---
 block/blk-core.c              | 2 ++
 block/blk-zoned.c             | 8 +++++++-
 drivers/nvme/host/core.c      | 3 +++
 drivers/nvme/host/zns.c       | 2 +-
 include/linux/blk_types.h     | 3 +++
 include/linux/blkdev.h        | 1 -
 include/uapi/linux/blkzoned.h | 3 +++
 7 files changed, 19 insertions(+), 3 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 03252af8c82c..589cbdacc5ec 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -140,6 +140,7 @@ static const char *const blk_op_name[] = {
 	REQ_OP_NAME(ZONE_CLOSE),
 	REQ_OP_NAME(ZONE_FINISH),
 	REQ_OP_NAME(ZONE_APPEND),
+	REQ_OP_NAME(ZONE_OFFLINE),
 	REQ_OP_NAME(WRITE_SAME),
 	REQ_OP_NAME(WRITE_ZEROES),
 	REQ_OP_NAME(SCSI_IN),
@@ -1030,6 +1031,7 @@ generic_make_request_checks(struct bio *bio)
 	case REQ_OP_ZONE_OPEN:
 	case REQ_OP_ZONE_CLOSE:
 	case REQ_OP_ZONE_FINISH:
+	case REQ_OP_ZONE_OFFLINE:
 		if (!blk_queue_is_zoned(q))
 			goto not_supported;
 		break;
diff --git a/block/blk-zoned.c b/block/blk-zoned.c
index 0f156e96e48f..b97f67f462b4 100644
--- a/block/blk-zoned.c
+++ b/block/blk-zoned.c
@@ -320,7 +320,8 @@ int blkdev_report_zones_ioctl(struct block_device *bdev, fmode_t mode,
 }
 
 /*
- * BLKRESETZONE, BLKOPENZONE, BLKCLOSEZONE and BLKFINISHZONE ioctl processing.
+ * BLKRESETZONE, BLKOPENZONE, BLKCLOSEZONE, BLKFINISHZONE and BLKOFFLINEZONE
+ * ioctl processing.
  * Called from blkdev_ioctl.
  */
 int blkdev_zone_mgmt_ioctl(struct block_device *bdev, fmode_t mode,
@@ -363,6 +364,11 @@ int blkdev_zone_mgmt_ioctl(struct block_device *bdev, fmode_t mode,
 	case BLKFINISHZONE:
 		op = REQ_OP_ZONE_FINISH;
 		break;
+	case BLKOFFLINEZONE:
+		if (!(q->zone_flags & BLK_ZONE_REP_OFFLINE))
+			return -EINVAL;
+		op = REQ_OP_ZONE_OFFLINE;
+		break;
 	default:
 		return -ENOTTY;
 	}
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index e5f754889234..1f5c7fc3d2c9 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -776,6 +776,9 @@ blk_status_t nvme_setup_cmd(struct nvme_ns *ns, struct request *req,
 	case REQ_OP_ZONE_FINISH:
 		ret = nvme_setup_zone_mgmt_send(ns, req, cmd, NVME_ZONE_FINISH);
 		break;
+	case REQ_OP_ZONE_OFFLINE:
+		ret = nvme_setup_zone_mgmt_send(ns, req, cmd, NVME_ZONE_OFFLINE);
+		break;
 	case REQ_OP_WRITE_ZEROES:
 		ret = nvme_setup_write_zeroes(ns, req, cmd);
 		break;
diff --git a/drivers/nvme/host/zns.c b/drivers/nvme/host/zns.c
index 888264261ba3..b34d2ed13825 100644
--- a/drivers/nvme/host/zns.c
+++ b/drivers/nvme/host/zns.c
@@ -81,7 +81,7 @@ int nvme_update_zone_info(struct gendisk *disk, struct nvme_ns *ns,
 	}
 
 	q->limits.zoned = BLK_ZONED_HM;
-	q->zone_flags = BLK_ZONE_REP_CAPACITY;
+	q->zone_flags = BLK_ZONE_REP_CAPACITY | BLK_ZONE_REP_OFFLINE;
 	blk_queue_flag_set(QUEUE_FLAG_ZONE_RESETALL, q);
 free_data:
 	kfree(id);
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index ccb895f911b1..c0123c643e2f 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -316,6 +316,8 @@ enum req_opf {
 	REQ_OP_ZONE_FINISH	= 12,
 	/* write data at the current zone write pointer */
 	REQ_OP_ZONE_APPEND	= 13,
+	/* Transition a zone to offline */
+	REQ_OP_ZONE_OFFLINE	= 14,
 
 	/* SCSI passthrough using struct scsi_request */
 	REQ_OP_SCSI_IN		= 32,
@@ -455,6 +457,7 @@ static inline bool op_is_zone_mgmt(enum req_opf op)
 	case REQ_OP_ZONE_OPEN:
 	case REQ_OP_ZONE_CLOSE:
 	case REQ_OP_ZONE_FINISH:
+	case REQ_OP_ZONE_OFFLINE:
 		return true;
 	default:
 		return false;
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 3f2e3425fa53..e489b646486d 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -370,7 +370,6 @@ extern int blkdev_report_zones_ioctl(struct block_device *bdev, fmode_t mode,
 				     unsigned int cmd, unsigned long arg);
 extern int blkdev_zone_mgmt_ioctl(struct block_device *bdev, fmode_t mode,
 				  unsigned int cmd, unsigned long arg);
-
 #else /* CONFIG_BLK_DEV_ZONED */
 
 static inline unsigned int blkdev_nr_zones(struct gendisk *disk)
diff --git a/include/uapi/linux/blkzoned.h b/include/uapi/linux/blkzoned.h
index 42c3366cc25f..e5adf4a9f4b0 100644
--- a/include/uapi/linux/blkzoned.h
+++ b/include/uapi/linux/blkzoned.h
@@ -77,9 +77,11 @@ enum blk_zone_cond {
  * enum blk_zone_report_flags - Feature flags of reported zone descriptors.
  *
  * @BLK_ZONE_REP_CAPACITY: Zone descriptor has capacity field.
+ * @BLK_ZONE_REP_OFFLINE : Zone device supports offline transition.
  */
 enum blk_zone_report_flags {
 	BLK_ZONE_REP_CAPACITY	= (1 << 0),
+	BLK_ZONE_REP_OFFLINE	= (1 << 1),
 };
 
 /**
@@ -166,5 +168,6 @@ struct blk_zone_range {
 #define BLKOPENZONE	_IOW(0x12, 134, struct blk_zone_range)
 #define BLKCLOSEZONE	_IOW(0x12, 135, struct blk_zone_range)
 #define BLKFINISHZONE	_IOW(0x12, 136, struct blk_zone_range)
+#define BLKOFFLINEZONE	_IOW(0x12, 137, struct blk_zone_range)
 
 #endif /* _UAPI_BLKZONED_H */
-- 
2.17.1


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 3/4] nvme: Add consistency check for zone count
  2020-07-02  6:54 ` Javier González
@ 2020-07-02  6:54   ` Javier González
  -1 siblings, 0 replies; 38+ messages in thread
From: Javier González @ 2020-07-02  6:54 UTC (permalink / raw)
  To: linux-nvme
  Cc: linux-block, hch, kbusch, sagi, axboe, Damien.LeMoal, mb,
	Javier González, SelvaKumar S, Kanchan Joshi, Nitesh Shetty

From: Javier González <javier.gonz@samsung.com>

Since the number of zones is calculated through the reported device
capacity and the ZNS specification allows to report the total number of
zones in the device, add an extra check to guarantee consistency between
the device and the kernel.

Also, fix a checkpatch warning: unsigned -> unsigned int in the process

Signed-off-by: Javier González <javier.gonz@samsung.com>
Signed-off-by: SelvaKumar S <selvakuma.s1@samsung.com>
Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
Signed-off-by: Nitesh Shetty <nj.shetty@samsung.com>
---
 drivers/nvme/host/core.c |  2 +-
 drivers/nvme/host/nvme.h |  6 ++++--
 drivers/nvme/host/zns.c  | 39 ++++++++++++++++++++++++++++++++++++++-
 3 files changed, 43 insertions(+), 4 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 1f5c7fc3d2c9..8b9c69172931 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1994,7 +1994,7 @@ static int __nvme_revalidate_disk(struct gendisk *disk, struct nvme_id_ns *id)
 	case NVME_CSI_NVM:
 		break;
 	case NVME_CSI_ZNS:
-		ret = nvme_update_zone_info(disk, ns, lbaf);
+		ret = nvme_update_zone_info(disk, ns, lbaf, le64_to_cpu(id->nsze));
 		if (ret)
 			return ret;
 		break;
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 662f95fbd909..ef80e0b4df56 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -407,6 +407,7 @@ struct nvme_ns {
 	u32 sws;
 	u8 pi_type;
 #ifdef CONFIG_BLK_DEV_ZONED
+	u64 nr_zones;
 	u64 zsze;
 #endif
 	unsigned long features;
@@ -700,7 +701,7 @@ static inline void nvme_mpath_start_freeze(struct nvme_subsystem *subsys)
 
 #ifdef CONFIG_BLK_DEV_ZONED
 int nvme_update_zone_info(struct gendisk *disk, struct nvme_ns *ns,
-			  unsigned lbaf);
+			  unsigned int lbaf, sector_t nsze);
 
 int nvme_report_zones(struct gendisk *disk, sector_t sector,
 		      unsigned int nr_zones, report_zones_cb cb, void *data);
@@ -720,7 +721,8 @@ static inline blk_status_t nvme_setup_zone_mgmt_send(struct nvme_ns *ns,
 
 static inline int nvme_update_zone_info(struct gendisk *disk,
 					struct nvme_ns *ns,
-					unsigned lbaf)
+					unsigned lbaf
+					sector_t nsze)
 {
 	dev_warn(ns->ctrl->device,
 		 "Please enable CONFIG_BLK_DEV_ZONED to support ZNS devices\n");
diff --git a/drivers/nvme/host/zns.c b/drivers/nvme/host/zns.c
index b34d2ed13825..d785d179a343 100644
--- a/drivers/nvme/host/zns.c
+++ b/drivers/nvme/host/zns.c
@@ -32,13 +32,36 @@ static int nvme_set_max_append(struct nvme_ctrl *ctrl)
 	return 0;
 }
 
+static u64 nvme_zns_nr_zones(struct nvme_ns *ns)
+{
+	struct nvme_command c = { };
+	struct nvme_zone_report report;
+	int buflen = sizeof(struct nvme_zone_report);
+	int ret;
+
+	c.zmr.opcode = nvme_cmd_zone_mgmt_recv;
+	c.zmr.nsid = cpu_to_le32(ns->head->ns_id);
+	c.zmr.slba = cpu_to_le64(0);
+	c.zmr.numd = cpu_to_le32(nvme_bytes_to_numd(buflen));
+	c.zmr.zra = NVME_ZRA_ZONE_REPORT;
+	c.zmr.zrasf = NVME_ZRASF_ZONE_REPORT_ALL;
+	c.zmr.pr = 0;
+
+	ret = nvme_submit_sync_cmd(ns->queue, &c, &report, buflen);
+	if (ret)
+		return ret;
+
+	return le64_to_cpu(report.nr_zones);
+}
+
 int nvme_update_zone_info(struct gendisk *disk, struct nvme_ns *ns,
-			  unsigned lbaf)
+			  unsigned int lbaf, sector_t nsze)
 {
 //	struct nvme_effects_log *log = ns->head->effects;
 	struct request_queue *q = disk->queue;
 	struct nvme_command c = { };
 	struct nvme_id_ns_zns *id;
+	sector_t cap_nr_zones;
 	int status;
 
 	/* Driver requires zone append support */
@@ -80,6 +103,20 @@ int nvme_update_zone_info(struct gendisk *disk, struct nvme_ns *ns,
 		goto free_data;
 	}
 
+	ns->nr_zones = nvme_zns_nr_zones(ns);
+	if (!ns->nr_zones) {
+		status = -EINVAL;
+		goto free_data;
+	}
+
+	cap_nr_zones = nvme_lba_to_sect(ns, le64_to_cpu(nsze)) >> ilog2(ns->zsze);
+	if (ns->nr_zones != cap_nr_zones) {
+		dev_err(ns->ctrl->device, "inconsistent zone count: %llu/%llu\n",
+			ns->nr_zones, cap_nr_zones);
+		status = -EINVAL;
+		goto free_data;
+	}
+
 	q->limits.zoned = BLK_ZONED_HM;
 	q->zone_flags = BLK_ZONE_REP_CAPACITY | BLK_ZONE_REP_OFFLINE;
 	blk_queue_flag_set(QUEUE_FLAG_ZONE_RESETALL, q);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 3/4] nvme: Add consistency check for zone count
@ 2020-07-02  6:54   ` Javier González
  0 siblings, 0 replies; 38+ messages in thread
From: Javier González @ 2020-07-02  6:54 UTC (permalink / raw)
  To: linux-nvme
  Cc: axboe, Damien.LeMoal, SelvaKumar S, sagi, Kanchan Joshi,
	Nitesh Shetty, linux-block, kbusch, Javier González, mb,
	hch

From: Javier González <javier.gonz@samsung.com>

Since the number of zones is calculated through the reported device
capacity and the ZNS specification allows to report the total number of
zones in the device, add an extra check to guarantee consistency between
the device and the kernel.

Also, fix a checkpatch warning: unsigned -> unsigned int in the process

Signed-off-by: Javier González <javier.gonz@samsung.com>
Signed-off-by: SelvaKumar S <selvakuma.s1@samsung.com>
Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
Signed-off-by: Nitesh Shetty <nj.shetty@samsung.com>
---
 drivers/nvme/host/core.c |  2 +-
 drivers/nvme/host/nvme.h |  6 ++++--
 drivers/nvme/host/zns.c  | 39 ++++++++++++++++++++++++++++++++++++++-
 3 files changed, 43 insertions(+), 4 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 1f5c7fc3d2c9..8b9c69172931 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1994,7 +1994,7 @@ static int __nvme_revalidate_disk(struct gendisk *disk, struct nvme_id_ns *id)
 	case NVME_CSI_NVM:
 		break;
 	case NVME_CSI_ZNS:
-		ret = nvme_update_zone_info(disk, ns, lbaf);
+		ret = nvme_update_zone_info(disk, ns, lbaf, le64_to_cpu(id->nsze));
 		if (ret)
 			return ret;
 		break;
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 662f95fbd909..ef80e0b4df56 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -407,6 +407,7 @@ struct nvme_ns {
 	u32 sws;
 	u8 pi_type;
 #ifdef CONFIG_BLK_DEV_ZONED
+	u64 nr_zones;
 	u64 zsze;
 #endif
 	unsigned long features;
@@ -700,7 +701,7 @@ static inline void nvme_mpath_start_freeze(struct nvme_subsystem *subsys)
 
 #ifdef CONFIG_BLK_DEV_ZONED
 int nvme_update_zone_info(struct gendisk *disk, struct nvme_ns *ns,
-			  unsigned lbaf);
+			  unsigned int lbaf, sector_t nsze);
 
 int nvme_report_zones(struct gendisk *disk, sector_t sector,
 		      unsigned int nr_zones, report_zones_cb cb, void *data);
@@ -720,7 +721,8 @@ static inline blk_status_t nvme_setup_zone_mgmt_send(struct nvme_ns *ns,
 
 static inline int nvme_update_zone_info(struct gendisk *disk,
 					struct nvme_ns *ns,
-					unsigned lbaf)
+					unsigned lbaf
+					sector_t nsze)
 {
 	dev_warn(ns->ctrl->device,
 		 "Please enable CONFIG_BLK_DEV_ZONED to support ZNS devices\n");
diff --git a/drivers/nvme/host/zns.c b/drivers/nvme/host/zns.c
index b34d2ed13825..d785d179a343 100644
--- a/drivers/nvme/host/zns.c
+++ b/drivers/nvme/host/zns.c
@@ -32,13 +32,36 @@ static int nvme_set_max_append(struct nvme_ctrl *ctrl)
 	return 0;
 }
 
+static u64 nvme_zns_nr_zones(struct nvme_ns *ns)
+{
+	struct nvme_command c = { };
+	struct nvme_zone_report report;
+	int buflen = sizeof(struct nvme_zone_report);
+	int ret;
+
+	c.zmr.opcode = nvme_cmd_zone_mgmt_recv;
+	c.zmr.nsid = cpu_to_le32(ns->head->ns_id);
+	c.zmr.slba = cpu_to_le64(0);
+	c.zmr.numd = cpu_to_le32(nvme_bytes_to_numd(buflen));
+	c.zmr.zra = NVME_ZRA_ZONE_REPORT;
+	c.zmr.zrasf = NVME_ZRASF_ZONE_REPORT_ALL;
+	c.zmr.pr = 0;
+
+	ret = nvme_submit_sync_cmd(ns->queue, &c, &report, buflen);
+	if (ret)
+		return ret;
+
+	return le64_to_cpu(report.nr_zones);
+}
+
 int nvme_update_zone_info(struct gendisk *disk, struct nvme_ns *ns,
-			  unsigned lbaf)
+			  unsigned int lbaf, sector_t nsze)
 {
 //	struct nvme_effects_log *log = ns->head->effects;
 	struct request_queue *q = disk->queue;
 	struct nvme_command c = { };
 	struct nvme_id_ns_zns *id;
+	sector_t cap_nr_zones;
 	int status;
 
 	/* Driver requires zone append support */
@@ -80,6 +103,20 @@ int nvme_update_zone_info(struct gendisk *disk, struct nvme_ns *ns,
 		goto free_data;
 	}
 
+	ns->nr_zones = nvme_zns_nr_zones(ns);
+	if (!ns->nr_zones) {
+		status = -EINVAL;
+		goto free_data;
+	}
+
+	cap_nr_zones = nvme_lba_to_sect(ns, le64_to_cpu(nsze)) >> ilog2(ns->zsze);
+	if (ns->nr_zones != cap_nr_zones) {
+		dev_err(ns->ctrl->device, "inconsistent zone count: %llu/%llu\n",
+			ns->nr_zones, cap_nr_zones);
+		status = -EINVAL;
+		goto free_data;
+	}
+
 	q->limits.zoned = BLK_ZONED_HM;
 	q->zone_flags = BLK_ZONE_REP_CAPACITY | BLK_ZONE_REP_OFFLINE;
 	blk_queue_flag_set(QUEUE_FLAG_ZONE_RESETALL, q);
-- 
2.17.1


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 4/4] block: add attributes to zone report
  2020-07-02  6:54 ` Javier González
@ 2020-07-02  6:54   ` Javier González
  -1 siblings, 0 replies; 38+ messages in thread
From: Javier González @ 2020-07-02  6:54 UTC (permalink / raw)
  To: linux-nvme
  Cc: linux-block, hch, kbusch, sagi, axboe, Damien.LeMoal, mb,
	Javier González, SelvaKumar S, Kanchan Joshi, Nitesh Shetty

From: Javier González <javier.gonz@samsung.com>

Add zone attributes field to the blk_zone structure. Use ZNS attributes
as base for zoned block devices and add the current atributes in
ZAC/ZBC.

Signed-off-by: Javier González <javier.gonz@samsung.com>
Signed-off-by: SelvaKumar S <selvakuma.s1@samsung.com>
Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
Signed-off-by: Nitesh Shetty <nj.shetty@samsung.com>
---
 drivers/nvme/host/zns.c       |  3 ++-
 drivers/scsi/sd.c             |  2 +-
 drivers/scsi/sd_zbc.c         |  8 ++++++--
 include/uapi/linux/blkzoned.h | 26 +++++++++++++++++++++++++-
 4 files changed, 34 insertions(+), 5 deletions(-)

diff --git a/drivers/nvme/host/zns.c b/drivers/nvme/host/zns.c
index d785d179a343..749382a14968 100644
--- a/drivers/nvme/host/zns.c
+++ b/drivers/nvme/host/zns.c
@@ -118,7 +118,7 @@ int nvme_update_zone_info(struct gendisk *disk, struct nvme_ns *ns,
 	}
 
 	q->limits.zoned = BLK_ZONED_HM;
-	q->zone_flags = BLK_ZONE_REP_CAPACITY | BLK_ZONE_REP_OFFLINE;
+	q->zone_flags = BLK_ZONE_REP_CAPACITY | BLK_ZONE_REP_CAPACITY | BLK_ZONE_REP_OFFLINE;
 	blk_queue_flag_set(QUEUE_FLAG_ZONE_RESETALL, q);
 free_data:
 	kfree(id);
@@ -197,6 +197,7 @@ static int nvme_zone_parse_entry(struct nvme_ns *ns,
 	zone.capacity = nvme_lba_to_sect(ns, le64_to_cpu(entry->zcap));
 	zone.start = nvme_lba_to_sect(ns, le64_to_cpu(entry->zslba));
 	zone.wp = nvme_lba_to_sect(ns, le64_to_cpu(entry->wp));
+	zone.attr = entry->za;
 
 	return cb(&zone, idx, data);
 }
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index b9c920bace28..63270598aa76 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -2967,7 +2967,7 @@ static void sd_read_block_characteristics(struct scsi_disk *sdkp)
 	if (sdkp->device->type == TYPE_ZBC) {
 		/* Host-managed */
 		q->limits.zoned = BLK_ZONED_HM;
-		q->zone_flags = BLK_ZONE_REP_CAPACITY;
+		q->zone_flags = BLK_ZONE_REP_CAPACITY | BLK_ZONE_REP_ATTR;
 	} else {
 		sdkp->zoned = (buffer[8] >> 4) & 3;
 		if (sdkp->zoned == 1 && !disk_has_partitions(sdkp->disk)) {
diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
index 183a20720da9..51c7f82b59c5 100644
--- a/drivers/scsi/sd_zbc.c
+++ b/drivers/scsi/sd_zbc.c
@@ -53,10 +53,14 @@ static int sd_zbc_parse_report(struct scsi_disk *sdkp, u8 *buf,
 
 	zone.type = buf[0] & 0x0f;
 	zone.cond = (buf[1] >> 4) & 0xf;
-	if (buf[1] & 0x01)
+	if (buf[1] & 0x01) {
+		zone.attr |= BLK_ZONE_ATTR_NRW;
 		zone.reset = 1;
-	if (buf[1] & 0x02)
+	}
+	if (buf[1] & 0x02) {
+		zone.attr |= BLK_ZONE_ATTR_NSQ;
 		zone.non_seq = 1;
+	}
 
 	zone.len = logical_to_sectors(sdp, get_unaligned_be64(&buf[8]));
 	zone.capacity = zone.len;
diff --git a/include/uapi/linux/blkzoned.h b/include/uapi/linux/blkzoned.h
index e5adf4a9f4b0..315617827acd 100644
--- a/include/uapi/linux/blkzoned.h
+++ b/include/uapi/linux/blkzoned.h
@@ -78,10 +78,33 @@ enum blk_zone_cond {
  *
  * @BLK_ZONE_REP_CAPACITY: Zone descriptor has capacity field.
  * @BLK_ZONE_REP_OFFLINE : Zone device supports offline transition.
+ * @BLK_ZONE_REP_ATTR: Zone attributes reported.
  */
 enum blk_zone_report_flags {
 	BLK_ZONE_REP_CAPACITY	= (1 << 0),
 	BLK_ZONE_REP_OFFLINE	= (1 << 1),
+	BLK_ZONE_REP_ATTR	= (1 << 2),
+};
+
+/**
+ * enum blk_zone_attr - Zone Attributes
+ *
+ * Attributes of the zone. Reported in struct blk_zone -> attr
+ *
+ * @BLK_ZONE_ATTR_ZFC: Zone Finished by Controller due to a zone active excursion
+ * @BLK_ZONE_ATTR_FZR: Finish Zone Recommended required by controller
+ * @BLK_ZONE_ATTR_RZR: Reset Zone Recommended required by controller
+ * @BLK_ZONE_ATTR_NSQ: Non Sequential zone
+ * @BLK_ZONE_ATTR_NRW: Need Reset Write Pointer required by controller
+ * @BLK_ZONE_ATTR_ZDEV: Zone Descriptor Extension Valid in zone report
+ */
+enum blk_zone_attr {
+	BLK_ZONE_ATTR_ZFC	= 1 << 0,
+	BLK_ZONE_ATTR_FZR	= 1 << 1,
+	BLK_ZONE_ATTR_RZR	= 1 << 2,
+	BLK_ZONE_ATTR_NSQ	= 1 << 3,
+	BLK_ZONE_ATTR_NRW	= 1 << 4,
+	BLK_ZONE_ATTR_ZDEV	= 1 << 7,
 };
 
 /**
@@ -110,7 +133,8 @@ struct blk_zone {
 	__u8	cond;		/* Zone condition */
 	__u8	non_seq;	/* Non-sequential write resources active */
 	__u8	reset;		/* Reset write pointer recommended */
-	__u8	resv[4];
+	__u8	attr;		/* Zone attributes */
+	__u8	resv[3];
 	__u64	capacity;	/* Zone capacity in number of sectors */
 	__u8	reserved[24];
 };
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 4/4] block: add attributes to zone report
@ 2020-07-02  6:54   ` Javier González
  0 siblings, 0 replies; 38+ messages in thread
From: Javier González @ 2020-07-02  6:54 UTC (permalink / raw)
  To: linux-nvme
  Cc: axboe, Damien.LeMoal, SelvaKumar S, sagi, Kanchan Joshi,
	Nitesh Shetty, linux-block, kbusch, Javier González, mb,
	hch

From: Javier González <javier.gonz@samsung.com>

Add zone attributes field to the blk_zone structure. Use ZNS attributes
as base for zoned block devices and add the current atributes in
ZAC/ZBC.

Signed-off-by: Javier González <javier.gonz@samsung.com>
Signed-off-by: SelvaKumar S <selvakuma.s1@samsung.com>
Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
Signed-off-by: Nitesh Shetty <nj.shetty@samsung.com>
---
 drivers/nvme/host/zns.c       |  3 ++-
 drivers/scsi/sd.c             |  2 +-
 drivers/scsi/sd_zbc.c         |  8 ++++++--
 include/uapi/linux/blkzoned.h | 26 +++++++++++++++++++++++++-
 4 files changed, 34 insertions(+), 5 deletions(-)

diff --git a/drivers/nvme/host/zns.c b/drivers/nvme/host/zns.c
index d785d179a343..749382a14968 100644
--- a/drivers/nvme/host/zns.c
+++ b/drivers/nvme/host/zns.c
@@ -118,7 +118,7 @@ int nvme_update_zone_info(struct gendisk *disk, struct nvme_ns *ns,
 	}
 
 	q->limits.zoned = BLK_ZONED_HM;
-	q->zone_flags = BLK_ZONE_REP_CAPACITY | BLK_ZONE_REP_OFFLINE;
+	q->zone_flags = BLK_ZONE_REP_CAPACITY | BLK_ZONE_REP_CAPACITY | BLK_ZONE_REP_OFFLINE;
 	blk_queue_flag_set(QUEUE_FLAG_ZONE_RESETALL, q);
 free_data:
 	kfree(id);
@@ -197,6 +197,7 @@ static int nvme_zone_parse_entry(struct nvme_ns *ns,
 	zone.capacity = nvme_lba_to_sect(ns, le64_to_cpu(entry->zcap));
 	zone.start = nvme_lba_to_sect(ns, le64_to_cpu(entry->zslba));
 	zone.wp = nvme_lba_to_sect(ns, le64_to_cpu(entry->wp));
+	zone.attr = entry->za;
 
 	return cb(&zone, idx, data);
 }
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index b9c920bace28..63270598aa76 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -2967,7 +2967,7 @@ static void sd_read_block_characteristics(struct scsi_disk *sdkp)
 	if (sdkp->device->type == TYPE_ZBC) {
 		/* Host-managed */
 		q->limits.zoned = BLK_ZONED_HM;
-		q->zone_flags = BLK_ZONE_REP_CAPACITY;
+		q->zone_flags = BLK_ZONE_REP_CAPACITY | BLK_ZONE_REP_ATTR;
 	} else {
 		sdkp->zoned = (buffer[8] >> 4) & 3;
 		if (sdkp->zoned == 1 && !disk_has_partitions(sdkp->disk)) {
diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
index 183a20720da9..51c7f82b59c5 100644
--- a/drivers/scsi/sd_zbc.c
+++ b/drivers/scsi/sd_zbc.c
@@ -53,10 +53,14 @@ static int sd_zbc_parse_report(struct scsi_disk *sdkp, u8 *buf,
 
 	zone.type = buf[0] & 0x0f;
 	zone.cond = (buf[1] >> 4) & 0xf;
-	if (buf[1] & 0x01)
+	if (buf[1] & 0x01) {
+		zone.attr |= BLK_ZONE_ATTR_NRW;
 		zone.reset = 1;
-	if (buf[1] & 0x02)
+	}
+	if (buf[1] & 0x02) {
+		zone.attr |= BLK_ZONE_ATTR_NSQ;
 		zone.non_seq = 1;
+	}
 
 	zone.len = logical_to_sectors(sdp, get_unaligned_be64(&buf[8]));
 	zone.capacity = zone.len;
diff --git a/include/uapi/linux/blkzoned.h b/include/uapi/linux/blkzoned.h
index e5adf4a9f4b0..315617827acd 100644
--- a/include/uapi/linux/blkzoned.h
+++ b/include/uapi/linux/blkzoned.h
@@ -78,10 +78,33 @@ enum blk_zone_cond {
  *
  * @BLK_ZONE_REP_CAPACITY: Zone descriptor has capacity field.
  * @BLK_ZONE_REP_OFFLINE : Zone device supports offline transition.
+ * @BLK_ZONE_REP_ATTR: Zone attributes reported.
  */
 enum blk_zone_report_flags {
 	BLK_ZONE_REP_CAPACITY	= (1 << 0),
 	BLK_ZONE_REP_OFFLINE	= (1 << 1),
+	BLK_ZONE_REP_ATTR	= (1 << 2),
+};
+
+/**
+ * enum blk_zone_attr - Zone Attributes
+ *
+ * Attributes of the zone. Reported in struct blk_zone -> attr
+ *
+ * @BLK_ZONE_ATTR_ZFC: Zone Finished by Controller due to a zone active excursion
+ * @BLK_ZONE_ATTR_FZR: Finish Zone Recommended required by controller
+ * @BLK_ZONE_ATTR_RZR: Reset Zone Recommended required by controller
+ * @BLK_ZONE_ATTR_NSQ: Non Sequential zone
+ * @BLK_ZONE_ATTR_NRW: Need Reset Write Pointer required by controller
+ * @BLK_ZONE_ATTR_ZDEV: Zone Descriptor Extension Valid in zone report
+ */
+enum blk_zone_attr {
+	BLK_ZONE_ATTR_ZFC	= 1 << 0,
+	BLK_ZONE_ATTR_FZR	= 1 << 1,
+	BLK_ZONE_ATTR_RZR	= 1 << 2,
+	BLK_ZONE_ATTR_NSQ	= 1 << 3,
+	BLK_ZONE_ATTR_NRW	= 1 << 4,
+	BLK_ZONE_ATTR_ZDEV	= 1 << 7,
 };
 
 /**
@@ -110,7 +133,8 @@ struct blk_zone {
 	__u8	cond;		/* Zone condition */
 	__u8	non_seq;	/* Non-sequential write resources active */
 	__u8	reset;		/* Reset write pointer recommended */
-	__u8	resv[4];
+	__u8	attr;		/* Zone attributes */
+	__u8	resv[3];
 	__u64	capacity;	/* Zone capacity in number of sectors */
 	__u8	reserved[24];
 };
-- 
2.17.1


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH 1/4] block: Add zone flags to queue zone prop.
  2020-07-02  6:54   ` Javier González
@ 2020-07-02  7:54     ` Damien Le Moal
  -1 siblings, 0 replies; 38+ messages in thread
From: Damien Le Moal @ 2020-07-02  7:54 UTC (permalink / raw)
  To: Javier González, linux-nvme
  Cc: linux-block, hch, kbusch, sagi, axboe, mb, Javier González,
	SelvaKumar S, Kanchan Joshi, Nitesh Shetty

On 2020/07/02 15:55, Javier González wrote:
> From: Javier González <javier.gonz@samsung.com>
> 
> As the zoned block device will have to deal with features that are
> optional for the backend device, add a flag field to inform the block
> layer about supported features. This builds on top of
> blk_zone_report_flags and extendes to the zone report code.
> 
> Signed-off-by: Javier González <javier.gonz@samsung.com>
> Signed-off-by: SelvaKumar S <selvakuma.s1@samsung.com>
> Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
> Signed-off-by: Nitesh Shetty <nj.shetty@samsung.com>
> ---
>  block/blk-zoned.c              | 3 ++-
>  drivers/block/null_blk_zoned.c | 2 ++
>  drivers/nvme/host/zns.c        | 1 +
>  drivers/scsi/sd.c              | 2 ++
>  include/linux/blkdev.h         | 3 +++
>  5 files changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/block/blk-zoned.c b/block/blk-zoned.c
> index 81152a260354..0f156e96e48f 100644
> --- a/block/blk-zoned.c
> +++ b/block/blk-zoned.c
> @@ -312,7 +312,8 @@ int blkdev_report_zones_ioctl(struct block_device *bdev, fmode_t mode,
>  		return ret;
>  
>  	rep.nr_zones = ret;
> -	rep.flags = BLK_ZONE_REP_CAPACITY;
> +	rep.flags = q->zone_flags;

zone_flags seem to be a fairly generic flags field while rep.flags is only about
the reported descriptors structure. So you may want to define a
BLK_ZONE_REP_FLAGS_MASK and have:

	rep.flags = q->zone_flags & BLK_ZONE_REP_FLAGS_MASK;

to avoid flags meaningless for the user being set.

In any case, since *all* zoned block devices now report capacity, I do not
really see the point to add BLK_ZONE_REP_FLAGS_MASK to q->zone_flags, especially
considering that you set the flag for all zoned device types, including scsi
which does not have zone capacity. This makes q->zone_flags rather confusing
instead of clearly defining the device features as you mentioned in the commit
message.

I think it may be better to just drop this patch, and if needed, introduce the
zone_flags field where it may be needed (e.g. OFFLINE zone ioctl support).

> +
>  	if (copy_to_user(argp, &rep, sizeof(struct blk_zone_report)))
>  		return -EFAULT;
>  	return 0;
> diff --git a/drivers/block/null_blk_zoned.c b/drivers/block/null_blk_zoned.c
> index b05832eb21b2..957c2103f240 100644
> --- a/drivers/block/null_blk_zoned.c
> +++ b/drivers/block/null_blk_zoned.c
> @@ -78,6 +78,8 @@ int null_init_zoned_dev(struct nullb_device *dev, struct request_queue *q)
>  	}
>  
>  	q->limits.zoned = BLK_ZONED_HM;
> +	q->zone_flags = BLK_ZONE_REP_CAPACITY;
> +
>  	blk_queue_flag_set(QUEUE_FLAG_ZONE_RESETALL, q);
>  	blk_queue_required_elevator_features(q, ELEVATOR_F_ZBD_SEQ_WRITE);
>  
> diff --git a/drivers/nvme/host/zns.c b/drivers/nvme/host/zns.c
> index 0642d3c54e8f..888264261ba3 100644
> --- a/drivers/nvme/host/zns.c
> +++ b/drivers/nvme/host/zns.c
> @@ -81,6 +81,7 @@ int nvme_update_zone_info(struct gendisk *disk, struct nvme_ns *ns,
>  	}
>  
>  	q->limits.zoned = BLK_ZONED_HM;
> +	q->zone_flags = BLK_ZONE_REP_CAPACITY;
>  	blk_queue_flag_set(QUEUE_FLAG_ZONE_RESETALL, q);
>  free_data:
>  	kfree(id);
> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> index d90fefffe31b..b9c920bace28 100644
> --- a/drivers/scsi/sd.c
> +++ b/drivers/scsi/sd.c
> @@ -2967,6 +2967,7 @@ static void sd_read_block_characteristics(struct scsi_disk *sdkp)
>  	if (sdkp->device->type == TYPE_ZBC) {
>  		/* Host-managed */
>  		q->limits.zoned = BLK_ZONED_HM;
> +		q->zone_flags = BLK_ZONE_REP_CAPACITY;
>  	} else {
>  		sdkp->zoned = (buffer[8] >> 4) & 3;
>  		if (sdkp->zoned == 1 && !disk_has_partitions(sdkp->disk)) {
> @@ -2983,6 +2984,7 @@ static void sd_read_block_characteristics(struct scsi_disk *sdkp)
>  					  "Drive-managed SMR disk\n");
>  		}
>  	}
> +
>  	if (blk_queue_is_zoned(q) && sdkp->first_scan)
>  		sd_printk(KERN_NOTICE, sdkp, "Host-%s zoned block device\n",
>  		      q->limits.zoned == BLK_ZONED_HM ? "managed" : "aware");
> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index 8fd900998b4e..3f2e3425fa53 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -512,12 +512,15 @@ struct request_queue {
>  	 * Stacking drivers (device mappers) may or may not initialize
>  	 * these fields.
>  	 *
> +	 * Flags represent features as described by blk_zone_report_flags in blkzoned.h
> +	 *
>  	 * Reads of this information must be protected with blk_queue_enter() /
>  	 * blk_queue_exit(). Modifying this information is only allowed while
>  	 * no requests are being processed. See also blk_mq_freeze_queue() and
>  	 * blk_mq_unfreeze_queue().
>  	 */
>  	unsigned int		nr_zones;
> +	unsigned int		zone_flags;
>  	unsigned long		*conv_zones_bitmap;
>  	unsigned long		*seq_zones_wlock;
>  #endif /* CONFIG_BLK_DEV_ZONED */
> 

And you are missing device-mapper support. DM target devices have a request
queue that would need to set the zone_flags too.


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 1/4] block: Add zone flags to queue zone prop.
@ 2020-07-02  7:54     ` Damien Le Moal
  0 siblings, 0 replies; 38+ messages in thread
From: Damien Le Moal @ 2020-07-02  7:54 UTC (permalink / raw)
  To: Javier González, linux-nvme
  Cc: axboe, SelvaKumar S, sagi, Kanchan Joshi, Nitesh Shetty,
	linux-block, kbusch, Javier González, mb, hch

On 2020/07/02 15:55, Javier González wrote:
> From: Javier González <javier.gonz@samsung.com>
> 
> As the zoned block device will have to deal with features that are
> optional for the backend device, add a flag field to inform the block
> layer about supported features. This builds on top of
> blk_zone_report_flags and extendes to the zone report code.
> 
> Signed-off-by: Javier González <javier.gonz@samsung.com>
> Signed-off-by: SelvaKumar S <selvakuma.s1@samsung.com>
> Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
> Signed-off-by: Nitesh Shetty <nj.shetty@samsung.com>
> ---
>  block/blk-zoned.c              | 3 ++-
>  drivers/block/null_blk_zoned.c | 2 ++
>  drivers/nvme/host/zns.c        | 1 +
>  drivers/scsi/sd.c              | 2 ++
>  include/linux/blkdev.h         | 3 +++
>  5 files changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/block/blk-zoned.c b/block/blk-zoned.c
> index 81152a260354..0f156e96e48f 100644
> --- a/block/blk-zoned.c
> +++ b/block/blk-zoned.c
> @@ -312,7 +312,8 @@ int blkdev_report_zones_ioctl(struct block_device *bdev, fmode_t mode,
>  		return ret;
>  
>  	rep.nr_zones = ret;
> -	rep.flags = BLK_ZONE_REP_CAPACITY;
> +	rep.flags = q->zone_flags;

zone_flags seem to be a fairly generic flags field while rep.flags is only about
the reported descriptors structure. So you may want to define a
BLK_ZONE_REP_FLAGS_MASK and have:

	rep.flags = q->zone_flags & BLK_ZONE_REP_FLAGS_MASK;

to avoid flags meaningless for the user being set.

In any case, since *all* zoned block devices now report capacity, I do not
really see the point to add BLK_ZONE_REP_FLAGS_MASK to q->zone_flags, especially
considering that you set the flag for all zoned device types, including scsi
which does not have zone capacity. This makes q->zone_flags rather confusing
instead of clearly defining the device features as you mentioned in the commit
message.

I think it may be better to just drop this patch, and if needed, introduce the
zone_flags field where it may be needed (e.g. OFFLINE zone ioctl support).

> +
>  	if (copy_to_user(argp, &rep, sizeof(struct blk_zone_report)))
>  		return -EFAULT;
>  	return 0;
> diff --git a/drivers/block/null_blk_zoned.c b/drivers/block/null_blk_zoned.c
> index b05832eb21b2..957c2103f240 100644
> --- a/drivers/block/null_blk_zoned.c
> +++ b/drivers/block/null_blk_zoned.c
> @@ -78,6 +78,8 @@ int null_init_zoned_dev(struct nullb_device *dev, struct request_queue *q)
>  	}
>  
>  	q->limits.zoned = BLK_ZONED_HM;
> +	q->zone_flags = BLK_ZONE_REP_CAPACITY;
> +
>  	blk_queue_flag_set(QUEUE_FLAG_ZONE_RESETALL, q);
>  	blk_queue_required_elevator_features(q, ELEVATOR_F_ZBD_SEQ_WRITE);
>  
> diff --git a/drivers/nvme/host/zns.c b/drivers/nvme/host/zns.c
> index 0642d3c54e8f..888264261ba3 100644
> --- a/drivers/nvme/host/zns.c
> +++ b/drivers/nvme/host/zns.c
> @@ -81,6 +81,7 @@ int nvme_update_zone_info(struct gendisk *disk, struct nvme_ns *ns,
>  	}
>  
>  	q->limits.zoned = BLK_ZONED_HM;
> +	q->zone_flags = BLK_ZONE_REP_CAPACITY;
>  	blk_queue_flag_set(QUEUE_FLAG_ZONE_RESETALL, q);
>  free_data:
>  	kfree(id);
> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> index d90fefffe31b..b9c920bace28 100644
> --- a/drivers/scsi/sd.c
> +++ b/drivers/scsi/sd.c
> @@ -2967,6 +2967,7 @@ static void sd_read_block_characteristics(struct scsi_disk *sdkp)
>  	if (sdkp->device->type == TYPE_ZBC) {
>  		/* Host-managed */
>  		q->limits.zoned = BLK_ZONED_HM;
> +		q->zone_flags = BLK_ZONE_REP_CAPACITY;
>  	} else {
>  		sdkp->zoned = (buffer[8] >> 4) & 3;
>  		if (sdkp->zoned == 1 && !disk_has_partitions(sdkp->disk)) {
> @@ -2983,6 +2984,7 @@ static void sd_read_block_characteristics(struct scsi_disk *sdkp)
>  					  "Drive-managed SMR disk\n");
>  		}
>  	}
> +
>  	if (blk_queue_is_zoned(q) && sdkp->first_scan)
>  		sd_printk(KERN_NOTICE, sdkp, "Host-%s zoned block device\n",
>  		      q->limits.zoned == BLK_ZONED_HM ? "managed" : "aware");
> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index 8fd900998b4e..3f2e3425fa53 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -512,12 +512,15 @@ struct request_queue {
>  	 * Stacking drivers (device mappers) may or may not initialize
>  	 * these fields.
>  	 *
> +	 * Flags represent features as described by blk_zone_report_flags in blkzoned.h
> +	 *
>  	 * Reads of this information must be protected with blk_queue_enter() /
>  	 * blk_queue_exit(). Modifying this information is only allowed while
>  	 * no requests are being processed. See also blk_mq_freeze_queue() and
>  	 * blk_mq_unfreeze_queue().
>  	 */
>  	unsigned int		nr_zones;
> +	unsigned int		zone_flags;
>  	unsigned long		*conv_zones_bitmap;
>  	unsigned long		*seq_zones_wlock;
>  #endif /* CONFIG_BLK_DEV_ZONED */
> 

And you are missing device-mapper support. DM target devices have a request
queue that would need to set the zone_flags too.


-- 
Damien Le Moal
Western Digital Research

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 2/4] block: add support for zone offline transition
  2020-07-02  6:54   ` Javier González
@ 2020-07-02  8:10     ` Damien Le Moal
  -1 siblings, 0 replies; 38+ messages in thread
From: Damien Le Moal @ 2020-07-02  8:10 UTC (permalink / raw)
  To: Javier González, linux-nvme
  Cc: linux-block, hch, kbusch, sagi, axboe, mb, Javier González,
	SelvaKumar S, Kanchan Joshi, Nitesh Shetty

On 2020/07/02 15:55, Javier González wrote:
> From: Javier González <javier.gonz@samsung.com>
> 
> Add support for offline transition on the zoned block device. Use the
> existing feature flags for the underlying driver to report support for
> the feature, as currently this transition is only supported in ZNS and
> not in ZAC/ZBC
> 
> Signed-off-by: Javier González <javier.gonz@samsung.com>
> Signed-off-by: SelvaKumar S <selvakuma.s1@samsung.com>
> Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
> Signed-off-by: Nitesh Shetty <nj.shetty@samsung.com>
> ---
>  block/blk-core.c              | 2 ++
>  block/blk-zoned.c             | 8 +++++++-
>  drivers/nvme/host/core.c      | 3 +++
>  drivers/nvme/host/zns.c       | 2 +-
>  include/linux/blk_types.h     | 3 +++
>  include/linux/blkdev.h        | 1 -
>  include/uapi/linux/blkzoned.h | 3 +++
>  7 files changed, 19 insertions(+), 3 deletions(-)
> 
> diff --git a/block/blk-core.c b/block/blk-core.c
> index 03252af8c82c..589cbdacc5ec 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -140,6 +140,7 @@ static const char *const blk_op_name[] = {
>  	REQ_OP_NAME(ZONE_CLOSE),
>  	REQ_OP_NAME(ZONE_FINISH),
>  	REQ_OP_NAME(ZONE_APPEND),
> +	REQ_OP_NAME(ZONE_OFFLINE),
>  	REQ_OP_NAME(WRITE_SAME),
>  	REQ_OP_NAME(WRITE_ZEROES),
>  	REQ_OP_NAME(SCSI_IN),
> @@ -1030,6 +1031,7 @@ generic_make_request_checks(struct bio *bio)
>  	case REQ_OP_ZONE_OPEN:
>  	case REQ_OP_ZONE_CLOSE:
>  	case REQ_OP_ZONE_FINISH:
> +	case REQ_OP_ZONE_OFFLINE:
>  		if (!blk_queue_is_zoned(q))
>  			goto not_supported;
>  		break;
> diff --git a/block/blk-zoned.c b/block/blk-zoned.c
> index 0f156e96e48f..b97f67f462b4 100644
> --- a/block/blk-zoned.c
> +++ b/block/blk-zoned.c
> @@ -320,7 +320,8 @@ int blkdev_report_zones_ioctl(struct block_device *bdev, fmode_t mode,
>  }
>  
>  /*
> - * BLKRESETZONE, BLKOPENZONE, BLKCLOSEZONE and BLKFINISHZONE ioctl processing.
> + * BLKRESETZONE, BLKOPENZONE, BLKCLOSEZONE, BLKFINISHZONE and BLKOFFLINEZONE
> + * ioctl processing.
>   * Called from blkdev_ioctl.
>   */
>  int blkdev_zone_mgmt_ioctl(struct block_device *bdev, fmode_t mode,
> @@ -363,6 +364,11 @@ int blkdev_zone_mgmt_ioctl(struct block_device *bdev, fmode_t mode,
>  	case BLKFINISHZONE:
>  		op = REQ_OP_ZONE_FINISH;
>  		break;
> +	case BLKOFFLINEZONE:
> +		if (!(q->zone_flags & BLK_ZONE_REP_OFFLINE))
> +			return -EINVAL;

return -ENOTTY here.

That is the error returned for regular block devices when a zone ioctl is
received, indicating the lack of support for these ioctls. Since this is also a
lack  of support by the device here too, we may as well keep the same error
code. Returning -EINVAL should be reserved for cases where the device can accept
the ioctl but start sector or number of sectors is invalid.


> +		op = REQ_OP_ZONE_OFFLINE;
> +		break;
>  	default:
>  		return -ENOTTY;
>  	}
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index e5f754889234..1f5c7fc3d2c9 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -776,6 +776,9 @@ blk_status_t nvme_setup_cmd(struct nvme_ns *ns, struct request *req,
>  	case REQ_OP_ZONE_FINISH:
>  		ret = nvme_setup_zone_mgmt_send(ns, req, cmd, NVME_ZONE_FINISH);
>  		break;
> +	case REQ_OP_ZONE_OFFLINE:
> +		ret = nvme_setup_zone_mgmt_send(ns, req, cmd, NVME_ZONE_OFFLINE);
> +		break;
>  	case REQ_OP_WRITE_ZEROES:
>  		ret = nvme_setup_write_zeroes(ns, req, cmd);
>  		break;
> diff --git a/drivers/nvme/host/zns.c b/drivers/nvme/host/zns.c
> index 888264261ba3..b34d2ed13825 100644
> --- a/drivers/nvme/host/zns.c
> +++ b/drivers/nvme/host/zns.c
> @@ -81,7 +81,7 @@ int nvme_update_zone_info(struct gendisk *disk, struct nvme_ns *ns,
>  	}
>  
>  	q->limits.zoned = BLK_ZONED_HM;
> -	q->zone_flags = BLK_ZONE_REP_CAPACITY;
> +	q->zone_flags = BLK_ZONE_REP_CAPACITY | BLK_ZONE_REP_OFFLINE;

The name BLK_ZONE_REP_OFFLINE is not ideal.  This flag is not about if offline
condition will be reported or not. It is about the drive supporting an explicit
offlining zone operation.

>  	blk_queue_flag_set(QUEUE_FLAG_ZONE_RESETALL, q);
>  free_data:
>  	kfree(id);
> diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
> index ccb895f911b1..c0123c643e2f 100644
> --- a/include/linux/blk_types.h
> +++ b/include/linux/blk_types.h
> @@ -316,6 +316,8 @@ enum req_opf {
>  	REQ_OP_ZONE_FINISH	= 12,
>  	/* write data at the current zone write pointer */
>  	REQ_OP_ZONE_APPEND	= 13,
> +	/* Transition a zone to offline */
> +	REQ_OP_ZONE_OFFLINE	= 14,
>  
>  	/* SCSI passthrough using struct scsi_request */
>  	REQ_OP_SCSI_IN		= 32,
> @@ -455,6 +457,7 @@ static inline bool op_is_zone_mgmt(enum req_opf op)
>  	case REQ_OP_ZONE_OPEN:
>  	case REQ_OP_ZONE_CLOSE:
>  	case REQ_OP_ZONE_FINISH:
> +	case REQ_OP_ZONE_OFFLINE:
>  		return true;
>  	default:
>  		return false;
> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index 3f2e3425fa53..e489b646486d 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -370,7 +370,6 @@ extern int blkdev_report_zones_ioctl(struct block_device *bdev, fmode_t mode,
>  				     unsigned int cmd, unsigned long arg);
>  extern int blkdev_zone_mgmt_ioctl(struct block_device *bdev, fmode_t mode,
>  				  unsigned int cmd, unsigned long arg);
> -
>  #else /* CONFIG_BLK_DEV_ZONED */
>  
>  static inline unsigned int blkdev_nr_zones(struct gendisk *disk)
> diff --git a/include/uapi/linux/blkzoned.h b/include/uapi/linux/blkzoned.h
> index 42c3366cc25f..e5adf4a9f4b0 100644
> --- a/include/uapi/linux/blkzoned.h
> +++ b/include/uapi/linux/blkzoned.h
> @@ -77,9 +77,11 @@ enum blk_zone_cond {
>   * enum blk_zone_report_flags - Feature flags of reported zone descriptors.
>   *
>   * @BLK_ZONE_REP_CAPACITY: Zone descriptor has capacity field.
> + * @BLK_ZONE_REP_OFFLINE : Zone device supports offline transition.

The device supports explicit zone offline transition

Since the implicit transition by the device may happen, even on SMR disks.

But I am not sure this flags is very useful. Or rather, isn't it out of place
here ? Device features are normally reported through sysfs (e.g. discard, etc).
It is certainly confusing and not matching the user doc for rep.flag which
states that the flags are about the zone descriptors, not what the device can
do. So at the very least, the comments need to change.

The other thing is that the implementation does not consider device mapper case
again: if a DM target is built on one or more ZNS drives all supporting zone
offline, then the target should be allowed to report zone offline support too,
no ? dm-linear and dm-flakey certainly should be allowed to do that. Exporting a
"zone_offline" (or something like named that) sysfs limit would allow that to be
supported easily through limit stacking and avoid the need for the report flag.

Happy to here others opinion about this one though.

>   */
>  enum blk_zone_report_flags {
>  	BLK_ZONE_REP_CAPACITY	= (1 << 0),
> +	BLK_ZONE_REP_OFFLINE	= (1 << 1),
>  };
>  
>  /**
> @@ -166,5 +168,6 @@ struct blk_zone_range {
>  #define BLKOPENZONE	_IOW(0x12, 134, struct blk_zone_range)
>  #define BLKCLOSEZONE	_IOW(0x12, 135, struct blk_zone_range)
>  #define BLKFINISHZONE	_IOW(0x12, 136, struct blk_zone_range)
> +#define BLKOFFLINEZONE	_IOW(0x12, 137, struct blk_zone_range)
>  
>  #endif /* _UAPI_BLKZONED_H */
> 


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 2/4] block: add support for zone offline transition
@ 2020-07-02  8:10     ` Damien Le Moal
  0 siblings, 0 replies; 38+ messages in thread
From: Damien Le Moal @ 2020-07-02  8:10 UTC (permalink / raw)
  To: Javier González, linux-nvme
  Cc: axboe, SelvaKumar S, sagi, Kanchan Joshi, Nitesh Shetty,
	linux-block, kbusch, Javier González, mb, hch

On 2020/07/02 15:55, Javier González wrote:
> From: Javier González <javier.gonz@samsung.com>
> 
> Add support for offline transition on the zoned block device. Use the
> existing feature flags for the underlying driver to report support for
> the feature, as currently this transition is only supported in ZNS and
> not in ZAC/ZBC
> 
> Signed-off-by: Javier González <javier.gonz@samsung.com>
> Signed-off-by: SelvaKumar S <selvakuma.s1@samsung.com>
> Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
> Signed-off-by: Nitesh Shetty <nj.shetty@samsung.com>
> ---
>  block/blk-core.c              | 2 ++
>  block/blk-zoned.c             | 8 +++++++-
>  drivers/nvme/host/core.c      | 3 +++
>  drivers/nvme/host/zns.c       | 2 +-
>  include/linux/blk_types.h     | 3 +++
>  include/linux/blkdev.h        | 1 -
>  include/uapi/linux/blkzoned.h | 3 +++
>  7 files changed, 19 insertions(+), 3 deletions(-)
> 
> diff --git a/block/blk-core.c b/block/blk-core.c
> index 03252af8c82c..589cbdacc5ec 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -140,6 +140,7 @@ static const char *const blk_op_name[] = {
>  	REQ_OP_NAME(ZONE_CLOSE),
>  	REQ_OP_NAME(ZONE_FINISH),
>  	REQ_OP_NAME(ZONE_APPEND),
> +	REQ_OP_NAME(ZONE_OFFLINE),
>  	REQ_OP_NAME(WRITE_SAME),
>  	REQ_OP_NAME(WRITE_ZEROES),
>  	REQ_OP_NAME(SCSI_IN),
> @@ -1030,6 +1031,7 @@ generic_make_request_checks(struct bio *bio)
>  	case REQ_OP_ZONE_OPEN:
>  	case REQ_OP_ZONE_CLOSE:
>  	case REQ_OP_ZONE_FINISH:
> +	case REQ_OP_ZONE_OFFLINE:
>  		if (!blk_queue_is_zoned(q))
>  			goto not_supported;
>  		break;
> diff --git a/block/blk-zoned.c b/block/blk-zoned.c
> index 0f156e96e48f..b97f67f462b4 100644
> --- a/block/blk-zoned.c
> +++ b/block/blk-zoned.c
> @@ -320,7 +320,8 @@ int blkdev_report_zones_ioctl(struct block_device *bdev, fmode_t mode,
>  }
>  
>  /*
> - * BLKRESETZONE, BLKOPENZONE, BLKCLOSEZONE and BLKFINISHZONE ioctl processing.
> + * BLKRESETZONE, BLKOPENZONE, BLKCLOSEZONE, BLKFINISHZONE and BLKOFFLINEZONE
> + * ioctl processing.
>   * Called from blkdev_ioctl.
>   */
>  int blkdev_zone_mgmt_ioctl(struct block_device *bdev, fmode_t mode,
> @@ -363,6 +364,11 @@ int blkdev_zone_mgmt_ioctl(struct block_device *bdev, fmode_t mode,
>  	case BLKFINISHZONE:
>  		op = REQ_OP_ZONE_FINISH;
>  		break;
> +	case BLKOFFLINEZONE:
> +		if (!(q->zone_flags & BLK_ZONE_REP_OFFLINE))
> +			return -EINVAL;

return -ENOTTY here.

That is the error returned for regular block devices when a zone ioctl is
received, indicating the lack of support for these ioctls. Since this is also a
lack  of support by the device here too, we may as well keep the same error
code. Returning -EINVAL should be reserved for cases where the device can accept
the ioctl but start sector or number of sectors is invalid.


> +		op = REQ_OP_ZONE_OFFLINE;
> +		break;
>  	default:
>  		return -ENOTTY;
>  	}
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index e5f754889234..1f5c7fc3d2c9 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -776,6 +776,9 @@ blk_status_t nvme_setup_cmd(struct nvme_ns *ns, struct request *req,
>  	case REQ_OP_ZONE_FINISH:
>  		ret = nvme_setup_zone_mgmt_send(ns, req, cmd, NVME_ZONE_FINISH);
>  		break;
> +	case REQ_OP_ZONE_OFFLINE:
> +		ret = nvme_setup_zone_mgmt_send(ns, req, cmd, NVME_ZONE_OFFLINE);
> +		break;
>  	case REQ_OP_WRITE_ZEROES:
>  		ret = nvme_setup_write_zeroes(ns, req, cmd);
>  		break;
> diff --git a/drivers/nvme/host/zns.c b/drivers/nvme/host/zns.c
> index 888264261ba3..b34d2ed13825 100644
> --- a/drivers/nvme/host/zns.c
> +++ b/drivers/nvme/host/zns.c
> @@ -81,7 +81,7 @@ int nvme_update_zone_info(struct gendisk *disk, struct nvme_ns *ns,
>  	}
>  
>  	q->limits.zoned = BLK_ZONED_HM;
> -	q->zone_flags = BLK_ZONE_REP_CAPACITY;
> +	q->zone_flags = BLK_ZONE_REP_CAPACITY | BLK_ZONE_REP_OFFLINE;

The name BLK_ZONE_REP_OFFLINE is not ideal.  This flag is not about if offline
condition will be reported or not. It is about the drive supporting an explicit
offlining zone operation.

>  	blk_queue_flag_set(QUEUE_FLAG_ZONE_RESETALL, q);
>  free_data:
>  	kfree(id);
> diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
> index ccb895f911b1..c0123c643e2f 100644
> --- a/include/linux/blk_types.h
> +++ b/include/linux/blk_types.h
> @@ -316,6 +316,8 @@ enum req_opf {
>  	REQ_OP_ZONE_FINISH	= 12,
>  	/* write data at the current zone write pointer */
>  	REQ_OP_ZONE_APPEND	= 13,
> +	/* Transition a zone to offline */
> +	REQ_OP_ZONE_OFFLINE	= 14,
>  
>  	/* SCSI passthrough using struct scsi_request */
>  	REQ_OP_SCSI_IN		= 32,
> @@ -455,6 +457,7 @@ static inline bool op_is_zone_mgmt(enum req_opf op)
>  	case REQ_OP_ZONE_OPEN:
>  	case REQ_OP_ZONE_CLOSE:
>  	case REQ_OP_ZONE_FINISH:
> +	case REQ_OP_ZONE_OFFLINE:
>  		return true;
>  	default:
>  		return false;
> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index 3f2e3425fa53..e489b646486d 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -370,7 +370,6 @@ extern int blkdev_report_zones_ioctl(struct block_device *bdev, fmode_t mode,
>  				     unsigned int cmd, unsigned long arg);
>  extern int blkdev_zone_mgmt_ioctl(struct block_device *bdev, fmode_t mode,
>  				  unsigned int cmd, unsigned long arg);
> -
>  #else /* CONFIG_BLK_DEV_ZONED */
>  
>  static inline unsigned int blkdev_nr_zones(struct gendisk *disk)
> diff --git a/include/uapi/linux/blkzoned.h b/include/uapi/linux/blkzoned.h
> index 42c3366cc25f..e5adf4a9f4b0 100644
> --- a/include/uapi/linux/blkzoned.h
> +++ b/include/uapi/linux/blkzoned.h
> @@ -77,9 +77,11 @@ enum blk_zone_cond {
>   * enum blk_zone_report_flags - Feature flags of reported zone descriptors.
>   *
>   * @BLK_ZONE_REP_CAPACITY: Zone descriptor has capacity field.
> + * @BLK_ZONE_REP_OFFLINE : Zone device supports offline transition.

The device supports explicit zone offline transition

Since the implicit transition by the device may happen, even on SMR disks.

But I am not sure this flags is very useful. Or rather, isn't it out of place
here ? Device features are normally reported through sysfs (e.g. discard, etc).
It is certainly confusing and not matching the user doc for rep.flag which
states that the flags are about the zone descriptors, not what the device can
do. So at the very least, the comments need to change.

The other thing is that the implementation does not consider device mapper case
again: if a DM target is built on one or more ZNS drives all supporting zone
offline, then the target should be allowed to report zone offline support too,
no ? dm-linear and dm-flakey certainly should be allowed to do that. Exporting a
"zone_offline" (or something like named that) sysfs limit would allow that to be
supported easily through limit stacking and avoid the need for the report flag.

Happy to here others opinion about this one though.

>   */
>  enum blk_zone_report_flags {
>  	BLK_ZONE_REP_CAPACITY	= (1 << 0),
> +	BLK_ZONE_REP_OFFLINE	= (1 << 1),
>  };
>  
>  /**
> @@ -166,5 +168,6 @@ struct blk_zone_range {
>  #define BLKOPENZONE	_IOW(0x12, 134, struct blk_zone_range)
>  #define BLKCLOSEZONE	_IOW(0x12, 135, struct blk_zone_range)
>  #define BLKFINISHZONE	_IOW(0x12, 136, struct blk_zone_range)
> +#define BLKOFFLINEZONE	_IOW(0x12, 137, struct blk_zone_range)
>  
>  #endif /* _UAPI_BLKZONED_H */
> 


-- 
Damien Le Moal
Western Digital Research

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 3/4] nvme: Add consistency check for zone count
  2020-07-02  6:54   ` Javier González
@ 2020-07-02  8:16     ` Damien Le Moal
  -1 siblings, 0 replies; 38+ messages in thread
From: Damien Le Moal @ 2020-07-02  8:16 UTC (permalink / raw)
  To: Javier González, linux-nvme
  Cc: linux-block, hch, kbusch, sagi, axboe, mb, Javier González,
	SelvaKumar S, Kanchan Joshi, Nitesh Shetty

On 2020/07/02 15:55, Javier González wrote:
> From: Javier González <javier.gonz@samsung.com>
> 
> Since the number of zones is calculated through the reported device
> capacity and the ZNS specification allows to report the total number of
> zones in the device, add an extra check to guarantee consistency between
> the device and the kernel.
> 
> Also, fix a checkpatch warning: unsigned -> unsigned int in the process
> 
> Signed-off-by: Javier González <javier.gonz@samsung.com>
> Signed-off-by: SelvaKumar S <selvakuma.s1@samsung.com>
> Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
> Signed-off-by: Nitesh Shetty <nj.shetty@samsung.com>
> ---
>  drivers/nvme/host/core.c |  2 +-
>  drivers/nvme/host/nvme.h |  6 ++++--
>  drivers/nvme/host/zns.c  | 39 ++++++++++++++++++++++++++++++++++++++-
>  3 files changed, 43 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 1f5c7fc3d2c9..8b9c69172931 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -1994,7 +1994,7 @@ static int __nvme_revalidate_disk(struct gendisk *disk, struct nvme_id_ns *id)
>  	case NVME_CSI_NVM:
>  		break;
>  	case NVME_CSI_ZNS:
> -		ret = nvme_update_zone_info(disk, ns, lbaf);
> +		ret = nvme_update_zone_info(disk, ns, lbaf, le64_to_cpu(id->nsze));
>  		if (ret)
>  			return ret;
>  		break;
> diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
> index 662f95fbd909..ef80e0b4df56 100644
> --- a/drivers/nvme/host/nvme.h
> +++ b/drivers/nvme/host/nvme.h
> @@ -407,6 +407,7 @@ struct nvme_ns {
>  	u32 sws;
>  	u8 pi_type;
>  #ifdef CONFIG_BLK_DEV_ZONED
> +	u64 nr_zones;
>  	u64 zsze;
>  #endif
>  	unsigned long features;
> @@ -700,7 +701,7 @@ static inline void nvme_mpath_start_freeze(struct nvme_subsystem *subsys)
>  
>  #ifdef CONFIG_BLK_DEV_ZONED
>  int nvme_update_zone_info(struct gendisk *disk, struct nvme_ns *ns,
> -			  unsigned lbaf);
> +			  unsigned int lbaf, sector_t nsze);
>  
>  int nvme_report_zones(struct gendisk *disk, sector_t sector,
>  		      unsigned int nr_zones, report_zones_cb cb, void *data);
> @@ -720,7 +721,8 @@ static inline blk_status_t nvme_setup_zone_mgmt_send(struct nvme_ns *ns,
>  
>  static inline int nvme_update_zone_info(struct gendisk *disk,
>  					struct nvme_ns *ns,
> -					unsigned lbaf)
> +					unsigned lbaf

missing a comma here. Does this even compiles ?

> +					sector_t nsze)
>  {
>  	dev_warn(ns->ctrl->device,
>  		 "Please enable CONFIG_BLK_DEV_ZONED to support ZNS devices\n");
> diff --git a/drivers/nvme/host/zns.c b/drivers/nvme/host/zns.c
> index b34d2ed13825..d785d179a343 100644
> --- a/drivers/nvme/host/zns.c
> +++ b/drivers/nvme/host/zns.c
> @@ -32,13 +32,36 @@ static int nvme_set_max_append(struct nvme_ctrl *ctrl)
>  	return 0;
>  }
>  
> +static u64 nvme_zns_nr_zones(struct nvme_ns *ns)
> +{
> +	struct nvme_command c = { };
> +	struct nvme_zone_report report;
> +	int buflen = sizeof(struct nvme_zone_report);
> +	int ret;
> +
> +	c.zmr.opcode = nvme_cmd_zone_mgmt_recv;
> +	c.zmr.nsid = cpu_to_le32(ns->head->ns_id);
> +	c.zmr.slba = cpu_to_le64(0);
> +	c.zmr.numd = cpu_to_le32(nvme_bytes_to_numd(buflen));
> +	c.zmr.zra = NVME_ZRA_ZONE_REPORT;
> +	c.zmr.zrasf = NVME_ZRASF_ZONE_REPORT_ALL;
> +	c.zmr.pr = 0;
> +
> +	ret = nvme_submit_sync_cmd(ns->queue, &c, &report, buflen);
> +	if (ret)
> +		return ret;
> +
> +	return le64_to_cpu(report.nr_zones);
> +}
> +
>  int nvme_update_zone_info(struct gendisk *disk, struct nvme_ns *ns,
> -			  unsigned lbaf)
> +			  unsigned int lbaf, sector_t nsze)
>  {
>  //	struct nvme_effects_log *log = ns->head->effects;
>  	struct request_queue *q = disk->queue;
>  	struct nvme_command c = { };
>  	struct nvme_id_ns_zns *id;
> +	sector_t cap_nr_zones;
>  	int status;
>  
>  	/* Driver requires zone append support */
> @@ -80,6 +103,20 @@ int nvme_update_zone_info(struct gendisk *disk, struct nvme_ns *ns,
>  		goto free_data;
>  	}
>  
> +	ns->nr_zones = nvme_zns_nr_zones(ns);
> +	if (!ns->nr_zones) {
> +		status = -EINVAL;
> +		goto free_data;
> +	}
> +
> +	cap_nr_zones = nvme_lba_to_sect(ns, le64_to_cpu(nsze)) >> ilog2(ns->zsze);
> +	if (ns->nr_zones != cap_nr_zones) {
> +		dev_err(ns->ctrl->device, "inconsistent zone count: %llu/%llu\n",
> +			ns->nr_zones, cap_nr_zones);
> +		status = -EINVAL;
> +		goto free_data;
> +	}
> +

I am not opposed to this, but I am not sure if this adds anything to the checks
in blk_revalidate_disk_zones() as that does a full zone report that is checked
against capacity. This seems more like a test of the drive conformance, checking
that the report header nr_zones is correct...

>  	q->limits.zoned = BLK_ZONED_HM;
>  	q->zone_flags = BLK_ZONE_REP_CAPACITY | BLK_ZONE_REP_OFFLINE;
>  	blk_queue_flag_set(QUEUE_FLAG_ZONE_RESETALL, q);
> 


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 3/4] nvme: Add consistency check for zone count
@ 2020-07-02  8:16     ` Damien Le Moal
  0 siblings, 0 replies; 38+ messages in thread
From: Damien Le Moal @ 2020-07-02  8:16 UTC (permalink / raw)
  To: Javier González, linux-nvme
  Cc: axboe, SelvaKumar S, sagi, Kanchan Joshi, Nitesh Shetty,
	linux-block, kbusch, Javier González, mb, hch

On 2020/07/02 15:55, Javier González wrote:
> From: Javier González <javier.gonz@samsung.com>
> 
> Since the number of zones is calculated through the reported device
> capacity and the ZNS specification allows to report the total number of
> zones in the device, add an extra check to guarantee consistency between
> the device and the kernel.
> 
> Also, fix a checkpatch warning: unsigned -> unsigned int in the process
> 
> Signed-off-by: Javier González <javier.gonz@samsung.com>
> Signed-off-by: SelvaKumar S <selvakuma.s1@samsung.com>
> Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
> Signed-off-by: Nitesh Shetty <nj.shetty@samsung.com>
> ---
>  drivers/nvme/host/core.c |  2 +-
>  drivers/nvme/host/nvme.h |  6 ++++--
>  drivers/nvme/host/zns.c  | 39 ++++++++++++++++++++++++++++++++++++++-
>  3 files changed, 43 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 1f5c7fc3d2c9..8b9c69172931 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -1994,7 +1994,7 @@ static int __nvme_revalidate_disk(struct gendisk *disk, struct nvme_id_ns *id)
>  	case NVME_CSI_NVM:
>  		break;
>  	case NVME_CSI_ZNS:
> -		ret = nvme_update_zone_info(disk, ns, lbaf);
> +		ret = nvme_update_zone_info(disk, ns, lbaf, le64_to_cpu(id->nsze));
>  		if (ret)
>  			return ret;
>  		break;
> diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
> index 662f95fbd909..ef80e0b4df56 100644
> --- a/drivers/nvme/host/nvme.h
> +++ b/drivers/nvme/host/nvme.h
> @@ -407,6 +407,7 @@ struct nvme_ns {
>  	u32 sws;
>  	u8 pi_type;
>  #ifdef CONFIG_BLK_DEV_ZONED
> +	u64 nr_zones;
>  	u64 zsze;
>  #endif
>  	unsigned long features;
> @@ -700,7 +701,7 @@ static inline void nvme_mpath_start_freeze(struct nvme_subsystem *subsys)
>  
>  #ifdef CONFIG_BLK_DEV_ZONED
>  int nvme_update_zone_info(struct gendisk *disk, struct nvme_ns *ns,
> -			  unsigned lbaf);
> +			  unsigned int lbaf, sector_t nsze);
>  
>  int nvme_report_zones(struct gendisk *disk, sector_t sector,
>  		      unsigned int nr_zones, report_zones_cb cb, void *data);
> @@ -720,7 +721,8 @@ static inline blk_status_t nvme_setup_zone_mgmt_send(struct nvme_ns *ns,
>  
>  static inline int nvme_update_zone_info(struct gendisk *disk,
>  					struct nvme_ns *ns,
> -					unsigned lbaf)
> +					unsigned lbaf

missing a comma here. Does this even compiles ?

> +					sector_t nsze)
>  {
>  	dev_warn(ns->ctrl->device,
>  		 "Please enable CONFIG_BLK_DEV_ZONED to support ZNS devices\n");
> diff --git a/drivers/nvme/host/zns.c b/drivers/nvme/host/zns.c
> index b34d2ed13825..d785d179a343 100644
> --- a/drivers/nvme/host/zns.c
> +++ b/drivers/nvme/host/zns.c
> @@ -32,13 +32,36 @@ static int nvme_set_max_append(struct nvme_ctrl *ctrl)
>  	return 0;
>  }
>  
> +static u64 nvme_zns_nr_zones(struct nvme_ns *ns)
> +{
> +	struct nvme_command c = { };
> +	struct nvme_zone_report report;
> +	int buflen = sizeof(struct nvme_zone_report);
> +	int ret;
> +
> +	c.zmr.opcode = nvme_cmd_zone_mgmt_recv;
> +	c.zmr.nsid = cpu_to_le32(ns->head->ns_id);
> +	c.zmr.slba = cpu_to_le64(0);
> +	c.zmr.numd = cpu_to_le32(nvme_bytes_to_numd(buflen));
> +	c.zmr.zra = NVME_ZRA_ZONE_REPORT;
> +	c.zmr.zrasf = NVME_ZRASF_ZONE_REPORT_ALL;
> +	c.zmr.pr = 0;
> +
> +	ret = nvme_submit_sync_cmd(ns->queue, &c, &report, buflen);
> +	if (ret)
> +		return ret;
> +
> +	return le64_to_cpu(report.nr_zones);
> +}
> +
>  int nvme_update_zone_info(struct gendisk *disk, struct nvme_ns *ns,
> -			  unsigned lbaf)
> +			  unsigned int lbaf, sector_t nsze)
>  {
>  //	struct nvme_effects_log *log = ns->head->effects;
>  	struct request_queue *q = disk->queue;
>  	struct nvme_command c = { };
>  	struct nvme_id_ns_zns *id;
> +	sector_t cap_nr_zones;
>  	int status;
>  
>  	/* Driver requires zone append support */
> @@ -80,6 +103,20 @@ int nvme_update_zone_info(struct gendisk *disk, struct nvme_ns *ns,
>  		goto free_data;
>  	}
>  
> +	ns->nr_zones = nvme_zns_nr_zones(ns);
> +	if (!ns->nr_zones) {
> +		status = -EINVAL;
> +		goto free_data;
> +	}
> +
> +	cap_nr_zones = nvme_lba_to_sect(ns, le64_to_cpu(nsze)) >> ilog2(ns->zsze);
> +	if (ns->nr_zones != cap_nr_zones) {
> +		dev_err(ns->ctrl->device, "inconsistent zone count: %llu/%llu\n",
> +			ns->nr_zones, cap_nr_zones);
> +		status = -EINVAL;
> +		goto free_data;
> +	}
> +

I am not opposed to this, but I am not sure if this adds anything to the checks
in blk_revalidate_disk_zones() as that does a full zone report that is checked
against capacity. This seems more like a test of the drive conformance, checking
that the report header nr_zones is correct...

>  	q->limits.zoned = BLK_ZONED_HM;
>  	q->zone_flags = BLK_ZONE_REP_CAPACITY | BLK_ZONE_REP_OFFLINE;
>  	blk_queue_flag_set(QUEUE_FLAG_ZONE_RESETALL, q);
> 


-- 
Damien Le Moal
Western Digital Research

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 3/4] nvme: Add consistency check for zone count
  2020-07-02  6:54   ` Javier González
@ 2020-07-02  8:19     ` Johannes Thumshirn
  -1 siblings, 0 replies; 38+ messages in thread
From: Johannes Thumshirn @ 2020-07-02  8:19 UTC (permalink / raw)
  To: Javier González, linux-nvme
  Cc: linux-block, hch, kbusch, sagi, axboe, Damien Le Moal, mb,
	Javier González, SelvaKumar S, Kanchan Joshi, Nitesh Shetty

On 02/07/2020 08:55, Javier González wrote:
>  int nvme_update_zone_info(struct gendisk *disk, struct nvme_ns *ns,
> -			  unsigned lbaf)
> +			  unsigned int lbaf, sector_t nsze)
>  {
>  //	struct nvme_effects_log *log = ns->head->effects;
>  	struct request_queue *q = disk->queue;


Please no C++ style comments and no commenting out of code in an official submission.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 3/4] nvme: Add consistency check for zone count
@ 2020-07-02  8:19     ` Johannes Thumshirn
  0 siblings, 0 replies; 38+ messages in thread
From: Johannes Thumshirn @ 2020-07-02  8:19 UTC (permalink / raw)
  To: Javier González, linux-nvme
  Cc: axboe, Damien Le Moal, SelvaKumar S, sagi, Kanchan Joshi,
	Nitesh Shetty, linux-block, kbusch, Javier González, mb,
	hch

On 02/07/2020 08:55, Javier González wrote:
>  int nvme_update_zone_info(struct gendisk *disk, struct nvme_ns *ns,
> -			  unsigned lbaf)
> +			  unsigned int lbaf, sector_t nsze)
>  {
>  //	struct nvme_effects_log *log = ns->head->effects;
>  	struct request_queue *q = disk->queue;


Please no C++ style comments and no commenting out of code in an official submission.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 4/4] block: add attributes to zone report
  2020-07-02  6:54   ` Javier González
@ 2020-07-02  8:27     ` Damien Le Moal
  -1 siblings, 0 replies; 38+ messages in thread
From: Damien Le Moal @ 2020-07-02  8:27 UTC (permalink / raw)
  To: Javier González, linux-nvme
  Cc: linux-block, hch, kbusch, sagi, axboe, mb, Javier González,
	SelvaKumar S, Kanchan Joshi, Nitesh Shetty

On 2020/07/02 15:55, Javier González wrote:
> From: Javier González <javier.gonz@samsung.com>
> 
> Add zone attributes field to the blk_zone structure. Use ZNS attributes
> as base for zoned block devices and add the current atributes in
> ZAC/ZBC.
> 
> Signed-off-by: Javier González <javier.gonz@samsung.com>
> Signed-off-by: SelvaKumar S <selvakuma.s1@samsung.com>
> Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
> Signed-off-by: Nitesh Shetty <nj.shetty@samsung.com>
> ---
>  drivers/nvme/host/zns.c       |  3 ++-
>  drivers/scsi/sd.c             |  2 +-
>  drivers/scsi/sd_zbc.c         |  8 ++++++--
>  include/uapi/linux/blkzoned.h | 26 +++++++++++++++++++++++++-
>  4 files changed, 34 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/nvme/host/zns.c b/drivers/nvme/host/zns.c
> index d785d179a343..749382a14968 100644
> --- a/drivers/nvme/host/zns.c
> +++ b/drivers/nvme/host/zns.c
> @@ -118,7 +118,7 @@ int nvme_update_zone_info(struct gendisk *disk, struct nvme_ns *ns,
>  	}
>  
>  	q->limits.zoned = BLK_ZONED_HM;
> -	q->zone_flags = BLK_ZONE_REP_CAPACITY | BLK_ZONE_REP_OFFLINE;
> +	q->zone_flags = BLK_ZONE_REP_CAPACITY | BLK_ZONE_REP_CAPACITY | BLK_ZONE_REP_OFFLINE;

Adding BLK_ZONE_REP_CAPACITY a second time ? You meant adding BLK_ZONE_REP_ATTR,
no ?

>  	blk_queue_flag_set(QUEUE_FLAG_ZONE_RESETALL, q);
>  free_data:
>  	kfree(id);
> @@ -197,6 +197,7 @@ static int nvme_zone_parse_entry(struct nvme_ns *ns,
>  	zone.capacity = nvme_lba_to_sect(ns, le64_to_cpu(entry->zcap));
>  	zone.start = nvme_lba_to_sect(ns, le64_to_cpu(entry->zslba));
>  	zone.wp = nvme_lba_to_sect(ns, le64_to_cpu(entry->wp));
> +	zone.attr = entry->za;

It may be a good idea to limit this to the user level defined attributes.

>  
>  	return cb(&zone, idx, data);
>  }
> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> index b9c920bace28..63270598aa76 100644
> --- a/drivers/scsi/sd.c
> +++ b/drivers/scsi/sd.c
> @@ -2967,7 +2967,7 @@ static void sd_read_block_characteristics(struct scsi_disk *sdkp)
>  	if (sdkp->device->type == TYPE_ZBC) {
>  		/* Host-managed */
>  		q->limits.zoned = BLK_ZONED_HM;
> -		q->zone_flags = BLK_ZONE_REP_CAPACITY;
> +		q->zone_flags = BLK_ZONE_REP_CAPACITY | BLK_ZONE_REP_ATTR;
>  	} else {
>  		sdkp->zoned = (buffer[8] >> 4) & 3;
>  		if (sdkp->zoned == 1 && !disk_has_partitions(sdkp->disk)) {
> diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
> index 183a20720da9..51c7f82b59c5 100644
> --- a/drivers/scsi/sd_zbc.c
> +++ b/drivers/scsi/sd_zbc.c
> @@ -53,10 +53,14 @@ static int sd_zbc_parse_report(struct scsi_disk *sdkp, u8 *buf,
>  
>  	zone.type = buf[0] & 0x0f;
>  	zone.cond = (buf[1] >> 4) & 0xf;
> -	if (buf[1] & 0x01)
> +	if (buf[1] & 0x01) {
> +		zone.attr |= BLK_ZONE_ATTR_NRW;
>  		zone.reset = 1;
> -	if (buf[1] & 0x02)
> +	}
> +	if (buf[1] & 0x02) {
> +		zone.attr |= BLK_ZONE_ATTR_NSQ;
>  		zone.non_seq = 1;
> +	}
>  
>  	zone.len = logical_to_sectors(sdp, get_unaligned_be64(&buf[8]));
>  	zone.capacity = zone.len;
> diff --git a/include/uapi/linux/blkzoned.h b/include/uapi/linux/blkzoned.h
> index e5adf4a9f4b0..315617827acd 100644
> --- a/include/uapi/linux/blkzoned.h
> +++ b/include/uapi/linux/blkzoned.h
> @@ -78,10 +78,33 @@ enum blk_zone_cond {
>   *
>   * @BLK_ZONE_REP_CAPACITY: Zone descriptor has capacity field.
>   * @BLK_ZONE_REP_OFFLINE : Zone device supports offline transition.
> + * @BLK_ZONE_REP_ATTR: Zone attributes reported.
>   */
>  enum blk_zone_report_flags {
>  	BLK_ZONE_REP_CAPACITY	= (1 << 0),
>  	BLK_ZONE_REP_OFFLINE	= (1 << 1),
> +	BLK_ZONE_REP_ATTR	= (1 << 2),
> +};
> +
> +/**
> + * enum blk_zone_attr - Zone Attributes
> + *
> + * Attributes of the zone. Reported in struct blk_zone -> attr
> + *
> + * @BLK_ZONE_ATTR_ZFC: Zone Finished by Controller due to a zone active excursion

ZNS drives that have zone active excursion are not supported for now, so this
attribute can never be returned.

> + * @BLK_ZONE_ATTR_FZR: Finish Zone Recommended required by controller

"Recommended required" is a rather odd wording. If it is required then it is not
recommended, it is mandatory. I would simplify this to

Finish zone recommended

> + * @BLK_ZONE_ATTR_RZR: Reset Zone Recommended required by controller

Same comment as above.

> + * @BLK_ZONE_ATTR_NSQ: Non Sequential zone

This is the same as the non_seq field, so please keep the same definition:

Non-sequential write resources active

> + * @BLK_ZONE_ATTR_NRW: Need Reset Write Pointer required by controller

This is the same as the reset field, so please keep the same definition:

"Reset write pointer recommended"

> + * @BLK_ZONE_ATTR_ZDEV: Zone Descriptor Extension Valid in zone report
> + */
> +enum blk_zone_attr {
> +	BLK_ZONE_ATTR_ZFC	= 1 << 0,
> +	BLK_ZONE_ATTR_FZR	= 1 << 1,
> +	BLK_ZONE_ATTR_RZR	= 1 << 2,
> +	BLK_ZONE_ATTR_NSQ	= 1 << 3,
> +	BLK_ZONE_ATTR_NRW	= 1 << 4,
> +	BLK_ZONE_ATTR_ZDEV	= 1 << 7,

What is the purpose of BLK_ZONE_ATTR_ZDEV ?

>  };
>  
>  /**
> @@ -110,7 +133,8 @@ struct blk_zone {
>  	__u8	cond;		/* Zone condition */
>  	__u8	non_seq;	/* Non-sequential write resources active */
>  	__u8	reset;		/* Reset write pointer recommended */
> -	__u8	resv[4];
> +	__u8	attr;		/* Zone attributes */
> +	__u8	resv[3];
>  	__u64	capacity;	/* Zone capacity in number of sectors */
>  	__u8	reserved[24];
>  };
> 


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 4/4] block: add attributes to zone report
@ 2020-07-02  8:27     ` Damien Le Moal
  0 siblings, 0 replies; 38+ messages in thread
From: Damien Le Moal @ 2020-07-02  8:27 UTC (permalink / raw)
  To: Javier González, linux-nvme
  Cc: axboe, SelvaKumar S, sagi, Kanchan Joshi, Nitesh Shetty,
	linux-block, kbusch, Javier González, mb, hch

On 2020/07/02 15:55, Javier González wrote:
> From: Javier González <javier.gonz@samsung.com>
> 
> Add zone attributes field to the blk_zone structure. Use ZNS attributes
> as base for zoned block devices and add the current atributes in
> ZAC/ZBC.
> 
> Signed-off-by: Javier González <javier.gonz@samsung.com>
> Signed-off-by: SelvaKumar S <selvakuma.s1@samsung.com>
> Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
> Signed-off-by: Nitesh Shetty <nj.shetty@samsung.com>
> ---
>  drivers/nvme/host/zns.c       |  3 ++-
>  drivers/scsi/sd.c             |  2 +-
>  drivers/scsi/sd_zbc.c         |  8 ++++++--
>  include/uapi/linux/blkzoned.h | 26 +++++++++++++++++++++++++-
>  4 files changed, 34 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/nvme/host/zns.c b/drivers/nvme/host/zns.c
> index d785d179a343..749382a14968 100644
> --- a/drivers/nvme/host/zns.c
> +++ b/drivers/nvme/host/zns.c
> @@ -118,7 +118,7 @@ int nvme_update_zone_info(struct gendisk *disk, struct nvme_ns *ns,
>  	}
>  
>  	q->limits.zoned = BLK_ZONED_HM;
> -	q->zone_flags = BLK_ZONE_REP_CAPACITY | BLK_ZONE_REP_OFFLINE;
> +	q->zone_flags = BLK_ZONE_REP_CAPACITY | BLK_ZONE_REP_CAPACITY | BLK_ZONE_REP_OFFLINE;

Adding BLK_ZONE_REP_CAPACITY a second time ? You meant adding BLK_ZONE_REP_ATTR,
no ?

>  	blk_queue_flag_set(QUEUE_FLAG_ZONE_RESETALL, q);
>  free_data:
>  	kfree(id);
> @@ -197,6 +197,7 @@ static int nvme_zone_parse_entry(struct nvme_ns *ns,
>  	zone.capacity = nvme_lba_to_sect(ns, le64_to_cpu(entry->zcap));
>  	zone.start = nvme_lba_to_sect(ns, le64_to_cpu(entry->zslba));
>  	zone.wp = nvme_lba_to_sect(ns, le64_to_cpu(entry->wp));
> +	zone.attr = entry->za;

It may be a good idea to limit this to the user level defined attributes.

>  
>  	return cb(&zone, idx, data);
>  }
> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> index b9c920bace28..63270598aa76 100644
> --- a/drivers/scsi/sd.c
> +++ b/drivers/scsi/sd.c
> @@ -2967,7 +2967,7 @@ static void sd_read_block_characteristics(struct scsi_disk *sdkp)
>  	if (sdkp->device->type == TYPE_ZBC) {
>  		/* Host-managed */
>  		q->limits.zoned = BLK_ZONED_HM;
> -		q->zone_flags = BLK_ZONE_REP_CAPACITY;
> +		q->zone_flags = BLK_ZONE_REP_CAPACITY | BLK_ZONE_REP_ATTR;
>  	} else {
>  		sdkp->zoned = (buffer[8] >> 4) & 3;
>  		if (sdkp->zoned == 1 && !disk_has_partitions(sdkp->disk)) {
> diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
> index 183a20720da9..51c7f82b59c5 100644
> --- a/drivers/scsi/sd_zbc.c
> +++ b/drivers/scsi/sd_zbc.c
> @@ -53,10 +53,14 @@ static int sd_zbc_parse_report(struct scsi_disk *sdkp, u8 *buf,
>  
>  	zone.type = buf[0] & 0x0f;
>  	zone.cond = (buf[1] >> 4) & 0xf;
> -	if (buf[1] & 0x01)
> +	if (buf[1] & 0x01) {
> +		zone.attr |= BLK_ZONE_ATTR_NRW;
>  		zone.reset = 1;
> -	if (buf[1] & 0x02)
> +	}
> +	if (buf[1] & 0x02) {
> +		zone.attr |= BLK_ZONE_ATTR_NSQ;
>  		zone.non_seq = 1;
> +	}
>  
>  	zone.len = logical_to_sectors(sdp, get_unaligned_be64(&buf[8]));
>  	zone.capacity = zone.len;
> diff --git a/include/uapi/linux/blkzoned.h b/include/uapi/linux/blkzoned.h
> index e5adf4a9f4b0..315617827acd 100644
> --- a/include/uapi/linux/blkzoned.h
> +++ b/include/uapi/linux/blkzoned.h
> @@ -78,10 +78,33 @@ enum blk_zone_cond {
>   *
>   * @BLK_ZONE_REP_CAPACITY: Zone descriptor has capacity field.
>   * @BLK_ZONE_REP_OFFLINE : Zone device supports offline transition.
> + * @BLK_ZONE_REP_ATTR: Zone attributes reported.
>   */
>  enum blk_zone_report_flags {
>  	BLK_ZONE_REP_CAPACITY	= (1 << 0),
>  	BLK_ZONE_REP_OFFLINE	= (1 << 1),
> +	BLK_ZONE_REP_ATTR	= (1 << 2),
> +};
> +
> +/**
> + * enum blk_zone_attr - Zone Attributes
> + *
> + * Attributes of the zone. Reported in struct blk_zone -> attr
> + *
> + * @BLK_ZONE_ATTR_ZFC: Zone Finished by Controller due to a zone active excursion

ZNS drives that have zone active excursion are not supported for now, so this
attribute can never be returned.

> + * @BLK_ZONE_ATTR_FZR: Finish Zone Recommended required by controller

"Recommended required" is a rather odd wording. If it is required then it is not
recommended, it is mandatory. I would simplify this to

Finish zone recommended

> + * @BLK_ZONE_ATTR_RZR: Reset Zone Recommended required by controller

Same comment as above.

> + * @BLK_ZONE_ATTR_NSQ: Non Sequential zone

This is the same as the non_seq field, so please keep the same definition:

Non-sequential write resources active

> + * @BLK_ZONE_ATTR_NRW: Need Reset Write Pointer required by controller

This is the same as the reset field, so please keep the same definition:

"Reset write pointer recommended"

> + * @BLK_ZONE_ATTR_ZDEV: Zone Descriptor Extension Valid in zone report
> + */
> +enum blk_zone_attr {
> +	BLK_ZONE_ATTR_ZFC	= 1 << 0,
> +	BLK_ZONE_ATTR_FZR	= 1 << 1,
> +	BLK_ZONE_ATTR_RZR	= 1 << 2,
> +	BLK_ZONE_ATTR_NSQ	= 1 << 3,
> +	BLK_ZONE_ATTR_NRW	= 1 << 4,
> +	BLK_ZONE_ATTR_ZDEV	= 1 << 7,

What is the purpose of BLK_ZONE_ATTR_ZDEV ?

>  };
>  
>  /**
> @@ -110,7 +133,8 @@ struct blk_zone {
>  	__u8	cond;		/* Zone condition */
>  	__u8	non_seq;	/* Non-sequential write resources active */
>  	__u8	reset;		/* Reset write pointer recommended */
> -	__u8	resv[4];
> +	__u8	attr;		/* Zone attributes */
> +	__u8	resv[3];
>  	__u64	capacity;	/* Zone capacity in number of sectors */
>  	__u8	reserved[24];
>  };
> 


-- 
Damien Le Moal
Western Digital Research

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 3/4] nvme: Add consistency check for zone count
  2020-07-02  8:19     ` Johannes Thumshirn
@ 2020-07-02  8:27       ` Javier González
  -1 siblings, 0 replies; 38+ messages in thread
From: Javier González @ 2020-07-02  8:27 UTC (permalink / raw)
  To: Johannes Thumshirn
  Cc: linux-nvme, linux-block, hch, kbusch, sagi, axboe,
	Damien Le Moal, mb, SelvaKumar S, Kanchan Joshi, Nitesh Shetty

On 02.07.2020 08:19, Johannes Thumshirn wrote:
>On 02/07/2020 08:55, Javier González wrote:
>>  int nvme_update_zone_info(struct gendisk *disk, struct nvme_ns *ns,
>> -			  unsigned lbaf)
>> +			  unsigned int lbaf, sector_t nsze)
>>  {
>>  //	struct nvme_effects_log *log = ns->head->effects;
>>  	struct request_queue *q = disk->queue;
>
>
>Please no C++ style comments and no commenting out of code in an official submission.

Sorry. I see some debug code leaked here.

Javier

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 3/4] nvme: Add consistency check for zone count
@ 2020-07-02  8:27       ` Javier González
  0 siblings, 0 replies; 38+ messages in thread
From: Javier González @ 2020-07-02  8:27 UTC (permalink / raw)
  To: Johannes Thumshirn
  Cc: axboe, Damien Le Moal, SelvaKumar S, sagi, Kanchan Joshi,
	linux-nvme, Nitesh Shetty, linux-block, kbusch, mb, hch

On 02.07.2020 08:19, Johannes Thumshirn wrote:
>On 02/07/2020 08:55, Javier González wrote:
>>  int nvme_update_zone_info(struct gendisk *disk, struct nvme_ns *ns,
>> -			  unsigned lbaf)
>> +			  unsigned int lbaf, sector_t nsze)
>>  {
>>  //	struct nvme_effects_log *log = ns->head->effects;
>>  	struct request_queue *q = disk->queue;
>
>
>Please no C++ style comments and no commenting out of code in an official submission.

Sorry. I see some debug code leaked here.

Javier

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 1/4] block: Add zone flags to queue zone prop.
  2020-07-02  7:54     ` Damien Le Moal
@ 2020-07-02  8:34       ` Javier González
  -1 siblings, 0 replies; 38+ messages in thread
From: Javier González @ 2020-07-02  8:34 UTC (permalink / raw)
  To: Damien Le Moal
  Cc: linux-nvme, linux-block, hch, kbusch, sagi, axboe, mb,
	SelvaKumar S, Kanchan Joshi, Nitesh Shetty

On 02.07.2020 07:54, Damien Le Moal wrote:
>On 2020/07/02 15:55, Javier González wrote:
>> From: Javier González <javier.gonz@samsung.com>
>>
>> As the zoned block device will have to deal with features that are
>> optional for the backend device, add a flag field to inform the block
>> layer about supported features. This builds on top of
>> blk_zone_report_flags and extendes to the zone report code.
>>
>> Signed-off-by: Javier González <javier.gonz@samsung.com>
>> Signed-off-by: SelvaKumar S <selvakuma.s1@samsung.com>
>> Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
>> Signed-off-by: Nitesh Shetty <nj.shetty@samsung.com>
>> ---
>>  block/blk-zoned.c              | 3 ++-
>>  drivers/block/null_blk_zoned.c | 2 ++
>>  drivers/nvme/host/zns.c        | 1 +
>>  drivers/scsi/sd.c              | 2 ++
>>  include/linux/blkdev.h         | 3 +++
>>  5 files changed, 10 insertions(+), 1 deletion(-)
>>
>> diff --git a/block/blk-zoned.c b/block/blk-zoned.c
>> index 81152a260354..0f156e96e48f 100644
>> --- a/block/blk-zoned.c
>> +++ b/block/blk-zoned.c
>> @@ -312,7 +312,8 @@ int blkdev_report_zones_ioctl(struct block_device *bdev, fmode_t mode,
>>  		return ret;
>>
>>  	rep.nr_zones = ret;
>> -	rep.flags = BLK_ZONE_REP_CAPACITY;
>> +	rep.flags = q->zone_flags;
>
>zone_flags seem to be a fairly generic flags field while rep.flags is only about
>the reported descriptors structure. So you may want to define a
>BLK_ZONE_REP_FLAGS_MASK and have:
>
>	rep.flags = q->zone_flags & BLK_ZONE_REP_FLAGS_MASK;
>
>to avoid flags meaningless for the user being set.
>
>In any case, since *all* zoned block devices now report capacity, I do not
>really see the point to add BLK_ZONE_REP_FLAGS_MASK to q->zone_flags, especially
>considering that you set the flag for all zoned device types, including scsi
>which does not have zone capacity. This makes q->zone_flags rather confusing
>instead of clearly defining the device features as you mentioned in the commit
>message.
>
>I think it may be better to just drop this patch, and if needed, introduce the
>zone_flags field where it may be needed (e.g. OFFLINE zone ioctl support).

I am using this as a way to pass the OFFLINE support flag to the block
layer. I used this too for the attributes. Are you thinking of a
different way to send this?

I believe this fits too for capacity, but we can just pass it in
all report as before if you prefer.

>
>> +
>>  	if (copy_to_user(argp, &rep, sizeof(struct blk_zone_report)))
>>  		return -EFAULT;
>>  	return 0;
>> diff --git a/drivers/block/null_blk_zoned.c b/drivers/block/null_blk_zoned.c
>> index b05832eb21b2..957c2103f240 100644
>> --- a/drivers/block/null_blk_zoned.c
>> +++ b/drivers/block/null_blk_zoned.c
>> @@ -78,6 +78,8 @@ int null_init_zoned_dev(struct nullb_device *dev, struct request_queue *q)
>>  	}
>>
>>  	q->limits.zoned = BLK_ZONED_HM;
>> +	q->zone_flags = BLK_ZONE_REP_CAPACITY;
>> +
>>  	blk_queue_flag_set(QUEUE_FLAG_ZONE_RESETALL, q);
>>  	blk_queue_required_elevator_features(q, ELEVATOR_F_ZBD_SEQ_WRITE);
>>
>> diff --git a/drivers/nvme/host/zns.c b/drivers/nvme/host/zns.c
>> index 0642d3c54e8f..888264261ba3 100644
>> --- a/drivers/nvme/host/zns.c
>> +++ b/drivers/nvme/host/zns.c
>> @@ -81,6 +81,7 @@ int nvme_update_zone_info(struct gendisk *disk, struct nvme_ns *ns,
>>  	}
>>
>>  	q->limits.zoned = BLK_ZONED_HM;
>> +	q->zone_flags = BLK_ZONE_REP_CAPACITY;
>>  	blk_queue_flag_set(QUEUE_FLAG_ZONE_RESETALL, q);
>>  free_data:
>>  	kfree(id);
>> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
>> index d90fefffe31b..b9c920bace28 100644
>> --- a/drivers/scsi/sd.c
>> +++ b/drivers/scsi/sd.c
>> @@ -2967,6 +2967,7 @@ static void sd_read_block_characteristics(struct scsi_disk *sdkp)
>>  	if (sdkp->device->type == TYPE_ZBC) {
>>  		/* Host-managed */
>>  		q->limits.zoned = BLK_ZONED_HM;
>> +		q->zone_flags = BLK_ZONE_REP_CAPACITY;
>>  	} else {
>>  		sdkp->zoned = (buffer[8] >> 4) & 3;
>>  		if (sdkp->zoned == 1 && !disk_has_partitions(sdkp->disk)) {
>> @@ -2983,6 +2984,7 @@ static void sd_read_block_characteristics(struct scsi_disk *sdkp)
>>  					  "Drive-managed SMR disk\n");
>>  		}
>>  	}
>> +
>>  	if (blk_queue_is_zoned(q) && sdkp->first_scan)
>>  		sd_printk(KERN_NOTICE, sdkp, "Host-%s zoned block device\n",
>>  		      q->limits.zoned == BLK_ZONED_HM ? "managed" : "aware");
>> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
>> index 8fd900998b4e..3f2e3425fa53 100644
>> --- a/include/linux/blkdev.h
>> +++ b/include/linux/blkdev.h
>> @@ -512,12 +512,15 @@ struct request_queue {
>>  	 * Stacking drivers (device mappers) may or may not initialize
>>  	 * these fields.
>>  	 *
>> +	 * Flags represent features as described by blk_zone_report_flags in blkzoned.h
>> +	 *
>>  	 * Reads of this information must be protected with blk_queue_enter() /
>>  	 * blk_queue_exit(). Modifying this information is only allowed while
>>  	 * no requests are being processed. See also blk_mq_freeze_queue() and
>>  	 * blk_mq_unfreeze_queue().
>>  	 */
>>  	unsigned int		nr_zones;
>> +	unsigned int		zone_flags;
>>  	unsigned long		*conv_zones_bitmap;
>>  	unsigned long		*seq_zones_wlock;
>>  #endif /* CONFIG_BLK_DEV_ZONED */
>>
>
>And you are missing device-mapper support. DM target devices have a request
>queue that would need to set the zone_flags too.

Ok. I looked at it and I thought that this would be inherited by the
underlying device. I will add it in V3.

Javier


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 1/4] block: Add zone flags to queue zone prop.
@ 2020-07-02  8:34       ` Javier González
  0 siblings, 0 replies; 38+ messages in thread
From: Javier González @ 2020-07-02  8:34 UTC (permalink / raw)
  To: Damien Le Moal
  Cc: axboe, SelvaKumar S, sagi, Kanchan Joshi, linux-nvme,
	Nitesh Shetty, linux-block, kbusch, mb, hch

On 02.07.2020 07:54, Damien Le Moal wrote:
>On 2020/07/02 15:55, Javier González wrote:
>> From: Javier González <javier.gonz@samsung.com>
>>
>> As the zoned block device will have to deal with features that are
>> optional for the backend device, add a flag field to inform the block
>> layer about supported features. This builds on top of
>> blk_zone_report_flags and extendes to the zone report code.
>>
>> Signed-off-by: Javier González <javier.gonz@samsung.com>
>> Signed-off-by: SelvaKumar S <selvakuma.s1@samsung.com>
>> Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
>> Signed-off-by: Nitesh Shetty <nj.shetty@samsung.com>
>> ---
>>  block/blk-zoned.c              | 3 ++-
>>  drivers/block/null_blk_zoned.c | 2 ++
>>  drivers/nvme/host/zns.c        | 1 +
>>  drivers/scsi/sd.c              | 2 ++
>>  include/linux/blkdev.h         | 3 +++
>>  5 files changed, 10 insertions(+), 1 deletion(-)
>>
>> diff --git a/block/blk-zoned.c b/block/blk-zoned.c
>> index 81152a260354..0f156e96e48f 100644
>> --- a/block/blk-zoned.c
>> +++ b/block/blk-zoned.c
>> @@ -312,7 +312,8 @@ int blkdev_report_zones_ioctl(struct block_device *bdev, fmode_t mode,
>>  		return ret;
>>
>>  	rep.nr_zones = ret;
>> -	rep.flags = BLK_ZONE_REP_CAPACITY;
>> +	rep.flags = q->zone_flags;
>
>zone_flags seem to be a fairly generic flags field while rep.flags is only about
>the reported descriptors structure. So you may want to define a
>BLK_ZONE_REP_FLAGS_MASK and have:
>
>	rep.flags = q->zone_flags & BLK_ZONE_REP_FLAGS_MASK;
>
>to avoid flags meaningless for the user being set.
>
>In any case, since *all* zoned block devices now report capacity, I do not
>really see the point to add BLK_ZONE_REP_FLAGS_MASK to q->zone_flags, especially
>considering that you set the flag for all zoned device types, including scsi
>which does not have zone capacity. This makes q->zone_flags rather confusing
>instead of clearly defining the device features as you mentioned in the commit
>message.
>
>I think it may be better to just drop this patch, and if needed, introduce the
>zone_flags field where it may be needed (e.g. OFFLINE zone ioctl support).

I am using this as a way to pass the OFFLINE support flag to the block
layer. I used this too for the attributes. Are you thinking of a
different way to send this?

I believe this fits too for capacity, but we can just pass it in
all report as before if you prefer.

>
>> +
>>  	if (copy_to_user(argp, &rep, sizeof(struct blk_zone_report)))
>>  		return -EFAULT;
>>  	return 0;
>> diff --git a/drivers/block/null_blk_zoned.c b/drivers/block/null_blk_zoned.c
>> index b05832eb21b2..957c2103f240 100644
>> --- a/drivers/block/null_blk_zoned.c
>> +++ b/drivers/block/null_blk_zoned.c
>> @@ -78,6 +78,8 @@ int null_init_zoned_dev(struct nullb_device *dev, struct request_queue *q)
>>  	}
>>
>>  	q->limits.zoned = BLK_ZONED_HM;
>> +	q->zone_flags = BLK_ZONE_REP_CAPACITY;
>> +
>>  	blk_queue_flag_set(QUEUE_FLAG_ZONE_RESETALL, q);
>>  	blk_queue_required_elevator_features(q, ELEVATOR_F_ZBD_SEQ_WRITE);
>>
>> diff --git a/drivers/nvme/host/zns.c b/drivers/nvme/host/zns.c
>> index 0642d3c54e8f..888264261ba3 100644
>> --- a/drivers/nvme/host/zns.c
>> +++ b/drivers/nvme/host/zns.c
>> @@ -81,6 +81,7 @@ int nvme_update_zone_info(struct gendisk *disk, struct nvme_ns *ns,
>>  	}
>>
>>  	q->limits.zoned = BLK_ZONED_HM;
>> +	q->zone_flags = BLK_ZONE_REP_CAPACITY;
>>  	blk_queue_flag_set(QUEUE_FLAG_ZONE_RESETALL, q);
>>  free_data:
>>  	kfree(id);
>> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
>> index d90fefffe31b..b9c920bace28 100644
>> --- a/drivers/scsi/sd.c
>> +++ b/drivers/scsi/sd.c
>> @@ -2967,6 +2967,7 @@ static void sd_read_block_characteristics(struct scsi_disk *sdkp)
>>  	if (sdkp->device->type == TYPE_ZBC) {
>>  		/* Host-managed */
>>  		q->limits.zoned = BLK_ZONED_HM;
>> +		q->zone_flags = BLK_ZONE_REP_CAPACITY;
>>  	} else {
>>  		sdkp->zoned = (buffer[8] >> 4) & 3;
>>  		if (sdkp->zoned == 1 && !disk_has_partitions(sdkp->disk)) {
>> @@ -2983,6 +2984,7 @@ static void sd_read_block_characteristics(struct scsi_disk *sdkp)
>>  					  "Drive-managed SMR disk\n");
>>  		}
>>  	}
>> +
>>  	if (blk_queue_is_zoned(q) && sdkp->first_scan)
>>  		sd_printk(KERN_NOTICE, sdkp, "Host-%s zoned block device\n",
>>  		      q->limits.zoned == BLK_ZONED_HM ? "managed" : "aware");
>> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
>> index 8fd900998b4e..3f2e3425fa53 100644
>> --- a/include/linux/blkdev.h
>> +++ b/include/linux/blkdev.h
>> @@ -512,12 +512,15 @@ struct request_queue {
>>  	 * Stacking drivers (device mappers) may or may not initialize
>>  	 * these fields.
>>  	 *
>> +	 * Flags represent features as described by blk_zone_report_flags in blkzoned.h
>> +	 *
>>  	 * Reads of this information must be protected with blk_queue_enter() /
>>  	 * blk_queue_exit(). Modifying this information is only allowed while
>>  	 * no requests are being processed. See also blk_mq_freeze_queue() and
>>  	 * blk_mq_unfreeze_queue().
>>  	 */
>>  	unsigned int		nr_zones;
>> +	unsigned int		zone_flags;
>>  	unsigned long		*conv_zones_bitmap;
>>  	unsigned long		*seq_zones_wlock;
>>  #endif /* CONFIG_BLK_DEV_ZONED */
>>
>
>And you are missing device-mapper support. DM target devices have a request
>queue that would need to set the zone_flags too.

Ok. I looked at it and I thought that this would be inherited by the
underlying device. I will add it in V3.

Javier


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 2/4] block: add support for zone offline transition
  2020-07-02  8:10     ` Damien Le Moal
@ 2020-07-02  8:39       ` Javier González
  -1 siblings, 0 replies; 38+ messages in thread
From: Javier González @ 2020-07-02  8:39 UTC (permalink / raw)
  To: Damien Le Moal
  Cc: linux-nvme, linux-block, hch, kbusch, sagi, axboe, mb,
	SelvaKumar S, Kanchan Joshi, Nitesh Shetty

On 02.07.2020 08:10, Damien Le Moal wrote:
>On 2020/07/02 15:55, Javier González wrote:
>> From: Javier González <javier.gonz@samsung.com>
>>
>> Add support for offline transition on the zoned block device. Use the
>> existing feature flags for the underlying driver to report support for
>> the feature, as currently this transition is only supported in ZNS and
>> not in ZAC/ZBC
>>
>> Signed-off-by: Javier González <javier.gonz@samsung.com>
>> Signed-off-by: SelvaKumar S <selvakuma.s1@samsung.com>
>> Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
>> Signed-off-by: Nitesh Shetty <nj.shetty@samsung.com>
>> ---
>>  block/blk-core.c              | 2 ++
>>  block/blk-zoned.c             | 8 +++++++-
>>  drivers/nvme/host/core.c      | 3 +++
>>  drivers/nvme/host/zns.c       | 2 +-
>>  include/linux/blk_types.h     | 3 +++
>>  include/linux/blkdev.h        | 1 -
>>  include/uapi/linux/blkzoned.h | 3 +++
>>  7 files changed, 19 insertions(+), 3 deletions(-)
>>
>> diff --git a/block/blk-core.c b/block/blk-core.c
>> index 03252af8c82c..589cbdacc5ec 100644
>> --- a/block/blk-core.c
>> +++ b/block/blk-core.c
>> @@ -140,6 +140,7 @@ static const char *const blk_op_name[] = {
>>  	REQ_OP_NAME(ZONE_CLOSE),
>>  	REQ_OP_NAME(ZONE_FINISH),
>>  	REQ_OP_NAME(ZONE_APPEND),
>> +	REQ_OP_NAME(ZONE_OFFLINE),
>>  	REQ_OP_NAME(WRITE_SAME),
>>  	REQ_OP_NAME(WRITE_ZEROES),
>>  	REQ_OP_NAME(SCSI_IN),
>> @@ -1030,6 +1031,7 @@ generic_make_request_checks(struct bio *bio)
>>  	case REQ_OP_ZONE_OPEN:
>>  	case REQ_OP_ZONE_CLOSE:
>>  	case REQ_OP_ZONE_FINISH:
>> +	case REQ_OP_ZONE_OFFLINE:
>>  		if (!blk_queue_is_zoned(q))
>>  			goto not_supported;
>>  		break;
>> diff --git a/block/blk-zoned.c b/block/blk-zoned.c
>> index 0f156e96e48f..b97f67f462b4 100644
>> --- a/block/blk-zoned.c
>> +++ b/block/blk-zoned.c
>> @@ -320,7 +320,8 @@ int blkdev_report_zones_ioctl(struct block_device *bdev, fmode_t mode,
>>  }
>>
>>  /*
>> - * BLKRESETZONE, BLKOPENZONE, BLKCLOSEZONE and BLKFINISHZONE ioctl processing.
>> + * BLKRESETZONE, BLKOPENZONE, BLKCLOSEZONE, BLKFINISHZONE and BLKOFFLINEZONE
>> + * ioctl processing.
>>   * Called from blkdev_ioctl.
>>   */
>>  int blkdev_zone_mgmt_ioctl(struct block_device *bdev, fmode_t mode,
>> @@ -363,6 +364,11 @@ int blkdev_zone_mgmt_ioctl(struct block_device *bdev, fmode_t mode,
>>  	case BLKFINISHZONE:
>>  		op = REQ_OP_ZONE_FINISH;
>>  		break;
>> +	case BLKOFFLINEZONE:
>> +		if (!(q->zone_flags & BLK_ZONE_REP_OFFLINE))
>> +			return -EINVAL;
>
>return -ENOTTY here.
>
>That is the error returned for regular block devices when a zone ioctl is
>received, indicating the lack of support for these ioctls. Since this is also a
>lack  of support by the device here too, we may as well keep the same error
>code. Returning -EINVAL should be reserved for cases where the device can accept
>the ioctl but start sector or number of sectors is invalid.

Ok.

>
>
>> +		op = REQ_OP_ZONE_OFFLINE;
>> +		break;
>>  	default:
>>  		return -ENOTTY;
>>  	}
>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>> index e5f754889234..1f5c7fc3d2c9 100644
>> --- a/drivers/nvme/host/core.c
>> +++ b/drivers/nvme/host/core.c
>> @@ -776,6 +776,9 @@ blk_status_t nvme_setup_cmd(struct nvme_ns *ns, struct request *req,
>>  	case REQ_OP_ZONE_FINISH:
>>  		ret = nvme_setup_zone_mgmt_send(ns, req, cmd, NVME_ZONE_FINISH);
>>  		break;
>> +	case REQ_OP_ZONE_OFFLINE:
>> +		ret = nvme_setup_zone_mgmt_send(ns, req, cmd, NVME_ZONE_OFFLINE);
>> +		break;
>>  	case REQ_OP_WRITE_ZEROES:
>>  		ret = nvme_setup_write_zeroes(ns, req, cmd);
>>  		break;
>> diff --git a/drivers/nvme/host/zns.c b/drivers/nvme/host/zns.c
>> index 888264261ba3..b34d2ed13825 100644
>> --- a/drivers/nvme/host/zns.c
>> +++ b/drivers/nvme/host/zns.c
>> @@ -81,7 +81,7 @@ int nvme_update_zone_info(struct gendisk *disk, struct nvme_ns *ns,
>>  	}
>>
>>  	q->limits.zoned = BLK_ZONED_HM;
>> -	q->zone_flags = BLK_ZONE_REP_CAPACITY;
>> +	q->zone_flags = BLK_ZONE_REP_CAPACITY | BLK_ZONE_REP_OFFLINE;
>
>The name BLK_ZONE_REP_OFFLINE is not ideal.  This flag is not about if offline
>condition will be reported or not. It is about the drive supporting an explicit
>offlining zone operation.

I wanted to follow the same convention. I can change the name in the
same enum for the report flags.

>
>>  	blk_queue_flag_set(QUEUE_FLAG_ZONE_RESETALL, q);
>>  free_data:
>>  	kfree(id);
>> diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
>> index ccb895f911b1..c0123c643e2f 100644
>> --- a/include/linux/blk_types.h
>> +++ b/include/linux/blk_types.h
>> @@ -316,6 +316,8 @@ enum req_opf {
>>  	REQ_OP_ZONE_FINISH	= 12,
>>  	/* write data at the current zone write pointer */
>>  	REQ_OP_ZONE_APPEND	= 13,
>> +	/* Transition a zone to offline */
>> +	REQ_OP_ZONE_OFFLINE	= 14,
>>
>>  	/* SCSI passthrough using struct scsi_request */
>>  	REQ_OP_SCSI_IN		= 32,
>> @@ -455,6 +457,7 @@ static inline bool op_is_zone_mgmt(enum req_opf op)
>>  	case REQ_OP_ZONE_OPEN:
>>  	case REQ_OP_ZONE_CLOSE:
>>  	case REQ_OP_ZONE_FINISH:
>> +	case REQ_OP_ZONE_OFFLINE:
>>  		return true;
>>  	default:
>>  		return false;
>> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
>> index 3f2e3425fa53..e489b646486d 100644
>> --- a/include/linux/blkdev.h
>> +++ b/include/linux/blkdev.h
>> @@ -370,7 +370,6 @@ extern int blkdev_report_zones_ioctl(struct block_device *bdev, fmode_t mode,
>>  				     unsigned int cmd, unsigned long arg);
>>  extern int blkdev_zone_mgmt_ioctl(struct block_device *bdev, fmode_t mode,
>>  				  unsigned int cmd, unsigned long arg);
>> -
>>  #else /* CONFIG_BLK_DEV_ZONED */
>>
>>  static inline unsigned int blkdev_nr_zones(struct gendisk *disk)
>> diff --git a/include/uapi/linux/blkzoned.h b/include/uapi/linux/blkzoned.h
>> index 42c3366cc25f..e5adf4a9f4b0 100644
>> --- a/include/uapi/linux/blkzoned.h
>> +++ b/include/uapi/linux/blkzoned.h
>> @@ -77,9 +77,11 @@ enum blk_zone_cond {
>>   * enum blk_zone_report_flags - Feature flags of reported zone descriptors.
>>   *
>>   * @BLK_ZONE_REP_CAPACITY: Zone descriptor has capacity field.
>> + * @BLK_ZONE_REP_OFFLINE : Zone device supports offline transition.
>
>The device supports explicit zone offline transition
>
>Since the implicit transition by the device may happen, even on SMR disks.
>
>But I am not sure this flags is very useful. Or rather, isn't it out of place
>here ? Device features are normally reported through sysfs (e.g. discard, etc).
>It is certainly confusing and not matching the user doc for rep.flag which
>states that the flags are about the zone descriptors, not what the device can
>do. So at the very least, the comments need to change.
>
>The other thing is that the implementation does not consider device mapper case
>again: if a DM target is built on one or more ZNS drives all supporting zone
>offline, then the target should be allowed to report zone offline support too,
>no ? dm-linear and dm-flakey certainly should be allowed to do that. Exporting a
>"zone_offline" (or something like named that) sysfs limit would allow that to be
>supported easily through limit stacking and avoid the need for the report flag.

I can add that too. I left it out as I did not add any implementation on
top of it for the device mapper itself. If this is the way that it makes
sense, then we can add it to the different device mappers later on.

>
>Happy to here others opinion about this one though.
>
>>   */
>>  enum blk_zone_report_flags {
>>  	BLK_ZONE_REP_CAPACITY	= (1 << 0),
>> +	BLK_ZONE_REP_OFFLINE	= (1 << 1),
>>  };
>>
>>  /**
>> @@ -166,5 +168,6 @@ struct blk_zone_range {
>>  #define BLKOPENZONE	_IOW(0x12, 134, struct blk_zone_range)
>>  #define BLKCLOSEZONE	_IOW(0x12, 135, struct blk_zone_range)
>>  #define BLKFINISHZONE	_IOW(0x12, 136, struct blk_zone_range)
>> +#define BLKOFFLINEZONE	_IOW(0x12, 137, struct blk_zone_range)
>>
>>  #endif /* _UAPI_BLKZONED_H */
>>
>


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 2/4] block: add support for zone offline transition
@ 2020-07-02  8:39       ` Javier González
  0 siblings, 0 replies; 38+ messages in thread
From: Javier González @ 2020-07-02  8:39 UTC (permalink / raw)
  To: Damien Le Moal
  Cc: axboe, SelvaKumar S, sagi, Kanchan Joshi, linux-nvme,
	Nitesh Shetty, linux-block, kbusch, mb, hch

On 02.07.2020 08:10, Damien Le Moal wrote:
>On 2020/07/02 15:55, Javier González wrote:
>> From: Javier González <javier.gonz@samsung.com>
>>
>> Add support for offline transition on the zoned block device. Use the
>> existing feature flags for the underlying driver to report support for
>> the feature, as currently this transition is only supported in ZNS and
>> not in ZAC/ZBC
>>
>> Signed-off-by: Javier González <javier.gonz@samsung.com>
>> Signed-off-by: SelvaKumar S <selvakuma.s1@samsung.com>
>> Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
>> Signed-off-by: Nitesh Shetty <nj.shetty@samsung.com>
>> ---
>>  block/blk-core.c              | 2 ++
>>  block/blk-zoned.c             | 8 +++++++-
>>  drivers/nvme/host/core.c      | 3 +++
>>  drivers/nvme/host/zns.c       | 2 +-
>>  include/linux/blk_types.h     | 3 +++
>>  include/linux/blkdev.h        | 1 -
>>  include/uapi/linux/blkzoned.h | 3 +++
>>  7 files changed, 19 insertions(+), 3 deletions(-)
>>
>> diff --git a/block/blk-core.c b/block/blk-core.c
>> index 03252af8c82c..589cbdacc5ec 100644
>> --- a/block/blk-core.c
>> +++ b/block/blk-core.c
>> @@ -140,6 +140,7 @@ static const char *const blk_op_name[] = {
>>  	REQ_OP_NAME(ZONE_CLOSE),
>>  	REQ_OP_NAME(ZONE_FINISH),
>>  	REQ_OP_NAME(ZONE_APPEND),
>> +	REQ_OP_NAME(ZONE_OFFLINE),
>>  	REQ_OP_NAME(WRITE_SAME),
>>  	REQ_OP_NAME(WRITE_ZEROES),
>>  	REQ_OP_NAME(SCSI_IN),
>> @@ -1030,6 +1031,7 @@ generic_make_request_checks(struct bio *bio)
>>  	case REQ_OP_ZONE_OPEN:
>>  	case REQ_OP_ZONE_CLOSE:
>>  	case REQ_OP_ZONE_FINISH:
>> +	case REQ_OP_ZONE_OFFLINE:
>>  		if (!blk_queue_is_zoned(q))
>>  			goto not_supported;
>>  		break;
>> diff --git a/block/blk-zoned.c b/block/blk-zoned.c
>> index 0f156e96e48f..b97f67f462b4 100644
>> --- a/block/blk-zoned.c
>> +++ b/block/blk-zoned.c
>> @@ -320,7 +320,8 @@ int blkdev_report_zones_ioctl(struct block_device *bdev, fmode_t mode,
>>  }
>>
>>  /*
>> - * BLKRESETZONE, BLKOPENZONE, BLKCLOSEZONE and BLKFINISHZONE ioctl processing.
>> + * BLKRESETZONE, BLKOPENZONE, BLKCLOSEZONE, BLKFINISHZONE and BLKOFFLINEZONE
>> + * ioctl processing.
>>   * Called from blkdev_ioctl.
>>   */
>>  int blkdev_zone_mgmt_ioctl(struct block_device *bdev, fmode_t mode,
>> @@ -363,6 +364,11 @@ int blkdev_zone_mgmt_ioctl(struct block_device *bdev, fmode_t mode,
>>  	case BLKFINISHZONE:
>>  		op = REQ_OP_ZONE_FINISH;
>>  		break;
>> +	case BLKOFFLINEZONE:
>> +		if (!(q->zone_flags & BLK_ZONE_REP_OFFLINE))
>> +			return -EINVAL;
>
>return -ENOTTY here.
>
>That is the error returned for regular block devices when a zone ioctl is
>received, indicating the lack of support for these ioctls. Since this is also a
>lack  of support by the device here too, we may as well keep the same error
>code. Returning -EINVAL should be reserved for cases where the device can accept
>the ioctl but start sector or number of sectors is invalid.

Ok.

>
>
>> +		op = REQ_OP_ZONE_OFFLINE;
>> +		break;
>>  	default:
>>  		return -ENOTTY;
>>  	}
>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>> index e5f754889234..1f5c7fc3d2c9 100644
>> --- a/drivers/nvme/host/core.c
>> +++ b/drivers/nvme/host/core.c
>> @@ -776,6 +776,9 @@ blk_status_t nvme_setup_cmd(struct nvme_ns *ns, struct request *req,
>>  	case REQ_OP_ZONE_FINISH:
>>  		ret = nvme_setup_zone_mgmt_send(ns, req, cmd, NVME_ZONE_FINISH);
>>  		break;
>> +	case REQ_OP_ZONE_OFFLINE:
>> +		ret = nvme_setup_zone_mgmt_send(ns, req, cmd, NVME_ZONE_OFFLINE);
>> +		break;
>>  	case REQ_OP_WRITE_ZEROES:
>>  		ret = nvme_setup_write_zeroes(ns, req, cmd);
>>  		break;
>> diff --git a/drivers/nvme/host/zns.c b/drivers/nvme/host/zns.c
>> index 888264261ba3..b34d2ed13825 100644
>> --- a/drivers/nvme/host/zns.c
>> +++ b/drivers/nvme/host/zns.c
>> @@ -81,7 +81,7 @@ int nvme_update_zone_info(struct gendisk *disk, struct nvme_ns *ns,
>>  	}
>>
>>  	q->limits.zoned = BLK_ZONED_HM;
>> -	q->zone_flags = BLK_ZONE_REP_CAPACITY;
>> +	q->zone_flags = BLK_ZONE_REP_CAPACITY | BLK_ZONE_REP_OFFLINE;
>
>The name BLK_ZONE_REP_OFFLINE is not ideal.  This flag is not about if offline
>condition will be reported or not. It is about the drive supporting an explicit
>offlining zone operation.

I wanted to follow the same convention. I can change the name in the
same enum for the report flags.

>
>>  	blk_queue_flag_set(QUEUE_FLAG_ZONE_RESETALL, q);
>>  free_data:
>>  	kfree(id);
>> diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
>> index ccb895f911b1..c0123c643e2f 100644
>> --- a/include/linux/blk_types.h
>> +++ b/include/linux/blk_types.h
>> @@ -316,6 +316,8 @@ enum req_opf {
>>  	REQ_OP_ZONE_FINISH	= 12,
>>  	/* write data at the current zone write pointer */
>>  	REQ_OP_ZONE_APPEND	= 13,
>> +	/* Transition a zone to offline */
>> +	REQ_OP_ZONE_OFFLINE	= 14,
>>
>>  	/* SCSI passthrough using struct scsi_request */
>>  	REQ_OP_SCSI_IN		= 32,
>> @@ -455,6 +457,7 @@ static inline bool op_is_zone_mgmt(enum req_opf op)
>>  	case REQ_OP_ZONE_OPEN:
>>  	case REQ_OP_ZONE_CLOSE:
>>  	case REQ_OP_ZONE_FINISH:
>> +	case REQ_OP_ZONE_OFFLINE:
>>  		return true;
>>  	default:
>>  		return false;
>> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
>> index 3f2e3425fa53..e489b646486d 100644
>> --- a/include/linux/blkdev.h
>> +++ b/include/linux/blkdev.h
>> @@ -370,7 +370,6 @@ extern int blkdev_report_zones_ioctl(struct block_device *bdev, fmode_t mode,
>>  				     unsigned int cmd, unsigned long arg);
>>  extern int blkdev_zone_mgmt_ioctl(struct block_device *bdev, fmode_t mode,
>>  				  unsigned int cmd, unsigned long arg);
>> -
>>  #else /* CONFIG_BLK_DEV_ZONED */
>>
>>  static inline unsigned int blkdev_nr_zones(struct gendisk *disk)
>> diff --git a/include/uapi/linux/blkzoned.h b/include/uapi/linux/blkzoned.h
>> index 42c3366cc25f..e5adf4a9f4b0 100644
>> --- a/include/uapi/linux/blkzoned.h
>> +++ b/include/uapi/linux/blkzoned.h
>> @@ -77,9 +77,11 @@ enum blk_zone_cond {
>>   * enum blk_zone_report_flags - Feature flags of reported zone descriptors.
>>   *
>>   * @BLK_ZONE_REP_CAPACITY: Zone descriptor has capacity field.
>> + * @BLK_ZONE_REP_OFFLINE : Zone device supports offline transition.
>
>The device supports explicit zone offline transition
>
>Since the implicit transition by the device may happen, even on SMR disks.
>
>But I am not sure this flags is very useful. Or rather, isn't it out of place
>here ? Device features are normally reported through sysfs (e.g. discard, etc).
>It is certainly confusing and not matching the user doc for rep.flag which
>states that the flags are about the zone descriptors, not what the device can
>do. So at the very least, the comments need to change.
>
>The other thing is that the implementation does not consider device mapper case
>again: if a DM target is built on one or more ZNS drives all supporting zone
>offline, then the target should be allowed to report zone offline support too,
>no ? dm-linear and dm-flakey certainly should be allowed to do that. Exporting a
>"zone_offline" (or something like named that) sysfs limit would allow that to be
>supported easily through limit stacking and avoid the need for the report flag.

I can add that too. I left it out as I did not add any implementation on
top of it for the device mapper itself. If this is the way that it makes
sense, then we can add it to the different device mappers later on.

>
>Happy to here others opinion about this one though.
>
>>   */
>>  enum blk_zone_report_flags {
>>  	BLK_ZONE_REP_CAPACITY	= (1 << 0),
>> +	BLK_ZONE_REP_OFFLINE	= (1 << 1),
>>  };
>>
>>  /**
>> @@ -166,5 +168,6 @@ struct blk_zone_range {
>>  #define BLKOPENZONE	_IOW(0x12, 134, struct blk_zone_range)
>>  #define BLKCLOSEZONE	_IOW(0x12, 135, struct blk_zone_range)
>>  #define BLKFINISHZONE	_IOW(0x12, 136, struct blk_zone_range)
>> +#define BLKOFFLINEZONE	_IOW(0x12, 137, struct blk_zone_range)
>>
>>  #endif /* _UAPI_BLKZONED_H */
>>
>


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 1/4] block: Add zone flags to queue zone prop.
  2020-07-02  8:34       ` Javier González
@ 2020-07-02  8:49         ` Damien Le Moal
  -1 siblings, 0 replies; 38+ messages in thread
From: Damien Le Moal @ 2020-07-02  8:49 UTC (permalink / raw)
  To: Javier González
  Cc: linux-nvme, linux-block, hch, kbusch, sagi, axboe, mb,
	SelvaKumar S, Kanchan Joshi, Nitesh Shetty

On 2020/07/02 17:34, Javier González wrote:
> On 02.07.2020 07:54, Damien Le Moal wrote:
>> On 2020/07/02 15:55, Javier González wrote:
>>> From: Javier González <javier.gonz@samsung.com>
>>>
>>> As the zoned block device will have to deal with features that are
>>> optional for the backend device, add a flag field to inform the block
>>> layer about supported features. This builds on top of
>>> blk_zone_report_flags and extendes to the zone report code.
>>>
>>> Signed-off-by: Javier González <javier.gonz@samsung.com>
>>> Signed-off-by: SelvaKumar S <selvakuma.s1@samsung.com>
>>> Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
>>> Signed-off-by: Nitesh Shetty <nj.shetty@samsung.com>
>>> ---
>>>  block/blk-zoned.c              | 3 ++-
>>>  drivers/block/null_blk_zoned.c | 2 ++
>>>  drivers/nvme/host/zns.c        | 1 +
>>>  drivers/scsi/sd.c              | 2 ++
>>>  include/linux/blkdev.h         | 3 +++
>>>  5 files changed, 10 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/block/blk-zoned.c b/block/blk-zoned.c
>>> index 81152a260354..0f156e96e48f 100644
>>> --- a/block/blk-zoned.c
>>> +++ b/block/blk-zoned.c
>>> @@ -312,7 +312,8 @@ int blkdev_report_zones_ioctl(struct block_device *bdev, fmode_t mode,
>>>  		return ret;
>>>
>>>  	rep.nr_zones = ret;
>>> -	rep.flags = BLK_ZONE_REP_CAPACITY;
>>> +	rep.flags = q->zone_flags;
>>
>> zone_flags seem to be a fairly generic flags field while rep.flags is only about
>> the reported descriptors structure. So you may want to define a
>> BLK_ZONE_REP_FLAGS_MASK and have:
>>
>> 	rep.flags = q->zone_flags & BLK_ZONE_REP_FLAGS_MASK;
>>
>> to avoid flags meaningless for the user being set.
>>
>> In any case, since *all* zoned block devices now report capacity, I do not
>> really see the point to add BLK_ZONE_REP_FLAGS_MASK to q->zone_flags, especially
>> considering that you set the flag for all zoned device types, including scsi
>> which does not have zone capacity. This makes q->zone_flags rather confusing
>> instead of clearly defining the device features as you mentioned in the commit
>> message.
>>
>> I think it may be better to just drop this patch, and if needed, introduce the
>> zone_flags field where it may be needed (e.g. OFFLINE zone ioctl support).
> 
> I am using this as a way to pass the OFFLINE support flag to the block
> layer. I used this too for the attributes. Are you thinking of a
> different way to send this?
> 
> I believe this fits too for capacity, but we can just pass it in
> all report as before if you prefer.

The point is that this patch as is does nothing and is needed as a preparatory
patch if we want to have the offline flag set in the report. But:
1) As commented in the offline ioctl patch, I am not sure the flag makes a lot
of sense. sysfs or nothing at all may be OK as well. When we introduced the new
open/close/finish ioctls, we did not add flags to signal that the device
supports them. Granted, it was for nullblk and scsi, and both had the support.
But running an application using these on an old kernel, and you will get
-ENOTTY, meaning, not supported. So simply introducing the offline ioctl without
any flag would be OK I think.

Since DM support for this would be nice too and we now are in a situation where
not all devices support the  ioctl, instead of a report flag (a little out of
place), a queue limit exported through sysfs may be a cleaner way to both
support DM correctly (stacked limits) and signal the support to the user. If you
choose this approach, then this patch is not needed. The zoned_flags, or a
regular queue flag like for RESET_ALL can be added in the offline ioctl patch.

> 
>>
>>> +
>>>  	if (copy_to_user(argp, &rep, sizeof(struct blk_zone_report)))
>>>  		return -EFAULT;
>>>  	return 0;
>>> diff --git a/drivers/block/null_blk_zoned.c b/drivers/block/null_blk_zoned.c
>>> index b05832eb21b2..957c2103f240 100644
>>> --- a/drivers/block/null_blk_zoned.c
>>> +++ b/drivers/block/null_blk_zoned.c
>>> @@ -78,6 +78,8 @@ int null_init_zoned_dev(struct nullb_device *dev, struct request_queue *q)
>>>  	}
>>>
>>>  	q->limits.zoned = BLK_ZONED_HM;
>>> +	q->zone_flags = BLK_ZONE_REP_CAPACITY;
>>> +
>>>  	blk_queue_flag_set(QUEUE_FLAG_ZONE_RESETALL, q);
>>>  	blk_queue_required_elevator_features(q, ELEVATOR_F_ZBD_SEQ_WRITE);
>>>
>>> diff --git a/drivers/nvme/host/zns.c b/drivers/nvme/host/zns.c
>>> index 0642d3c54e8f..888264261ba3 100644
>>> --- a/drivers/nvme/host/zns.c
>>> +++ b/drivers/nvme/host/zns.c
>>> @@ -81,6 +81,7 @@ int nvme_update_zone_info(struct gendisk *disk, struct nvme_ns *ns,
>>>  	}
>>>
>>>  	q->limits.zoned = BLK_ZONED_HM;
>>> +	q->zone_flags = BLK_ZONE_REP_CAPACITY;
>>>  	blk_queue_flag_set(QUEUE_FLAG_ZONE_RESETALL, q);
>>>  free_data:
>>>  	kfree(id);
>>> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
>>> index d90fefffe31b..b9c920bace28 100644
>>> --- a/drivers/scsi/sd.c
>>> +++ b/drivers/scsi/sd.c
>>> @@ -2967,6 +2967,7 @@ static void sd_read_block_characteristics(struct scsi_disk *sdkp)
>>>  	if (sdkp->device->type == TYPE_ZBC) {
>>>  		/* Host-managed */
>>>  		q->limits.zoned = BLK_ZONED_HM;
>>> +		q->zone_flags = BLK_ZONE_REP_CAPACITY;
>>>  	} else {
>>>  		sdkp->zoned = (buffer[8] >> 4) & 3;
>>>  		if (sdkp->zoned == 1 && !disk_has_partitions(sdkp->disk)) {
>>> @@ -2983,6 +2984,7 @@ static void sd_read_block_characteristics(struct scsi_disk *sdkp)
>>>  					  "Drive-managed SMR disk\n");
>>>  		}
>>>  	}
>>> +
>>>  	if (blk_queue_is_zoned(q) && sdkp->first_scan)
>>>  		sd_printk(KERN_NOTICE, sdkp, "Host-%s zoned block device\n",
>>>  		      q->limits.zoned == BLK_ZONED_HM ? "managed" : "aware");
>>> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
>>> index 8fd900998b4e..3f2e3425fa53 100644
>>> --- a/include/linux/blkdev.h
>>> +++ b/include/linux/blkdev.h
>>> @@ -512,12 +512,15 @@ struct request_queue {
>>>  	 * Stacking drivers (device mappers) may or may not initialize
>>>  	 * these fields.
>>>  	 *
>>> +	 * Flags represent features as described by blk_zone_report_flags in blkzoned.h
>>> +	 *
>>>  	 * Reads of this information must be protected with blk_queue_enter() /
>>>  	 * blk_queue_exit(). Modifying this information is only allowed while
>>>  	 * no requests are being processed. See also blk_mq_freeze_queue() and
>>>  	 * blk_mq_unfreeze_queue().
>>>  	 */
>>>  	unsigned int		nr_zones;
>>> +	unsigned int		zone_flags;
>>>  	unsigned long		*conv_zones_bitmap;
>>>  	unsigned long		*seq_zones_wlock;
>>>  #endif /* CONFIG_BLK_DEV_ZONED */
>>>
>>
>> And you are missing device-mapper support. DM target devices have a request
>> queue that would need to set the zone_flags too.
> 
> Ok. I looked at it and I thought that this would be inherited by the
> underlying device. I will add it in V3.
> 
> Javier
> 
> 


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 1/4] block: Add zone flags to queue zone prop.
@ 2020-07-02  8:49         ` Damien Le Moal
  0 siblings, 0 replies; 38+ messages in thread
From: Damien Le Moal @ 2020-07-02  8:49 UTC (permalink / raw)
  To: Javier González
  Cc: axboe, SelvaKumar S, sagi, Kanchan Joshi, linux-nvme,
	Nitesh Shetty, linux-block, kbusch, mb, hch

On 2020/07/02 17:34, Javier González wrote:
> On 02.07.2020 07:54, Damien Le Moal wrote:
>> On 2020/07/02 15:55, Javier González wrote:
>>> From: Javier González <javier.gonz@samsung.com>
>>>
>>> As the zoned block device will have to deal with features that are
>>> optional for the backend device, add a flag field to inform the block
>>> layer about supported features. This builds on top of
>>> blk_zone_report_flags and extendes to the zone report code.
>>>
>>> Signed-off-by: Javier González <javier.gonz@samsung.com>
>>> Signed-off-by: SelvaKumar S <selvakuma.s1@samsung.com>
>>> Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
>>> Signed-off-by: Nitesh Shetty <nj.shetty@samsung.com>
>>> ---
>>>  block/blk-zoned.c              | 3 ++-
>>>  drivers/block/null_blk_zoned.c | 2 ++
>>>  drivers/nvme/host/zns.c        | 1 +
>>>  drivers/scsi/sd.c              | 2 ++
>>>  include/linux/blkdev.h         | 3 +++
>>>  5 files changed, 10 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/block/blk-zoned.c b/block/blk-zoned.c
>>> index 81152a260354..0f156e96e48f 100644
>>> --- a/block/blk-zoned.c
>>> +++ b/block/blk-zoned.c
>>> @@ -312,7 +312,8 @@ int blkdev_report_zones_ioctl(struct block_device *bdev, fmode_t mode,
>>>  		return ret;
>>>
>>>  	rep.nr_zones = ret;
>>> -	rep.flags = BLK_ZONE_REP_CAPACITY;
>>> +	rep.flags = q->zone_flags;
>>
>> zone_flags seem to be a fairly generic flags field while rep.flags is only about
>> the reported descriptors structure. So you may want to define a
>> BLK_ZONE_REP_FLAGS_MASK and have:
>>
>> 	rep.flags = q->zone_flags & BLK_ZONE_REP_FLAGS_MASK;
>>
>> to avoid flags meaningless for the user being set.
>>
>> In any case, since *all* zoned block devices now report capacity, I do not
>> really see the point to add BLK_ZONE_REP_FLAGS_MASK to q->zone_flags, especially
>> considering that you set the flag for all zoned device types, including scsi
>> which does not have zone capacity. This makes q->zone_flags rather confusing
>> instead of clearly defining the device features as you mentioned in the commit
>> message.
>>
>> I think it may be better to just drop this patch, and if needed, introduce the
>> zone_flags field where it may be needed (e.g. OFFLINE zone ioctl support).
> 
> I am using this as a way to pass the OFFLINE support flag to the block
> layer. I used this too for the attributes. Are you thinking of a
> different way to send this?
> 
> I believe this fits too for capacity, but we can just pass it in
> all report as before if you prefer.

The point is that this patch as is does nothing and is needed as a preparatory
patch if we want to have the offline flag set in the report. But:
1) As commented in the offline ioctl patch, I am not sure the flag makes a lot
of sense. sysfs or nothing at all may be OK as well. When we introduced the new
open/close/finish ioctls, we did not add flags to signal that the device
supports them. Granted, it was for nullblk and scsi, and both had the support.
But running an application using these on an old kernel, and you will get
-ENOTTY, meaning, not supported. So simply introducing the offline ioctl without
any flag would be OK I think.

Since DM support for this would be nice too and we now are in a situation where
not all devices support the  ioctl, instead of a report flag (a little out of
place), a queue limit exported through sysfs may be a cleaner way to both
support DM correctly (stacked limits) and signal the support to the user. If you
choose this approach, then this patch is not needed. The zoned_flags, or a
regular queue flag like for RESET_ALL can be added in the offline ioctl patch.

> 
>>
>>> +
>>>  	if (copy_to_user(argp, &rep, sizeof(struct blk_zone_report)))
>>>  		return -EFAULT;
>>>  	return 0;
>>> diff --git a/drivers/block/null_blk_zoned.c b/drivers/block/null_blk_zoned.c
>>> index b05832eb21b2..957c2103f240 100644
>>> --- a/drivers/block/null_blk_zoned.c
>>> +++ b/drivers/block/null_blk_zoned.c
>>> @@ -78,6 +78,8 @@ int null_init_zoned_dev(struct nullb_device *dev, struct request_queue *q)
>>>  	}
>>>
>>>  	q->limits.zoned = BLK_ZONED_HM;
>>> +	q->zone_flags = BLK_ZONE_REP_CAPACITY;
>>> +
>>>  	blk_queue_flag_set(QUEUE_FLAG_ZONE_RESETALL, q);
>>>  	blk_queue_required_elevator_features(q, ELEVATOR_F_ZBD_SEQ_WRITE);
>>>
>>> diff --git a/drivers/nvme/host/zns.c b/drivers/nvme/host/zns.c
>>> index 0642d3c54e8f..888264261ba3 100644
>>> --- a/drivers/nvme/host/zns.c
>>> +++ b/drivers/nvme/host/zns.c
>>> @@ -81,6 +81,7 @@ int nvme_update_zone_info(struct gendisk *disk, struct nvme_ns *ns,
>>>  	}
>>>
>>>  	q->limits.zoned = BLK_ZONED_HM;
>>> +	q->zone_flags = BLK_ZONE_REP_CAPACITY;
>>>  	blk_queue_flag_set(QUEUE_FLAG_ZONE_RESETALL, q);
>>>  free_data:
>>>  	kfree(id);
>>> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
>>> index d90fefffe31b..b9c920bace28 100644
>>> --- a/drivers/scsi/sd.c
>>> +++ b/drivers/scsi/sd.c
>>> @@ -2967,6 +2967,7 @@ static void sd_read_block_characteristics(struct scsi_disk *sdkp)
>>>  	if (sdkp->device->type == TYPE_ZBC) {
>>>  		/* Host-managed */
>>>  		q->limits.zoned = BLK_ZONED_HM;
>>> +		q->zone_flags = BLK_ZONE_REP_CAPACITY;
>>>  	} else {
>>>  		sdkp->zoned = (buffer[8] >> 4) & 3;
>>>  		if (sdkp->zoned == 1 && !disk_has_partitions(sdkp->disk)) {
>>> @@ -2983,6 +2984,7 @@ static void sd_read_block_characteristics(struct scsi_disk *sdkp)
>>>  					  "Drive-managed SMR disk\n");
>>>  		}
>>>  	}
>>> +
>>>  	if (blk_queue_is_zoned(q) && sdkp->first_scan)
>>>  		sd_printk(KERN_NOTICE, sdkp, "Host-%s zoned block device\n",
>>>  		      q->limits.zoned == BLK_ZONED_HM ? "managed" : "aware");
>>> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
>>> index 8fd900998b4e..3f2e3425fa53 100644
>>> --- a/include/linux/blkdev.h
>>> +++ b/include/linux/blkdev.h
>>> @@ -512,12 +512,15 @@ struct request_queue {
>>>  	 * Stacking drivers (device mappers) may or may not initialize
>>>  	 * these fields.
>>>  	 *
>>> +	 * Flags represent features as described by blk_zone_report_flags in blkzoned.h
>>> +	 *
>>>  	 * Reads of this information must be protected with blk_queue_enter() /
>>>  	 * blk_queue_exit(). Modifying this information is only allowed while
>>>  	 * no requests are being processed. See also blk_mq_freeze_queue() and
>>>  	 * blk_mq_unfreeze_queue().
>>>  	 */
>>>  	unsigned int		nr_zones;
>>> +	unsigned int		zone_flags;
>>>  	unsigned long		*conv_zones_bitmap;
>>>  	unsigned long		*seq_zones_wlock;
>>>  #endif /* CONFIG_BLK_DEV_ZONED */
>>>
>>
>> And you are missing device-mapper support. DM target devices have a request
>> queue that would need to set the zone_flags too.
> 
> Ok. I looked at it and I thought that this would be inherited by the
> underlying device. I will add it in V3.
> 
> Javier
> 
> 


-- 
Damien Le Moal
Western Digital Research

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 2/4] block: add support for zone offline transition
  2020-07-02  8:39       ` Javier González
@ 2020-07-02  8:52         ` Damien Le Moal
  -1 siblings, 0 replies; 38+ messages in thread
From: Damien Le Moal @ 2020-07-02  8:52 UTC (permalink / raw)
  To: Javier González
  Cc: linux-nvme, linux-block, hch, kbusch, sagi, axboe, mb,
	SelvaKumar S, Kanchan Joshi, Nitesh Shetty

On 2020/07/02 17:40, Javier González wrote:
> On 02.07.2020 08:10, Damien Le Moal wrote:
>> On 2020/07/02 15:55, Javier González wrote:
>>> From: Javier González <javier.gonz@samsung.com>
>>>
>>> Add support for offline transition on the zoned block device. Use the
>>> existing feature flags for the underlying driver to report support for
>>> the feature, as currently this transition is only supported in ZNS and
>>> not in ZAC/ZBC
>>>
>>> Signed-off-by: Javier González <javier.gonz@samsung.com>
>>> Signed-off-by: SelvaKumar S <selvakuma.s1@samsung.com>
>>> Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
>>> Signed-off-by: Nitesh Shetty <nj.shetty@samsung.com>
>>> ---
>>>  block/blk-core.c              | 2 ++
>>>  block/blk-zoned.c             | 8 +++++++-
>>>  drivers/nvme/host/core.c      | 3 +++
>>>  drivers/nvme/host/zns.c       | 2 +-
>>>  include/linux/blk_types.h     | 3 +++
>>>  include/linux/blkdev.h        | 1 -
>>>  include/uapi/linux/blkzoned.h | 3 +++
>>>  7 files changed, 19 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/block/blk-core.c b/block/blk-core.c
>>> index 03252af8c82c..589cbdacc5ec 100644
>>> --- a/block/blk-core.c
>>> +++ b/block/blk-core.c
>>> @@ -140,6 +140,7 @@ static const char *const blk_op_name[] = {
>>>  	REQ_OP_NAME(ZONE_CLOSE),
>>>  	REQ_OP_NAME(ZONE_FINISH),
>>>  	REQ_OP_NAME(ZONE_APPEND),
>>> +	REQ_OP_NAME(ZONE_OFFLINE),
>>>  	REQ_OP_NAME(WRITE_SAME),
>>>  	REQ_OP_NAME(WRITE_ZEROES),
>>>  	REQ_OP_NAME(SCSI_IN),
>>> @@ -1030,6 +1031,7 @@ generic_make_request_checks(struct bio *bio)
>>>  	case REQ_OP_ZONE_OPEN:
>>>  	case REQ_OP_ZONE_CLOSE:
>>>  	case REQ_OP_ZONE_FINISH:
>>> +	case REQ_OP_ZONE_OFFLINE:
>>>  		if (!blk_queue_is_zoned(q))
>>>  			goto not_supported;
>>>  		break;
>>> diff --git a/block/blk-zoned.c b/block/blk-zoned.c
>>> index 0f156e96e48f..b97f67f462b4 100644
>>> --- a/block/blk-zoned.c
>>> +++ b/block/blk-zoned.c
>>> @@ -320,7 +320,8 @@ int blkdev_report_zones_ioctl(struct block_device *bdev, fmode_t mode,
>>>  }
>>>
>>>  /*
>>> - * BLKRESETZONE, BLKOPENZONE, BLKCLOSEZONE and BLKFINISHZONE ioctl processing.
>>> + * BLKRESETZONE, BLKOPENZONE, BLKCLOSEZONE, BLKFINISHZONE and BLKOFFLINEZONE
>>> + * ioctl processing.
>>>   * Called from blkdev_ioctl.
>>>   */
>>>  int blkdev_zone_mgmt_ioctl(struct block_device *bdev, fmode_t mode,
>>> @@ -363,6 +364,11 @@ int blkdev_zone_mgmt_ioctl(struct block_device *bdev, fmode_t mode,
>>>  	case BLKFINISHZONE:
>>>  		op = REQ_OP_ZONE_FINISH;
>>>  		break;
>>> +	case BLKOFFLINEZONE:
>>> +		if (!(q->zone_flags & BLK_ZONE_REP_OFFLINE))
>>> +			return -EINVAL;
>>
>> return -ENOTTY here.
>>
>> That is the error returned for regular block devices when a zone ioctl is
>> received, indicating the lack of support for these ioctls. Since this is also a
>> lack  of support by the device here too, we may as well keep the same error
>> code. Returning -EINVAL should be reserved for cases where the device can accept
>> the ioctl but start sector or number of sectors is invalid.
> 
> Ok.
> 
>>
>>
>>> +		op = REQ_OP_ZONE_OFFLINE;
>>> +		break;
>>>  	default:
>>>  		return -ENOTTY;
>>>  	}
>>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>>> index e5f754889234..1f5c7fc3d2c9 100644
>>> --- a/drivers/nvme/host/core.c
>>> +++ b/drivers/nvme/host/core.c
>>> @@ -776,6 +776,9 @@ blk_status_t nvme_setup_cmd(struct nvme_ns *ns, struct request *req,
>>>  	case REQ_OP_ZONE_FINISH:
>>>  		ret = nvme_setup_zone_mgmt_send(ns, req, cmd, NVME_ZONE_FINISH);
>>>  		break;
>>> +	case REQ_OP_ZONE_OFFLINE:
>>> +		ret = nvme_setup_zone_mgmt_send(ns, req, cmd, NVME_ZONE_OFFLINE);
>>> +		break;
>>>  	case REQ_OP_WRITE_ZEROES:
>>>  		ret = nvme_setup_write_zeroes(ns, req, cmd);
>>>  		break;
>>> diff --git a/drivers/nvme/host/zns.c b/drivers/nvme/host/zns.c
>>> index 888264261ba3..b34d2ed13825 100644
>>> --- a/drivers/nvme/host/zns.c
>>> +++ b/drivers/nvme/host/zns.c
>>> @@ -81,7 +81,7 @@ int nvme_update_zone_info(struct gendisk *disk, struct nvme_ns *ns,
>>>  	}
>>>
>>>  	q->limits.zoned = BLK_ZONED_HM;
>>> -	q->zone_flags = BLK_ZONE_REP_CAPACITY;
>>> +	q->zone_flags = BLK_ZONE_REP_CAPACITY | BLK_ZONE_REP_OFFLINE;
>>
>> The name BLK_ZONE_REP_OFFLINE is not ideal.  This flag is not about if offline
>> condition will be reported or not. It is about the drive supporting an explicit
>> offlining zone operation.
> 
> I wanted to follow the same convention. I can change the name in the
> same enum for the report flags.
> 
>>
>>>  	blk_queue_flag_set(QUEUE_FLAG_ZONE_RESETALL, q);
>>>  free_data:
>>>  	kfree(id);
>>> diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
>>> index ccb895f911b1..c0123c643e2f 100644
>>> --- a/include/linux/blk_types.h
>>> +++ b/include/linux/blk_types.h
>>> @@ -316,6 +316,8 @@ enum req_opf {
>>>  	REQ_OP_ZONE_FINISH	= 12,
>>>  	/* write data at the current zone write pointer */
>>>  	REQ_OP_ZONE_APPEND	= 13,
>>> +	/* Transition a zone to offline */
>>> +	REQ_OP_ZONE_OFFLINE	= 14,
>>>
>>>  	/* SCSI passthrough using struct scsi_request */
>>>  	REQ_OP_SCSI_IN		= 32,
>>> @@ -455,6 +457,7 @@ static inline bool op_is_zone_mgmt(enum req_opf op)
>>>  	case REQ_OP_ZONE_OPEN:
>>>  	case REQ_OP_ZONE_CLOSE:
>>>  	case REQ_OP_ZONE_FINISH:
>>> +	case REQ_OP_ZONE_OFFLINE:
>>>  		return true;
>>>  	default:
>>>  		return false;
>>> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
>>> index 3f2e3425fa53..e489b646486d 100644
>>> --- a/include/linux/blkdev.h
>>> +++ b/include/linux/blkdev.h
>>> @@ -370,7 +370,6 @@ extern int blkdev_report_zones_ioctl(struct block_device *bdev, fmode_t mode,
>>>  				     unsigned int cmd, unsigned long arg);
>>>  extern int blkdev_zone_mgmt_ioctl(struct block_device *bdev, fmode_t mode,
>>>  				  unsigned int cmd, unsigned long arg);
>>> -
>>>  #else /* CONFIG_BLK_DEV_ZONED */
>>>
>>>  static inline unsigned int blkdev_nr_zones(struct gendisk *disk)
>>> diff --git a/include/uapi/linux/blkzoned.h b/include/uapi/linux/blkzoned.h
>>> index 42c3366cc25f..e5adf4a9f4b0 100644
>>> --- a/include/uapi/linux/blkzoned.h
>>> +++ b/include/uapi/linux/blkzoned.h
>>> @@ -77,9 +77,11 @@ enum blk_zone_cond {
>>>   * enum blk_zone_report_flags - Feature flags of reported zone descriptors.
>>>   *
>>>   * @BLK_ZONE_REP_CAPACITY: Zone descriptor has capacity field.
>>> + * @BLK_ZONE_REP_OFFLINE : Zone device supports offline transition.
>>
>> The device supports explicit zone offline transition
>>
>> Since the implicit transition by the device may happen, even on SMR disks.
>>
>> But I am not sure this flags is very useful. Or rather, isn't it out of place
>> here ? Device features are normally reported through sysfs (e.g. discard, etc).
>> It is certainly confusing and not matching the user doc for rep.flag which
>> states that the flags are about the zone descriptors, not what the device can
>> do. So at the very least, the comments need to change.
>>
>> The other thing is that the implementation does not consider device mapper case
>> again: if a DM target is built on one or more ZNS drives all supporting zone
>> offline, then the target should be allowed to report zone offline support too,
>> no ? dm-linear and dm-flakey certainly should be allowed to do that. Exporting a
>> "zone_offline" (or something like named that) sysfs limit would allow that to be
>> supported easily through limit stacking and avoid the need for the report flag.
> 
> I can add that too. I left it out as I did not add any implementation on
> top of it for the device mapper itself. If this is the way that it makes
> sense, then we can add it to the different device mappers later on.

If the capability is advertised as a queue limit and the generic limit stacking
function adjusted, you get DM support for free and will not have to add anything
to targets supporting zoned block devices.

> 
>>
>> Happy to here others opinion about this one though.
>>
>>>   */
>>>  enum blk_zone_report_flags {
>>>  	BLK_ZONE_REP_CAPACITY	= (1 << 0),
>>> +	BLK_ZONE_REP_OFFLINE	= (1 << 1),
>>>  };
>>>
>>>  /**
>>> @@ -166,5 +168,6 @@ struct blk_zone_range {
>>>  #define BLKOPENZONE	_IOW(0x12, 134, struct blk_zone_range)
>>>  #define BLKCLOSEZONE	_IOW(0x12, 135, struct blk_zone_range)
>>>  #define BLKFINISHZONE	_IOW(0x12, 136, struct blk_zone_range)
>>> +#define BLKOFFLINEZONE	_IOW(0x12, 137, struct blk_zone_range)
>>>
>>>  #endif /* _UAPI_BLKZONED_H */
>>>
>>
> 
> 


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 2/4] block: add support for zone offline transition
@ 2020-07-02  8:52         ` Damien Le Moal
  0 siblings, 0 replies; 38+ messages in thread
From: Damien Le Moal @ 2020-07-02  8:52 UTC (permalink / raw)
  To: Javier González
  Cc: axboe, SelvaKumar S, sagi, Kanchan Joshi, linux-nvme,
	Nitesh Shetty, linux-block, kbusch, mb, hch

On 2020/07/02 17:40, Javier González wrote:
> On 02.07.2020 08:10, Damien Le Moal wrote:
>> On 2020/07/02 15:55, Javier González wrote:
>>> From: Javier González <javier.gonz@samsung.com>
>>>
>>> Add support for offline transition on the zoned block device. Use the
>>> existing feature flags for the underlying driver to report support for
>>> the feature, as currently this transition is only supported in ZNS and
>>> not in ZAC/ZBC
>>>
>>> Signed-off-by: Javier González <javier.gonz@samsung.com>
>>> Signed-off-by: SelvaKumar S <selvakuma.s1@samsung.com>
>>> Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
>>> Signed-off-by: Nitesh Shetty <nj.shetty@samsung.com>
>>> ---
>>>  block/blk-core.c              | 2 ++
>>>  block/blk-zoned.c             | 8 +++++++-
>>>  drivers/nvme/host/core.c      | 3 +++
>>>  drivers/nvme/host/zns.c       | 2 +-
>>>  include/linux/blk_types.h     | 3 +++
>>>  include/linux/blkdev.h        | 1 -
>>>  include/uapi/linux/blkzoned.h | 3 +++
>>>  7 files changed, 19 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/block/blk-core.c b/block/blk-core.c
>>> index 03252af8c82c..589cbdacc5ec 100644
>>> --- a/block/blk-core.c
>>> +++ b/block/blk-core.c
>>> @@ -140,6 +140,7 @@ static const char *const blk_op_name[] = {
>>>  	REQ_OP_NAME(ZONE_CLOSE),
>>>  	REQ_OP_NAME(ZONE_FINISH),
>>>  	REQ_OP_NAME(ZONE_APPEND),
>>> +	REQ_OP_NAME(ZONE_OFFLINE),
>>>  	REQ_OP_NAME(WRITE_SAME),
>>>  	REQ_OP_NAME(WRITE_ZEROES),
>>>  	REQ_OP_NAME(SCSI_IN),
>>> @@ -1030,6 +1031,7 @@ generic_make_request_checks(struct bio *bio)
>>>  	case REQ_OP_ZONE_OPEN:
>>>  	case REQ_OP_ZONE_CLOSE:
>>>  	case REQ_OP_ZONE_FINISH:
>>> +	case REQ_OP_ZONE_OFFLINE:
>>>  		if (!blk_queue_is_zoned(q))
>>>  			goto not_supported;
>>>  		break;
>>> diff --git a/block/blk-zoned.c b/block/blk-zoned.c
>>> index 0f156e96e48f..b97f67f462b4 100644
>>> --- a/block/blk-zoned.c
>>> +++ b/block/blk-zoned.c
>>> @@ -320,7 +320,8 @@ int blkdev_report_zones_ioctl(struct block_device *bdev, fmode_t mode,
>>>  }
>>>
>>>  /*
>>> - * BLKRESETZONE, BLKOPENZONE, BLKCLOSEZONE and BLKFINISHZONE ioctl processing.
>>> + * BLKRESETZONE, BLKOPENZONE, BLKCLOSEZONE, BLKFINISHZONE and BLKOFFLINEZONE
>>> + * ioctl processing.
>>>   * Called from blkdev_ioctl.
>>>   */
>>>  int blkdev_zone_mgmt_ioctl(struct block_device *bdev, fmode_t mode,
>>> @@ -363,6 +364,11 @@ int blkdev_zone_mgmt_ioctl(struct block_device *bdev, fmode_t mode,
>>>  	case BLKFINISHZONE:
>>>  		op = REQ_OP_ZONE_FINISH;
>>>  		break;
>>> +	case BLKOFFLINEZONE:
>>> +		if (!(q->zone_flags & BLK_ZONE_REP_OFFLINE))
>>> +			return -EINVAL;
>>
>> return -ENOTTY here.
>>
>> That is the error returned for regular block devices when a zone ioctl is
>> received, indicating the lack of support for these ioctls. Since this is also a
>> lack  of support by the device here too, we may as well keep the same error
>> code. Returning -EINVAL should be reserved for cases where the device can accept
>> the ioctl but start sector or number of sectors is invalid.
> 
> Ok.
> 
>>
>>
>>> +		op = REQ_OP_ZONE_OFFLINE;
>>> +		break;
>>>  	default:
>>>  		return -ENOTTY;
>>>  	}
>>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>>> index e5f754889234..1f5c7fc3d2c9 100644
>>> --- a/drivers/nvme/host/core.c
>>> +++ b/drivers/nvme/host/core.c
>>> @@ -776,6 +776,9 @@ blk_status_t nvme_setup_cmd(struct nvme_ns *ns, struct request *req,
>>>  	case REQ_OP_ZONE_FINISH:
>>>  		ret = nvme_setup_zone_mgmt_send(ns, req, cmd, NVME_ZONE_FINISH);
>>>  		break;
>>> +	case REQ_OP_ZONE_OFFLINE:
>>> +		ret = nvme_setup_zone_mgmt_send(ns, req, cmd, NVME_ZONE_OFFLINE);
>>> +		break;
>>>  	case REQ_OP_WRITE_ZEROES:
>>>  		ret = nvme_setup_write_zeroes(ns, req, cmd);
>>>  		break;
>>> diff --git a/drivers/nvme/host/zns.c b/drivers/nvme/host/zns.c
>>> index 888264261ba3..b34d2ed13825 100644
>>> --- a/drivers/nvme/host/zns.c
>>> +++ b/drivers/nvme/host/zns.c
>>> @@ -81,7 +81,7 @@ int nvme_update_zone_info(struct gendisk *disk, struct nvme_ns *ns,
>>>  	}
>>>
>>>  	q->limits.zoned = BLK_ZONED_HM;
>>> -	q->zone_flags = BLK_ZONE_REP_CAPACITY;
>>> +	q->zone_flags = BLK_ZONE_REP_CAPACITY | BLK_ZONE_REP_OFFLINE;
>>
>> The name BLK_ZONE_REP_OFFLINE is not ideal.  This flag is not about if offline
>> condition will be reported or not. It is about the drive supporting an explicit
>> offlining zone operation.
> 
> I wanted to follow the same convention. I can change the name in the
> same enum for the report flags.
> 
>>
>>>  	blk_queue_flag_set(QUEUE_FLAG_ZONE_RESETALL, q);
>>>  free_data:
>>>  	kfree(id);
>>> diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
>>> index ccb895f911b1..c0123c643e2f 100644
>>> --- a/include/linux/blk_types.h
>>> +++ b/include/linux/blk_types.h
>>> @@ -316,6 +316,8 @@ enum req_opf {
>>>  	REQ_OP_ZONE_FINISH	= 12,
>>>  	/* write data at the current zone write pointer */
>>>  	REQ_OP_ZONE_APPEND	= 13,
>>> +	/* Transition a zone to offline */
>>> +	REQ_OP_ZONE_OFFLINE	= 14,
>>>
>>>  	/* SCSI passthrough using struct scsi_request */
>>>  	REQ_OP_SCSI_IN		= 32,
>>> @@ -455,6 +457,7 @@ static inline bool op_is_zone_mgmt(enum req_opf op)
>>>  	case REQ_OP_ZONE_OPEN:
>>>  	case REQ_OP_ZONE_CLOSE:
>>>  	case REQ_OP_ZONE_FINISH:
>>> +	case REQ_OP_ZONE_OFFLINE:
>>>  		return true;
>>>  	default:
>>>  		return false;
>>> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
>>> index 3f2e3425fa53..e489b646486d 100644
>>> --- a/include/linux/blkdev.h
>>> +++ b/include/linux/blkdev.h
>>> @@ -370,7 +370,6 @@ extern int blkdev_report_zones_ioctl(struct block_device *bdev, fmode_t mode,
>>>  				     unsigned int cmd, unsigned long arg);
>>>  extern int blkdev_zone_mgmt_ioctl(struct block_device *bdev, fmode_t mode,
>>>  				  unsigned int cmd, unsigned long arg);
>>> -
>>>  #else /* CONFIG_BLK_DEV_ZONED */
>>>
>>>  static inline unsigned int blkdev_nr_zones(struct gendisk *disk)
>>> diff --git a/include/uapi/linux/blkzoned.h b/include/uapi/linux/blkzoned.h
>>> index 42c3366cc25f..e5adf4a9f4b0 100644
>>> --- a/include/uapi/linux/blkzoned.h
>>> +++ b/include/uapi/linux/blkzoned.h
>>> @@ -77,9 +77,11 @@ enum blk_zone_cond {
>>>   * enum blk_zone_report_flags - Feature flags of reported zone descriptors.
>>>   *
>>>   * @BLK_ZONE_REP_CAPACITY: Zone descriptor has capacity field.
>>> + * @BLK_ZONE_REP_OFFLINE : Zone device supports offline transition.
>>
>> The device supports explicit zone offline transition
>>
>> Since the implicit transition by the device may happen, even on SMR disks.
>>
>> But I am not sure this flags is very useful. Or rather, isn't it out of place
>> here ? Device features are normally reported through sysfs (e.g. discard, etc).
>> It is certainly confusing and not matching the user doc for rep.flag which
>> states that the flags are about the zone descriptors, not what the device can
>> do. So at the very least, the comments need to change.
>>
>> The other thing is that the implementation does not consider device mapper case
>> again: if a DM target is built on one or more ZNS drives all supporting zone
>> offline, then the target should be allowed to report zone offline support too,
>> no ? dm-linear and dm-flakey certainly should be allowed to do that. Exporting a
>> "zone_offline" (or something like named that) sysfs limit would allow that to be
>> supported easily through limit stacking and avoid the need for the report flag.
> 
> I can add that too. I left it out as I did not add any implementation on
> top of it for the device mapper itself. If this is the way that it makes
> sense, then we can add it to the different device mappers later on.

If the capability is advertised as a queue limit and the generic limit stacking
function adjusted, you get DM support for free and will not have to add anything
to targets supporting zoned block devices.

> 
>>
>> Happy to here others opinion about this one though.
>>
>>>   */
>>>  enum blk_zone_report_flags {
>>>  	BLK_ZONE_REP_CAPACITY	= (1 << 0),
>>> +	BLK_ZONE_REP_OFFLINE	= (1 << 1),
>>>  };
>>>
>>>  /**
>>> @@ -166,5 +168,6 @@ struct blk_zone_range {
>>>  #define BLKOPENZONE	_IOW(0x12, 134, struct blk_zone_range)
>>>  #define BLKCLOSEZONE	_IOW(0x12, 135, struct blk_zone_range)
>>>  #define BLKFINISHZONE	_IOW(0x12, 136, struct blk_zone_range)
>>> +#define BLKOFFLINEZONE	_IOW(0x12, 137, struct blk_zone_range)
>>>
>>>  #endif /* _UAPI_BLKZONED_H */
>>>
>>
> 
> 


-- 
Damien Le Moal
Western Digital Research

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 1/4] block: Add zone flags to queue zone prop.
  2020-07-02  8:49         ` Damien Le Moal
@ 2020-07-02 10:27           ` Javier González
  -1 siblings, 0 replies; 38+ messages in thread
From: Javier González @ 2020-07-02 10:27 UTC (permalink / raw)
  To: Damien Le Moal
  Cc: linux-nvme, linux-block, hch, kbusch, sagi, axboe, mb,
	SelvaKumar S, Kanchan Joshi, Nitesh Shetty

On 02.07.2020 08:49, Damien Le Moal wrote:
>On 2020/07/02 17:34, Javier González wrote:
>> On 02.07.2020 07:54, Damien Le Moal wrote:
>>> On 2020/07/02 15:55, Javier González wrote:
>>>> From: Javier González <javier.gonz@samsung.com>
>>>>
>>>> As the zoned block device will have to deal with features that are
>>>> optional for the backend device, add a flag field to inform the block
>>>> layer about supported features. This builds on top of
>>>> blk_zone_report_flags and extendes to the zone report code.
>>>>
>>>> Signed-off-by: Javier González <javier.gonz@samsung.com>
>>>> Signed-off-by: SelvaKumar S <selvakuma.s1@samsung.com>
>>>> Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
>>>> Signed-off-by: Nitesh Shetty <nj.shetty@samsung.com>
>>>> ---
>>>>  block/blk-zoned.c              | 3 ++-
>>>>  drivers/block/null_blk_zoned.c | 2 ++
>>>>  drivers/nvme/host/zns.c        | 1 +
>>>>  drivers/scsi/sd.c              | 2 ++
>>>>  include/linux/blkdev.h         | 3 +++
>>>>  5 files changed, 10 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/block/blk-zoned.c b/block/blk-zoned.c
>>>> index 81152a260354..0f156e96e48f 100644
>>>> --- a/block/blk-zoned.c
>>>> +++ b/block/blk-zoned.c
>>>> @@ -312,7 +312,8 @@ int blkdev_report_zones_ioctl(struct block_device *bdev, fmode_t mode,
>>>>  		return ret;
>>>>
>>>>  	rep.nr_zones = ret;
>>>> -	rep.flags = BLK_ZONE_REP_CAPACITY;
>>>> +	rep.flags = q->zone_flags;
>>>
>>> zone_flags seem to be a fairly generic flags field while rep.flags is only about
>>> the reported descriptors structure. So you may want to define a
>>> BLK_ZONE_REP_FLAGS_MASK and have:
>>>
>>> 	rep.flags = q->zone_flags & BLK_ZONE_REP_FLAGS_MASK;
>>>
>>> to avoid flags meaningless for the user being set.
>>>
>>> In any case, since *all* zoned block devices now report capacity, I do not
>>> really see the point to add BLK_ZONE_REP_FLAGS_MASK to q->zone_flags, especially
>>> considering that you set the flag for all zoned device types, including scsi
>>> which does not have zone capacity. This makes q->zone_flags rather confusing
>>> instead of clearly defining the device features as you mentioned in the commit
>>> message.
>>>
>>> I think it may be better to just drop this patch, and if needed, introduce the
>>> zone_flags field where it may be needed (e.g. OFFLINE zone ioctl support).
>>
>> I am using this as a way to pass the OFFLINE support flag to the block
>> layer. I used this too for the attributes. Are you thinking of a
>> different way to send this?
>>
>> I believe this fits too for capacity, but we can just pass it in
>> all report as before if you prefer.
>
>The point is that this patch as is does nothing and is needed as a preparatory
>patch if we want to have the offline flag set in the report. But:
>1) As commented in the offline ioctl patch, I am not sure the flag makes a lot
>of sense. sysfs or nothing at all may be OK as well. When we introduced the new
>open/close/finish ioctls, we did not add flags to signal that the device
>supports them. Granted, it was for nullblk and scsi, and both had the support.
>But running an application using these on an old kernel, and you will get
>-ENOTTY, meaning, not supported. So simply introducing the offline ioctl without
>any flag would be OK I think.

I see. My understanding after some comments from Christoph was that we
should use these bits to signal any optional features / capabilities
that would depend on the underlying driver, just as it is done with the
capacity flag today.

If not for the offline transition, for the attributes, I see it exactly
as the same use case as capacity, where we signal that a new field is
reported in the report structure.

Am I missing something here?

>
>Since DM support for this would be nice too and we now are in a situation where
>not all devices support the  ioctl, instead of a report flag (a little out of
>place), a queue limit exported through sysfs may be a cleaner way to both
>support DM correctly (stacked limits) and signal the support to the user. If you
>choose this approach, then this patch is not needed. The zoned_flags, or a
>regular queue flag like for RESET_ALL can be added in the offline ioctl patch.

I see. If we can reach an agreement that the above is a bad
understanding on my side, I will be happy to translate this into a sysfs
entry. If it is OK, I'll give it this week in the mailing list and send
a V4 next week.

>
>>
>>>
>>>> +
>>>>  	if (copy_to_user(argp, &rep, sizeof(struct blk_zone_report)))
>>>>  		return -EFAULT;
>>>>  	return 0;
>>>> diff --git a/drivers/block/null_blk_zoned.c b/drivers/block/null_blk_zoned.c
>>>> index b05832eb21b2..957c2103f240 100644
>>>> --- a/drivers/block/null_blk_zoned.c
>>>> +++ b/drivers/block/null_blk_zoned.c
>>>> @@ -78,6 +78,8 @@ int null_init_zoned_dev(struct nullb_device *dev, struct request_queue *q)
>>>>  	}
>>>>
>>>>  	q->limits.zoned = BLK_ZONED_HM;
>>>> +	q->zone_flags = BLK_ZONE_REP_CAPACITY;
>>>> +
>>>>  	blk_queue_flag_set(QUEUE_FLAG_ZONE_RESETALL, q);
>>>>  	blk_queue_required_elevator_features(q, ELEVATOR_F_ZBD_SEQ_WRITE);
>>>>
>>>> diff --git a/drivers/nvme/host/zns.c b/drivers/nvme/host/zns.c
>>>> index 0642d3c54e8f..888264261ba3 100644
>>>> --- a/drivers/nvme/host/zns.c
>>>> +++ b/drivers/nvme/host/zns.c
>>>> @@ -81,6 +81,7 @@ int nvme_update_zone_info(struct gendisk *disk, struct nvme_ns *ns,
>>>>  	}
>>>>
>>>>  	q->limits.zoned = BLK_ZONED_HM;
>>>> +	q->zone_flags = BLK_ZONE_REP_CAPACITY;
>>>>  	blk_queue_flag_set(QUEUE_FLAG_ZONE_RESETALL, q);
>>>>  free_data:
>>>>  	kfree(id);
>>>> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
>>>> index d90fefffe31b..b9c920bace28 100644
>>>> --- a/drivers/scsi/sd.c
>>>> +++ b/drivers/scsi/sd.c
>>>> @@ -2967,6 +2967,7 @@ static void sd_read_block_characteristics(struct scsi_disk *sdkp)
>>>>  	if (sdkp->device->type == TYPE_ZBC) {
>>>>  		/* Host-managed */
>>>>  		q->limits.zoned = BLK_ZONED_HM;
>>>> +		q->zone_flags = BLK_ZONE_REP_CAPACITY;
>>>>  	} else {
>>>>  		sdkp->zoned = (buffer[8] >> 4) & 3;
>>>>  		if (sdkp->zoned == 1 && !disk_has_partitions(sdkp->disk)) {
>>>> @@ -2983,6 +2984,7 @@ static void sd_read_block_characteristics(struct scsi_disk *sdkp)
>>>>  					  "Drive-managed SMR disk\n");
>>>>  		}
>>>>  	}
>>>> +
>>>>  	if (blk_queue_is_zoned(q) && sdkp->first_scan)
>>>>  		sd_printk(KERN_NOTICE, sdkp, "Host-%s zoned block device\n",
>>>>  		      q->limits.zoned == BLK_ZONED_HM ? "managed" : "aware");
>>>> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
>>>> index 8fd900998b4e..3f2e3425fa53 100644
>>>> --- a/include/linux/blkdev.h
>>>> +++ b/include/linux/blkdev.h
>>>> @@ -512,12 +512,15 @@ struct request_queue {
>>>>  	 * Stacking drivers (device mappers) may or may not initialize
>>>>  	 * these fields.
>>>>  	 *
>>>> +	 * Flags represent features as described by blk_zone_report_flags in blkzoned.h
>>>> +	 *
>>>>  	 * Reads of this information must be protected with blk_queue_enter() /
>>>>  	 * blk_queue_exit(). Modifying this information is only allowed while
>>>>  	 * no requests are being processed. See also blk_mq_freeze_queue() and
>>>>  	 * blk_mq_unfreeze_queue().
>>>>  	 */
>>>>  	unsigned int		nr_zones;
>>>> +	unsigned int		zone_flags;
>>>>  	unsigned long		*conv_zones_bitmap;
>>>>  	unsigned long		*seq_zones_wlock;
>>>>  #endif /* CONFIG_BLK_DEV_ZONED */
>>>>
>>>
>>> And you are missing device-mapper support. DM target devices have a request
>>> queue that would need to set the zone_flags too.

Yes. As mentioned, I did not want to introduce more changes to this
series, just fix the mistake I made with some added debug code.

>>
>> Ok. I looked at it and I thought that this would be inherited by the
>> underlying device. I will add it in V3.
>>
>> Javier
>>
>>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 1/4] block: Add zone flags to queue zone prop.
@ 2020-07-02 10:27           ` Javier González
  0 siblings, 0 replies; 38+ messages in thread
From: Javier González @ 2020-07-02 10:27 UTC (permalink / raw)
  To: Damien Le Moal
  Cc: axboe, SelvaKumar S, sagi, Kanchan Joshi, linux-nvme,
	Nitesh Shetty, linux-block, kbusch, mb, hch

On 02.07.2020 08:49, Damien Le Moal wrote:
>On 2020/07/02 17:34, Javier González wrote:
>> On 02.07.2020 07:54, Damien Le Moal wrote:
>>> On 2020/07/02 15:55, Javier González wrote:
>>>> From: Javier González <javier.gonz@samsung.com>
>>>>
>>>> As the zoned block device will have to deal with features that are
>>>> optional for the backend device, add a flag field to inform the block
>>>> layer about supported features. This builds on top of
>>>> blk_zone_report_flags and extendes to the zone report code.
>>>>
>>>> Signed-off-by: Javier González <javier.gonz@samsung.com>
>>>> Signed-off-by: SelvaKumar S <selvakuma.s1@samsung.com>
>>>> Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
>>>> Signed-off-by: Nitesh Shetty <nj.shetty@samsung.com>
>>>> ---
>>>>  block/blk-zoned.c              | 3 ++-
>>>>  drivers/block/null_blk_zoned.c | 2 ++
>>>>  drivers/nvme/host/zns.c        | 1 +
>>>>  drivers/scsi/sd.c              | 2 ++
>>>>  include/linux/blkdev.h         | 3 +++
>>>>  5 files changed, 10 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/block/blk-zoned.c b/block/blk-zoned.c
>>>> index 81152a260354..0f156e96e48f 100644
>>>> --- a/block/blk-zoned.c
>>>> +++ b/block/blk-zoned.c
>>>> @@ -312,7 +312,8 @@ int blkdev_report_zones_ioctl(struct block_device *bdev, fmode_t mode,
>>>>  		return ret;
>>>>
>>>>  	rep.nr_zones = ret;
>>>> -	rep.flags = BLK_ZONE_REP_CAPACITY;
>>>> +	rep.flags = q->zone_flags;
>>>
>>> zone_flags seem to be a fairly generic flags field while rep.flags is only about
>>> the reported descriptors structure. So you may want to define a
>>> BLK_ZONE_REP_FLAGS_MASK and have:
>>>
>>> 	rep.flags = q->zone_flags & BLK_ZONE_REP_FLAGS_MASK;
>>>
>>> to avoid flags meaningless for the user being set.
>>>
>>> In any case, since *all* zoned block devices now report capacity, I do not
>>> really see the point to add BLK_ZONE_REP_FLAGS_MASK to q->zone_flags, especially
>>> considering that you set the flag for all zoned device types, including scsi
>>> which does not have zone capacity. This makes q->zone_flags rather confusing
>>> instead of clearly defining the device features as you mentioned in the commit
>>> message.
>>>
>>> I think it may be better to just drop this patch, and if needed, introduce the
>>> zone_flags field where it may be needed (e.g. OFFLINE zone ioctl support).
>>
>> I am using this as a way to pass the OFFLINE support flag to the block
>> layer. I used this too for the attributes. Are you thinking of a
>> different way to send this?
>>
>> I believe this fits too for capacity, but we can just pass it in
>> all report as before if you prefer.
>
>The point is that this patch as is does nothing and is needed as a preparatory
>patch if we want to have the offline flag set in the report. But:
>1) As commented in the offline ioctl patch, I am not sure the flag makes a lot
>of sense. sysfs or nothing at all may be OK as well. When we introduced the new
>open/close/finish ioctls, we did not add flags to signal that the device
>supports them. Granted, it was for nullblk and scsi, and both had the support.
>But running an application using these on an old kernel, and you will get
>-ENOTTY, meaning, not supported. So simply introducing the offline ioctl without
>any flag would be OK I think.

I see. My understanding after some comments from Christoph was that we
should use these bits to signal any optional features / capabilities
that would depend on the underlying driver, just as it is done with the
capacity flag today.

If not for the offline transition, for the attributes, I see it exactly
as the same use case as capacity, where we signal that a new field is
reported in the report structure.

Am I missing something here?

>
>Since DM support for this would be nice too and we now are in a situation where
>not all devices support the  ioctl, instead of a report flag (a little out of
>place), a queue limit exported through sysfs may be a cleaner way to both
>support DM correctly (stacked limits) and signal the support to the user. If you
>choose this approach, then this patch is not needed. The zoned_flags, or a
>regular queue flag like for RESET_ALL can be added in the offline ioctl patch.

I see. If we can reach an agreement that the above is a bad
understanding on my side, I will be happy to translate this into a sysfs
entry. If it is OK, I'll give it this week in the mailing list and send
a V4 next week.

>
>>
>>>
>>>> +
>>>>  	if (copy_to_user(argp, &rep, sizeof(struct blk_zone_report)))
>>>>  		return -EFAULT;
>>>>  	return 0;
>>>> diff --git a/drivers/block/null_blk_zoned.c b/drivers/block/null_blk_zoned.c
>>>> index b05832eb21b2..957c2103f240 100644
>>>> --- a/drivers/block/null_blk_zoned.c
>>>> +++ b/drivers/block/null_blk_zoned.c
>>>> @@ -78,6 +78,8 @@ int null_init_zoned_dev(struct nullb_device *dev, struct request_queue *q)
>>>>  	}
>>>>
>>>>  	q->limits.zoned = BLK_ZONED_HM;
>>>> +	q->zone_flags = BLK_ZONE_REP_CAPACITY;
>>>> +
>>>>  	blk_queue_flag_set(QUEUE_FLAG_ZONE_RESETALL, q);
>>>>  	blk_queue_required_elevator_features(q, ELEVATOR_F_ZBD_SEQ_WRITE);
>>>>
>>>> diff --git a/drivers/nvme/host/zns.c b/drivers/nvme/host/zns.c
>>>> index 0642d3c54e8f..888264261ba3 100644
>>>> --- a/drivers/nvme/host/zns.c
>>>> +++ b/drivers/nvme/host/zns.c
>>>> @@ -81,6 +81,7 @@ int nvme_update_zone_info(struct gendisk *disk, struct nvme_ns *ns,
>>>>  	}
>>>>
>>>>  	q->limits.zoned = BLK_ZONED_HM;
>>>> +	q->zone_flags = BLK_ZONE_REP_CAPACITY;
>>>>  	blk_queue_flag_set(QUEUE_FLAG_ZONE_RESETALL, q);
>>>>  free_data:
>>>>  	kfree(id);
>>>> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
>>>> index d90fefffe31b..b9c920bace28 100644
>>>> --- a/drivers/scsi/sd.c
>>>> +++ b/drivers/scsi/sd.c
>>>> @@ -2967,6 +2967,7 @@ static void sd_read_block_characteristics(struct scsi_disk *sdkp)
>>>>  	if (sdkp->device->type == TYPE_ZBC) {
>>>>  		/* Host-managed */
>>>>  		q->limits.zoned = BLK_ZONED_HM;
>>>> +		q->zone_flags = BLK_ZONE_REP_CAPACITY;
>>>>  	} else {
>>>>  		sdkp->zoned = (buffer[8] >> 4) & 3;
>>>>  		if (sdkp->zoned == 1 && !disk_has_partitions(sdkp->disk)) {
>>>> @@ -2983,6 +2984,7 @@ static void sd_read_block_characteristics(struct scsi_disk *sdkp)
>>>>  					  "Drive-managed SMR disk\n");
>>>>  		}
>>>>  	}
>>>> +
>>>>  	if (blk_queue_is_zoned(q) && sdkp->first_scan)
>>>>  		sd_printk(KERN_NOTICE, sdkp, "Host-%s zoned block device\n",
>>>>  		      q->limits.zoned == BLK_ZONED_HM ? "managed" : "aware");
>>>> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
>>>> index 8fd900998b4e..3f2e3425fa53 100644
>>>> --- a/include/linux/blkdev.h
>>>> +++ b/include/linux/blkdev.h
>>>> @@ -512,12 +512,15 @@ struct request_queue {
>>>>  	 * Stacking drivers (device mappers) may or may not initialize
>>>>  	 * these fields.
>>>>  	 *
>>>> +	 * Flags represent features as described by blk_zone_report_flags in blkzoned.h
>>>> +	 *
>>>>  	 * Reads of this information must be protected with blk_queue_enter() /
>>>>  	 * blk_queue_exit(). Modifying this information is only allowed while
>>>>  	 * no requests are being processed. See also blk_mq_freeze_queue() and
>>>>  	 * blk_mq_unfreeze_queue().
>>>>  	 */
>>>>  	unsigned int		nr_zones;
>>>> +	unsigned int		zone_flags;
>>>>  	unsigned long		*conv_zones_bitmap;
>>>>  	unsigned long		*seq_zones_wlock;
>>>>  #endif /* CONFIG_BLK_DEV_ZONED */
>>>>
>>>
>>> And you are missing device-mapper support. DM target devices have a request
>>> queue that would need to set the zone_flags too.

Yes. As mentioned, I did not want to introduce more changes to this
series, just fix the mistake I made with some added debug code.

>>
>> Ok. I looked at it and I thought that this would be inherited by the
>> underlying device. I will add it in V3.
>>
>> Javier
>>
>>

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 1/4] block: Add zone flags to queue zone prop.
  2020-07-02 10:27           ` Javier González
@ 2020-07-03  5:20             ` Damien Le Moal
  -1 siblings, 0 replies; 38+ messages in thread
From: Damien Le Moal @ 2020-07-03  5:20 UTC (permalink / raw)
  To: Javier González
  Cc: linux-nvme, linux-block, hch, kbusch, sagi, axboe, mb,
	SelvaKumar S, Kanchan Joshi, Nitesh Shetty

On 2020/07/02 19:27, Javier González wrote:
> On 02.07.2020 08:49, Damien Le Moal wrote:
>> On 2020/07/02 17:34, Javier González wrote:
>>> On 02.07.2020 07:54, Damien Le Moal wrote:
>>>> On 2020/07/02 15:55, Javier González wrote:
>>>>> From: Javier González <javier.gonz@samsung.com>
>>>>>
>>>>> As the zoned block device will have to deal with features that are
>>>>> optional for the backend device, add a flag field to inform the block
>>>>> layer about supported features. This builds on top of
>>>>> blk_zone_report_flags and extendes to the zone report code.
>>>>>
>>>>> Signed-off-by: Javier González <javier.gonz@samsung.com>
>>>>> Signed-off-by: SelvaKumar S <selvakuma.s1@samsung.com>
>>>>> Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
>>>>> Signed-off-by: Nitesh Shetty <nj.shetty@samsung.com>
>>>>> ---
>>>>>  block/blk-zoned.c              | 3 ++-
>>>>>  drivers/block/null_blk_zoned.c | 2 ++
>>>>>  drivers/nvme/host/zns.c        | 1 +
>>>>>  drivers/scsi/sd.c              | 2 ++
>>>>>  include/linux/blkdev.h         | 3 +++
>>>>>  5 files changed, 10 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/block/blk-zoned.c b/block/blk-zoned.c
>>>>> index 81152a260354..0f156e96e48f 100644
>>>>> --- a/block/blk-zoned.c
>>>>> +++ b/block/blk-zoned.c
>>>>> @@ -312,7 +312,8 @@ int blkdev_report_zones_ioctl(struct block_device *bdev, fmode_t mode,
>>>>>  		return ret;
>>>>>
>>>>>  	rep.nr_zones = ret;
>>>>> -	rep.flags = BLK_ZONE_REP_CAPACITY;
>>>>> +	rep.flags = q->zone_flags;
>>>>
>>>> zone_flags seem to be a fairly generic flags field while rep.flags is only about
>>>> the reported descriptors structure. So you may want to define a
>>>> BLK_ZONE_REP_FLAGS_MASK and have:
>>>>
>>>> 	rep.flags = q->zone_flags & BLK_ZONE_REP_FLAGS_MASK;
>>>>
>>>> to avoid flags meaningless for the user being set.
>>>>
>>>> In any case, since *all* zoned block devices now report capacity, I do not
>>>> really see the point to add BLK_ZONE_REP_FLAGS_MASK to q->zone_flags, especially
>>>> considering that you set the flag for all zoned device types, including scsi
>>>> which does not have zone capacity. This makes q->zone_flags rather confusing
>>>> instead of clearly defining the device features as you mentioned in the commit
>>>> message.
>>>>
>>>> I think it may be better to just drop this patch, and if needed, introduce the
>>>> zone_flags field where it may be needed (e.g. OFFLINE zone ioctl support).
>>>
>>> I am using this as a way to pass the OFFLINE support flag to the block
>>> layer. I used this too for the attributes. Are you thinking of a
>>> different way to send this?
>>>
>>> I believe this fits too for capacity, but we can just pass it in
>>> all report as before if you prefer.
>>
>> The point is that this patch as is does nothing and is needed as a preparatory
>> patch if we want to have the offline flag set in the report. But:
>> 1) As commented in the offline ioctl patch, I am not sure the flag makes a lot
>> of sense. sysfs or nothing at all may be OK as well. When we introduced the new
>> open/close/finish ioctls, we did not add flags to signal that the device
>> supports them. Granted, it was for nullblk and scsi, and both had the support.
>> But running an application using these on an old kernel, and you will get
>> -ENOTTY, meaning, not supported. So simply introducing the offline ioctl without
>> any flag would be OK I think.
> 
> I see. My understanding after some comments from Christoph was that we
> should use these bits to signal any optional features / capabilities
> that would depend on the underlying driver, just as it is done with the
> capacity flag today.
> 
> If not for the offline transition, for the attributes, I see it exactly
> as the same use case as capacity, where we signal that a new field is
> reported in the report structure.
> 
> Am I missing something here?

Using the report flags for reporting supported operations/features is fine, but
as already mentioned, then document that by changing the description of enum
blk_zone_report_flags. Right now, it still says:

 * enum blk_zone_report_flags - Feature flags of reported zone descriptors.

Which kind of really point to things the zone descriptor itself has compared to
earlier versions of the descriptor rather than what the device can do or not.

And as I argued already, using a flag is fine only for letting a user know about
a supported feature, but that is far from ideal to have device-mapper easily
discover what a target can support or not. For that, stacked queue limits are
much simpler. Which leads to exporting that limit in sysfs rather than using a
flag for the user interface.

>> Since DM support for this would be nice too and we now are in a situation where
>> not all devices support the  ioctl, instead of a report flag (a little out of
>> place), a queue limit exported through sysfs may be a cleaner way to both
>> support DM correctly (stacked limits) and signal the support to the user. If you
>> choose this approach, then this patch is not needed. The zoned_flags, or a
>> regular queue flag like for RESET_ALL can be added in the offline ioctl patch.
> 
> I see. If we can reach an agreement that the above is a bad
> understanding on my side, I will be happy to translate this into a sysfs
> entry. If it is OK, I'll give it this week in the mailing list and send
> a V4 next week.

It is all about device mapper support... How are you planning to do it using a
queue flag without said flags being stackable as queue limits ?

-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 1/4] block: Add zone flags to queue zone prop.
@ 2020-07-03  5:20             ` Damien Le Moal
  0 siblings, 0 replies; 38+ messages in thread
From: Damien Le Moal @ 2020-07-03  5:20 UTC (permalink / raw)
  To: Javier González
  Cc: axboe, SelvaKumar S, sagi, Kanchan Joshi, linux-nvme,
	Nitesh Shetty, linux-block, kbusch, mb, hch

On 2020/07/02 19:27, Javier González wrote:
> On 02.07.2020 08:49, Damien Le Moal wrote:
>> On 2020/07/02 17:34, Javier González wrote:
>>> On 02.07.2020 07:54, Damien Le Moal wrote:
>>>> On 2020/07/02 15:55, Javier González wrote:
>>>>> From: Javier González <javier.gonz@samsung.com>
>>>>>
>>>>> As the zoned block device will have to deal with features that are
>>>>> optional for the backend device, add a flag field to inform the block
>>>>> layer about supported features. This builds on top of
>>>>> blk_zone_report_flags and extendes to the zone report code.
>>>>>
>>>>> Signed-off-by: Javier González <javier.gonz@samsung.com>
>>>>> Signed-off-by: SelvaKumar S <selvakuma.s1@samsung.com>
>>>>> Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
>>>>> Signed-off-by: Nitesh Shetty <nj.shetty@samsung.com>
>>>>> ---
>>>>>  block/blk-zoned.c              | 3 ++-
>>>>>  drivers/block/null_blk_zoned.c | 2 ++
>>>>>  drivers/nvme/host/zns.c        | 1 +
>>>>>  drivers/scsi/sd.c              | 2 ++
>>>>>  include/linux/blkdev.h         | 3 +++
>>>>>  5 files changed, 10 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/block/blk-zoned.c b/block/blk-zoned.c
>>>>> index 81152a260354..0f156e96e48f 100644
>>>>> --- a/block/blk-zoned.c
>>>>> +++ b/block/blk-zoned.c
>>>>> @@ -312,7 +312,8 @@ int blkdev_report_zones_ioctl(struct block_device *bdev, fmode_t mode,
>>>>>  		return ret;
>>>>>
>>>>>  	rep.nr_zones = ret;
>>>>> -	rep.flags = BLK_ZONE_REP_CAPACITY;
>>>>> +	rep.flags = q->zone_flags;
>>>>
>>>> zone_flags seem to be a fairly generic flags field while rep.flags is only about
>>>> the reported descriptors structure. So you may want to define a
>>>> BLK_ZONE_REP_FLAGS_MASK and have:
>>>>
>>>> 	rep.flags = q->zone_flags & BLK_ZONE_REP_FLAGS_MASK;
>>>>
>>>> to avoid flags meaningless for the user being set.
>>>>
>>>> In any case, since *all* zoned block devices now report capacity, I do not
>>>> really see the point to add BLK_ZONE_REP_FLAGS_MASK to q->zone_flags, especially
>>>> considering that you set the flag for all zoned device types, including scsi
>>>> which does not have zone capacity. This makes q->zone_flags rather confusing
>>>> instead of clearly defining the device features as you mentioned in the commit
>>>> message.
>>>>
>>>> I think it may be better to just drop this patch, and if needed, introduce the
>>>> zone_flags field where it may be needed (e.g. OFFLINE zone ioctl support).
>>>
>>> I am using this as a way to pass the OFFLINE support flag to the block
>>> layer. I used this too for the attributes. Are you thinking of a
>>> different way to send this?
>>>
>>> I believe this fits too for capacity, but we can just pass it in
>>> all report as before if you prefer.
>>
>> The point is that this patch as is does nothing and is needed as a preparatory
>> patch if we want to have the offline flag set in the report. But:
>> 1) As commented in the offline ioctl patch, I am not sure the flag makes a lot
>> of sense. sysfs or nothing at all may be OK as well. When we introduced the new
>> open/close/finish ioctls, we did not add flags to signal that the device
>> supports them. Granted, it was for nullblk and scsi, and both had the support.
>> But running an application using these on an old kernel, and you will get
>> -ENOTTY, meaning, not supported. So simply introducing the offline ioctl without
>> any flag would be OK I think.
> 
> I see. My understanding after some comments from Christoph was that we
> should use these bits to signal any optional features / capabilities
> that would depend on the underlying driver, just as it is done with the
> capacity flag today.
> 
> If not for the offline transition, for the attributes, I see it exactly
> as the same use case as capacity, where we signal that a new field is
> reported in the report structure.
> 
> Am I missing something here?

Using the report flags for reporting supported operations/features is fine, but
as already mentioned, then document that by changing the description of enum
blk_zone_report_flags. Right now, it still says:

 * enum blk_zone_report_flags - Feature flags of reported zone descriptors.

Which kind of really point to things the zone descriptor itself has compared to
earlier versions of the descriptor rather than what the device can do or not.

And as I argued already, using a flag is fine only for letting a user know about
a supported feature, but that is far from ideal to have device-mapper easily
discover what a target can support or not. For that, stacked queue limits are
much simpler. Which leads to exporting that limit in sysfs rather than using a
flag for the user interface.

>> Since DM support for this would be nice too and we now are in a situation where
>> not all devices support the  ioctl, instead of a report flag (a little out of
>> place), a queue limit exported through sysfs may be a cleaner way to both
>> support DM correctly (stacked limits) and signal the support to the user. If you
>> choose this approach, then this patch is not needed. The zoned_flags, or a
>> regular queue flag like for RESET_ALL can be added in the offline ioctl patch.
> 
> I see. If we can reach an agreement that the above is a bad
> understanding on my side, I will be happy to translate this into a sysfs
> entry. If it is OK, I'll give it this week in the mailing list and send
> a V4 next week.

It is all about device mapper support... How are you planning to do it using a
queue flag without said flags being stackable as queue limits ?

-- 
Damien Le Moal
Western Digital Research

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 1/4] block: Add zone flags to queue zone prop.
  2020-07-03  5:20             ` Damien Le Moal
@ 2020-07-03  6:28               ` Javier González
  -1 siblings, 0 replies; 38+ messages in thread
From: Javier González @ 2020-07-03  6:28 UTC (permalink / raw)
  To: Damien Le Moal
  Cc: linux-nvme, linux-block, hch, kbusch, sagi, axboe, mb,
	SelvaKumar S, Kanchan Joshi, Nitesh Shetty

On 03.07.2020 05:20, Damien Le Moal wrote:
>On 2020/07/02 19:27, Javier González wrote:
>> On 02.07.2020 08:49, Damien Le Moal wrote:
>>> On 2020/07/02 17:34, Javier González wrote:
>>>> On 02.07.2020 07:54, Damien Le Moal wrote:
>>>>> On 2020/07/02 15:55, Javier González wrote:
>>>>>> From: Javier González <javier.gonz@samsung.com>
>>>>>>
>>>>>> As the zoned block device will have to deal with features that are
>>>>>> optional for the backend device, add a flag field to inform the block
>>>>>> layer about supported features. This builds on top of
>>>>>> blk_zone_report_flags and extendes to the zone report code.
>>>>>>
>>>>>> Signed-off-by: Javier González <javier.gonz@samsung.com>
>>>>>> Signed-off-by: SelvaKumar S <selvakuma.s1@samsung.com>
>>>>>> Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
>>>>>> Signed-off-by: Nitesh Shetty <nj.shetty@samsung.com>
>>>>>> ---
>>>>>>  block/blk-zoned.c              | 3 ++-
>>>>>>  drivers/block/null_blk_zoned.c | 2 ++
>>>>>>  drivers/nvme/host/zns.c        | 1 +
>>>>>>  drivers/scsi/sd.c              | 2 ++
>>>>>>  include/linux/blkdev.h         | 3 +++
>>>>>>  5 files changed, 10 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/block/blk-zoned.c b/block/blk-zoned.c
>>>>>> index 81152a260354..0f156e96e48f 100644
>>>>>> --- a/block/blk-zoned.c
>>>>>> +++ b/block/blk-zoned.c
>>>>>> @@ -312,7 +312,8 @@ int blkdev_report_zones_ioctl(struct block_device *bdev, fmode_t mode,
>>>>>>  		return ret;
>>>>>>
>>>>>>  	rep.nr_zones = ret;
>>>>>> -	rep.flags = BLK_ZONE_REP_CAPACITY;
>>>>>> +	rep.flags = q->zone_flags;
>>>>>
>>>>> zone_flags seem to be a fairly generic flags field while rep.flags is only about
>>>>> the reported descriptors structure. So you may want to define a
>>>>> BLK_ZONE_REP_FLAGS_MASK and have:
>>>>>
>>>>> 	rep.flags = q->zone_flags & BLK_ZONE_REP_FLAGS_MASK;
>>>>>
>>>>> to avoid flags meaningless for the user being set.
>>>>>
>>>>> In any case, since *all* zoned block devices now report capacity, I do not
>>>>> really see the point to add BLK_ZONE_REP_FLAGS_MASK to q->zone_flags, especially
>>>>> considering that you set the flag for all zoned device types, including scsi
>>>>> which does not have zone capacity. This makes q->zone_flags rather confusing
>>>>> instead of clearly defining the device features as you mentioned in the commit
>>>>> message.
>>>>>
>>>>> I think it may be better to just drop this patch, and if needed, introduce the
>>>>> zone_flags field where it may be needed (e.g. OFFLINE zone ioctl support).
>>>>
>>>> I am using this as a way to pass the OFFLINE support flag to the block
>>>> layer. I used this too for the attributes. Are you thinking of a
>>>> different way to send this?
>>>>
>>>> I believe this fits too for capacity, but we can just pass it in
>>>> all report as before if you prefer.
>>>
>>> The point is that this patch as is does nothing and is needed as a preparatory
>>> patch if we want to have the offline flag set in the report. But:
>>> 1) As commented in the offline ioctl patch, I am not sure the flag makes a lot
>>> of sense. sysfs or nothing at all may be OK as well. When we introduced the new
>>> open/close/finish ioctls, we did not add flags to signal that the device
>>> supports them. Granted, it was for nullblk and scsi, and both had the support.
>>> But running an application using these on an old kernel, and you will get
>>> -ENOTTY, meaning, not supported. So simply introducing the offline ioctl without
>>> any flag would be OK I think.
>>
>> I see. My understanding after some comments from Christoph was that we
>> should use these bits to signal any optional features / capabilities
>> that would depend on the underlying driver, just as it is done with the
>> capacity flag today.
>>
>> If not for the offline transition, for the attributes, I see it exactly
>> as the same use case as capacity, where we signal that a new field is
>> reported in the report structure.
>>
>> Am I missing something here?
>
>Using the report flags for reporting supported operations/features is fine, but
>as already mentioned, then document that by changing the description of enum
>blk_zone_report_flags. Right now, it still says:
>
> * enum blk_zone_report_flags - Feature flags of reported zone descriptors.
>
>Which kind of really point to things the zone descriptor itself has compared to
>earlier versions of the descriptor rather than what the device can do or not.

I see how the offline transition collides with this model. But can we
agree that the zone attributes is something that fits here?

>
>And as I argued already, using a flag is fine only for letting a user know about
>a supported feature, but that is far from ideal to have device-mapper easily
>discover what a target can support or not. For that, stacked queue limits are
>much simpler. Which leads to exporting that limit in sysfs rather than using a
>flag for the user interface.

See below

>
>>> Since DM support for this would be nice too and we now are in a situation where
>>> not all devices support the  ioctl, instead of a report flag (a little out of
>>> place), a queue limit exported through sysfs may be a cleaner way to both
>>> support DM correctly (stacked limits) and signal the support to the user. If you
>>> choose this approach, then this patch is not needed. The zoned_flags, or a
>>> regular queue flag like for RESET_ALL can be added in the offline ioctl patch.
>>
>> I see. If we can reach an agreement that the above is a bad
>> understanding on my side, I will be happy to translate this into a sysfs
>> entry. If it is OK, I'll give it this week in the mailing list and send
>> a V4 next week.
>
>It is all about device mapper support... How are you planning to do it using a
>queue flag without said flags being stackable as queue limits ?

I guess what I am struggle with here is that you see the capacity
feature as a different extension than the attributes and the offline
transition.

The way I see it, there are things the device supports and things that
the driver supports. ZAC/ZBC does not support a size != capacity, but
the driver sets these values to be the same. One thing I struggle with
is that saying that we support capacity and setting size = capacity puts
the burden in the user to check these values across all zones to
determine if they can support the driver or not. I think it would be
clear to have a feature explicitly stating that capacity != size.

For the attributes, this is very similar - some apply, some don't, and
we only report the ones that make sense for each driver. I do not see
how device mappers are treated differently here than in the existing
capacity features.

For the offline transition, I see that the current patches do not set
flags properly and we would need to inherit the underlying device if we
want to do it this way. I can understand from your comments that this is
not a good way of doing it. I see in one of your comments from V1 that
we could deal with this transition in the scsi driver and keep it in
memory (i.e., device is still read-only, but driver transitions to
offline) - can you comment on this?

Javier

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 1/4] block: Add zone flags to queue zone prop.
@ 2020-07-03  6:28               ` Javier González
  0 siblings, 0 replies; 38+ messages in thread
From: Javier González @ 2020-07-03  6:28 UTC (permalink / raw)
  To: Damien Le Moal
  Cc: axboe, SelvaKumar S, sagi, Kanchan Joshi, linux-nvme,
	Nitesh Shetty, linux-block, kbusch, mb, hch

On 03.07.2020 05:20, Damien Le Moal wrote:
>On 2020/07/02 19:27, Javier González wrote:
>> On 02.07.2020 08:49, Damien Le Moal wrote:
>>> On 2020/07/02 17:34, Javier González wrote:
>>>> On 02.07.2020 07:54, Damien Le Moal wrote:
>>>>> On 2020/07/02 15:55, Javier González wrote:
>>>>>> From: Javier González <javier.gonz@samsung.com>
>>>>>>
>>>>>> As the zoned block device will have to deal with features that are
>>>>>> optional for the backend device, add a flag field to inform the block
>>>>>> layer about supported features. This builds on top of
>>>>>> blk_zone_report_flags and extendes to the zone report code.
>>>>>>
>>>>>> Signed-off-by: Javier González <javier.gonz@samsung.com>
>>>>>> Signed-off-by: SelvaKumar S <selvakuma.s1@samsung.com>
>>>>>> Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
>>>>>> Signed-off-by: Nitesh Shetty <nj.shetty@samsung.com>
>>>>>> ---
>>>>>>  block/blk-zoned.c              | 3 ++-
>>>>>>  drivers/block/null_blk_zoned.c | 2 ++
>>>>>>  drivers/nvme/host/zns.c        | 1 +
>>>>>>  drivers/scsi/sd.c              | 2 ++
>>>>>>  include/linux/blkdev.h         | 3 +++
>>>>>>  5 files changed, 10 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/block/blk-zoned.c b/block/blk-zoned.c
>>>>>> index 81152a260354..0f156e96e48f 100644
>>>>>> --- a/block/blk-zoned.c
>>>>>> +++ b/block/blk-zoned.c
>>>>>> @@ -312,7 +312,8 @@ int blkdev_report_zones_ioctl(struct block_device *bdev, fmode_t mode,
>>>>>>  		return ret;
>>>>>>
>>>>>>  	rep.nr_zones = ret;
>>>>>> -	rep.flags = BLK_ZONE_REP_CAPACITY;
>>>>>> +	rep.flags = q->zone_flags;
>>>>>
>>>>> zone_flags seem to be a fairly generic flags field while rep.flags is only about
>>>>> the reported descriptors structure. So you may want to define a
>>>>> BLK_ZONE_REP_FLAGS_MASK and have:
>>>>>
>>>>> 	rep.flags = q->zone_flags & BLK_ZONE_REP_FLAGS_MASK;
>>>>>
>>>>> to avoid flags meaningless for the user being set.
>>>>>
>>>>> In any case, since *all* zoned block devices now report capacity, I do not
>>>>> really see the point to add BLK_ZONE_REP_FLAGS_MASK to q->zone_flags, especially
>>>>> considering that you set the flag for all zoned device types, including scsi
>>>>> which does not have zone capacity. This makes q->zone_flags rather confusing
>>>>> instead of clearly defining the device features as you mentioned in the commit
>>>>> message.
>>>>>
>>>>> I think it may be better to just drop this patch, and if needed, introduce the
>>>>> zone_flags field where it may be needed (e.g. OFFLINE zone ioctl support).
>>>>
>>>> I am using this as a way to pass the OFFLINE support flag to the block
>>>> layer. I used this too for the attributes. Are you thinking of a
>>>> different way to send this?
>>>>
>>>> I believe this fits too for capacity, but we can just pass it in
>>>> all report as before if you prefer.
>>>
>>> The point is that this patch as is does nothing and is needed as a preparatory
>>> patch if we want to have the offline flag set in the report. But:
>>> 1) As commented in the offline ioctl patch, I am not sure the flag makes a lot
>>> of sense. sysfs or nothing at all may be OK as well. When we introduced the new
>>> open/close/finish ioctls, we did not add flags to signal that the device
>>> supports them. Granted, it was for nullblk and scsi, and both had the support.
>>> But running an application using these on an old kernel, and you will get
>>> -ENOTTY, meaning, not supported. So simply introducing the offline ioctl without
>>> any flag would be OK I think.
>>
>> I see. My understanding after some comments from Christoph was that we
>> should use these bits to signal any optional features / capabilities
>> that would depend on the underlying driver, just as it is done with the
>> capacity flag today.
>>
>> If not for the offline transition, for the attributes, I see it exactly
>> as the same use case as capacity, where we signal that a new field is
>> reported in the report structure.
>>
>> Am I missing something here?
>
>Using the report flags for reporting supported operations/features is fine, but
>as already mentioned, then document that by changing the description of enum
>blk_zone_report_flags. Right now, it still says:
>
> * enum blk_zone_report_flags - Feature flags of reported zone descriptors.
>
>Which kind of really point to things the zone descriptor itself has compared to
>earlier versions of the descriptor rather than what the device can do or not.

I see how the offline transition collides with this model. But can we
agree that the zone attributes is something that fits here?

>
>And as I argued already, using a flag is fine only for letting a user know about
>a supported feature, but that is far from ideal to have device-mapper easily
>discover what a target can support or not. For that, stacked queue limits are
>much simpler. Which leads to exporting that limit in sysfs rather than using a
>flag for the user interface.

See below

>
>>> Since DM support for this would be nice too and we now are in a situation where
>>> not all devices support the  ioctl, instead of a report flag (a little out of
>>> place), a queue limit exported through sysfs may be a cleaner way to both
>>> support DM correctly (stacked limits) and signal the support to the user. If you
>>> choose this approach, then this patch is not needed. The zoned_flags, or a
>>> regular queue flag like for RESET_ALL can be added in the offline ioctl patch.
>>
>> I see. If we can reach an agreement that the above is a bad
>> understanding on my side, I will be happy to translate this into a sysfs
>> entry. If it is OK, I'll give it this week in the mailing list and send
>> a V4 next week.
>
>It is all about device mapper support... How are you planning to do it using a
>queue flag without said flags being stackable as queue limits ?

I guess what I am struggle with here is that you see the capacity
feature as a different extension than the attributes and the offline
transition.

The way I see it, there are things the device supports and things that
the driver supports. ZAC/ZBC does not support a size != capacity, but
the driver sets these values to be the same. One thing I struggle with
is that saying that we support capacity and setting size = capacity puts
the burden in the user to check these values across all zones to
determine if they can support the driver or not. I think it would be
clear to have a feature explicitly stating that capacity != size.

For the attributes, this is very similar - some apply, some don't, and
we only report the ones that make sense for each driver. I do not see
how device mappers are treated differently here than in the existing
capacity features.

For the offline transition, I see that the current patches do not set
flags properly and we would need to inherit the underlying device if we
want to do it this way. I can understand from your comments that this is
not a good way of doing it. I see in one of your comments from V1 that
we could deal with this transition in the scsi driver and keep it in
memory (i.e., device is still read-only, but driver transitions to
offline) - can you comment on this?

Javier

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 1/4] block: Add zone flags to queue zone prop.
  2020-07-03  6:28               ` Javier González
@ 2020-07-03  8:35                 ` Damien Le Moal
  -1 siblings, 0 replies; 38+ messages in thread
From: Damien Le Moal @ 2020-07-03  8:35 UTC (permalink / raw)
  To: Javier González
  Cc: linux-nvme, linux-block, hch, kbusch, sagi, axboe, mb,
	SelvaKumar S, Kanchan Joshi, Nitesh Shetty

On 2020/07/03 15:28, Javier González wrote:
> On 03.07.2020 05:20, Damien Le Moal wrote:
>> On 2020/07/02 19:27, Javier González wrote:
>>> On 02.07.2020 08:49, Damien Le Moal wrote:
>>>> On 2020/07/02 17:34, Javier González wrote:
>>>>> On 02.07.2020 07:54, Damien Le Moal wrote:
>>>>>> On 2020/07/02 15:55, Javier González wrote:
>>>>>>> From: Javier González <javier.gonz@samsung.com>
>>>>>>>
>>>>>>> As the zoned block device will have to deal with features that are
>>>>>>> optional for the backend device, add a flag field to inform the block
>>>>>>> layer about supported features. This builds on top of
>>>>>>> blk_zone_report_flags and extendes to the zone report code.
>>>>>>>
>>>>>>> Signed-off-by: Javier González <javier.gonz@samsung.com>
>>>>>>> Signed-off-by: SelvaKumar S <selvakuma.s1@samsung.com>
>>>>>>> Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
>>>>>>> Signed-off-by: Nitesh Shetty <nj.shetty@samsung.com>
>>>>>>> ---
>>>>>>>  block/blk-zoned.c              | 3 ++-
>>>>>>>  drivers/block/null_blk_zoned.c | 2 ++
>>>>>>>  drivers/nvme/host/zns.c        | 1 +
>>>>>>>  drivers/scsi/sd.c              | 2 ++
>>>>>>>  include/linux/blkdev.h         | 3 +++
>>>>>>>  5 files changed, 10 insertions(+), 1 deletion(-)
>>>>>>>
>>>>>>> diff --git a/block/blk-zoned.c b/block/blk-zoned.c
>>>>>>> index 81152a260354..0f156e96e48f 100644
>>>>>>> --- a/block/blk-zoned.c
>>>>>>> +++ b/block/blk-zoned.c
>>>>>>> @@ -312,7 +312,8 @@ int blkdev_report_zones_ioctl(struct block_device *bdev, fmode_t mode,
>>>>>>>  		return ret;
>>>>>>>
>>>>>>>  	rep.nr_zones = ret;
>>>>>>> -	rep.flags = BLK_ZONE_REP_CAPACITY;
>>>>>>> +	rep.flags = q->zone_flags;
>>>>>>
>>>>>> zone_flags seem to be a fairly generic flags field while rep.flags is only about
>>>>>> the reported descriptors structure. So you may want to define a
>>>>>> BLK_ZONE_REP_FLAGS_MASK and have:
>>>>>>
>>>>>> 	rep.flags = q->zone_flags & BLK_ZONE_REP_FLAGS_MASK;
>>>>>>
>>>>>> to avoid flags meaningless for the user being set.
>>>>>>
>>>>>> In any case, since *all* zoned block devices now report capacity, I do not
>>>>>> really see the point to add BLK_ZONE_REP_FLAGS_MASK to q->zone_flags, especially
>>>>>> considering that you set the flag for all zoned device types, including scsi
>>>>>> which does not have zone capacity. This makes q->zone_flags rather confusing
>>>>>> instead of clearly defining the device features as you mentioned in the commit
>>>>>> message.
>>>>>>
>>>>>> I think it may be better to just drop this patch, and if needed, introduce the
>>>>>> zone_flags field where it may be needed (e.g. OFFLINE zone ioctl support).
>>>>>
>>>>> I am using this as a way to pass the OFFLINE support flag to the block
>>>>> layer. I used this too for the attributes. Are you thinking of a
>>>>> different way to send this?
>>>>>
>>>>> I believe this fits too for capacity, but we can just pass it in
>>>>> all report as before if you prefer.
>>>>
>>>> The point is that this patch as is does nothing and is needed as a preparatory
>>>> patch if we want to have the offline flag set in the report. But:
>>>> 1) As commented in the offline ioctl patch, I am not sure the flag makes a lot
>>>> of sense. sysfs or nothing at all may be OK as well. When we introduced the new
>>>> open/close/finish ioctls, we did not add flags to signal that the device
>>>> supports them. Granted, it was for nullblk and scsi, and both had the support.
>>>> But running an application using these on an old kernel, and you will get
>>>> -ENOTTY, meaning, not supported. So simply introducing the offline ioctl without
>>>> any flag would be OK I think.
>>>
>>> I see. My understanding after some comments from Christoph was that we
>>> should use these bits to signal any optional features / capabilities
>>> that would depend on the underlying driver, just as it is done with the
>>> capacity flag today.
>>>
>>> If not for the offline transition, for the attributes, I see it exactly
>>> as the same use case as capacity, where we signal that a new field is
>>> reported in the report structure.
>>>
>>> Am I missing something here?
>>
>> Using the report flags for reporting supported operations/features is fine, but
>> as already mentioned, then document that by changing the description of enum
>> blk_zone_report_flags. Right now, it still says:
>>
>> * enum blk_zone_report_flags - Feature flags of reported zone descriptors.
>>
>> Which kind of really point to things the zone descriptor itself has compared to
>> earlier versions of the descriptor rather than what the device can do or not.
> 
> I see how the offline transition collides with this model. But can we
> agree that the zone attributes is something that fits here?

Flags may be fine. I still do not see it though due to your patches missing
device mapper support. Please send patches that cover all zoned block device
types, including device mapper to show that this is a good approach.

>> And as I argued already, using a flag is fine only for letting a user know about
>> a supported feature, but that is far from ideal to have device-mapper easily
>> discover what a target can support or not. For that, stacked queue limits are
>> much simpler. Which leads to exporting that limit in sysfs rather than using a
>> flag for the user interface.
> 
> See below
> 
>>
>>>> Since DM support for this would be nice too and we now are in a situation where
>>>> not all devices support the  ioctl, instead of a report flag (a little out of
>>>> place), a queue limit exported through sysfs may be a cleaner way to both
>>>> support DM correctly (stacked limits) and signal the support to the user. If you
>>>> choose this approach, then this patch is not needed. The zoned_flags, or a
>>>> regular queue flag like for RESET_ALL can be added in the offline ioctl patch.
>>>
>>> I see. If we can reach an agreement that the above is a bad
>>> understanding on my side, I will be happy to translate this into a sysfs
>>> entry. If it is OK, I'll give it this week in the mailing list and send
>>> a V4 next week.
>>
>> It is all about device mapper support... How are you planning to do it using a
>> queue flag without said flags being stackable as queue limits ?
> 
> I guess what I am struggle with here is that you see the capacity
> feature as a different extension than the attributes and the offline
> transition.
> 
> The way I see it, there are things the device supports and things that
> the driver supports. ZAC/ZBC does not support a size != capacity, but
> the driver sets these values to be the same. One thing I struggle with
> is that saying that we support capacity and setting size = capacity puts
> the burden in the user to check these values across all zones to
> determine if they can support the driver or not. I think it would be
> clear to have a feature explicitly stating that capacity != size.

From the user (application or in-kernel) perspective, *all* zoned device now
have a zone capacity. I do not think it matters if that comes from the device or
from the driver. It is unconditionally supported. The report flag
BLK_ZONE_REP_CAPACITY exist to signal to user applications that the zone
descriptor has the capactiy field, to allow for backward compatibility with
older kernels. See patches for fio or util-linux blkzone posted to see its use.

Having a flag that explicitly states that zone capacity != zone size would be
useful only and only if *all* zones of the device have that property. What if
only some zones have a capacity lower than the zone size ? The user would still
need a zone report to sort things out.

> For the attributes, this is very similar - some apply, some don't, and
> we only report the ones that make sense for each driver. I do not see
> how device mappers are treated differently here than in the existing
> capacity features.

The capacity feature is supported by *all* zoned block device. Support coming
from the device or from the driver does not matter. The API is consistent for
all device types. So in my opinion, there is no need to consider zone capacity
as an attribute. It is supported by everything, it is now part of the API,
signaled with BLK_ZONE_REP_CAPACITY for backward compatibility handling.

You are mixing up the report flags describing the structure of the zone
descriptor API with optional device features attributes. Adding an attribute
field to the report zones descriptors is fine. The presence of that field can be
signaled with a BLK_ZONE_REP_ATTR report flag as you already proposed. No problem.

But then, imagine a device mapper target using multiple devices with different
optional features. How do you propose that the device mapper target sorts this
out and come up with a coherent set of attributes for its target report zones ?
Please propose a solution for that. Your patches so far are all missing that.

> For the offline transition, I see that the current patches do not set
> flags properly and we would need to inherit the underlying device if we
> want to do it this way. I can understand from your comments that this is
> not a good way of doing it. I see in one of your comments from V1 that
> we could deal with this transition in the scsi driver and keep it in
> memory (i.e., device is still read-only, but driver transitions to
> offline) - can you comment on this?

That would not be persistent across reboot, unlike for ZNS drive. So forget
about this. That was wishful thinking from me. This will not work.

Explicit zone offlining is truly an optional feature of the device. Signaling
its support to the user is necessary, but again, I would like that to be done in
a way that does not force device mapper targets to forever report nothing for
the optional features. A flag field can certainly be used, but it has to  be
handled in blk_queue_stack_limits(). That may be as simple as taking a bitwise
AND of each device zone attribute flags.


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 1/4] block: Add zone flags to queue zone prop.
@ 2020-07-03  8:35                 ` Damien Le Moal
  0 siblings, 0 replies; 38+ messages in thread
From: Damien Le Moal @ 2020-07-03  8:35 UTC (permalink / raw)
  To: Javier González
  Cc: axboe, SelvaKumar S, sagi, Kanchan Joshi, linux-nvme,
	Nitesh Shetty, linux-block, kbusch, mb, hch

On 2020/07/03 15:28, Javier González wrote:
> On 03.07.2020 05:20, Damien Le Moal wrote:
>> On 2020/07/02 19:27, Javier González wrote:
>>> On 02.07.2020 08:49, Damien Le Moal wrote:
>>>> On 2020/07/02 17:34, Javier González wrote:
>>>>> On 02.07.2020 07:54, Damien Le Moal wrote:
>>>>>> On 2020/07/02 15:55, Javier González wrote:
>>>>>>> From: Javier González <javier.gonz@samsung.com>
>>>>>>>
>>>>>>> As the zoned block device will have to deal with features that are
>>>>>>> optional for the backend device, add a flag field to inform the block
>>>>>>> layer about supported features. This builds on top of
>>>>>>> blk_zone_report_flags and extendes to the zone report code.
>>>>>>>
>>>>>>> Signed-off-by: Javier González <javier.gonz@samsung.com>
>>>>>>> Signed-off-by: SelvaKumar S <selvakuma.s1@samsung.com>
>>>>>>> Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
>>>>>>> Signed-off-by: Nitesh Shetty <nj.shetty@samsung.com>
>>>>>>> ---
>>>>>>>  block/blk-zoned.c              | 3 ++-
>>>>>>>  drivers/block/null_blk_zoned.c | 2 ++
>>>>>>>  drivers/nvme/host/zns.c        | 1 +
>>>>>>>  drivers/scsi/sd.c              | 2 ++
>>>>>>>  include/linux/blkdev.h         | 3 +++
>>>>>>>  5 files changed, 10 insertions(+), 1 deletion(-)
>>>>>>>
>>>>>>> diff --git a/block/blk-zoned.c b/block/blk-zoned.c
>>>>>>> index 81152a260354..0f156e96e48f 100644
>>>>>>> --- a/block/blk-zoned.c
>>>>>>> +++ b/block/blk-zoned.c
>>>>>>> @@ -312,7 +312,8 @@ int blkdev_report_zones_ioctl(struct block_device *bdev, fmode_t mode,
>>>>>>>  		return ret;
>>>>>>>
>>>>>>>  	rep.nr_zones = ret;
>>>>>>> -	rep.flags = BLK_ZONE_REP_CAPACITY;
>>>>>>> +	rep.flags = q->zone_flags;
>>>>>>
>>>>>> zone_flags seem to be a fairly generic flags field while rep.flags is only about
>>>>>> the reported descriptors structure. So you may want to define a
>>>>>> BLK_ZONE_REP_FLAGS_MASK and have:
>>>>>>
>>>>>> 	rep.flags = q->zone_flags & BLK_ZONE_REP_FLAGS_MASK;
>>>>>>
>>>>>> to avoid flags meaningless for the user being set.
>>>>>>
>>>>>> In any case, since *all* zoned block devices now report capacity, I do not
>>>>>> really see the point to add BLK_ZONE_REP_FLAGS_MASK to q->zone_flags, especially
>>>>>> considering that you set the flag for all zoned device types, including scsi
>>>>>> which does not have zone capacity. This makes q->zone_flags rather confusing
>>>>>> instead of clearly defining the device features as you mentioned in the commit
>>>>>> message.
>>>>>>
>>>>>> I think it may be better to just drop this patch, and if needed, introduce the
>>>>>> zone_flags field where it may be needed (e.g. OFFLINE zone ioctl support).
>>>>>
>>>>> I am using this as a way to pass the OFFLINE support flag to the block
>>>>> layer. I used this too for the attributes. Are you thinking of a
>>>>> different way to send this?
>>>>>
>>>>> I believe this fits too for capacity, but we can just pass it in
>>>>> all report as before if you prefer.
>>>>
>>>> The point is that this patch as is does nothing and is needed as a preparatory
>>>> patch if we want to have the offline flag set in the report. But:
>>>> 1) As commented in the offline ioctl patch, I am not sure the flag makes a lot
>>>> of sense. sysfs or nothing at all may be OK as well. When we introduced the new
>>>> open/close/finish ioctls, we did not add flags to signal that the device
>>>> supports them. Granted, it was for nullblk and scsi, and both had the support.
>>>> But running an application using these on an old kernel, and you will get
>>>> -ENOTTY, meaning, not supported. So simply introducing the offline ioctl without
>>>> any flag would be OK I think.
>>>
>>> I see. My understanding after some comments from Christoph was that we
>>> should use these bits to signal any optional features / capabilities
>>> that would depend on the underlying driver, just as it is done with the
>>> capacity flag today.
>>>
>>> If not for the offline transition, for the attributes, I see it exactly
>>> as the same use case as capacity, where we signal that a new field is
>>> reported in the report structure.
>>>
>>> Am I missing something here?
>>
>> Using the report flags for reporting supported operations/features is fine, but
>> as already mentioned, then document that by changing the description of enum
>> blk_zone_report_flags. Right now, it still says:
>>
>> * enum blk_zone_report_flags - Feature flags of reported zone descriptors.
>>
>> Which kind of really point to things the zone descriptor itself has compared to
>> earlier versions of the descriptor rather than what the device can do or not.
> 
> I see how the offline transition collides with this model. But can we
> agree that the zone attributes is something that fits here?

Flags may be fine. I still do not see it though due to your patches missing
device mapper support. Please send patches that cover all zoned block device
types, including device mapper to show that this is a good approach.

>> And as I argued already, using a flag is fine only for letting a user know about
>> a supported feature, but that is far from ideal to have device-mapper easily
>> discover what a target can support or not. For that, stacked queue limits are
>> much simpler. Which leads to exporting that limit in sysfs rather than using a
>> flag for the user interface.
> 
> See below
> 
>>
>>>> Since DM support for this would be nice too and we now are in a situation where
>>>> not all devices support the  ioctl, instead of a report flag (a little out of
>>>> place), a queue limit exported through sysfs may be a cleaner way to both
>>>> support DM correctly (stacked limits) and signal the support to the user. If you
>>>> choose this approach, then this patch is not needed. The zoned_flags, or a
>>>> regular queue flag like for RESET_ALL can be added in the offline ioctl patch.
>>>
>>> I see. If we can reach an agreement that the above is a bad
>>> understanding on my side, I will be happy to translate this into a sysfs
>>> entry. If it is OK, I'll give it this week in the mailing list and send
>>> a V4 next week.
>>
>> It is all about device mapper support... How are you planning to do it using a
>> queue flag without said flags being stackable as queue limits ?
> 
> I guess what I am struggle with here is that you see the capacity
> feature as a different extension than the attributes and the offline
> transition.
> 
> The way I see it, there are things the device supports and things that
> the driver supports. ZAC/ZBC does not support a size != capacity, but
> the driver sets these values to be the same. One thing I struggle with
> is that saying that we support capacity and setting size = capacity puts
> the burden in the user to check these values across all zones to
> determine if they can support the driver or not. I think it would be
> clear to have a feature explicitly stating that capacity != size.

From the user (application or in-kernel) perspective, *all* zoned device now
have a zone capacity. I do not think it matters if that comes from the device or
from the driver. It is unconditionally supported. The report flag
BLK_ZONE_REP_CAPACITY exist to signal to user applications that the zone
descriptor has the capactiy field, to allow for backward compatibility with
older kernels. See patches for fio or util-linux blkzone posted to see its use.

Having a flag that explicitly states that zone capacity != zone size would be
useful only and only if *all* zones of the device have that property. What if
only some zones have a capacity lower than the zone size ? The user would still
need a zone report to sort things out.

> For the attributes, this is very similar - some apply, some don't, and
> we only report the ones that make sense for each driver. I do not see
> how device mappers are treated differently here than in the existing
> capacity features.

The capacity feature is supported by *all* zoned block device. Support coming
from the device or from the driver does not matter. The API is consistent for
all device types. So in my opinion, there is no need to consider zone capacity
as an attribute. It is supported by everything, it is now part of the API,
signaled with BLK_ZONE_REP_CAPACITY for backward compatibility handling.

You are mixing up the report flags describing the structure of the zone
descriptor API with optional device features attributes. Adding an attribute
field to the report zones descriptors is fine. The presence of that field can be
signaled with a BLK_ZONE_REP_ATTR report flag as you already proposed. No problem.

But then, imagine a device mapper target using multiple devices with different
optional features. How do you propose that the device mapper target sorts this
out and come up with a coherent set of attributes for its target report zones ?
Please propose a solution for that. Your patches so far are all missing that.

> For the offline transition, I see that the current patches do not set
> flags properly and we would need to inherit the underlying device if we
> want to do it this way. I can understand from your comments that this is
> not a good way of doing it. I see in one of your comments from V1 that
> we could deal with this transition in the scsi driver and keep it in
> memory (i.e., device is still read-only, but driver transitions to
> offline) - can you comment on this?

That would not be persistent across reboot, unlike for ZNS drive. So forget
about this. That was wishful thinking from me. This will not work.

Explicit zone offlining is truly an optional feature of the device. Signaling
its support to the user is necessary, but again, I would like that to be done in
a way that does not force device mapper targets to forever report nothing for
the optional features. A flag field can certainly be used, but it has to  be
handled in blk_queue_stack_limits(). That may be as simple as taking a bitwise
AND of each device zone attribute flags.


-- 
Damien Le Moal
Western Digital Research

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2020-07-03  8:36 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-02  6:54 [V2 PATCH 0/4] ZNS: Extra features for current patche Javier González
2020-07-02  6:54 ` Javier González
2020-07-02  6:54 ` [PATCH 1/4] block: Add zone flags to queue zone prop Javier González
2020-07-02  6:54   ` Javier González
2020-07-02  7:54   ` Damien Le Moal
2020-07-02  7:54     ` Damien Le Moal
2020-07-02  8:34     ` Javier González
2020-07-02  8:34       ` Javier González
2020-07-02  8:49       ` Damien Le Moal
2020-07-02  8:49         ` Damien Le Moal
2020-07-02 10:27         ` Javier González
2020-07-02 10:27           ` Javier González
2020-07-03  5:20           ` Damien Le Moal
2020-07-03  5:20             ` Damien Le Moal
2020-07-03  6:28             ` Javier González
2020-07-03  6:28               ` Javier González
2020-07-03  8:35               ` Damien Le Moal
2020-07-03  8:35                 ` Damien Le Moal
2020-07-02  6:54 ` [PATCH 2/4] block: add support for zone offline transition Javier González
2020-07-02  6:54   ` Javier González
2020-07-02  8:10   ` Damien Le Moal
2020-07-02  8:10     ` Damien Le Moal
2020-07-02  8:39     ` Javier González
2020-07-02  8:39       ` Javier González
2020-07-02  8:52       ` Damien Le Moal
2020-07-02  8:52         ` Damien Le Moal
2020-07-02  6:54 ` [PATCH 3/4] nvme: Add consistency check for zone count Javier González
2020-07-02  6:54   ` Javier González
2020-07-02  8:16   ` Damien Le Moal
2020-07-02  8:16     ` Damien Le Moal
2020-07-02  8:19   ` Johannes Thumshirn
2020-07-02  8:19     ` Johannes Thumshirn
2020-07-02  8:27     ` Javier González
2020-07-02  8:27       ` Javier González
2020-07-02  6:54 ` [PATCH 4/4] block: add attributes to zone report Javier González
2020-07-02  6:54   ` Javier González
2020-07-02  8:27   ` Damien Le Moal
2020-07-02  8:27     ` Damien Le Moal

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.